<< Click to Display Table of Contents >> Navigation: Decision-theoretic modeling > Computational complexity of probabilistic inference |
One has to remember that probabilistic inference is worst-case NP-hard, which means in practice that probabilistic models can easily reach size and complexity that is prohibitive. Complexity of probabilistic models stems from two sources: (1) exponential growth of conditional probability tables in the number of parents, and (2) connectivity of the directed graphs modeling the problem structure. While SMILE is very efficient and belongs to the fastest (if it is not the fastest!) probabilistic inference engines, it is not difficult to create models that will exhaust available memory and processor time.
Our advice to users is to keep an eye on the number of parents of a node. Please note that the size of the conditional probability table in the node grows exponentially with the number of nodes and while 10 binary parents will lead to 2(10+1)=2,048 parameters, 11 binary parents will lead to 2(11+1)=4,096 parameters, and 20 binary parents will lead to 2(20+1)=2,097,152 parameters. Adding just a few more parents may exhaust your computer's memory.
There is no magic number that one should not exceed but we believe that one should slow down with adding parents to nodes when their number exceeds 15 or so. When the number of states of the nodes in question is large, this number becomes even smaller. GeNIe will issue warnings when more than 20 parents are added (this threshold can be changed through the Options dialog). GeNIe offers also a dialog that presents a list of all model nodes sorted from the highest to the lowest in-degree. We advise to make generous use of this function and to focus model development on those nodes that have the highest in-degree. Section Structural analysis describes this functionality in detail.
There are several techniques for reducing the number of model parameters. One of them is the use of so-called canonical gates, such as the Noisy-MAX gates, implemented in GeNIe and SMILE. They not only reduce the number of parameters and, hence, elicitation effort but also significantly speed up inference. Another technique, described thoroughly in Bayesian network literature, is called parent divorcing, which amounts to introducing intermediate nodes that reduce the number of parents of a single node. We encourage users of decision-theoretic techniques to study these methods, as they will lead to increased productivity and will significantly move the boundaries of what can be modeled in practice. Increasing the computer speed or computer memory will lead to linear improvements in performance. Structural changes, such as reducing the number of parents of a node may lead to logarithmic improvements. Focusing on the modeling effort is, thus, by far more effective than increasing the computational power of the computer solving the model.