Posterior Distribution using SMILE

lukasbr · Post by **lukasbr** » Tue Oct 04, 2016 9:49 am

Hello,

I have an issue performing inference in a bayesian network using SMILE.

According to a dataset I set the evidences and then update the beliefs. Afterwards I access the posterior distribution of the target node and output the values. Loading the same network into GeNIe and setting the same evidences results in a completely different posterior distribution. Manually calculating the posterior distribution shows that the result received in GeNIe is actually correct.

Why is this the case and how can I get the correct results using SMILE? I saw an earlier post, which mentioned some default setting in GeNIe which enables d-separation. The original post is not accessible anymore though. Or has this something to do with setting relevance?

Thank you!

Tue Oct 04, 2016 2:02 pm

I saw an earlier post, which mentioned some default setting in GeNIe which enables d-separation. The original post is not accessible anymore though.

D-separation is used for entropy-based diagnosis. If you're just setting the evidence/performing inference/reading the posteriors there are no settings to adjust. Can you post your code and the dataset here?

lukasbr · Post by **lukasbr** » Wed Oct 05, 2016 11:39 am

While trying to recreate the error with a smaller network, I realized that only opening a bayesian network with SMILE and setting the evidence results in the same posterior distribution as determined in GeNIe. This changes once I start adding nodes in SMILE.

The following example illustrates this behavior:
I open a small naive bayes network with the first node being the class variable. I add another node to the network, set all the nodes to the first state (except the class variable), and calculate the posterior distribution for the class variable. Afterwards I save the net, open it again, set all the nodes to the first state, and calculate the posterior distribution. The posterior distribution is different in both nets, even though they have the same nodes, CPTs, and evidences. The second net produces the same result as GeNIe. Why is this the case? Or am I missing something?

I'm using the following code:

Code: Select all

	Manager manager; // only used to output the posteriors
	DSL_network nb;
	
	stringstream path2net;
	path2net <<  "/.../someBayesNet.xdsl"; 
	nb.ReadFile(path2net.str().c_str());

	// create new node "Test"
	stringstream nodeName;
	nodeName << "Test";
	int nodeTest = nb.AddNode(DSL_CPT, nodeName.str().c_str());
	DSL_node* nodeHandlerTest = nb.GetNode(nodeTest);

	// add arc from the first node to node "Test"
	nb.AddArc(nb.GetFirstNode(), nodeTest); 

	// get the number of states of the first node
	int numStates = (nb.GetNode(nb.GetFirstNode())->Definition())->GetNumberOfOutcomes();

	// create CPT for node "Test"
	DSL_Dmatrix cpt = DSL_Dmatrix();
	cpt.AddDimension(2*numStates); // Is there another way to initialize the matrix?
	int j = 0; 
	for (int i = 0; i < numStates ; i++) {
		// filling in random values
		cpt[j] = ((double) rand() / (RAND_MAX));
		cpt[j + 1] = 1 - cpt[j]; 
		j+=2; 
	}

	// set the matrix as CPT for "Test"
	nodeHandlerTest->Definition()->SetDefinition(cpt);
	
	// set evidence (always the first State) for all nodes except the first node
	for (int i = nb.GetFirstNode()+1; i <= nb.GetLastNode() ; i++) {
		 nb.GetNode(i)->Value()->SetEvidence(0);
	}
	
	nb.UpdateBeliefs();
	// output the posterior distribution of the first node
	manager.outputPosteriors(nb); 

	// Save the net with the added arc
	stringstream path2UpdatedNet;
	path2UpdatedNet << "/.../newBayesNet.xdsl"; 
	nb.WriteFile(path2UpdatedNet.str().c_str());

	// Open the net 
	nb.ReadFile(path2UpdatedNet.str().c_str());

	// set evidence (always the first State) for all nodes except the first node
	for (int i = nb.GetFirstNode()+1; i <= nb.GetLastNode() ; i++) {
		nb.GetNode(i)->Value()->SetEvidence(0);
	}

	nb.UpdateBeliefs();
	// output the posterior distribution of the first node
	manager.outputPosteriors(nb);

Wed Oct 05, 2016 12:32 pm

To fully understand the issue I need the following:
1. compilable code (no references to missing classes)
2. input network

Couple of things I've noticed by reading the code:
1. no error checking; most of the SMILE methods return an integer error code.
2. iteration over the nodes: you should use DSL_network::GetFirstNode and DSL_network::GetNextNode.
3. I'm not sure why you use std::stringstream with DSL_network::ReadFile and DSL_network::WriteFile.
4. you may consider adding ErrorH.RedirectToFile(stdout); at the top of your main() to ensure you're not missing any error and warning messages at runtime

lukasbr · Post by **lukasbr** » Fri Oct 07, 2016 12:52 pm

Thank you for your help.

I have sent you the code and the input network in a private message.

Fri Oct 07, 2016 6:06 pm

There are two major problems in your code.

1. When initializing the CPT for the newly added node, you try to add one dimension to the DSL_Dmatrix, while it requires two dimensions. One dimension for parent node (the arc was added between 'Color' and 'Test'), another for new node itself:

Code: Select all

// create CPT for node "Test"
DSL_Dmatrix cpt;
cpt.AddDimension(numStates);
cpt.AddDimension(2); // 2 is the default outcome count for new CPT node

2. Your for loop iterating over the network have invalid check for loop exit. You're using "i <= nb.GetLastNode()". Don't do that - GetNextNode returns negative value when end of node list is reached. The exit condition is not satisfied. Do this instead when you want to start from 2nd node and iterate over the rest:

Code: Select all

for (int i = nb.GetNextNode(nb.GetFirstNode()); i >= 0; i = nb.GetNextNode(i)) {
...
}

Finally, a bit of C++ advice, not related directly to SMILE. You should avoid creating unnecessary objects when passing the arguments to functions. Instead of this:

Code: Select all

void outputPosteriors(DSL_network nb) { ... }

do this:

Code: Select all

void outputPosteriors(const DSL_network & nb) { ... }

Passing by reference avoids the object copy/destruction after the function completes. This doesn't

Simlarly, instead of this:

Code: Select all

DSL_doubleArray posteriorsClassVariable = dMatrix->GetItems();

do this:

Code: Select all

const [code]DSL_doubleArray & posteriorsClassVariable = dMatrix->GetItems();

[/code]

lukasbr · Post by **lukasbr** » Mon Oct 10, 2016 9:15 am

Thank you so much! I didn't know how the addDimension() operation was supposed to work.

BayesFusion Support Forum

Posterior Distribution using SMILE

Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE

Re: Posterior Distribution using SMILE