Marginalize node: GeNIe vs. Smile

The engine.
Post Reply
jonnie
Posts: 41
Joined: Mon Feb 06, 2012 12:49 pm

Marginalize node: GeNIe vs. Smile

Post by jonnie »

There's a new feature in GeNIe that lets you marginalize a node into its child. It removes the node and instead adds the node's parents to the child's parents.
Also, Smile has a method int DSL_network::MarginalizeNode(int theParent, int theChild) that lets you marginalize a node into its child. However, there seem to be some differences (I haven't actually worked with the Smile function so if I'm mistaken please excuse me):
- GeNIe's marginalization works only if the node has only one child
- GeNIe deletes the node
- GeNIe merges the node's parents into the child's parents, which possibly increases the child's size; while Smile's docu sounds like the node's outcomes just get summed out in the child's CPT and the arc gets removed.
Are these points correct? If so, what would be a simple approach to do something like this in Smile? (Maybe I should download the GeNIe source and look it up, instead of asking this question here...)
My impression is that GeNIe's method of marginalizing nodes preserves the JPD of the model (except that the node in question is summed out) while Smile's method doesn't (because it removes dependencies among the surrounding nodes). Is this correct?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: Marginalize node: GeNIe vs. Smile

Post by shooltz[BayesFusion] »

Also, Smile has a method int DSL_network::MarginalizeNode(int theParent, int theChild) that lets you marginalize a node into its child.
The method you're looking for is int DSL_network::MarginalizeNode(int thisNode, DSL_progress *progress = NULL). This is the code GeNIe calls into.
GeNIe's marginalization works only if the node has only one child
It should work regarless of the child count of the marginalized node.
jonnie
Posts: 41
Joined: Mon Feb 06, 2012 12:49 pm

Re: Marginalize node: GeNIe vs. Smile

Post by jonnie »

It should work regarless of the child count of the marginalized node.
Seems I interpreted the wrong thing into GeNIe's behavior. I was thinking that it's meant to be like that, that you need a single child like in DSL_network::MarginalizeNode(int, int). I've reproduced marginalizing with several children now, and I've found two very similar cases where the only difference between the networks are the CPTs. I attach the example network. The node to marginalize is n0. Some CPT entries contain zeroes. If you set all CPT entries to 0.5, the marginalization works.
Attachments
marginalize.xdsl
marginalize n0 doesn't work
(2.16 KiB) Downloaded 308 times
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: Marginalize node: GeNIe vs. Smile

Post by shooltz[BayesFusion] »

jonnie wrote:I've reproduced marginalizing with several children now, and I've found two very similar cases where the only difference between the networks are the CPTs.
During marginalization, SMILE runs inference to obtain entries for the CPTs of nodes affected by marginalization. When n0 is marginalized, this node set includes n3. In turn, n3's new parents after successful marginalization would be n1 and n2. The n3's CPT entries for (n1=s1, n2=s0) can't be obtained, because n1=s1 implies n2=s1 - you can check it simply by setting evidence in n1. DSL_network::MarginalizeNode will return an error code.

SMILE should emit a meaningful message to the error output - this will be fixed soon.
jonnie
Posts: 41
Joined: Mon Feb 06, 2012 12:49 pm

Re: Marginalize node: GeNIe vs. Smile

Post by jonnie »

Thanks a lot for clarifying. I have a few follow-up questions:

- In order to preserve the JPD, the children of the marginalized node are "bound together" by adding arcs between them. Does this always result in a complete subgraph of the children?

- Is there any way to know which children will become the parents of which other children? I think there are alternative possibilities, and Smile's pick simply depends on its internal ordering of the nodes.
This is interesting because you only should marginalize a node if it's a) auxiliary (neither evidence nor target) and b) the resulting clique sizes will not be bigger than before.
While the actual cliques are determined by the triangulation, I think the family sizes are a good heuristic. So I want to compare the family sizes of the marginalized node and its children before and after the marginalization, and if they got bigger I don't want to perform the operation.
Does this heuristic make sense btw?
For my own intents and purposes it is actually okay to copy the DSL_network, perform marginalization there and look at the result. Still, it's not the most efficient thing to do, plus, a comparison of family sizes might be of interest to the gui user as well...

- Is it true that, if there's any impossible evidence combination on the node's children, then the node cannot be marginalized? In my eyes it depends on which children become the parents of other children. If one combination doesn't work, then maybe another one would? Is Smile checking alternative possibilities there?

And one further bug report: you didn't consider deterministic children either :P Marginalizing a chance node into a deterministic child should turn this deterministic child into a chance node, else it can't hold the resulting CPT. The current implementation puts a chance CPT into the child but it is still deterministic, making it invalid. Manually changing it into a chance node reveals the new CPT.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: Marginalize node: GeNIe vs. Smile

Post by shooltz[BayesFusion] »

In order to preserve the JPD, the children of the marginalized node are "bound together" by adding arcs between them. Does this always result in a complete subgraph of the children?
That is correct. The node marginalization as currently implemented in SMILE is basically a series of arc reversals (to ensure that node to be marginalized has no children) followed by node deletion. The reversals will create a complete subgraph of the children (assuming you look at the subgraph as undirected).
- Is there any way to know which children will become the parents of which other children? I think there are alternative possibilities, and Smile's pick simply depends on its internal ordering of the nodes.
SMILE picks the order of reversals to ensure there's no cycle in the modified network (using simple one step ahead heuristics). There's no API which exposes the ordering.
- Is it true that, if there's any impossible evidence combination on the node's children, then the node cannot be marginalized? In my eyes it depends on which children become the parents of other children. If one combination doesn't work, then maybe another one would? Is Smile checking alternative possibilities there?
No, the only check performed is to ensure the network remains a DAG after marginalization. In theory it should be possible to examine DAG-yielding orderings until one of them gives the structure which can be used to calculate all CPT entries.
And one further bug report: you didn't consider deterministic children either :P
Yes, we're aware of that :)
Post Reply