problems with parameter learning with EM

The front end.
Post Reply
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

problems with parameter learning with EM

Post by ninio »

I'm trying to learn a network with several hidden variables and some missing data from real live data. When I remove the hidden variables and learn a Naive Bayes network, learning works fine. But when I try to learn with the hidden variables I get the following message
em: not all nodes are updated (deterministic cpts?)

and either nothing happens or all/most of my cpts are zeroed.

Any Idea what the problem may be?
Thanks!
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

some more information

Post by ninio »

i'm using 2.0.3259.0, and the program tends to crash after several iterations of training.
mark
Posts: 179
Joined: Tue Nov 27, 2007 4:02 pm

Post by mark »

It should never crash. Could you share your network and data set so that I can reproduce the crash? The error message you were getting could be due to insufficient records, how many were you using?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: some more information

Post by shooltz[BayesFusion] »

ninio wrote:i'm using 2.0.3259.0, and the program tends to crash after several iterations of training.
Did you try the most recent build (2.0.3393.0)?
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

some more observations

Post by ninio »

I updated to the latest version, and still get this.
In addition, I found that Genie has a problem when the network loaded from an xdsl file has a cpt with a total conditional distribution that's more then one. For example, on a binary variable, if I set the probs to 1 and 1e-5, and then train-parameters and try to open the node properties, the program dies.

worse, if you set the probs to 1 and 1e-9, genie will not complain but the EM parameter learning will fail with the same error message.


My current suspicion is that during the EM process, a rounding error leads to a sum that is not exactly one, and this leads to the problem.

attached is a small network that displays the problem. The column for s2 in "problem" has 1e-009 and 1, so learning this network fails. If you open the definition tab for the node, mark the s2 column and press normalize ("\Sigma=1") and OK, you can train the netwrork without a problem.
Attachments
Network3test.xdsl
the network
(2.33 KiB) Downloaded 425 times
Network2.txt
the data file
(8.13 KiB) Downloaded 407 times
mark
Posts: 179
Joined: Tue Nov 27, 2007 4:02 pm

Post by mark »

Are you randomizing the initial parameters?
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

not randomizing the parameters

Post by ninio »

I have a good idea what I want, and randomizing will give my hidden variables some random meaning, which will probably not match what I want.

And while at the subject - in what units is the "Confidence" setting? I used 1 to get the results above.

I will mention that on my large network (a dozen hidden vars and about twice that observed ones) I get the same error message even if I do randomize (but the time it takes to reach this state depends on the randomization). I will also mention that after a couple of dozen training sessions Genei tends to become unstable, and I see problems in image rendering and a significant tendency to crash, mostly if I try to close the program.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: not randomizing the parameters

Post by shooltz[BayesFusion] »

ninio wrote:I will also mention that after a couple of dozen training sessions Genei tends to become unstable, and I see problems in image rendering and a significant tendency to crash, mostly if I try to close the program.
Does this happen with small model you've attached in one of previous posts, or the large network is required to crash the program?
mark
Posts: 179
Joined: Tue Nov 27, 2007 4:02 pm

Post by mark »

I found the cause and solution of the 'em: not all nodes are updated (deterministic cpts?)' error message. This should be fixed in the next GeNIe/SMILE release.
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

glad to hear

Post by ninio »

is there a time frame for the next release? Is it possible to get a preview?

I'm currently stuck with my project, being unable to train my network.


As for the crashes - the software will crash when the input network has a node with distribution with a sum grater then one, and this network is sent to the learn-parameters function. It will fail, but once the node is opened for edit, the program will crash.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: glad to hear

Post by shooltz[BayesFusion] »

ninio wrote:is there a time frame for the next release? Is it possible to get a preview?
We'll be testing modified EM for couple of days. If it works OK we'll release GeNIe or at least notify you and provide link to preview version.
As for the crashes - the software will crash when the input network has a node with distribution with a sum grater then one, and this network is sent to the learn-parameters function. It will fail, but once the node is opened for edit, the program will crash.
I can't reproduce that. Using most recent public GeNIe build (the one which does not yet have our recent EM fix) I ran EM on your model . EM returned an error message as expected and I can inspect all four CPTs in GeNIe without problems. This does not change when I run EM multiple times. Next I modified the network to have sum > 1 in node 'observed'. Still, EM returns error, but I can open the CPTs without crash.
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

I can't reproduce it now

Post by ninio »

I had a set where this happened consistently, but I saved over the file. I will try to see if I can find a copy.

Thanks for all your help!
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

Re: I can't reproduce it now

Post by ninio »

ninio wrote:I had a set where this happened consistently, but I saved over the file. I will try to see if I can find a copy.

Thanks for all your help!
coming to think about it, the crashing may have happend on a slightly out of date version ( 2.0.3259.0 ) I have been using until recently. It may still be worth checking, as I did see problems when working with larger networks even with the new code, but this I can not duplicate so easily.

Any news on the updated code?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: I can't reproduce it now

Post by shooltz[BayesFusion] »

ninio wrote:Any news on the updated code?
Send me the private message.
ninio
Posts: 9
Joined: Sun May 17, 2009 12:22 pm
Location: Israel

Re: I can't reproduce it now

Post by ninio »

shooltz wrote:
ninio wrote:Any news on the updated code?
Send me the private message.

thanks, looks like it works fine. Its great to see such support!
Post Reply