Lauritzen algorithm?

ermutarra · Post by **ermutarra** » Wed May 20, 2009 4:30 pm

Hi,

I have noticed that when I use Smilearn to learn the parameters of my model I get some probabilities that are not 0, but should be!

I've tried to find out how the DSL_ALG_BN_LAURITZEN algorithm works but couldn't find any information about it. Could you please explain?

I suspect that this algorithm incorporates a small-sample correction in all probability estimates such that no probability is ever set to be exactly zero. Is this true?

In any case, do you have any algorithm that uses the straight counts from the data with no correction such that if an event is not observed in the data then its probability is 0?

Thanks a lot!
Noelia

mark · Post by **mark** » Wed May 20, 2009 7:50 pm

There should be no small sample correction. Is all the data complete? Would you mind sharing your network and data?

ermutarra · Post by **ermutarra** » Thu May 21, 2009 9:51 am

Yes, the data is complete.
It is a naive bayes classifier. The "G" node is the parent and the "D", "T", and "B" nodes are the children.
I have attached the data as requested.

Thank you for your help.

Thu May 21, 2009 12:06 pm

ermutarra wrote:Yes, the data is complete.
It is a naive bayes classifier. The "G" node is the parent and the "D", "T", and "B" nodes are the children.

I just want to clarify that DSL_ALG_BN_LAURITZEN is the identifier you can pass to DSL_network::SetDefaultBNAlgorithm, which in turn controls which inference algorithm is exacuted when you call DSL_network::UpdateBeliefs. However, UpdateBeliefs is not called by SMILearn's naive Bayes code.

Please note that for SMILearn 'learn the parameters' means 'run the EM' algorithm, at least at this point

Naive Bayes belongs to the 'structure learning' family of algorithms, even if the learned structure is trivial and there's code running to obtain the probabilities. However, this code is not EM.

Post your model and indicate which probabities you'd expect to be zero so we can investigate this further.

ermutarra · Post by **ermutarra** » Thu May 21, 2009 3:51 pm

I'm not quite sure why you mean when you ask me to post the model, I'm attaching the xdsl file generated when I learn the naive bayes network with the data I posted before.

In this model with this data I would expect P(T=55|G=0) and P(T=55|G=1) to be zero as there where no observations of T=55 when G=0 or G=1. These two are just an example, but there are many more cases.

I don't quite understand why you say that the algorithm learns the structure if I have already told it that it is a naive bayes classifer and I've also told it which the parent node is. What is there to learn about the structure if I have already provided this information?

Cheers!

Thu May 21, 2009 5:05 pm

ermutarra wrote:I'm not quite sure why you mean when you ask me to post the model

Model == network.

Did you use DSL_naiveBayes::Learn or created the arcs yourself, then ran DSL_em::Learn?

mark · Post by **mark** » Fri May 22, 2009 1:24 am

I think I found the problem. Instead of just using integers in the data set, prepend them with 'State_' so that the exact state names from the network are matched. That should work.