EM algorithm details

Ant_B · Post by **Ant_B** » Fri Mar 21, 2014 4:14 am

Hi Genie/Smile community,

I recently started to experiment with SMILE via JSmile. Following the documentation, I was able to get up and running - creating a network, reading in a simple data file that includes missing data, and performing EM on the dataset. The simple project is on my GitHub page (improvements/fixes very welcome):
https://github.com/amb-enthusiast/BayesianHack

I also repeated this exercise with SamIam's inflib.jar library and Mallet's GRMM library (part of the above GitHub project). When comparing results, I noticed a discrepancy: my by-hand calculations, GRMM and SamIam results all matched up with the tutorial notes. However, JSmile/SMILE gave different results. My hunch is that the differences are due to different initial estimated CPT values, but it may be something else.

I tried to look up details of the EM implementation, but couldn't find any details in this forum, or in the site documentation.

Having experimented with JSmile a little, I have a few questions about the SMILE EM implementation:

What is the default behaviour for JSmile/SMILE EM? In particular, what initial parameter estimate is used?
Is it possible to override these defaults, and supply initial parameters for the target BN to the EM algorithm?
Is it possible to set thresholds (on log-likelihood difference or max/target number of iterations) for EM execution?

I'd be grateful for any advice, information or guidance on these questions.

Thanks in advance,

Ant

Fri Mar 21, 2014 5:45 pm

What is the default behaviour for JSmile/SMILE EM? In particular, what initial parameter estimate is used?

You can randomize, uniformize or use existing parameters as initial values. EM.setRandomizeParameters and EM.setUniformizeParameters control this behavior. Currently, randomize defaults to true and uniformize defaults to false (but I suggest calling setters anyway to express your intent explicitly).

See also EM.get/setEqSampleSize, which defaults to 1.

Is it possible to set thresholds (on log-likelihood difference or max/target number of iterations) for EM execution?[/list]

No, this is not controlled through the API.

BayesFusion Support Forum

EM algorithm details

EM algorithm details

Re: EM algorithm details