Problem about learning parameters

JohnYu · Post by **JohnYu** » Fri May 16, 2008 3:50 am

Hi,
I just start your jsmile and have a question about EM learning parameters.

My bn have 3 ondes "a", "b" and "c".
There are two arcs, "a" -> "c" and "b" ->"c".
"a" has 3 states, "a0", "a1", "a2".
"b" has 4 states "b0", "b1", "b2", "b3", "b4".
"c" has 2 states "c0", "c1".

The following is the training data:
c, a, b
c1, a2, b1
c0, a1, b4
c0, a0, b2
c1, a2, b4
c1, a1, b4

Now I want to know P(c=c1 | a=a1, b=b2) = ?
According to my thoughts, we can get this by:
P(c=c1, a=a1, b=b2)/P(a=a1, b=b2)
in the training data.

But either P(c=c1, a=a1, b=b2)=0 or P(a=a1, b=b)=0.
So I think the value, P(c=c1 | a=a1, b=b2), is unknown.

While I use EM for learning parameters for the bn and training data, and inference P(c=c1 | a=a1, b=b2), I got the value but different for every training time(the same training data).

My question is: if my thought is correct, I should not get the value....
Can anyone please tell me why it generates the value and different for every training time?
Or my thought is wrong?

Thanks.

JohnYu · Post by **JohnYu** » Fri May 16, 2008 4:29 am

Hi,
I found the reason why it's differet for every training time.
Because I set randomize the parameters true first.
Maybe it's is a stupid question........

JohnYu · Post by **JohnYu** » Wed May 21, 2008 4:32 am

Hi,
My question is if the training set does not contain the data on some condition, such as P(c=c1, a=a1, b=b2) or P(a=a1, b=b2), we can not get the estimated value of P(c=c1 | a=a1, b=b2) in the conditional probability table of variable "c". How to deal with this problem?

mark · Post by **mark** » Wed May 21, 2008 7:23 pm

There is not much you can do when you don't have the data. In this case keeping the distributions prior to learning or assuming a uniform distribution may be reasonable.