About learning

megha · Post by **megha** » Tue Mar 25, 2008 9:20 am

Hello Everyone,
Can anybody please help me out in determining the CPT for each node? Suppose I've network with nodes A,B,C and D, with states
A: a1, a2 B:b1,b2 C:c1,c2 D:d1,d2. suppose A is parent of B and C, and B &C are parents of D. then b1,b2,c1,c2 will have two conditional probabilities dependant on value of A i.e. a1 and a2.
Now when we calculate conditional prob for d1 ie P(d1|B&C) then we will have to calculate P(d1|B=b1 &C=c1) ,P(d1|B=b1& C=c2), P(d1|B=b2&C=c1) and P(d1|B=b2&C=c2).In the same way for d2.
Now while propagating probabilities of B=b1 actually we've 2 probabilities for b1( (b1|a1) and(b1|a2) )then while propagating it to child do we need to add these probabilities to get prob of b1 as a whole?
I know this is very stupid question, but I still need to clarify it from a good source.

mark · Post by **mark** » Tue Mar 25, 2008 7:18 pm

It's not totally clear to me what you're trying to calculate. Probability P(d1|B,C) is a vector of 4 values, one for each combination of states of B and C. You can find the probabilities for these states in the CPT for D. There is not much else to calculate in this setting. Does this help?

megha · Post by **megha** » Wed Mar 26, 2008 5:15 am

ya, thats true. but CPT for B will be P(b1|a1) P(b1|a2) P(b2|a1) P(b2|a2). Now since B is parent of D so while determining CPT for D we have to take values of B(b1&b2) into consideration, which can be taken from CPT of B. But now my problem is whem we say P(d1|b1&c1) , here b1 is having two values i.e. P(b1|a1) &P (b1|a2), so when we consider b1 while calculating d1, do we have to take combination of both values of b1 i.e.(b1|a1) & (b1|a2) i.e add both of these probabilities and then consider for further calculation? b'coz in this case we'll have to add the values in vector of probabilities for b1, from its CPT.

mark · Post by **mark** » Wed Mar 26, 2008 2:23 pm

When specifying a CPT for a node, only the direct parents of this node are relevant. So the CPT in B only depends on A, and the CPT in D only depends on B and C (and not A). There is not really any calculation involved, since the CPTs are fixed for a given network. Or are you trying to calculate P(D) or something along those lines?

If you want to learn the probabilities in a CPT it basically boils down to counting occurrences of state combinations of a family (child plus its parents) in the data. In case of missing data it gets somewhat more complicated though.

megha · Post by **megha** » Thu Mar 27, 2008 5:47 pm

Hello Mark,
Thanx for ur replies. So u mean to say, when i have to calculate P(d1|b1&c1) then i'll have to calculate the prob of occurences of d1,b1,c1 appearing together and again the prob of occurence of state b1 of node B and prob of occurence of state c1 of node C, since P(d1|b1&c1) = P(d1,b1,c1) / P(b1).P(c1), right? and actually i thought that we won't have to calculate P(b1) & P(c1) again in order to calculate P(d1) as we've already calculated it in CPT of B & C, so we can directly use it from their CPTs only. hence i had asked that whether we have to sum up the prob of (b1|a1) & P(b1|a2) to get the P(b1).

mark · Post by **mark** » Thu Mar 27, 2008 6:17 pm

P(d1|b1&c1) = P(d1,b1,c1) / P(b1).P(c1)

This formula is wrong, it should be P(d1|b1&c1) = P(d1,b1,c1) / P(b1,c1). So you can't reuse any of the already calculated probabilities in B or C.

megha · Post by **megha** » Thu Mar 27, 2008 6:49 pm

ok, so here i was going wrong and getting all wrong calculations, anyways thank you so much mark, for your guidance.

megha · Post by **megha** » Thu Mar 27, 2008 6:59 pm

Hi,
U said that probability calculation becomes tough in case of missing values in data, then can we fill up those missing values by using some technique? like put the value in column with missing values which occurs mostly with particular value of the adjacent column of the current column or something like this. also one more situation is there which may create a problem, i'.e. when the column is having the continuous data b'coz how CPT can be prepared for such large no of states.

mark · Post by **mark** » Thu Mar 27, 2008 7:22 pm

You might want to take a look at the expectation-maximization (EM) algorithm (http://en.wikipedia.org/wiki/Expectatio ... _algorithm). Your intuition is correct, EM fills in the missing values with the most likely values and then updates the CPTs as if it were complete data. Then it iterates until convergence.

If you have continuous data, one approach you could take is to discretize it.

megha · Post by **megha** » Fri Mar 28, 2008 6:50 am

Hi,
as per u've said earlier i.e. P(d1|b1&c1) = P(d1,b1,c1)/P(b1,c1). then if we use this formula to calculate the probability for node tstsc having parent spend(plz refer the learning example with retention_discretized.txt) then when we calculate P(s1_below_52|s3_28052_33376) = P(s1_below_52, s3_28052_33376) / P(s3_28052_33376) and the answer should come "0", since P(s1_below_52, s3_28052_33376) is "0" b'coz this combination never occurs in table and P(s4_78_up|s3_28052_33376) should be "1" ,b'coz all combinations of 33376 are with s2_52_69 only, but the values calculatd in CPT in GeNIe are "0.16667" and "0.5" respectively, then how's that?

megha · Post by **megha** » Fri Mar 28, 2008 10:08 am

Hi,
also while calculating CPT for node rejr in the same example of retention_discretized.txt, for many states of rejr, probability comes 'infinity', as the probability of occurence of many combinations of parents values are '0' and hence dividing probabiity of child, parent comb by these prob(having value 0) yields result 'infinity', but the answers shown in GeNIe are different i.e. equal probabilities are assigned to many of combinations, what is the idea behind it?

mark · Post by **mark** » Fri Mar 28, 2008 3:30 pm

Good points and I don't really know at the moment how those parameters are estimated. But what you could do is apply EM (parameter learning) after learning the structure and then it will give you the expected results.

megha · Post by **megha** » Fri Mar 28, 2008 6:05 pm

Hi,
But can you please help me out to know these things? B'coz i'm trying to do it from many days but not getting my calculations correct. and actually somebody has told me to use parents CPT to calculate childs CPT and it has really confused me b'coz from programming perspective also it will be too much complicated to use the CPTs for calculating other CPTs.

mark · Post by **mark** » Fri Mar 28, 2008 6:21 pm

Like I said, apply EM and you'll see that your calculations are correct.

megha · Post by **megha** » Fri Mar 28, 2008 6:39 pm

ok, one more query just to countercheck myself, when we say P(d1|b1&c1) = P(d1,b1,c1)/ P(b1,c1) then we simply need to count no of occurences of d1,b1,c1 appearing together divided by total no of data and same way no of occurences of b1 & c1 together divided by total no of data, right?