Validation process: model's result should be probability and not exact value (?)

danielhombing · Post by **danielhombing** » Mon Jul 17, 2017 5:56 pm

Hello,

I have a question regarding validation method.
I have two states in my outcome nodes, "yes" and "no".
What I understood in BN is you get probability in your outcome node. But in validation method, we will get the accuracy meaning that if the model can predict or distinguish "yes" and "no" correctly. How Genie do that because the model's result is probability of "yes" and "no" and not exact value "yes" and "no". I mean the model's result should be, for example, probability of "yes" is 75% and "no" is 25% but not 100% "yes" and 0% "no".
Or maybe Genie will categorize probability above 50% to "yes" and below that value to "no"? How does it works in validation?

Can you explain, please? Thank you very much.

Daniel

*I have other questions in another topic. Could you respond to those questions as well, please? Thanks!

Mon Jul 17, 2017 6:16 pm

Or maybe Genie will categorize probability above 50% to "yes" and below that value to "no"? How does it works in validation?

That's correct. GeNIe selects the outcome with highest probability as the predicted outcome of the class node(s) during validation. For the binary class node, this means probability above 50%.

danielhombing · Post by **danielhombing** » Tue Jul 18, 2017 8:02 am

Dear Shooltz,

Thanks for your answer.
Could you help me to answer these questions from my previous conversation please?

1. When perform parameter learning, we get log(p) value. Do I need to consider this value to evaluate my model’s performance? I see that closer to zero is better. My log (p) value is -1084380.522721 which is very high.

2. Is it possible to have illogical CPT after parameter learning but get good result on validation? If that is a kind of trade-off, which one should I prioritize or sacrifice in your opinion?

3. I have no data on intermediate node. And when I do the learning parameter, EM algorithm will fill the CPT on this node. But I am curious, how can EM get this value because logically EM does not know the relationship/effect of each state in parent nodes on the child node?

For example this network: housing condition (states: finished - natural) -> wealth condition (rich-middle-poor) -> buy product (yes – no).
Let say I have data on "housing condition" and "buy product" but not "wealth condition". Logically, if housing condition is natural then wealth condition is poor and they buy they product. EM algorithm will fill the probability on wealth condition but how can EM find the “logical” relationship? I am trying to find this answer from some literature but they do not explain this part. They only explain EM can learn from missing value but in this case is all value in one node is missing.

4. Can you explain more about “randomize initial parameters ” or “uniformize” in parameter learning? What I know is you choose this option to wipe out the current/default parameter. Which one should I use?

5. I found out also that the validation result is different if I tick randomize initial parameter and if I tick Uniformize. Randomize initial parameter give better result. Can you explain this also, please?

Thank you again for your kind respond. I really appreciate your help.

Best,
Daniel

Post by **marek [BayesFusion]** » Thu Jul 20, 2017 8:47 am

I have just posted a reply to your questions in the original thread -- Marek

BayesFusion Support Forum

Validation process: model's result should be probability and not exact value (?)

Validation process: model's result should be probability and not exact value (?)

Re: Validation process: model's result should be probability and not exact value (?)

Re: Validation process: model's result should be probability and not exact value (?)

Re: Validation process: model's result should be probability and not exact value (?)