enter evidence, virtual evidence, and controlling values

Yan · Post by **Yan** » Fri Nov 17, 2023 1:45 am

Dear staff, I'd like to ask some questions about entering evidence, virtual evidence, and controlling values:
1. My understanding is that entering evidence can only give 100% for one state of node A, while virtual evidence can give the probability distribution (e.g., 20%, 40%, 40%, assuming there are three states) for all states of node A. Is it right? If so, does that mean setting state 1 = 100% in virtual evidence is equal to selecting state 1 in set evidence?
2. Could you please explain the sentence in 6.2.5 Virtual evidence in the manual (page 334) -- "Some modelers use this construct to modify the prior probability distribution over a variable, although this works only when the variable does not have parents." So this only works for root nodes, why other nodes with parents cannot use this method to set their prior probability distributions?
3. The difference between controlling values and entering evidence is that the former only affects the posterior probabilities of the descendants, while the latter (i.e., directly double-clicking on one state of node A) affects the posterior probabilities of both parents and descendants. From the perspective of practical meaning, could you please explain the application of these two methods? For example, if I want to test the effect of a strategy (e.g., setting node A to 100% absence) on the outcome node B, which method should I use and why not anther one?

Many thanks!

Post by **marek [BayesFusion]** » Thu Nov 23, 2023 2:37 pm

Hi Yan,

Let me try to answer your questions.

1. My understanding is that entering evidence can only give 100% for one state of node A, while virtual evidence can give the probability distribution (e.g., 20%, 40%, 40%, assuming there are three states) for all states of node A. Is it right? If so, does that mean setting state 1 = 100% in virtual evidence is equal to selecting state 1 in set evidence?

This is correct.

2. Could you please explain the sentence in 6.2.5 Virtual evidence in the manual (page 334) -- "Some modelers use this construct to modify the prior probability distribution over a variable, although this works only when the variable does not have parents." So this only works for root nodes, why other nodes with parents cannot use this method to set their prior probability distributions?

Thank you for noticing this -- I have realized that this sentence is confusing and will make sure that it gets modified or just removed in the next version of the manual. The marginal probability distribution over a node without parents (and no other relevant observations) is the prior. When there are parents, one can also interpret the marginal as prior when there are no relevant observations. You can use virtual evidence for modifying the marginal over the node in question. It is, I guess, a little harder mentally to choose the values of indirect evidence to influence the priors in the case when there are parents. Now that I think about it, the most appropriate thing to do is to remove this sentence from the manual, which we will do. Thanks for noticing it and for asking about it!

3. The difference between controlling values and entering evidence is that the former only affects the posterior probabilities of the descendants, while the latter (i.e., directly double-clicking on one state of node A) affects the posterior probabilities of both parents and descendants. From the perspective of practical meaning, could you please explain the application of these two methods? For example, if I want to test the effect of a strategy (e.g., setting node A to 100% absence) on the outcome node B, which method should I use and why not anther one?

The difference is that the first operation models manipulation and the second models observation. When your model is causal, controlling a variable (manipulating it) will have no effect on its ancestors. For example, when you see a dead person, you wonder what caused the death. All possible causes of death of that person become more likely than they were before your observation. However, when you kill somebody, i.e., intervene/manipulate/control, it will have no effect on any other possible causes of death of that person. In fact, all the other causes will become irrelevant to the death of the person. You can see that GeNIe dims all incoming arcs of a variable that is manipulated/controlled. You should use observation when you mean that you have observed a variable and controlling/manipulation when you mean manipulation. Does this help?

Marek

Yan · Post by **Yan** » Fri Nov 24, 2023 5:41 am

Thanks Marek. For the question 3, I'm still a bit confused and would like to discuss with you. For example, node A "sport" (low frequency, medium frequency, high frequency) directs to node B drink water (low frequency, medium frequency, high frequency), and both sport and drink water direct to node C "health" (not good, good). Now I have a strategy or intervention -- drink more water, and want to see whether drink more water is good to health. According to your explanation, my understanding is that if I already observed a person drink water very frequently, and then I can set drink water (high frequency = 100%) and see the probability distribution of health and may also sport. If I haven't observed anything and want to test the effect of drink more water (intervention) on health, I need to control drink water (high frequency = 100%), and in this case, sport will not be influenced, and we can only see the effect of drink more water on health. And if the strategy is to sport more and drink more water, then I need to control/manipulate both "sport" and "drink water" to high frequency to assess their effect on health. Is it correct? Kind regards~

Post by **marek [BayesFusion]** » Fri Nov 24, 2023 9:42 pm

Hi Yan,

Yes, you are correct in that if you want to see the effect of both sport and drinking water, you should control both. In the example that you gave, the effect on health of controlling both will be the same as the effect of observing both. In fact, you can see the effect in the CPT for node C. However, controlling water intake will tell you the effect of water intake given whatever the sports intensity is (I guess the prior distribution). Observing water intake high tells you also something about the distribution of sport intensity (I assume that it will be more likely that it is high). Difficult, isn't it :-).
Cheers,

Marek

Yan · Post by **Yan** » Sat Nov 25, 2023 7:58 am

Haha, great, thanks Marek!

BayesFusion Support Forum

enter evidence, virtual evidence, and controlling values

enter evidence, virtual evidence, and controlling values

Re: enter evidence, virtual evidence, and controlling values

Re: enter evidence, virtual evidence, and controlling values

Re: enter evidence, virtual evidence, and controlling values

Re: enter evidence, virtual evidence, and controlling values