## Why should virtual evidence sum to 1?

The front end.
sverrevr
Posts: 12
Joined: Mon Aug 24, 2020 7:37 am

### Why should virtual evidence sum to 1?

Why should virtual evidence sum to 1? Have I understood correctly that when inserting virtual evidence you give the probability of observing a particular evidence E for all of the nodes states? P(E|node="state0"), P(E|node="state1"), P(E|node="state2"), and so on.
If this is correct then the probabilities do not need to sum to 1, right? Multiple states could have a 100% probability of giving this evidence.
If this is not correct, then what is the definition of virtual evidence? If it is P(node="state0"|E), then I don't see how the prior of the node can affect the distribution.
According to the documentation virtual evidence is implemented by making a virtual node and populating its CPT. When I try to mimic this behaviour I get the same behavior if I make a new node, called node_evidence that has a state "E" and "not E", and specify P(node_evidence="E"|node="state0"), P(node_evidence="E"|node="state1") and so on. These don't have to sum to 1, only P(node_evidence="E"|node="state0") and P(node_evidence="not E"|node="state0") must sum to 1.

I would be very grateful for an explanation if I have misunderstood how virtual evidence works 😊

Ps: This sentence of the virtual evidence documentation was very difficult to understand (HTML version): A typical modeling practice in such cases is that we model such variables but next to them variables that are observable and can provide us with information about the unobservables.

marek [BayesFusion]
Posts: 338
Joined: Tue Dec 11, 2007 4:24 pm

### Re: Why should virtual evidence sum to 1?

Hi sverrevr,

It's been a while since I looked at the issue -- we implemented virtual evidence in GeNIe and SMILE quite possibly 20 years ago :-). The best source of information will be articles that contain the precise definition and formulas. Here are a couple that you may try:

https://www.researchgate.net/publicatio ... le_Example

https://vannevar.ece.uw.edu/techsite/pa ... 4-0016.pdf

The issue of normalization (i.e., whether it is necessary or not) may be stemming from Pearl, who first defined virtual evidence (I'm not sure if he used the name :-)). I believe that he defined it in terms of lambda messages reaching a node. Lambda messages do not need to be normalized, I believe. Our implementation is very simple and straightforward -- we implement virtual evidence by creating a (invisible to the user) child node that gets real evidence and make sure that the support coming from it is precisely what the user has defined in terms of virtual evidence. Even if it is not necessary to normalize the support, it it very handy and conceptually clear -- you distribute the total pie of evidential support over the states of the node. Unnormalized values, if we opted for that, would have to be normalized at some point anyway. Please also note that SMILE can use any of a set of inference algorithms. Normalized support makes it easy on them.

I hope this helps. I'm sorry for not being very, very specific.
Cheers,

Marek

sverrevr
Posts: 12
Joined: Mon Aug 24, 2020 7:37 am

### Re: Why should virtual evidence sum to 1?

Thanks for the reply Marek :)
The first article you referred to was actually the one I read to understand virtual evidence, its quite good som maybe it could be linked to in the wiki? :)

I understand that it is a hazel to allow unnormalized virtual evidence and that this is not a priority. But, at least for me, having to supply normalized evidence made it difficult to understand what the probability meant. I guess it can make sense if you are not supplying probabilities, but rather want to give more vague knowledge. I would argue that being able to supply non-normalized evidence would help the user to think correctly about the probabilities they supply, they could then think: "What is the probability that this state would cause our observation?". When giving normalized evidence it's easy to start thinking "With this observation in mind, what is the probability of the node being in this particualr state?" which is wrong unless the prior happens to be uniform.
Also with regards to implementation, if virtual evidence generates a virtual child node, then there should be no need to normalize which is what I tried to explain in my previous message. In the CPT of the virtual child node, the virtual evidence gives the probabilities for the, for example, top value of each column. These values (the first row) do not need to sum to 1, so I don't see why you would need to normalize at some point?

Anyways, I understand if you don't prioritize to do anything about this, but I think it would be better to be able to give them unnormalize ;)

Best regards
Sverre