ERROR

The engine.
Post Reply
nira
Posts: 8
Joined: Tue Mar 22, 2022 5:18 pm

ERROR

Post by nira »

Hi
can you explain me what is this error "Discretization problem in node Blood_Sugar: Underflow samples: 73593, min=4.75416e-19 loBound=0.4 Overflow samples: 831715, max=782875 hiBound=4.4 Total valid samples: 94692 of 1000000" and what should I do to solve this problem ?
Thanks.
marek [BayesFusion]
Site Admin
Posts: 430
Joined: Tue Dec 11, 2007 4:24 pm

Re: ERROR

Post by marek [BayesFusion] »

Hi Nira,

It is hard to say without looking at the node in question and its parents and their definitions (mainly domains). Essentially, GeNIe complains here that there have not been enough samples to derive a precise CPT for the node in question. There may be many reasons for that, so suggesting how to fix it is hard. Please look at the domains of the nodes and see whether sampling from the parents' domains will always lead to a value in the node Blood_Sugar that is within the node's domain. If not, has the domain been defined perhaps too tightly? Sometimes, the nature of the equation defining of the node is such that it rarely produces values within the domain. In short, hard to say. If you want me to look at it in detail, please post the relevant fragment of the model.
I hope this helps,

Marek
nira
Posts: 8
Joined: Tue Mar 22, 2022 5:18 pm

Re: ERROR

Post by nira »

here is my bayesian network, I have this problem for all continuous nodes :
Attachments
el taliiiiiii (avec des equations) - Copy.xdsl
(29.54 KiB) Downloaded 218 times
marek [BayesFusion]
Site Admin
Posts: 430
Joined: Tue Dec 11, 2007 4:24 pm

Re: ERROR

Post by marek [BayesFusion] »

Thank you for the network. I can see that the domain of the node Blood_Sugar is [0.4,4.4] and the definition is Blood_Sugar=Weibull(451.545,0.365594). The samples from your Weibull span between practically zero and almost 400K. Many, many samples fall outside of the interval [0.4,4.4] and this is what GeNIe is complaining about. Would another distribution be perhaps more appropriate? If you have no other ideas, custom distribution should do the job and waste no samples.

The same, I suspect, is happening in other nodes. In fact when I check "Reject out-of-bounds and invalid samples" in the Inference tab of Network properties, I get no valid samples at all for the network (without evidence!), even though I set the number of samples to 1,000,000. This is not right, as your model is supposed to model reality. It tells me that your definitions of the nodes can be seriously improved.

Generally, when things like that start happening and you are unwilling to change the model, throwing more samples may lead to some valid samples and meaningful results. I suggest, however, examining your model and making sure that you get valid samples.
I hope this helps,

Marek
nira
Posts: 8
Joined: Tue Mar 22, 2022 5:18 pm

Re: ERROR

Post by nira »

thank you for your answer

here is the same bayesian network except that in this one I customized the definitions of the nodes and the "x" of the histogram are the same as in the dataset and are more logical for example the age the x are: 10,20,30,40,50,60... etc as my data and as shown in the picture in the tab "node property -> value" the max and min are in the proposed range [6,96] but I still have this error "Discretization problem in node Age: Underflow samples: 91, min=1. 50963 loBound=6 Overflow samples: 322, max=99.5983 hiBound=96 Total valid samples: 99587 of 100000" after doing "rediscretize" and I still don't understand and is it going to influence the inference result?

another problem: I have arcs that are always in gray such as the arc between morbid_obesity and diabetes, is that going to be a problem for me later?
Attachments
CaptureMarek.PNG
CaptureMarek.PNG (37.77 KiB) Viewed 3495 times
el taliiiiiii (avec des equations).xdsl
(29.58 KiB) Downloaded 207 times
marek [BayesFusion]
Site Admin
Posts: 430
Joined: Tue Dec 11, 2007 4:24 pm

Re: ERROR

Post by marek [BayesFusion] »

I’m on the road, so I will be very brief. A small percentage of invalid samples is generally not a problem. Gray arcs tell you that there is something wrong with the definition of the child. They mean, essentially, that they are not necessary (the child is independent of the parent), which means that you probably need to revisit the definition.
I hope this helps,

Marek
Post Reply