missing value

Yan · Post by **Yan** » Sun Oct 08, 2023 12:35 pm

Dear staff,

I have some questions about missing values, please see below:
(1) if i open the data file and replace all missing values with specified value, will these missing data affect the structure/parameter learning? for example, node A has three values "L, M, H", if i replace the missing values in node A with "-99", will this value join the computation? or it's just like a label (similar to SPSS, like a null) and doesn't make any sense (this is what i want)?
(2) if i follow the above step to learn the structure, when i use PC algorithm, the missing value (i.e., "-99") will not be considered as a state of node A in node properties. however, if i use Greedy Thick Thinning to learn the structure, the missing value (i.e., "-99") will be considered as a state of node A, showing "S_99" (see below). could you please explain the reason?

Many thanks.

Mon Oct 09, 2023 10:46 pm

The structure learning algorithms in SMILE/GeNIe currently require that a dataset has no missing values. From the POV of the learning algorithm the missing value replacement is no different from any state label.

The PC algorithm should output the nodes with outcomes like S_99, just like the attached image.

Yan · Post by **Yan** » Tue Oct 10, 2023 12:34 am

shooltz[BayesFusion] wrote: ↑Mon Oct 09, 2023 10:46 pm The structure learning algorithms in SMILE/GeNIe currently require that a dataset has no missing values. From the POV of the learning algorithm the missing value replacement is no different from any state label.

The PC algorithm should output the nodes with outcomes like S_99, just like the attached image.

Thanks Shooltz. So the structure leanring doesn't allow missing data, but parameter learning allows missing data, does it mean structure learning and parameter learning are based on different data set? Otherwise, we must use the complete data for both structure learing and parameter learning.

Cheers,
Yan

Yan · Post by **Yan** » Tue Oct 10, 2023 4:13 am

and i try PC algorithm again, the output doesn't show outcomes/states like S_99 (replace missing data with 99), but if i use other structure learning algorithms (e.g., bayesian search), the bar chart view shows S_99, that's a bit weird. could you please check that by randomly using some data? thanks a lot!

Wed Oct 11, 2023 7:15 pm

It seems the issue is with PC learning algorithm and the missing value replacement which is numeric. I ran PC with a data file with missing values replaced with the label "x99", and the output did contain the x99 outcome. When the missing value was replaced with a number (like 99), the outcome representing value was not present.

We're searching for the root cause. In the meantime, please replace missing values with non-numeric labels.

BayesFusion Support Forum

missing value

missing value

Re: missing value

Re: missing value

Re: missing value

Re: missing value