data type in GeNIe

The front end.
Post Reply
Yang Yajie
Posts: 34
Joined: Thu Mar 19, 2020 11:49 am

data type in GeNIe

Post by Yang Yajie »

hello, I am using GeNIe to built the dynamic Bayesian belief network. I have a question about the value label in data columns.

such as I use 0,1,2,3,4,5,6,7,8,9 to represent age (0,10],(10,20], (20,30],...,(90,>90] in .csv file, then I input this csv file into GeNIe to make "learn structure" and "learn network'. I notice the dynamic bayesian network you provide on the website that we can use string to represent our value label. like True and False, Pittsburg and Saraha in the .csv file. if I use 1,2,3,.., 99 to represent the value label in the file, does this number has numeric meaning? like 2 is bigger than 1, 87 is bigger than 23? or does these number has the same equal meaning with "A,B,C,...,P'?

At first, I use the "A,B,C,...,S,M,L" as the value label, but when i check the correlation in GeNIe, these variables which represent by "A,B,C,..S,M,L" will not shown in the correlation table. So i change all this lable into ‘1,2,3,...11,12,13...’ instead of them in each varibale. and the correlation table will shown these variables. Because I want to check the correlations among these variables, So I am afraid I use the wrong value label in the CSV file and cause the wrong correlation. Could oyu please to help me with this problem?

The second question is I put discrete data and continuous data into one CSV file, when I make structure learning in GeNIe, it told me "Can't start learning due to problems with data. Selected algorithm supports continuous or discrete columns, but they can't be used together. ". Could you please tell me how to deal with this problem?

kind regards!
Yajie
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: data type in GeNIe

Post by shooltz[BayesFusion] »

such as I use 0,1,2,3,4,5,6,7,8,9 to represent age (0,10],(10,20], (20,30],...,(90,>90] in .csv file, then I input this csv file into GeNIe to make "learn structure" and "learn network'. I notice the dynamic bayesian network you provide on the website that we can use string to represent our value label. like True and False, Pittsburg and Saraha in the .csv file. if I use 1,2,3,.., 99 to represent the value label in the file, does this number has numeric meaning? like 2 is bigger than 1, 87 is bigger than 23? or does these number has the same equal meaning with "A,B,C,...,P'?
If your data columns contains integers only, they will be sorted numerically (8 is less than 23, for example).
At first, I use the "A,B,C,...,S,M,L" as the value label, but when i check the correlation in GeNIe, these variables which represent by "A,B,C,..S,M,L" will not shown in the correlation table. So i change all this lable into ‘1,2,3,...11,12,13...’ instead of them in each varibale. and the correlation table will shown these variables. Because I want to check the correlations among these variables, So I am afraid I use the wrong value label in the CSV file and cause the wrong correlation
Can you post your data file here? Alternatively, upload the file to Google Drive or a similar service and post a link.

The second question is I put discrete data and continuous data into one CSV file, when I make structure learning in GeNIe, it told me "Can't start learning due to problems with data. Selected algorithm supports continuous or discrete columns, but they can't be used together. ". Could you please tell me how to deal with this problem
You should check either the discrete data, or continuous data before you proceed to learn the structure.
Yang Yajie
Posts: 34
Joined: Thu Mar 19, 2020 11:49 am

Re: data type in GeNIe

Post by Yang Yajie »

here is the data file. I use 1,2,3,..10 to represent different status. like gender 1=Male, 2=Female.
Thanks
Attachments
example.csv
(2.92 KiB) Downloaded 248 times
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: data type in GeNIe

Post by shooltz[BayesFusion] »

Your the HomeLocation variables are considered to be continuous, because the number of distinct values is greater than 20. To make sure GeNIe treats these variables as discrete, increase the 'Discrete threshold' parameter in the 'Learn New Network' window.

Note that structure learning for dynamic networks is not supported; only parameters can be learned in such case. If you run learning with the csv file you attached in the previous message, you'll obtain a non-dynamic network.
Post Reply