data representation

The front end.
Post Reply
student
Posts: 5
Joined: Wed Jun 18, 2008 8:49 am

data representation

Post by student »

I 've started using GeNie with a simple example and the result I've got looks very fine.
But now I try to build a more complex BN with 14 parameters and one target node.
With de help of the 14 parameters it should classificate patients into two classes 'cancer' or 'not cancer' (=the target node).

I have a training set with 60 examples. It looks like this:

MMP-9 TIMP-1 IGF-1
188,4429041 128,1473695 99,12979943
1622,460964 272,0132637 120,7913285
648,6884035 223,3910573 100,2765291
1074,058998 151,3514913 202,5293192
910,2007009 180,9122295 143,9813973
541,2574401 210,4037978 87,23793444
581,2954049 207,0669932 227,0130343
1208,975256 704,4374826 33,79843346
371,2985708 197,1676542 146,3000646
433,7125762 190,5068819 90,66447127
244,6561241 163,5205956 107,914169
570,3505797 192,1733215 153,7459332
727,8080358 161,9484687 146,2752897
669,9939991 177,6005678 104,5662187
432,200093 188,5920888 142,3640146
527,1738718 231,6239566 294,8347591


Now my question is how i should represent this data.
Is the solution to see each value as another state, what means 60 states for each node, or should I divide them into intervals?

Thank you for your answer.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: data representation

Post by shooltz[BayesFusion] »

student wrote:Now my question is how i should represent this data.
Is the solution to see each value as another state, what means 60 states for each node, or should I divide them into intervals?
The data looks continous, so you should discretize it. The choice of intervals depends of course on the problem domain.
student
Posts: 5
Joined: Wed Jun 18, 2008 8:49 am

data representation

Post by student »

I thought so too. But when i marked the column and tried to press the dicretize label, it was disabled. So i was surprised at this.

But if you think so also, then i'll do it by hand.
Thank you for your answer.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: data representation

Post by shooltz[BayesFusion] »

Your data uses comma (not decimal dot) - GeNIe treats the values as text, not numbers, so discretization is not possible.
student
Posts: 5
Joined: Wed Jun 18, 2008 8:49 am

data representation

Post by student »

Oh! That's the error. :oops:
Thanks a lot for your help.
Post Reply