Missing values in test set

The engine.
Post Reply
lizbona
Posts: 9
Joined: Mon Mar 22, 2010 12:38 pm

Missing values in test set

Post by lizbona »

I want to learn network from data set with no missing values and then test it on data set that contains some missing values. I use readFile(path, missingValueToken) when reading training set (which is complete) and testing set (which has some missing values marked with missingValueToken). Then I learn network by GTT with learning set and test it with testing set. This gives me some results but also following message like:
Invalid outcome index ? for node 'synfuels_corporation_cutback', valid indices are 0..1

"?" stands for missing value (it has -1 index) and this message is printed for every missing value in test set. Is this normal behaviour or this message means there is some error? Am I doing something wrong?What is the difference when I read data with readFile(path) and readFile(path, missingValue) method?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: Missing values in test set

Post by shooltz[BayesFusion] »

lizbona wrote:Then I learn network by GTT with learning set and test it with testing set. This gives me some results but also following message like:
Invalid outcome index ? for node 'synfuels_corporation_cutback', valid indices are 0..1

"?" stands for missing value (it has -1 index) and this message is printed for every missing value in test set.
How exactly do you test? If you're iterating over values in the dataset and call Network.setEvidence, then you'll need to check for missing values manually before passing them to setEvidence.
lizbona
Posts: 9
Joined: Mon Mar 22, 2010 12:38 pm

Post by lizbona »

my testing is done as you mentioned above. I do it right now by puting setEvidence, and I am catching smileException. It is thrown when there is -1 index. I guess index -1 goes with missing value '?'
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

lizbona wrote:my testing is done as you mentioned above. I do it right now by puting setEvidence, and I am catching smileException. It is thrown when there is -1 index. I guess index -1 goes with missing value '?'
Yes, the -1 represents the missing value in dataset by default. However, I'd suggest checking the value of the index before passing it to setEvidence instead of catching an exception.

Make sure you call clearAllEvidence before instantiating a record from the training set, or clearEvidence for each entry with missing value.
Post Reply