How to calculate the accuracy?
How to calculate the accuracy?
Hi,
I learned a network with GREEDY algorithm and I want to test the accuracy of this model. In this network, I have a class variable named "motivation_level" to which I want to use other variables to test the accuracy of the prediction. How can I test the accuracy, that is, how can I evaluate my model using SMILE?
Thanks. Any help will be appreciated.
I learned a network with GREEDY algorithm and I want to test the accuracy of this model. In this network, I have a class variable named "motivation_level" to which I want to use other variables to test the accuracy of the prediction. How can I test the accuracy, that is, how can I evaluate my model using SMILE?
Thanks. Any help will be appreciated.
Load a record of the data set into the learned network except for the target variable (in this case motivation_level). Then perform inference and see if the state with the highest posterior probability is the state that was recorded in the data set. Repeat this procedure for all records in the data set. This is a simple approach and variations are possible.
(This approach assumes that you learned the parameters as well.)
(This approach assumes that you learned the parameters as well.)
Hi
I just looking throw the forum found this topic. I was trying to do something similar, but I don't know what I could be doing wrong.
First I learn my bn using a file dataset and greedy.
Once I get the result bn I read the initial file and go through each line setting the evidence, but not the target variable:
So basically I try to read each line, put the evidences, and get the target value to check with the file known value.
But in GetEvidence() I just get -2, I dont know if I should use another method or it propagate the evidence by itself.
I just looking throw the forum found this topic. I was trying to do something similar, but I don't know what I could be doing wrong.
First I learn my bn using a file dataset and greedy.
Code: Select all
greedy.Learn(m_dataset,result)
Code: Select all
result.SetTarget(TARGET);
for each line:
{
readline()
for each column:
{
value=column.value;
if (col!=TARGET)
result.GetNode(i)->Value()->SetEvidence(value);
else
targetValue=value;
}
result.UpdateBeliefs();
int value=result.GetNode(TARGET)->Value()->GetEvidence();
if (value==targetValue)
... ok it works good
else
... we got a fail
}
But in GetEvidence() I just get -2, I dont know if I should use another method or it propagate the evidence by itself.
I think I could answer myself
Instead of using GetEvidence (It's normal as you didn't declare it O:) )
Instead of using GetEvidence (It's normal as you didn't declare it O:) )
Code: Select all
DSL_Dmatrix* mat=result.GetNode(TARGET)->Value()->GetMatrix() ;
double value_state0=matriz->GetItems().Subscript(0);
Hi all,
I'm making a simple test for check the percentage of the bn created.
For testing I just learn with whole data file:
After that if I open the test.xdsl file and I go putting the evidences from the file data.txt by hand, I get the correct results on the Class node.
But just after the code below I want to test it by hand in my program so I do the following (I rewrite by hand some part of code for clarity):
So basically I just go through the file I used for creating and learning the network and I put all the evidences but not the Class one, so I just update the Beliefs and after I just compare both values using euclidean distance.
The problem is that with that method I should get 100% but I don't get it just 52 or whatever
Any help about what I could be doing wrong?
Thank u very much in advance
I'm making a simple test for check the percentage of the bn created.
For testing I just learn with whole data file:
Code: Select all
crossValid->readFile("data.txt");
if (greedy.Learn(*crossValid->m_dataSet,result)!=DSL_OKAY)
{
ExitProcess(0);
}
int TARGET_INDEX=result.FindNode("Class");
result.SetTarget(TARGET_INDEX);
result.WriteFile("test.xdsl");
But just after the code below I want to test it by hand in my program so I do the following (I rewrite by hand some part of code for clarity):
Code: Select all
StreamReader* file = new StreamReader("data.txt");
while ( line = file->ReadLine() )
{
...
result.ClearAllEvidence();
float targetStatus[TARGET_NUM_STATUS]; // NUM_STATUS=2
while (s is string In Column)
{
int value;
if (s->Equals("Yes"))
value=0;
else if (s->Equals("No"))
value=1;
if (i==TARGET_INDEX)
{
if (value==1)
{
targetStatus[0]=1;
targetStatus[1]=0;
}
else
{
targetStatus[0]=0;
targetStatus[1]=1;
}
}
else
result.GetNode(i)->Value()->SetEvidence(value);
i++;
}
result.UpdateBeliefs();
DSL_Dmatrix* mat=result.GetNode(TARGET_INDEX)->Value()->GetMatrix() ;
float inferenceStatus[TARGET_NUM_STATUS];
inferenceStatus[0]=matriz->GetItems().Subscript(0);
inferenceStatus[1]=matriz->GetItems().Subscript(1);
float difference=euclideanDistance(targetStatus,inferenceStatus,2);
if (difference<=1)
{
correct++;
}
}
numTotal++;
}
float porcentage=correct/(float)numTotal;
The problem is that with that method I should get 100% but I don't get it just 52 or whatever
Any help about what I could be doing wrong?
Thank u very much in advance
Is it possible that you're not doing anything wrong, but the predictions you're making are simply imperfect? Please note that the correct value for the target node is 0 or 1, but that you will not obtain these values after performing inference (you'll get a probability distribution).
Also, why are you checking for difference <= 1?
Also, why are you checking for difference <= 1?
Hi Mark!
I'll try to explain myself better. (Btw I made mistake it shouldn't be difference<=1 but difference<=0.5)
The Class Node will have 2 values, Yes or No, but they're probability, because they're 2 different states not just one value that can be 0 or 1.
So if I read from the file that the expected value it's Yes, the probabilities should be:
So if I get Class=YES in one line, and probabilities (GetNodes->Value->Matrix), lets say:
In a wrong case: Read Class=NO and probabilities from inference:
----
The thing is, just after making the " result.WriteFile("test.xdsl"); " if I go and open that file, and introduce one by one each evidence I get the right expected Class as in the file, so I just want to get the same in the code that should be possible if I can get in the Genie interface no?
I hope it's clear, if not let me know, I'm really messy with that problem
I'll try to explain myself better. (Btw I made mistake it shouldn't be difference<=1 but difference<=0.5)
The Class Node will have 2 values, Yes or No, but they're probability, because they're 2 different states not just one value that can be 0 or 1.
So if I read from the file that the expected value it's Yes, the probabilities should be:
Code: Select all
State(Yes)=1
State(No)=0
And for No:
State(Yes)=0
State(No)=1
Code: Select all
ProbYes=matriz->GetItems().Subscript(0); <-- 0.8
ProbNo=matriz->GetItems().Subscript(1); <-- 0.2
The distance =
sqrt( (State(YES)-ProbYes)^2 + (State(NO)-ProbNo)^2)=
sqrt( (1-0.8)^2 + (0-0.2)^)=0.2828 (It's correct because diff<=0.5)
Code: Select all
ProbYes=0.6
ProbNo=0.4
Distance=sqrt( (0-0.6)^2 + (1-0.4)^2) = 0.848 >0.5 so Wrong estimation.
The thing is, just after making the " result.WriteFile("test.xdsl"); " if I go and open that file, and introduce one by one each evidence I get the right expected Class as in the file, so I just want to get the same in the code that should be possible if I can get in the Genie interface no?
I hope it's clear, if not let me know, I'm really messy with that problem
Another example X)
The first line of the file is:
Where the last column is the class
So I go to Genie, and put the evidences of the first 4 nodes, and click Update Beliefs and I get in the class node:
But I read the file in my program and call SetEvidence for the same nodes and call getValue->getMatrix in the Class Node, I get:
I don't really get why I don't get the same values if the network I'm using it's the same :S
The first line of the file is:
Code: Select all
No No Yes No Yes
So I go to Genie, and put the evidences of the first 4 nodes, and click Update Beliefs and I get in the class node:
Code: Select all
Yes: 85%
No: 15%
Code: Select all
(Yes) 0: 0.38
(No) 1: 0.62
Hi Mark!
Here it's the code:
I've also included the vs.2003 project in case you wanna check whole, but the main stuff is the one I copy below.
http://kile.stravaganza.org/temp/bayesants.zip
Here it's the code:
Code: Select all
if (m_dataset.ReadFile(filename)!=DSL_OKAY)
ExitProcess(0);
if (greedy.Learn(*crossValid->m_dataSet,result)!=DSL_OKAY)
{
ExitProcess(0);
}
//result.SetTarget(m_studyData->getNumAttributes()-1);
int TARGET_INDEX=result.FindNode("Paliza");
result.SetTarget(TARGET_INDEX);
result.WriteFile("D://testest.xdsl");
StreamReader* file = new StreamReader("D://test2.txt");
String* line;
bool first=true;
int numTotal=-1; // First row it's 0
int correct=0;
while ( line = file->ReadLine() )
{
if (first)
{
first=false;
}
else
{
String *split[];
String* delimStr = S" ";
Char delimiter[] = delimStr->ToCharArray();
split=line->Split(delimiter,20);
result.ClearAllEvidence();
IEnumerator* myEnum = split->GetEnumerator();
int i=0;
float targetStatus[TARGET_NUM_STATUS];
while (myEnum->MoveNext())
{
String* s = __try_cast<String*>(myEnum->Current);
int value;
if (s->Equals("Si"))
value=0;
else if (s->Equals("No"))
value=1;
if (i==TARGET_INDEX)
{
if (value==1)
{
targetStatus[0]=1;
targetStatus[1]=0;
}
else
{
targetStatus[0]=0;
targetStatus[1]=1;
}
}
else
result.GetNode(i)->Value()->SetEvidence(value);
i++;
}
result.UpdateBeliefs();
DSL_Dmatrix* matriz=result.GetNode(TARGET_INDEX)->Value()->GetMatrix() ;
float inferenceStatus[TARGET_NUM_STATUS];
inferenceStatus[0]=matriz->GetItems().Subscript(0);
inferenceStatus[1]=matriz->GetItems().Subscript(1);
float difference=euclideanDistance(targetStatus,inferenceStatus,2);
if (difference<=0.5)
{
correct++;
}
}
numTotal++;
}
http://kile.stravaganza.org/temp/bayesants.zip
Hi Mark!
I've checked that this node it's exactly the one i need using next sentence too for security:
And it gaves me the same order that it's reading (for the variable name). But as you point I think my problem is coming not from the variable names but from the state names like another user asked in previous post:
http://genie.sis.pitt.edu/forum/viewtopic.php?t=217
If the first word read is NO and secon YES will be (0: NO, 1:YES) but if is different order the states indices will change too.
I'll try to manage how to load them and use without any dependency of how they'll read.
thank u for your answer, I'll write some results as soon i'll get them working
I've checked that this node it's exactly the one i need using next sentence too for security:
Code: Select all
const char* name=result.GetNode(i)->GetId();
http://genie.sis.pitt.edu/forum/viewtopic.php?t=217
If the first word read is NO and secon YES will be (0: NO, 1:YES) but if is different order the states indices will change too.
I'll try to manage how to load them and use without any dependency of how they'll read.
thank u for your answer, I'll write some results as soon i'll get them working