More questions about DSL_dataset's ReadFile

The engine.
Post Reply
ermutarra
Posts: 17
Joined: Fri Apr 24, 2009 1:19 pm

More questions about DSL_dataset's ReadFile

Post by ermutarra »

Hi,

When I use DSL_dataset's ReadFile method to read my data, say I have the following file:

3 4
2 4
1 4

Then, I've seen in the XML file that it stores the states as "State_1", "State_2" and "State_3". But does it name then in the same order it reads them so State_1 would be equal to 3, State_2 equal to 2 and State_3 equal to 1?? Or that is sort them and State_1 actually corresponds to 1?

Is the a way to see this mapping programmatically with the methods in DSL_dataset?

Thank you very much!
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: More questions about DSL_dataset's ReadFile

Post by shooltz[BayesFusion] »

ermutarra wrote:Then, I've seen in the XML file that it stores the states as "State_1", "State_2" and "State_3". But does it name then in the same order it reads them so State_1 would be equal to 3, State_2 equal to 2 and State_3 equal to 1?? Or that is sort them and State_1 actually corresponds to 1?
What you see in the .xdsl file is the output from structure learning algorithm. The state names were created automatically, because you didn't provide your own with a call to DSL_dataset::SetStateNames. The numeric suffixes appended automatically correspond to actual values in the dataset. Try replacing { 3, 2, 1 } in the first column with { 4, 5, 7 }.

BTW: why don't you use GeNIe to view the .xdsl file? It should be much easier to see the structure of the learned network.
ermutarra
Posts: 17
Joined: Fri Apr 24, 2009 1:19 pm

Post by ermutarra »

Thanks, that clarifies things!

Yes, I suppose I should install Genie...
kile
Posts: 19
Joined: Sat Apr 25, 2009 3:36 pm

Post by kile »

Hi,

If I've the following file:
NO
YES
YES
NO

And I make read file, the state 0 will be NO and YES will be state 1,
but if you make in different order this will change.
I would like to know how it could be possible to get correctly the index of one read state to be able to use SetEvidence with the correct index.
I mean could be possible to make once i read the file, something like?:

Code: Select all

network.GetNode(i)->Value()->SetEvidence("NO");
or something like

Code: Select all

status=network.GetNode(i)->Value()->GETSTATUSFROMNAME("NO");
network.GetNode(i)->Value()->SetEvidence(status);
kile
Posts: 19
Joined: Sat Apr 25, 2009 3:36 pm

Post by kile »

Ok, I answer myself :) I've made the following function:

Code: Select all

int findVarState(int varID,std:string stateName)
{
   DSL_datasetVarInfo varInfo=m_dataSet->GetVariableInfo(varID);
   for(int j=0; j<varInfo.stateNames.size(); j++)
   {
          std::string currentStateName=m_dataSet->GetStateNames(varID)[j];
          if (!stateName.compare(currentStateName)
   	          return j;
    }
    return -1;
}
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

kile wrote:Hi,

If I've the following file:
NO
YES
YES
NO

And I make read file, the state 0 will be NO and YES will be state 1,
but if you make in different order this will change.
Just to clarify things, the order should not change. If dataset column contains non-numbers, they will be sorted alphabetically in the stateNames and data will contain the appropriate integer indices.
kile
Posts: 19
Joined: Sat Apr 25, 2009 3:36 pm

Post by kile »

Ah ok thank u very much shooltz I was confusing about that ;)
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

kile wrote:Ok, I answer myself :) I've made the following function:
But if you want to use entry in the dataset as a parameter when calling SetEvidence, you should do the lookup in the other direction:

1) get the integer value from the dataset
2) use that value as an index to obtain the string from stateNames
3) find the index of that string in node's outcomes - no need for manual loop here, call DSL_nodeDefinition::GetOutcomesNames which returns DSL_idArray, then pass your string to DSL_idArray::FindPosition.

If you plan to do the tight loop over large dataset it would be best to perform the lookup before entering the main loop body, so you'd be able to convert raw integer from the dataset to outcome index in constant time.
Post Reply