Help with SmiLearn changes
Help with SmiLearn changes
Hi,
I am trying to learn a network with continuous variables by modifying code that worked with the previous version of SmiLearn.
The dataset has 30 variables and approx 4800 records. I have run tests where all data are real numbers with no change in the results (the zero values are integers in the original datafile).
Dataset::ReadFile returns 0. Does ReadFile now return DSL_OKAY now as it did in earlier versions (before the last)?
DSL_pattern::ToDAG returns true
However, DSL_pc::Learn returns 1
Any thoughts on what is preventing this from working?
Also, I have another question related to DSL_learnProgress. Does this class enable one to stop the learning code and restart at a later time?
Thanks for your time,
Greg Johnson
I am trying to learn a network with continuous variables by modifying code that worked with the previous version of SmiLearn.
The dataset has 30 variables and approx 4800 records. I have run tests where all data are real numbers with no change in the results (the zero values are integers in the original datafile).
Dataset::ReadFile returns 0. Does ReadFile now return DSL_OKAY now as it did in earlier versions (before the last)?
DSL_pattern::ToDAG returns true
However, DSL_pc::Learn returns 1
Any thoughts on what is preventing this from working?
Also, I have another question related to DSL_learnProgress. Does this class enable one to stop the learning code and restart at a later time?
Thanks for your time,
Greg Johnson
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Help with SmiLearn changes
That's correct. The type returned from DSL_dataset::ReadFile was changed from bool to int around October 2008.Dataset::ReadFile returns 0. Does ReadFile now return DSL_OKAY now as it did in earlier versions (before the last)?
Yes, this method still returns a bool value.DSL_pattern::ToDAG returns true
That's highly unlikely - DSL_pc::Learn always returned an int and kept the SMILE convention of using negative error codes. Maybe you meant -1 (DSL_GENERAL_ERROR) instead of 1? If you have a console app, try adding the following line before other SMILE calls - maybe you'll get some diagnostic messages from the learning code on the screen:However, DSL_pc::Learn returns 1
Code: Select all
ErrorH.RedirectToFile(stdout);
If you derive from DSL_learnProgress and implement the Tick method, you'll be able to stop by returning false. For stop/restart you'd have to enter some loop/wait for external event in Tick - the method is called on the same thread where DSL_pc::Learn runs.Also, I have another question related to DSL_learnProgress. Does this class enable one to stop the learning code and restart at a later time?
Thanks for the quick response!
pc::Learn returns 1 not -1 as expected. However, the error message indicates that there is a mix of continuous and discrete variables. Is there a way to force all variables to be continuous? I have edited the dataset so all data elements have at least one decimal place.
Thanks again,
Greg
pc::Learn returns 1 not -1 as expected. However, the error message indicates that there is a mix of continuous and discrete variables. Is there a way to force all variables to be continuous? I have edited the dataset so all data elements have at least one decimal place.
Thanks again,
Greg
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
That's highly unexpected outcome. What's the size/date of pc.h file?Greg wrote:pc::Learn returns 1 not -1 as expected.
The parser included in SMILE only cares about actual values in the parsed files, not the notation (so if you have 1.0, 2.0 and 3.0 the column will be considered discrete anyway). You'll need to copy columns or the entire dataset to ensure all columns are continuous.However, the error message indicates that there is a mix of continuous and discrete variables. Is there a way to force all variables to be continuous? I have edited the dataset so all data elements have at least one decimal place.
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
After loading the data file, iterate over the dataset columns. For each column call DSL_dataset::IsDiscrete. If IsDiscrete returns false, you'll have to create new dataset column with AddFloatVar and copy the values from discrete column into newly created continuous column. When the copy is complete, delete the discrete column with DSL_dataset::RemoveVar.Greg wrote:I'm afraid I don't understand what you mean about copying columns or the dataset to ensure all columns are continuous.
The copy can be done within single dataset object, or you can create second dataset.
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
The output above is just a 'legend' - if the pattern contains anything, this will be followed by a series of numbers (0/1/2) representing the relationships between vertices in the pattern. What does DSL_pattern::GetSize return?Greg wrote:Okay, when I use a different dataset, or a concatenation of several datasets, DSL_pattern::Print produces the following output:
None: 0
Undirected: 1
Directed: 2
Perhaps I am confused about how to use the pattern, dataset and network. After a dataset has been created, I have:
if (!pattern.ToDAG(d, result))
cout << "\tDSL_pattern::ToDAG returned false!"<< endl;
pattern.Print();
cout << "Starting learning... ";
if (error_code = pc.Learn(d,pattern)!=DSL_OKAY)
{
cout << "Learning failed with error " << error_code << "." << endl;
return -1;
}
cout << "Learning complete!"<<endl;
where result is a DSL_network. Am I using something wrong or forgetting something?
if (!pattern.ToDAG(d, result))
cout << "\tDSL_pattern::ToDAG returned false!"<< endl;
pattern.Print();
cout << "Starting learning... ";
if (error_code = pc.Learn(d,pattern)!=DSL_OKAY)
{
cout << "Learning failed with error " << error_code << "." << endl;
return -1;
}
cout << "Learning complete!"<<endl;
where result is a DSL_network. Am I using something wrong or forgetting something?
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
ToDAG gets the dataset on input, but it's only used to initialize node identifiers (the actual numeric values from the data are not used). You should follow these steps
1) obtain a dataset (parse the file, optionally ensure that all columns are continuous)
2) pass the dataset to DSL_pc::Learn. If DSL_pc::Learn returns DSL_OKAY you should have something in the learned pattern.
3) try DSL_pattern::ToDAG - this may return false (ToDAG doesn't remove arcs between pattern vertices; it doesn't convert bidirectional arcs to unidirectionals). In such case, you'll need to ensure that patterns is directed and acyclic in your own code, or provide some kind of user interface for this manipulation.
1) obtain a dataset (parse the file, optionally ensure that all columns are continuous)
2) pass the dataset to DSL_pc::Learn. If DSL_pc::Learn returns DSL_OKAY you should have something in the learned pattern.
3) try DSL_pattern::ToDAG - this may return false (ToDAG doesn't remove arcs between pattern vertices; it doesn't convert bidirectional arcs to unidirectionals). In such case, you'll need to ensure that patterns is directed and acyclic in your own code, or provide some kind of user interface for this manipulation.
When I invoke ToDAG after DSL_pc::Learn DSL_pattern::ToDAG returns false but I get the following output from DSL_pattern::Print which shows no undirected arcs.
None: 0
Undirected: 1
Directed: 2
0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 2 2 0
0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 2 2 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0 2 0 0 2 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 2 2 0 2 0 0 0 0
0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 2
0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0
0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 0 0 2 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 2 2 2 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 2
0 0 0 0 0 0 0 0 0 0 2 0 0 2 2 0 0 2 0 0 2 2 0 0 0 0 0 0 0 0
0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 2 0 2 0 0
0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 2 0 0 0 0 0 2
0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 2 0 2 0 0 2 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 0
0 0 0 2 0 0 0 2 2 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0
0 0 2 0 0 2 0 0 2 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 2 0 2 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 0 2 0 2 2 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 2 0 2 2 2 0 2 0 0 0 0
0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0
0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0
2 0 2 0 0 0 0 0 0 2 0 0 0 0 2 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0
None: 0
Undirected: 1
Directed: 2
0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 2 2 0
0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0
0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 2 2 0 0 0 0 0 0
0 0 0 0 0 2 0 0 0 2 0 0 2 0 0 0 2 0 2 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 2 2 0 2 0 0 0 0
0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 2
0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 2 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0
0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 2 0 0 2 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 2 2 2 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 2
0 0 0 0 0 0 0 0 0 0 2 0 0 2 2 0 0 2 0 0 2 2 0 0 0 0 0 0 0 0
0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 2 0 2 0 0
0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 2 0 0 0 0 0 2
0 0 0 0 0 0 0 0 2 0 2 0 0 0 0 0 2 0 2 0 0 2 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 0 2 0
0 0 0 2 0 0 0 2 2 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0
0 0 2 0 0 2 0 0 2 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 2 0 2 2 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 0 2 0 2 2 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 2 0 2 2 2 0 2 0 0 0 0
0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0
0 0 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0
2 0 2 0 0 0 0 0 0 2 0 0 0 0 2 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
BTW, this...
... should be changed to this - notice the extra parentheses:if (error_code = pc.Learn(d,pattern)!=DSL_OKAY)
Code: Select all
if ((error_code = pc.Learn(d,pattern))!=DSL_OKAY)
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
The pattern with directed arcs may not be a directed acyclic graph. In other words, it may contain a directed cycle. In such case ToDAG returns false.Greg wrote:When I invoke ToDAG after DSL_pc::Learn DSL_pattern::ToDAG returns false but I get the following output from DSL_pattern::Print which shows no undirected arcs.