EM-learn DBN possible bug [solved]

The engine.
Post Reply
mvdheijd
Posts: 2
Joined: Wed Dec 05, 2012 12:00 pm

EM-learn DBN possible bug [solved]

Post by mvdheijd »

Hi,

I'm trying to use SMILE to learn the parameters of a DBN, which leads to a strange combination of successes and failures, depending on the structure of the network. Learning (from the same data set) either succeeds, gives an error when calling em.Learn() or crashes with a segmentation fault. I've tried to find the minimal amount of code to reproduce it, which now consists mainly of code taken from the tutorial:

Code: Select all

#include "lib/smile.h"
#include "lib/smilearn.h"
#include<iostream>
#include<fstream>
#include<sstream>

using namespace std;

DSL_network MakeNetwork(const char* netfile, const char* datafile, const char def) {

  DSL_network net;
  if (net.ReadFile(netfile, DSL_XDSL_FORMAT) != DSL_OKAY) {
    cerr << "Error: Cannot read model file... exiting." << endl;
    exit(1);
  }
  cerr << "Read model with " << net.GetNumberOfNodes() << " nodes." << endl;

  DSL_dataset ds;
  if (ds.ReadFile(datafile) != DSL_OKAY) {
    cerr << "Error: Cannot read data file... exiting." << endl;
    exit(1);
  }

  //match up variables, assumes 'varname_timeslice' 
  vector<DSL_datasetMatch> dsMap(ds.GetNumberOfVariables());
  int v=0; //counts matching variables (# in ds is not necessarily == # in net)
  for (int i = 0; i < ds.GetNumberOfVariables(); i++) {
    string id = ds.GetId(i);
    const char* idc = id.c_str();
    for (int j = 0; j < (int) strlen(idc); j++) {
      if (idc[j] == '_') {
        char* nodeId = (char*) malloc((j+1) * sizeof(char));
        strncpy(nodeId, idc, j);
        nodeId[j] = '\0';

        if (strcmp(nodeId,"exacerbationA")==0) {
          if (def == 'A') {
            nodeId[j-1] = '\0'; //cut off the definition letter to match with the 'exacerbation' var
          } else break;
        } else if (strcmp(nodeId,"exacerbationB")==0) {
          if (def == 'B') {
            nodeId[j-1] = '\0'; //cut off the definition letter to match with the 'exacerbation' var
          } else break;
        }
        int index = net.FindNode(nodeId);
        assert(index >= 0);
        DSL_intArray orders;
        net.GetTemporalOrders(index, orders);

        dsMap[v].node   = index;
        dsMap[v].slice  = atoi(idc + j + 1);
        dsMap[v].column = i;
        v++;
        free(nodeId);
        break;
      }
    }
  }
  dsMap.resize(v); //resize vector to actual number of used variables (instead of number in dataset)

  //match variable states
  for (int i = 0; i < dsMap.size(); i++) {
    DSL_datasetMatch &p = dsMap[i];
    int index = p.node;

    DSL_nodeDefinition* def = net.GetNode(index)->Definition();
    DSL_idArray* ids = def->GetOutcomesNames();
    const DSL_datasetVarInfo &varInfo = ds.GetVariableInfo(p.column);
    const vector<string> &stateNames = varInfo.stateNames;
    vector<int> map(stateNames.size(), -1);
    for (int j = 0; j < (int) stateNames.size(); j++) {
      const char* id = stateNames[j].c_str();
      for (int k = 0; k < ids->NumItems(); k++) {
        char* tmpid = (*ids)[k];
        if (!strcmp(id, tmpid)) {
          map[j] = k;
        }
      }
    }
    for (int k = 0; k < ds.GetNumberOfRecords(); k++) {
      if (ds.GetInt(p.column, k) >= 0) {
        ds.SetInt(i, k, map[ds.GetInt(p.column, k)]);
      }
    }
  }

  DSL_em em;
  if (em.Learn(ds, net, dsMap) != DSL_OKAY) {
    cerr << "Error: Cannot learn parameters... exiting." << endl;
    exit(1);
  }
  cerr << "Learned parameters with EM." << endl;

  return net;
}


int main(int argc, char* argv[]) {

  if (argc != 3) {
    cerr << "Expecting 2 arguments <netfile> <datafile> ... exiting." << endl;
    exit(1);
  }

  const char* netfile  = argv[1];
  const char* datafile = argv[2];

  DSL_network net = MakeNetwork(netfile, datafile, 'A');
  cerr << "Done." << endl;
  return 0;
}
I'm using the g++ (GCC) 4.5.1 20100924 (Red Hat 4.5.1-4) compiler on Fedora 14 (i686).

The model is fairly simpel, with 9 variables (it's about COPD in case anyone wondered). The crashes can be reproduced with a single order 1 parent. Probabilities shown below are initialised randomly.

Code: Select all

<?xml version='1.0'?>
<smile id="aerial_model" version="1.0">
<nodes>
<cpt dynamic="plate" id="exacerbation"><state id="False" /><state id="True" /><probabilities>0.243537334706 0.756462665294 </probabilities></cpt>
<cpt dynamic="plate" id="dyspnea"><state id="False" /><state id="True" /><parents>exacerbation </parents><probabilities>0.860181812722 0.139818187278 0.82267139849 0.17732860151 </probabilities></cpt>
<cpt dynamic="plate" id="cough"><state id="False" /><state id="True" /><parents>exacerbation </parents><probabilities>0.714359611857 0.285640388143 0.335642119973 0.664357880027 </probabilities></cpt>
<cpt dynamic="plate" id="activity"><state id="False" /><state id="True" /><parents>exacerbation </parents><probabilities>0.109020152567 0.890979847433 0.97315730061 0.0268426993897 </probabilities></cpt>
<cpt dynamic="plate" id="malaise"><state id="False" /><state id="True" /><parents>cough </parents><probabilities>0.0926196683246 0.907380331675 0.694698962139 0.305301037861 </probabilities></cpt>
<cpt dynamic="plate" id="sputumVol"><state id="False" /><state id="True" /><parents>cough </parents><probabilities>0.0171582009709 0.982841799029 0.311515553305 0.688484446695 </probabilities></cpt>
<cpt dynamic="plate" id="sputumCol"><state id="False" /><state id="True" /><parents>sputumVol </parents><probabilities>0.728338637561 0.271661362439 0.0845809574053 0.915419042595 </probabilities></cpt>
<cpt dynamic="plate" id="wheeze"><state id="False" /><state id="True" /><parents>dyspnea sputumCol </parents><probabilities>0.875434963447 0.124565036553 0.323648097991 0.676351902009 0.350625515093 0.649374484907 0.399005620822 0.600994379178 </probabilities></cpt>
<cpt dynamic="plate" id="temperature"><state id="False" /><state id="True" /><parents>sputumVol </parents><probabilities>0.984635551874 0.0153644481258 0.80365760654 0.19634239346 </probabilities></cpt>
</nodes>
<dynamic numslices="2">
<cpt id="wheeze" order="1"><parents>dyspnea</parents><probabilities>0.593322483894 0.406677516106 0.763396055827 0.236603944173 0.200441777266 0.799558222734 0.940747671441 0.0592523285594 0.736401644759 0.263598355241 0.79469092255 0.20530907745 0.194799918165 0.805200081835 0.140266467197 0.859733532803</probabilities></cpt>
</dynamic>
</smile>
With this code and model I get the following result:
Read model with 9 nodes.
Error: Cannot learn parameters... exiting.
Changing the dynamic parent of wheeze from dyspnea to either cough, activity or exacerbation leads to:
Read model with 9 nodes.
Segmentation fault (core dumped)
Changing it to a temporal self-loop or to malaise,sputumVol,sputumCol or temperature succeeds with:
Read model with 9 nodes.
Learned parameters with EM.
Done.
Looking at this partition it appears the order of the variables in the model specification matters. Indeed moving dyspnea to appear just above wheeze makes learning succeed. It seems therefore that there is some fragility in how models are read. I already noticed that even though xdsl is an xml-format there are hidden asumptions on the order of the variables, as loading fails when variables are not in topological order, however these crashes suggest there is some further constraint that I have unknowingly violated and which is possibly a bug.

[edit]
Just a quick clarification, I'm generating the xdsl-files from the output of a structure learner, so making it work by manually shuffling variables around is not really feasible.
[/edit]

To be able to reproduce what I have done I also attached my data set. It has 320 colums (10 variables (9 used), with timeseries of length 32) but only 10 rows and a lot of missing values (hence EM).
[edit]removed the data now the issue is solved[/edit]

Any help on this issue would be greatly appreciated, as my data analysis rather depends on getting this to work. If any further information is needed I'd be glad to provide it.

Thanks in advance,

Maarten van der Heijden
----
PhD-student
Model-Based System Development
Radboud University Nijmegen
Last edited by mvdheijd on Wed Dec 12, 2012 9:09 pm, edited 1 time in total.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: EM-learn DBN possible bug

Post by shooltz[BayesFusion] »

Using the xdsl included in the message body I'm getting "Learned parameters with EM.". However, I'm running the test on Win32.

Our binaries for Linux/x86 are compiled with gcc 4.4.5 - can you try running your code on a Linux box with that version of the compiler?
mvdheijd
Posts: 2
Joined: Wed Dec 05, 2012 12:00 pm

Re: EM-learn DBN possible bug

Post by mvdheijd »

Thank you for your suggestion, it does indeed appear to be a compiler/toolchain issue. I got it working in a virtual machine both with a downgraded 4.4.7 compiler and with 4.5.3. Thanks again!
Post Reply