I am doing a simulation study, in which I use static EM to estimate the parameters of a BN. I assume that in SMILE the CPs are saved into a matrix based on the rule introduced in the manual and following the same rule, I extract CPs using the codes as follows. However, I noticed that there seems a flip of node states between different runs with different dataset.
For example, in my BN, a latent node B is dependent on a latent node A. Both nodes have two states: 0 = false; 1 = true. Therefore, I did not define states when creating the network. To generate the data, the true values for P (B=1|A=1) and P (B=1|A=0) are set to .60 and .40, respectively. Then, I ran static EM for 100 datasets. The seed is set to 0 for all runs. Below is part of the output I pulled out from CP tables using attached codes. It seems in the CP tables the order of the states of node B were randomly assigned. If so, is there any ways to fix it
dataset P_B_1_A_1 P_B_0_A_1 P_B_1_A_0 P_B_0_A_0
1 0.593043 0.406957 0.415188 0.584812
2 0.60443 0.39557 0.410549 0.589451
3 0.603798 0.396202 0.390583 0.609417
4 0.405125 0.594875 0.599858 0.400142
5 0.389113 0.610887 0.612045 0.387955
6 0.401198 0.598802 0.609971 0.390029
Thanks a lot!
Bo
Code: Select all
void staticEM() {
DSL_dataset LT;
std::string errMsg;
int parseCode = LT.ReadFile("C:\\Users\\User\\Desktop\\C++\\LIN_4.txt", NULL, &errMsg);
if (parseCode != DSL_OKAY) {
cout << "Cannot read data file...exiting" << endl;
cout << "Error code:" << parseCode << endl;
cout << "Error message:" << errMsg << endl;
exit(1);
}
if (LT.ReadFile("C:\\Users\\User\\Desktop\\C++\\LIN_4.txt") != DSL_OKAY) {
cout << "Cannot read data file...exiting" << endl;
exit(1);
}
DSL_network LIN_4T;
if (LIN_4T.ReadFile("LIN_4T.xdsl", DSL_XDSL_FORMAT) != DSL_OKAY) {
cout << "Cannot read network...exiting." << endl;
exit(1);
}
vector<DSL_datasetMatch> matches;
string err;
if (LT.MatchNetwork(LIN_4T, matches, err) != DSL_OKAY) {
cout << "Cannot match network...exiting." << endl;
cout << err << endl;
exit(1);
}
int seed = 0;
double loglik;
DSL_em em;
em.DSL_em::SetSeed(seed);
if (em.Learn(LT, LIN_4T, matches, &loglik) != DSL_OKAY) {
cout << "Cannot learn parameters...exiting." << endl;
exit(1);
}
LIN_4T.UpdateBeliefs();
//ATTRIBUTE CPs
int A = LIN_4T.FindNode("A");
const DSL_Dmatrix* mtx_A = LIN_4T.GetNode(A)->Definition()->GetMatrix();
double P_A_0 = (*mtx_A)[0];
double P_A_1 = (*mtx_A)[1];
int B = LIN_4T.FindNode("B");
const DSL_Dmatrix* mtx_B = LIN_4T.GetNode(B)->Definition()->GetMatrix();
double P_B_0_A_0 = (*mtx_B)[0];
double P_B_1_A_0 = (*mtx_B)[1];
double P_B_0_A_1 = (*mtx_B)[2];
double P_B_1_A_1 = (*mtx_B)[3];