Validation methods in GeNIe

korfan01 · Post by **korfan01** » Fri Jan 31, 2014 8:58 pm

Hi,

I used validation for DBN in Genie but I am not sure how the output prediction probabilities are computed.
For example, is leave-one-out method repeated N times (where N is the data size) using N-1 records for training and leave one record for testing (leaving a different record out on each repetition)?
The output results are the average probabilities?

I ran leave-one-out method to evaluate the predictive accuracy of my model but I am getting some strange results (accuracy= 0.99 at time slices 0 and 1 and accuracy =1 at time slice 2).
Do I understand correct the validation procedure on GeNIe or I am missing something?

Thank you in advance,

Kalia

Mon Feb 03, 2014 2:56 pm

korfan01 wrote:For example, is leave-one-out method repeated N times (where N is the data size) using N-1 records for training and leave one record for testing (leaving a different record out on each repetition)?

Yes, that's correct. See http://en.wikipedia.org/wiki/Cross-vali ... validation for more info.

The output results are the average probabilities?

No, the accuracy value calculated by GeNIe is defined as number of correctly classified cases divided by the total number of relevant records. For the node accuracy, the number of records is equal to the total record count in the data. For outcome accuracy, the total is defined as the number of records with the specified outcome.

Note that regardless of the validation method used (k-fold, leave one out, test only), each of the data records will be used exactly once for classification.

BTW, are you using the validation on the unrolled network?

kalia_or · Post by **kalia_or** » Fri Feb 07, 2014 6:26 am

Yes, I had used an unrolled DBN for the validation process. Thank you for your answer, I found out what was going wrong with my results, but I have another one question.
How are the final conditional probabilities for the model computed, after the validation process ? Is the average of all the training repetitions or the probabilities learned on the last training process?

Also, is it possible to have access to these parameters after the validation process?

Thanks,

Kalia

Fri Feb 07, 2014 3:27 pm

kalia_or wrote:Yes, I had used an unrolled DBN for the validation process. Thank you for your answer, I found out what was going wrong with my results, but I have another one question.
How are the final conditional probabilities for the model computed, after the validation process ? Is the average of all the training repetitions or the probabilities learned on the last training process?

Validation does not modify the conditional probabilities of the model you have open in SMILE. When you run in leave-one-out or K-fold mode, the EM phase work on the copy of the network. Depending on your choice of EM parameters you can have the conditionals uniformized, randomized or used as a start point with give confidence level. When EM completes the training phase (and new set of conditionals is obtained), the records not used in training will be instantiated and the posteriors of class node(s) compared to actual values in the data. This gives the accuracy, which is the output of validation.

Also, is it possible to have access to these parameters after the validation process?

If you mean the modified conditionals, then the answer is no. You should run EM to obtain new parameters.

korfan01 · Post by **korfan01** » Thu Jun 05, 2014 6:35 am

One last question, how can I use cross validation to evaluate a DBN prognostic model?
Let the DBN has 5 variables: [A,B,C,D,E] and it is unrolled for 3 time slices [t0, t1, t2] and the goal is to predict the class value at t=2 (last time slice) e.g. E_2,
given all the observations up to t=1 [A_0,A_1,B_0,B_1,C_0,C_1,D_0,D_1].
Thus, all the variables at t=2 will be hidden. Is it correct to select all the hidden variables as class variables in the cross-validation window or there is another way to evaluate prognostic models?

Thu Jun 05, 2014 1:29 pm

korfan01 wrote:Thus, all the variables at t=2 will be hidden. Is it correct to select all the hidden variables as class variables in the cross-validation window or there is another way to evaluate prognostic models?

Yes, that's correct.

BayesFusion Support Forum

Validation methods in GeNIe

Validation methods in GeNIe

Re: Validation methods in GeNIe

Re: Validation methods in GeNIe

Re: Validation methods in GeNIe

Re: Validation methods in GeNIe

Re: Validation methods in GeNIe