How to assess the parameter learning score and the validation results of my Bayesain network

The front end.
Post Reply
wxk8000
Posts: 20
Joined: Fri Jan 19, 2018 11:58 am

How to assess the parameter learning score and the validation results of my Bayesain network

Post by wxk8000 »

Hello!
I do the structure learning using the data sample. The tree Augumented Naïve Bayes Algorithm is selected, and have established a Bayesian net like picture 1. I have several question during the establishment of the Bayesian net
Question 1
During the parameters learning, the nodes & states order always cannot corresponding to the value of the Columns & values, So should I regulated them manually?
11.png
11.png (68.35 KiB) Viewed 2502 times
Question 2
I select the Uniformize method to do the parameter learning, After that, a score Log(p)=-503.829825 is obtained? I do not know the meaning of this value, how to assess this value?
55.png
55.png (101.58 KiB) Viewed 2502 times
Question 3
Then I obtain the Bayesian net like this, Then I make a K-fold cross validation, obtain the validation results, it shows the accuracy for all 5 nodes = 0.342 (171/500). Is the accuracy good or bad?
66.png
66.png (98.26 KiB) Viewed 2502 times
marek [BayesFusion]
Site Admin
Posts: 430
Joined: Tue Dec 11, 2007 4:24 pm

Re: How to assess the parameter learning score and the validation results of my Bayesain network

Post by marek [BayesFusion] »

Question 1
During the parameters learning, the nodes & states order always cannot corresponding to the value of the Columns & values, So should I regulated them manually?
Yes, you can match the data and the model manually by dragging and dropping state names into the columns and values. Better, however, is to pre-process your data by changing digits/numbers into state names before learning. This way, you will have the same labels in the model and in the data and matching will be automatic.
Question 2
I select the Uniformize method to do the parameter learning, After that, a score Log(p)=-503.829825 is obtained? I do not know the meaning of this value, how to assess this value?
Log(p), ranging from minus infinity to zero, is a measure of fit of the model to the data. Its numerical value is best used in comparisons between two runs -- the higher the number, the better (please keep in mind that the number is negative, so lower absolute value means higher and -10 is higher than -100).
Then I obtain the Bayesian net like this, Then I make a K-fold cross validation, obtain the validation results, it shows the accuracy for all 5 nodes = 0.342 (171/500). Is the accuracy good or bad?
I have noticed that you run accuracy for all five variables. Why do you do this? What is your class node? I would check accuracy for the class node only.

Judging whether accuracy is good or bad cannot be made in isolation of the problem. Consider a model that is guessing the outcome of a coin toss. What accuracy would you expect? In many data sets, accuracy of 34.2% is not impressive but it all depends on the problem. Try learning from some standard data sets, such as the popular data sets from the Irvine Machine Learning Repository (e.g., Congressional Voting Records). Accuracy for these data sets is typically reported by other researchers, so you can check how well you are doing compared to other researchers/methods.
I hope this helps,

Marek
Post Reply