Hello!
I do the structure learning using the data sample. The tree Augumented Naïve Bayes Algorithm is selected, and have established a Bayesian net like picture 1. I have several question during the establishment of the Bayesian net
Question 1
During the parameters learning, the nodes & states order always cannot corresponding to the value of the Columns & values, So should I regulated them manually?
Question 2
I select the Uniformize method to do the parameter learning, After that, a score Log(p)=-503.829825 is obtained? I do not know the meaning of this value, how to assess this value?
Question 3
Then I obtain the Bayesian net like this, Then I make a K-fold cross validation, obtain the validation results, it shows the accuracy for all 5 nodes = 0.342 (171/500). Is the accuracy good or bad?
How to assess the parameter learning score and the validation results of my Bayesain network
-
- Site Admin
- Posts: 430
- Joined: Tue Dec 11, 2007 4:24 pm
Re: How to assess the parameter learning score and the validation results of my Bayesain network
Yes, you can match the data and the model manually by dragging and dropping state names into the columns and values. Better, however, is to pre-process your data by changing digits/numbers into state names before learning. This way, you will have the same labels in the model and in the data and matching will be automatic.Question 1
During the parameters learning, the nodes & states order always cannot corresponding to the value of the Columns & values, So should I regulated them manually?
Log(p), ranging from minus infinity to zero, is a measure of fit of the model to the data. Its numerical value is best used in comparisons between two runs -- the higher the number, the better (please keep in mind that the number is negative, so lower absolute value means higher and -10 is higher than -100).Question 2
I select the Uniformize method to do the parameter learning, After that, a score Log(p)=-503.829825 is obtained? I do not know the meaning of this value, how to assess this value?
I have noticed that you run accuracy for all five variables. Why do you do this? What is your class node? I would check accuracy for the class node only.Then I obtain the Bayesian net like this, Then I make a K-fold cross validation, obtain the validation results, it shows the accuracy for all 5 nodes = 0.342 (171/500). Is the accuracy good or bad?
Judging whether accuracy is good or bad cannot be made in isolation of the problem. Consider a model that is guessing the outcome of a coin toss. What accuracy would you expect? In many data sets, accuracy of 34.2% is not impressive but it all depends on the problem. Try learning from some standard data sets, such as the popular data sets from the Irvine Machine Learning Repository (e.g., Congressional Voting Records). Accuracy for these data sets is typically reported by other researchers, so you can check how well you are doing compared to other researchers/methods.
I hope this helps,
Marek