Bayesian search and negative best score

ind · Post by **ind** » Wed May 08, 2013 7:45 pm

Could you please explain what does negative best score signify? How should I interpret it? For example best score in iteration x: -70 000. Is it good or bad? What about -170 000? I expect the score to be positive and intuition tells me the bigger positive score the better.

Kind regards,

Martijn · Post by **Martijn** » Thu May 09, 2013 5:12 pm

Hi Ind,

The score is a loglikelihood.
Since the likelihood is a number between 0 and 1 the log likelihood will be negative.
For the loglikelihood, higher (in this case, closer to 0) is better.

Best,

Martijn

ind · Post by **ind** » Sat May 11, 2013 9:45 am

Thank you for your answer! It seems that fewer nodes give better results. The question then arises: should I always prefer models which score closer to zero? Please tell me about your experience.

Martijn · Post by **Martijn** » Sat May 11, 2013 3:09 pm

No, that's a too simplistic view.

Consider building a classifier that has a class variable and a few feature variables.
Learning parameters for this model would give you a likelihood score.
I can guarantee you that if you drop the feature variables and you learn parameters for just the class variable the likelihood score would be higher, but the model would be completely useless.

The range of the likelihood score depends on the number of variables and the number of samples in the data file.
More variables and more samples -> smaller score

You shouldn't compare likelihood scores of models that have either a different number of variables or were learned with a different number of samples.

Best,

Martijn

ind · Post by **ind** » Sat May 11, 2013 5:18 pm

I think that another possibility is to use accuracy as a scoring function. Let me give you an example. Given a table (C, A, B) with 5 rows:
(1,1,1)
(2,1,2)
(3,2,1)
(4,2,2)
(1,1,1)
Bayesian search generates a model without arcs for a given class variable=C (k-fold=2). Probability distribution without hard evidence for A and B would be then [0.375, 0.208, 0.208, 0.208] and best score in iteration x would be 2. I don't understand how should I understand this score and why aren't there any arcs? By the way dropping either A or B doesn't change score. It's still 2. Maybe it's too simplistic example but I would like to use this example to get better understandig about this topic.

Best wishes,

Martijn · Post by **Martijn** » Sat May 11, 2013 6:06 pm

In this case we calculate classification accuracy for each network structure that we try and pick the one with the highest accuracy.
This is not really a typical way of doing it, but useful when learning networks for classification.

Here the score represents the number of correct classifications done by the test network. In your case 2 records were classified correctly by the resulting network.

There are most likely no arcs in this network because you provided only 5 records, which is very little. Probably it was not enough to find some any of dependence among the variables.
Normally data sets should have at least hundreds of records or more.

The score did not change when you removed either A or B because of the network structure. In this case due to the independence among all the variables, neither A or B had any influence on C to begin with.
This means that the classification results of C stayed unchanged. Every time, state 1 was chosen because that was the state of C with the highest probability. This way it was correct twice and wrong the other times.

I hope this explains what happened in your example.

Best,

Martijn

BayesFusion Support Forum

Bayesian search and negative best score

Bayesian search and negative best score

Re: Bayesian search and negative best score

Re: Bayesian search and negative best score

Re: Bayesian search and negative best score

Re: Bayesian search and negative best score

Re: Bayesian search and negative best score