Hello,
for the calibration plot in GeNIe I saw that binning can be used.
I just want to make sure I understand it correctly.
Am I right that the probability interval [0,1] is divided into equally sized bins, and for each bin the plotted value corresponds to the mean observed frequency of the positive outcome among the samples in that bin?
Thank you!
Calibration Curve
-
shooltz[BayesFusion]
- Site Admin
- Posts: 1484
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Calibration Curve
Here's the quote from GeNIe's manual:
The final tab, Calibration, shows a very important measure of performance of a probabilistic model, notably the calibration curve. Because the output of a probabilistic model is a probability and this probability is useful in decision making, ideally we would like it to be as accurate as possible. One way of measuring the accuracy of a model is comparing the output probability to the actually observed frequencies in the data. The calibration curve shows how these two compare. For each probability p produced by the model (the horizontal axis) for the class variable, the plot shows the actual frequencies in the data (vertical axis) observed for all cases for which the model produced probability p. The dim diagonal line shows the ideal calibration curve, i.e., one in which every probability produced by the classifier is precisely equal to the frequency observed in the data. Because p is a continuous variable, the plot groups the values of probability so that sufficiently many data records are found to estimate the actual frequency in the data for the vertical axis. There are two methods of grouping implemented in GeNIe: Binning and Moving average. Binning works similarly to a histogram - we divide the interval [0..1] into equal size bins. As we change the number of bins, the plot changes as well. It is a good practice to explore several bin sizes to get an idea of the model calibration.