Calibration Curve

merli · Post by **merli** » Thu Feb 05, 2026 11:44 pm

Hello,
for the calibration plot in GeNIe I saw that binning can be used.
I just want to make sure I understand it correctly.

Am I right that the probability interval [0,1] is divided into equally sized bins, and for each bin the plotted value corresponds to the mean observed frequency of the positive outcome among the samples in that bin?

Thank you!

Fri Feb 06, 2026 11:10 am

Here's the quote from GeNIe's manual:

The final tab, Calibration, shows a very important measure of performance of a probabilistic model, notably the calibration curve. Because the output of a probabilistic model is a probability and this probability is useful in decision making, ideally we would like it to be as accurate as possible. One way of measuring the accuracy of a model is comparing the output probability to the actually observed frequencies in the data. The calibration curve shows how these two compare. For each probability p produced by the model (the horizontal axis) for the class variable, the plot shows the actual frequencies in the data (vertical axis) observed for all cases for which the model produced probability p. The dim diagonal line shows the ideal calibration curve, i.e., one in which every probability produced by the classifier is precisely equal to the frequency observed in the data. Because p is a continuous variable, the plot groups the values of probability so that sufficiently many data records are found to estimate the actual frequency in the data for the vertical axis. There are two methods of grouping implemented in GeNIe: Binning and Moving average. Binning works similarly to a histogram - we divide the interval [0..1] into equal size bins. As we change the number of bins, the plot changes as well. It is a good practice to explore several bin sizes to get an idea of the model calibration.

merli · Post by **merli** » Sun May 31, 2026 6:05 pm

Hello,
thank you for the fast reply!

Regarding the Hosmer-Lemeshow test, I have a question about the binning mode.

How are the observations assigned to the groups used for the Hosmer-Lemeshow calculation? Are they separated into equal-width bins over the interval [0,1], or into quantile-based groups containing approximately the same number of observations?

In addition, how are the degrees of freedom determined for the Hosmer-Lemeshow statistic?

I also noticed that the calibration plots in GeNIe appear as step functions, whereas calibration plots generated with scikit-learn are typically displayed as line graphs connecting the calibration points. Is the step-like shape only a visualization choice, or does it reflect a different way of computing the calibration curve?

Thank you for your help.

Mon Jun 08, 2026 10:35 pm

Here's the code snippet from the SMILE sources. The 'indices' vector is contains row numbers sorted by the class node posterior value. As you can see, SMILE splits the observations into equal width bins. The HL test value is accumulated in the 'hosmerLemeshTest' variable.

Code: Select all

hosmerLemeshTest = 0;
for (int i = 0; i < binCount; i ++)
{
	int startPos = firstPosteriorPos + i * validRowCount / binCount;
	int endPos = firstPosteriorPos + (i + 1) * validRowCount / binCount - 1;
	int observedEvents = 0;
	double expectedEvents = 0;
	for (int p = startPos; p <= endPos; p ++) 
	{
		int row = indices[p];
		int actual = ds.GetInt(varIndex, row);
		if (actual == outcomeIndex)
		{
			observedEvents ++;
		}
		expectedEvents += posteriors[posteriorsPerRecord * row + rowOffset];
	}

	int binSize = endPos - startPos + 1;
	double prevalence = double(observedEvents) / binSize;
	curve[2 * i].first = posteriors[posteriorsPerRecord * indices[startPos] + rowOffset];
	curve[2 * i + 1].first = posteriors[posteriorsPerRecord * indices[endPos] + rowOffset];
	curve[2 * i].second = curve[2 * i + 1].second = prevalence;

	double diff = expectedEvents - observedEvents;
	hosmerLemeshTest += (diff * diff) / (expectedEvents * (1 - expectedEvents / binSize));
}

BayesFusion Support Forum

Calibration Curve

Calibration Curve

Re: Calibration Curve

Re: Calibration Curve

Re: Calibration Curve