All glossary terms
Plan

Calibration error

Calibration error is the gap between the confidence someone reports in a prediction and the empirical accuracy of those predictions. A well-calibrated estimator who says they are 80% confident is right 80% of the time across many predictions; an overconfident estimator is right less often than they claim. The Brier score is the standard measurement of calibration error.

Calibration is the central concept in the psychology of probabilistic prediction (Lichtenstein, Fischhoff & Phillips 1982 and decades of replication). Almost all populations are systematically overconfident on hard tasks; only weather forecasters, professional poker players, and a small number of intensively-trained experts are well-calibrated by default. Software estimators sit firmly in the overconfident camp — the gap between 'I'm 80% sure this sprint commitment is achievable' and 'this sprint commitment is achievable 80% of the time' is one of the largest known calibration gaps in any professional field. The intervention literature shows calibration improves with deliberate practice on the order of 8 weeks of repeated estimate-and-feedback cycles — almost no engineering organisation runs that training.

Related terms