Skip to content
Supportman
Glossary
Operations

QA Calibration

— Definition —

QA Calibration is the process of aligning multiple QA reviewers on how to interpret and apply a quality rubric consistently. In calibration sessions, reviewers independently score the same set of conversations and then compare scores to identify and resolve disagreements. The goal is to ensure that a conversation scored by one reviewer gets the same score from another — otherwise IQS data is unreliable and unfair to use for agent coaching or performance management.

— Formula —

Inter-Rater Reliability (IRR) = (Conversations where reviewers agree within ±10% ÷ Total conversations reviewed in calibration session) × 100

A score is "in agreement" if two independent reviewers give scores within 10 percentage points of each other on the same conversation. IRR above 85% is considered a well-calibrated program. Track IRR per rubric dimension to identify which criteria are consistently interpreted differently.

— Benchmark ranges —

Support QA calibration sessions

Well-calibrated85%+ agreement across reviewers
Needs alignment70–85% agreement
Poorly calibratedBelow 70% agreement
— Calculator —

Calculate inter-rater reliability

Inter-rater reliability85.0%
— Common mistakes —
  • 1Running calibration sessions only at program launch — calibration drift happens gradually. Run calibration sessions at least monthly, especially when new reviewers join.
  • 2Calibrating only on easy, clear-cut conversations — calibrate on edge cases and disputed scores. Agreement on easy conversations doesn't confirm alignment on the hard ones.
  • 3Treating calibration disagreements as reviewer errors — disagreements reveal rubric ambiguity. If two experienced reviewers disagree, the rubric criteria are likely under-defined.
  • 4Not tracking calibration scores over time — IRR declining over months signals that standards are drifting and re-alignment is needed.
— Related metrics —

Five minutes to live, no IT ticket required.