Inter-Rater Reliability Calculator Formula
Understand the math behind the inter-rater reliability calculator. Each variable explained with a worked example.
Formulas Used
Cohen's Kappa
kappa = (p_observed - p_expected) / (1 - p_expected)Observed Agreement
observed_agreement = p_observed * 100Chance Agreement
chance_agreement = p_expected * 100Total Observations
total_observations = n_totalVariables
| Variable | Description | Default |
|---|---|---|
agree_both_yes | Both Raters Agree Yes | 40 |
agree_both_no | Both Raters Agree No | 30 |
rater1_yes_rater2_no | Rater 1 Yes, Rater 2 No | 10 |
rater1_no_rater2_yes | Rater 1 No, Rater 2 Yes | 20 |
n_total | Derived value= agree_both_yes + agree_both_no + rater1_yes_rater2_no + rater1_no_rater2_yes | calculated |
p_observed | Derived value= (agree_both_yes + agree_both_no) / n_total | calculated |
p_yes_r1 | Derived value= (agree_both_yes + rater1_yes_rater2_no) / n_total | calculated |
p_yes_r2 | Derived value= (agree_both_yes + rater1_no_rater2_yes) / n_total | calculated |
p_no_r1 | Derived value= 1 - p_yes_r1 | calculated |
p_no_r2 | Derived value= 1 - p_yes_r2 | calculated |
p_expected | Derived value= p_yes_r1 * p_yes_r2 + p_no_r1 * p_no_r2 | calculated |
How It Works
How Cohen's Kappa Works
Cohen's Kappa measures inter-rater agreement while correcting for chance. It is more robust than simple percent agreement.
Formula
Kappa = (P_observed - P_expected) / (1 - P_expected)
Interpretation
Worked Example
Two graders evaluate 100 essays: both say pass (40), both say fail (30), only Rater 1 passes (10), only Rater 2 passes (20).
- 01Total: 40 + 30 + 10 + 20 = 100
- 02P_observed = (40 + 30) / 100 = 0.70
- 03P(R1 yes) = 50/100 = 0.50, P(R2 yes) = 60/100 = 0.60
- 04P_expected = 0.50 x 0.60 + 0.50 x 0.40 = 0.30 + 0.20 = 0.50
- 05Kappa = (0.70 - 0.50) / (1 - 0.50) = 0.20 / 0.50 = 0.400
Frequently Asked Questions
When should I use Kappa instead of percent agreement?
Always use Kappa when reporting inter-rater reliability in research. Percent agreement inflates reliability by ignoring chance.
What is a good Kappa value?
Values above 0.60 indicate substantial agreement. For high-stakes decisions, aim for 0.80 or higher.
Can Kappa be used with more than two raters?
Cohen's Kappa is designed for two raters. For multiple raters, use Fleiss' Kappa instead.
Ready to run the numbers?
Open Inter-Rater Reliability Calculator