Outlier Detection Calculator Formula

Understand the math behind the outlier detection calculator. Each variable explained with a worked example.

Formulas Used

IQR

iqr_val = iqr

Lower Fence

lower_fence = q1 - 1.5 * iqr

Upper Fence

upper_fence = q3 + 1.5 * iqr

Outside Fences? (1=yes)

is_outlier = ((value < (q1 - 1.5 * iqr)) + (value > (q3 + 1.5 * iqr))) > 0 ? 1 : 0

Distance Beyond Fence

distance = abs(value) > abs(q3 + 1.5 * iqr) ? abs(value - (q3 + 1.5 * iqr)) : abs(value) < abs(q1 - 1.5 * iqr) ? abs(value - (q1 - 1.5 * iqr)) : 0

Variables

VariableDescriptionDefault
valueValue to Test95
q1Q1 (25th Percentile)30
q3Q3 (75th Percentile)70
iqrDerived value= q3 - q1calculated

How It Works

How to Detect Outliers Using the IQR Method

Formula

IQR = Q3 - Q1

Lower Fence = Q1 - 1.5 * IQR

Upper Fence = Q3 + 1.5 * IQR

Values below the lower fence or above the upper fence are flagged as potential outliers. The 1.5 multiplier is Tukey's convention. Using 3.0 instead identifies extreme outliers.

Worked Example

Q1 = 30, Q3 = 70. Is 95 an outlier?

value = 95q1 = 30q3 = 70
  1. 01IQR = 70 - 30 = 40
  2. 02Lower fence = 30 - 1.5*40 = 30 - 60 = -30
  3. 03Upper fence = 70 + 1.5*40 = 70 + 60 = 130
  4. 0495 is between -30 and 130, so it is NOT an outlier
  5. 05It would need to exceed 130 to be flagged

Frequently Asked Questions

Why use 1.5 times the IQR?

John Tukey chose 1.5 because, for a normal distribution, this captures about 99.3% of the data, flagging only the most extreme 0.7%. It provides a good balance between identifying true outliers and avoiding false flags.

Are outliers always errors?

No. Outliers may be data entry errors, measurement errors, or genuine extreme observations. Investigate each case before removing. In some fields (fraud detection, rare events), outliers are the most important data points.

What are alternatives to the IQR method?

Z-score method (flag |z| > 2 or 3), Grubbs' test, Dixon's Q test, and the modified Z-score using the median absolute deviation (MAD). The IQR method is robust because Q1 and Q3 are resistant to outliers themselves.

Ready to run the numbers?

Open Outlier Detection Calculator