How to Calculate Sample Size
Learn how to calculate the required sample size for surveys and experiments. This guide covers the formula using margin of error, confidence level, and population proportion to determine how many subjects you need.
Why Sample Size Matters
The sample size n determines the precision and statistical power of a study. Too small a sample produces wide confidence intervals and low power to detect true effects; too large a sample wastes resources without meaningfully improving precision. Calculating the required sample size before data collection is a fundamental step in study design, ensuring you collect enough data to answer your research question reliably.
Sample Size for Estimating a Proportion
When estimating a population proportion p with a desired margin of error E at confidence level z*, the required sample size is: n = z*² × p(1−p) / E². If you have no prior estimate for p, use p = 0.5, which maximizes the required sample size and produces the most conservative (largest) estimate. For a 95% CI (z* = 1.96) with E = 0.05 and p = 0.5: n = (1.96)² × 0.25 / (0.05)² = 3.8416 × 0.25 / 0.0025 = 384.16, so n = 385.
Finite Population Correction
When your population size N is small relative to the sample size, apply the finite population correction: n_adjusted = n / (1 + (n − 1)/N). For example, if N = 1000 and the initial estimate is n = 385, then n_adjusted = 385 / (1 + 384/1000) = 385 / 1.384 ≈ 278. This correction reduces the required sample size, reflecting the fact that surveying a large fraction of the population provides more information per observation.
Sample Size for Estimating a Mean
When estimating a population mean with known or estimated standard deviation σ, the formula is: n = (z* × σ / E)², where E is the desired margin of error. For a 95% CI, z* = 1.96. If σ = 15 and E = 3, then n = (1.96 × 15 / 3)² = (9.8)² = 96.04, so n = 97. If σ is unknown from prior studies, use a pilot study estimate or a range estimate where σ ≈ range/4.
Sample Size for Hypothesis Testing
When designing an experiment to test a hypothesis, sample size depends on the significance level α, desired statistical power (1−β), effect size d, and the test type. For a two-sample t-test, the formula is: n = 2 × [(z_α/2 + z_β) / d]² per group, where d = |μ₁ − μ₂| / σ is Cohen's d. For 80% power (z_β = 0.842) and α = 0.05 (z_α/2 = 1.96) with d = 0.5: n = 2 × [(1.96 + 0.842) / 0.5]² = 2 × (5.604)² ≈ 63 per group.
Effect Size and Power
Statistical power is the probability of correctly rejecting a false null hypothesis (i.e., detecting a real effect). Power of 80% (β = 0.20) is a common minimum standard; 90% power is often used in clinical trials. Effect size quantifies the practical magnitude of the difference you want to detect — Cohen's d for means, Cohen's h for proportions. Larger required sample sizes result from smaller effect sizes, higher power requirements, or stricter significance levels.
Practical Considerations
Always add a buffer for anticipated non-response or dropout: if you expect a 20% dropout rate, inflate n by dividing by 0.80. For stratified or cluster sampling, consult the design effect (DEFF), which can be greater than 1 and requires multiplying the simple random sample size by DEFF to maintain the same precision. Sample size calculators automate these formulas but still require careful specification of effect size and variance inputs.
Try These Calculators
Put what you learned into practice with these free calculators.
Related Guides
How to Calculate Confidence Intervals
Step-by-step guide to calculating confidence intervals. Learn when to use z-intervals vs. t-intervals, how to choose a confidence level, and how to interpret the results.
How to Perform a T-Test
Learn how to perform one-sample, two-sample, and paired t-tests. This guide covers the t-test formula, degrees of freedom, p-values, and when to use each type of t-test.
How to Calculate Chi-Square Test
Learn how to perform a chi-square test of independence and goodness-of-fit. This guide explains the chi-square formula, how to build a contingency table, and how to interpret results using degrees of freedom.