You rolled a die 60 times and got sixes 18 times. Loaded — or just luck? The chi-square test compares what you *observed* with what you'd *expect*, and tells you if the gap is suspicious.
The chi-square (χ²) test checks whether observed counts in categories match expected counts. Two main uses: goodness-of-fit (does this die / distribution behave as expected?) and independence (are two categorical variables related?).
Genetics (do offspring ratios match Mendel's?), survey cross-tabs (is opinion linked to age group?), quality control, A/B tests with categorical outcomes — wherever your data is *counts in boxes*.
A chi-square test of 'is voting preference independent of region?' gives p = 0.001. Conclusion?
For each category, square the gap between observed and expected, divide by expected, and sum. A large χ² means the data strays far from expectation.
A fair coin is flipped 100 times: 60 heads, 40 tails. Expected is 50/50. Compute χ².
What's a 'degree of freedom' in a chi-square test?
Roughly, how many category counts are free to vary once the totals are fixed. Goodness-of-fit: (number of categories − 1). Independence test on an r×c table: (r−1)(c−1). It sets which χ² distribution you compare against.
Chi-square needs reasonably large expected counts (a common rule: every expected count ≥ 5). With tiny counts the test is unreliable — use an exact test (like Fisher's) instead. Also: it works on counts, never on percentages or means.
Karl Pearson introduced the chi-square test in 1900. R. A. Fisher later used it (and famously re-examined Mendel's pea data with it — finding the results suspiciously *too* good a fit).
- Compares observed vs expected counts in categories: χ² = Σ(O−E)²/E.
- Two uses: goodness-of-fit and test of independence.
- Needs decent expected counts (≥ ~5); works on counts, not percentages.