Roll two dice and add them. The total — 2 to 12 — isn't fixed; it's a number that depends on chance. That's a random variable: a number waiting for the universe to decide it.
A random variable assigns a number to each outcome of a random experiment. It can be discrete (countable values: dice totals, number of heads) or continuous (any value in a range: heights, waiting times).
Every probability distribution, every expected value, every statistical model is built on random variables. They're the noun that probability is about.
0 balls dropped through 8 rows of pegs. The dashed line is the theoretical normal curve — the histogram converges to it as you drop more.
Key ideas
- Discrete RV — takes specific values, each with a probability (a 'probability mass function').
- Continuous RV — described by a density curve; probabilities are areas under it.
- Expected value E(X) — the long-run average, Σ(value × probability).
- Variance — how much the variable typically deviates from its expected value.
Let X = the number on one fair die. What's E(X)?
Roll two dice, X = the sum. Why is 7 the most likely value?
There are 6 ways to roll a 7 (1+6, 2+5, …, 6+1) but only 1 way to roll a 2 or a 12. More outcomes map to 7, so it has the highest probability — the distribution of X peaks at 7.
A random variable isn't 'random' once observed — it's a *rule*, not a single number. 'X = die roll' is the variable; the 4 you actually rolled is a *realisation* of it.
Add up many independent random variables (like the left/right bounces in the Galton board above) and their sum tends toward a normal distribution — the Central Limit Theorem, the reason the bell curve is everywhere.
- A random variable = a number determined by a random outcome.
- Discrete (countable values) or continuous (density curve).
- E(X) is the long-run average; sums of many RVs go normal (CLT).