Math Playground
Data

Random variables

A variable whose value is the outcome of a random event.

Roll two dice and add them. The total — 2 to 12 — isn't fixed; it's a number that depends on chance. That's a random variable: a number waiting for the universe to decide it.

A random variable assigns a number to each outcome of a random experiment. It can be discrete (countable values: dice totals, number of heads) or continuous (any value in a range: heights, waiting times).

Where you'll meet this

Every probability distribution, every expected value, every statistical model is built on random variables. They're the noun that probability is about.

probabilitystatisticsML
Galton board — watch the bell curve build

0 balls dropped through 8 rows of pegs. The dashed line is the theoretical normal curve — the histogram converges to it as you drop more.

Key ideas

  • Discrete RV — takes specific values, each with a probability (a 'probability mass function').
  • Continuous RV — described by a density curve; probabilities are areas under it.
  • Expected value E(X) — the long-run average, Σ(value × probability).
  • Variance — how much the variable typically deviates from its expected value.
Your turn

Let X = the number on one fair die. What's E(X)?

Try it

Roll two dice, X = the sum. Why is 7 the most likely value?

There are 6 ways to roll a 7 (1+6, 2+5, …, 6+1) but only 1 way to roll a 2 or a 12. More outcomes map to 7, so it has the highest probability — the distribution of X peaks at 7.

Watch out

A random variable isn't 'random' once observed — it's a *rule*, not a single number. 'X = die roll' is the variable; the 4 you actually rolled is a *realisation* of it.

Add up many independent random variables (like the left/right bounces in the Galton board above) and their sum tends toward a normal distribution — the Central Limit Theorem, the reason the bell curve is everywhere.

Recap
  • A random variable = a number determined by a random outcome.
  • Discrete (countable values) or continuous (density curve).
  • E(X) is the long-run average; sums of many RVs go normal (CLT).