'How tall are the students?' is one question. 'Do taller students weigh more?' is a deeper one. One variable describes; two variables relate — and they need completely different toolkits.
Univariate analysis studies one variable at a time — its centre, spread, and shape. Bivariate analysis studies two variables together — whether and how they're related.
It's the fork in the road for every data analysis: are you summarising one thing, or hunting for a relationship between two? The chart, the statistic, and the conclusion all depend on the answer.
Which one is a bivariate display?
Tools for each
- Univariate — mean, median, mode, range, SD; bar graph, histogram, box plot, dot plot.
- Bivariate — correlation, regression line; scatter plot, two-way table, side-by-side box plots.
- Univariate asks: *what's typical, how spread out, what shape?*
- Bivariate asks: *do these move together — and how strongly?*
Classify each task as univariate or bivariate: (a) average rainfall this year, (b) does rainfall affect crop yield, (c) the distribution of exam grades.
What comes after bivariate?
Multivariate analysis — three or more variables at once (e.g. how age, income, and education jointly predict spending). Most real-world data science is multivariate; univariate and bivariate are the foundations.
Don't reach for correlation when you only have one variable — there's nothing to correlate it with. And don't summarise two related variables separately; you'd miss the relationship that's the whole point.
First question of any analysis: *how many variables am I actually interested in here?* One → describe it. Two → look for a relationship. Three+ → multivariate methods.
- Univariate = one variable: centre, spread, shape.
- Bivariate = two variables: is there a relationship, and how strong?
- Three or more → multivariate — the basis of most data science.