Least-squares regression
Find the line that minimises total squared error — the line of best fit.
A scatter of dots that *roughly* line up. There are infinitely many lines you could draw through them. Least squares picks the one — the line that's, in a precise sense, as close to every dot as possible.
Least-squares regression finds the straight line that minimises the total of the *squared* vertical distances from the data points to the line. That line is the 'line of best fit'.
Trend lines, forecasting, calibration curves, the foundation of linear regression in statistics and machine learning, A/B test analysis.
y ≈ 0.96x + 1.86
r runs from −1 (perfect down) through 0 (no link) to +1 (perfect up). Correlation isn't causation!
b is the slope (how much y changes per unit x); a is the intercept. The line always passes through the point (x̄, ȳ).
A best-fit line is ŷ = 50 + 8x, where x = hours studied. What does the 8 mean?
What does R² tell you about a regression?
R² is the fraction of the variation in y explained by the line — 0 (line useless) to 1 (perfect fit). R² = 0.8 means 80% of y's variability is accounted for by x; the other 20% is scatter the line can't capture.
Don't extrapolate far beyond your data. A line fitted on study-hours 0-5 says nothing reliable about 20 hours. And a good fit still isn't proof that x *causes* y.
Gauss and Legendre both claimed least squares around 1805-1809 — Gauss used it to predict where the asteroid Ceres would reappear, and nailed it. Two centuries on, it's the 'linear' in linear regression.
- Finds the line minimising total squared vertical error.
- Slope b = change in y per unit x; line passes through (x̄, ȳ).
- R² = fraction of variation explained; don't extrapolate or claim causation.