Math Playground
Back to Graphics

Graphics › Coordinate spaces

Perspective & the divide

Why distant things look small: the projection matrix smuggles depth into the w coordinate, then everything gets divided by it.

Projection — orthographic vs perspective
image planeeyenear (d=1.4)mid (d=2.6)far (d=4)

what the camera records

Rays converge on the eye, so the far object lands closer to the centre and is recorded smaller — that's the ÷ depth that gives a sense of distance. Widen the FOV and more of the scene crams onto the same image.

The ÷w trick

A perspective projection matrix is built so that, after multiplying a view-space point by it, the 4th coordinate w comes out equal to roughly the distance to the camera (−z, in OpenGL terms). Then the GPU does the perspective divide: it divides x, y and z by that w.

So a point twice as far away has twice the w, and after the divide its x and y are halved — it lands half as far from the centre of the screen and draws half as big. That single division is the entire reason a hallway converges to a vanishing point.

Three knobs

Field of view (FOV) — the angle the frustum opens to. Wide FOV = more in shot, more “fish-eye” distortion at the edges; narrow FOV = zoomed-in, flattened look. Aspect ratio — width ÷ height of the window, so circles don't become ovals (see aspect ratio). Near & far planes — the closest and furthest distances that get drawn.

Why the near plane can't be zero

Dividing by w means dividing by depth — and at depth 0 you'd divide by zero. So there's always a near plane a small distance in front of the eye; anything closer is clipped. Set near too small and far too large and depth precision collapses: distant surfaces flicker in and out of each other — z-fighting. The fix is almost always “push the near plane out”, not “pull far in”, because depth precision is bunched up near the camera.

Perspective-correct interpolation

Because of the ÷w, you can't blend texture coordinates across a triangle linearly in screen space — that gives the wobbly-texture look of early PlayStation games. Modern GPUs interpolate attribute/w and 1/w instead, then divide: “perspective-correct” interpolation.