Back to Graphics
Graphics › Drawing it
The rendering pipeline
An assembly line: a list of vertices goes in one end, a grid of coloured pixels comes out the other. Here are the stations, in order.
Coordinate spaces — follow a cube from mesh to pixels
Model space: the cube as the mesh file stores it — built around its own origin, no idea where it'll end up.
the cube's far-top-right corner here: (0.5, 0.5, 0.5)
The stations on the line
- Vertex shader — runs once per vertex. Its main job: multiply by the MVP matrix to take the vertex from model space to clip space. It can also move vertices around (waves, wind, skinning) and pass data (colour, UVs, normals) down the line. Programmable.
- Primitive assembly & clipping — groups vertices into triangles; clips any that cross the view-volume boundary; optionally back-face culls (skips triangles facing away). The perspective divide (÷w) and the viewport transform happen here. Fixed-function.
- Rasterization — turns each screen-space triangle into the set of pixels (“fragments”) it covers, interpolating the per-vertex data across them (perspective-correctly). See rasterization. Fixed-function.
- Fragment (pixel) shader — runs once per fragment. Computes the final colour: sample textures, do lighting, fog, etc. Programmable.
- Per-fragment tests & blending — depth test (is something already in front?), stencil test, alpha blending — then the survivor is written to the framebuffer. Fixed-function.
The shape of it
Per-vertex work (a few thousand items) → assembly & clip → per-fragment work (millions of items). The fan-out at rasterization is why fragment shaders dominate the GPU's time, and why “overdraw” (shading pixels that later get hidden) hurts.
What you actually write
Only two boxes are yours: the vertex shader and the fragment shader (plus, on modern GPUs, optional geometry/tessellation/compute stages). Everything else — clipping, rasterizing, depth testing — is fixed silicon you configure but don't code. The art of real-time graphics is doing as little per-fragment work as possible and as much per-vertex (or precomputed) as you can get away with.