3D pipeline overview and description

This short description explain mathematics tools and considerations used for provided 3d rendering engines. These techniques are commonly used but I have introduced some subjective point of view such as axis orientation.

The 3D rendering technique used is a combination, in this order, of: Camera space transformation + perspective projection + clipping + scanline + (Z-buffering + bilinear interpolation).

I have chosen this combination to make a compromise between realtime rendering and realistic rendering.

As 3D redenring programming is a very huge topic I will not explain each step with lots of details. So, I hope that the short description below will help you to understand the way provided 3D rendering engines work.

Each step is a bit developped in sections below this one with some specificals explainations related to our codes.


1Camera space transformation

This first step of process is used to compute vertex coordinates from 3d scene space (absolute space or "world space") to camera space (the point of view).

The camera position is always defined in the absolute space (or 3D world space) by two points:
  • The "IsAt" point, the space coordinates where the camera is.
  • The "LookAt" point, the space coordinates where the camera point to.
Note that the picture shows an inverted Y axis of the camera, I have used this orientation because computer graphics often use an Y value increasing from top to bottom. We can change the Y axis camera orientation, changing also the transformation matrix coefficients, but in this case we have to change the sign of some coefficients used to compute projection plane coordinates.

To perform a space transformation we use a transformation matrix, that point is more elaborated in camera space transformation section.


2Perspective projection

We usually consider that 3D scene is projected onto a virtual screen (projection plane) that is the 2D view observable from camera. You may consider it as a screen on which is displayed the 3D scene, seen along the Z axis of the camera (like your computer's screen).

Once again, the axis orientation of this projection plane has been chosen according to the computer graphics convention:



The clipping process step is used to eliminate the parts of 3D scene that are outside the projection plane, moreover it is used to eliminate from rendering process the parts that can't be fully seen from the camera and to clip the parts that are partially visible from camera. That allows to reduce the cpu time needed to render the scene.

Clipping action is generally realized before and after perspective projection. That means we need to know some informations in 3D space and 2D projection plane to eliminate and reduce what must be drawn.



Scanline rendering is an algorithm for drawing the pixels inside a projected polygon onto projection plane that works on a row-by-row basis.

The main idea is to use only values associated to each vertex of polygon to be drawn (three sets of values for a triangular polygon) and then to scan each pixel of a line, applying bilinear interpolation, to compute values for pixels inside polygon (z coordinate, color, diffuse light, specular light, the UV texture interpolants).


5Z-buffering and bilinear interpolation

Z-buffering and bilinear interpolations are generally mixed because the z coordinates are interpolated, using bilinear interpolation. But the bilinear interpolation is also used to determine the values (light, colors, UV texture interpolants) associated to each vertex of the polygon to be drawn, computed only if the pixel is visible by using the z-buffering test.


Z-buffering operation consists in the management of image depth coordinates in 3D space. It is one solution to the visibility problem, which is the problem of deciding which elements of a rendered scene are visible, and which are hidden (Z-buffering is also known as depth buffering).

When a part of the 3D scene is rendered, the depth of each generated pixel (z coordinate along z axis) is stored in a buffer (the z-buffer or depth buffer). This buffer is usually arranged as a two-dimensional array (x*y) with one element for each pixel of the projection plane. If another part of the scene must be rendered in the same pixel (same x and y coordinates in 2D projection plane), the two depths are compared and the one closer to the observer is chosen to be drawn (because this one is necessarily visible). The new depth is then saved into the z-buffer, replacing the old one. In the end, the z-buffer will allow to correctly reproduce the usual depth perception: a close object hides a farther one. This is called z-culling.

The pecision of elements of a z-buffer has a great influence on the scene quality.

As Palm os based devices haven't got a large heap memory I have chosen to use a 16-bit z-buffer. This choice may generate artifacts (called "z-fighting") when two objects are very close to each other, a 32-bit z-buffer behaves much better but will use more memory space.


5.1Bilinear interpolation

This last step is the one where pixels are set to their interpolated color, according to the lights, texture, material, depth, and so on.

This part of code must be enough optimized to run as fast as possible for realtime rendering. A larger description can be found in the related section.