Introduction

The following are observations distilled from the "Scanline Rendering in POV-Ray" discussion thread at news.povray.org. They discuss in detail the pros and cons of scanline and raytracer rendering. The two methods are first described, and then compared. "Scanline" is a catch-all term for any method that renders using output buffer polygon rasterization, including REYES.


Raytracing

Raytracing is the dominant method for rendering photorealistic scenes. POV-Ray and Rayshade are examples of raytracers. Hardware implementations of raytracers exist but tend to be rare.

The idea behind raytracing is to iterate through the output buffer (screen) pixels and figure out what part of the scene each of them shows. As a result, if scene geometry remains constant, render time increases in linear proportion to the number of output pixels.

For each screen pixel, an imaginary ray is cast from the camera into the scene. Intersections between the ray and scene objects are compared and the closest one to the camera is used to color the pixel, making hidden surface removal implicit.

If the object is reflective or transparent/refractive, a second ray is cast (or bounced off the object) to find out what the object is reflecting or letting show through. Rays can also be cast towards light sources to determine shadows. Every primitive must provide some way to test itself with the intersection of a ray.

A Z or depth buffer may also be used to provide quick redrawing effects after a rendering is completed, although some raytracers forego such a feature because the main rendering task does not explicitly require a Z buffer.


Scanline Rendering

Scanline rendering is the preferred method for generating most computer graphics in motion pictures. One particular implementation, REYES, is so popular that it has become almost standard in that industry. Scanline rendering is also the method used by video games and most scientific/engineering visualization software (usually via OpenGL). Scanline algorithms have also been widely and cheaply implemented in hardware.

In scanline rendering, drawing is accomplished by iterating through component parts of scene geometry primitives. If the number of output pixels remains constant, render time tends to increase in linear proportion to the number of primitives. OpenGL and Photorealistic Renderman are two examples of scanline rendering.

Before drawing, a Z or depth buffer containing as many pixels as the output buffer is allocated and initialized. The Z buffer is like a heightfield facing the camera, and it keeps track of which scene geometry part is closest to the camera, making hidden surface removal easy. The Z buffer may store additional per-pixel attributes, or other buffers can be allocated to do this (more on this below). Unless primitives are prearranged in back-to-front painting order and do not present pathological depth issues, a Z buffer is mandatory.

For each primitive, it is either composed of an easily drawable part (usually a triangle) or can be divvied up (tesselated) into such parts. Triangles or polygons that fit within screen pixels are called micropolygons, and represent the smallest size a polygon needs to be for drawing. It is sometimes desriable (but not absolutely necessary) for polygons to be micropolygons -- what matters is how simply (and therefore quickly) a polygon can be drawn.

Assigning color to output pixels using these polygons is called rasterization. After figuring out which screen pixel locations the corners of a polygon occupy, the polygon is scan-converted into a series of horizontal or vertical strips (usually horizontal). As each scanline is stepped through pixel by pixel (from one edge of the polygon to the other), various attributes of the polygon are computed so that each pixel can be colored properly. These include surface normal, scene location, z-buffer depth, and polygon s,t coordinates. If the depth of a polygon pixel is nearer to the camera than the value for the respective screen pixel in the Z buffer, the Z buffer is updated and the pixel is colored. Otherwise, the polygon pixel is ignored and the next one is tried.


Technique Comparison

Each rendering method has its strengths and weaknesses. Because the shortcomings of one approach tend to be strengths in the other, some renderers, suitably named "hybrid renderers", use both methods in an attempt to have few or no weaknesses.

Raytracers are good at:

  • Photorealistic features such as reflections, transparency, multiple lights, shadows, area lights, etc. With only a little work, these features pretty much "fall out" of the algorithm, because rays are a good analogy for light paths, thereby modeling the real-world properties of light.
     
  • Rendering images with very large amounts of scene geometry. By using a hierarchical bounding box tree data structure, locating any given object to intersection-test is some inverse power (log) of the number of primitives, similar to guessing a number in a sorted list of numbers. Because only world-aligned boxes need to be intersection-tested when searching the tree, searches are relatively fast compared to scene complexity.
     
  • Using different cameras. By simply altering how eye rays are projected into the scene, one can easily imitate the optical properties of many different lenses, scene projections, and special lens distortions.
     
  • CSG. Constructive Solid Geometry modeling is easy to support (todo: specifics).
     
  • Motion blur (todo: specifics).
     
Scanline renderers are good at:

  • Drawing quickly if the final number of polygons is under some threshold relative to the visibility determination algorithm being used (BSP, octree, etc.). By not searching for scene geometry for each pixel, they just "hop to it" and start drawing.
     
  • Supporting displacement shaders. After splitting a primitive into polygons or patches, the polygon or patch can easily be subdivided further to produce more geometry.
     
  • Maintaining CPU/GPU code and data cache coherency, because the switching of textures and primitives occurs less frequently.
     
  • Arbitrary generation of primitives/patches/polygons, because they can be unloaded after being drawn. This is useful when implementing, for example, shaders that work by inserting additional geometry on the fly.
     
  • Realtime rendering even without hardware support, and realtime rendering of considerable model complexity with hardware support.
     
  • Wireframe, pointcloud, and other diagnostic-style rendering.
     

What impedes raytracing performance:

  • Although each screen pixel need only be computed once, that computation is expensive. This can happen even for pixels that are not intersected by any geometry. First, the projected eye ray is determined. This costs at least 10 multiplies and 7 adds. Next, the bounding slabs hierarchy is traversed. This requires an optimized search with intersection testing of world-aligned bounding boxes. Each box test costs two multiply-adds plus some comparison logic. Then, when the nearest bbox leaf node is found, the primitive inside is tested. This costs at least 18 multiplies and 15 adds, because the eye ray must be transformed into the primitive's local coordinate system. Then the actual hit test is done; for a sphere, we're looking at an extra 10 multiplies and 15 adds. All the conditional logic inside these routines (and it can be complex) impedes the CPU's branch prediction. There is also the overhead of the slab machinery, which must maintain state flags in rays, etc. If it takes five bbox tests during the bounding slabs traversal, we've used 10 multiply-adds, so the total for a sphere intersection for one screen pixel would be 48 multiplies and 47 additions. So far, we have not cast any secondary rays -- this computation load occurs for each primary ray. Even without global illumination effects, a raytracer would have to be able to trace width x height pixels in 1/24 second to perform in realtime, but that would mean about 14,745,600 multiplies and 14,438,400 additions for a 640 x 480 display.

    There's just so much computing going on. Considering that current chip fabrication processes are hitting a wall, the necessary speed might not be available for some time. Clearly, if raytracing is to perform in realtime before Moore's Law can be unstalled, a hardware assist (or massive parallelism) is necessary.
     

  • In scanlining, there are no computations required for pixels that do not intersect geometry. The expensive operations are projecting the primitive into eye space, tesselating a primitive into polygons, projecting each polygon into screen space, and computing per-polygon edge lists. The larger a polygon is, the more pixels it has to spread across the cost of the per-polygon computation. Since the ratio of the perimeter length of a polygon to its surface area decreases as the polygon gets larger, the per-pixel cost can become very small, the ultimate minimum being 4 multiplies and 5 adds (an optimized interpolation to compute the pixel's 3D location and depth buffer value plus a single addition to increment the pixel's X coordinate). This benefit accrues particularly in preview rendering and rendering of flat surfaces such as planes, boxes, triangles, etc. Throw in the greater cache coherency and less disruptive effects upon branch prediction, and it's apparent that a scanliner can afford to suffer pixel overwrites several times before a raytracer becomes competitive. With efficient visibility determination, scanlining is an order of magnitude ahead. For micropolygons, edge lists and their per-pixel interpolations become unnecessary, so a different set of computation costs occur (todo: investigate this).


     

 


Copyright Daylon Graphics Ltd.

Daylon Leveller is powered in part by the OpenGL® API.