|
Introduction
The following are observations distilled from the
"Scanline Rendering in POV-Ray" discussion thread
at news.povray.org. They discuss in detail the pros
and cons of scanline and raytracer rendering. The
two methods are first described, and then compared.
"Scanline" is a catch-all term for any method that renders
using output buffer polygon rasterization, including
REYES.
Raytracing
Raytracing is the dominant method for rendering
photorealistic scenes. POV-Ray and Rayshade are
examples of raytracers. Hardware implementations
of raytracers exist but tend to be rare.
The idea behind raytracing is to iterate through
the output buffer (screen) pixels and figure out
what part of the scene each of them shows. As a
result, if scene geometry remains constant, render time
increases in linear proportion to the number of
output pixels.
For each screen pixel, an imaginary ray is cast
from the camera into the scene. Intersections between
the ray and scene objects are compared and the closest
one to the camera is used to color the pixel, making
hidden surface removal implicit.
If the object is reflective or transparent/refractive,
a second ray is cast (or bounced off the object)
to find out what the object is reflecting or letting
show through. Rays can also be cast towards light sources
to determine shadows. Every primitive must provide some
way to test itself with the intersection of a ray.
A Z or depth buffer may also be used to provide
quick redrawing effects after a rendering is completed,
although some raytracers forego such a feature because
the main rendering task does not explicitly require
a Z buffer.
Scanline Rendering
Scanline rendering is the preferred method for
generating most computer graphics in motion pictures.
One particular implementation, REYES, is so popular
that it has become almost standard in that industry.
Scanline rendering is also the method used by
video games and most scientific/engineering visualization
software (usually via OpenGL). Scanline algorithms
have also been widely and cheaply implemented
in hardware.
In scanline rendering, drawing is accomplished by
iterating through component parts of scene geometry primitives.
If the number of output pixels remains constant, render
time tends to increase in linear proportion to the
number of primitives. OpenGL and Photorealistic Renderman
are two examples of scanline rendering.
Before drawing, a Z or depth buffer containing as
many pixels as the output buffer is allocated
and initialized. The Z buffer is like a heightfield
facing the camera, and it keeps track of which
scene geometry part is closest to the camera,
making hidden surface removal easy. The Z buffer
may store additional per-pixel attributes, or
other buffers can be allocated to do this (more
on this below). Unless primitives are prearranged
in back-to-front painting order and do not present
pathological depth issues, a Z buffer is mandatory.
For each primitive, it is either composed of an
easily drawable part (usually a triangle) or can be
divvied up (tesselated) into such parts. Triangles
or polygons that fit within screen pixels are
called micropolygons, and represent the smallest
size a polygon needs to be for drawing. It is
sometimes desriable (but not absolutely necessary)
for polygons to be micropolygons -- what matters
is how simply (and therefore quickly) a polygon
can be drawn.
Assigning color to output pixels using these polygons
is called rasterization. After figuring out which
screen pixel locations the corners of a polygon occupy,
the polygon is scan-converted into a series of
horizontal or vertical strips (usually horizontal).
As each scanline is stepped through pixel by
pixel (from one edge of the polygon to the other),
various attributes of the polygon are computed
so that each pixel can be colored properly.
These include surface normal, scene location,
z-buffer depth, and polygon s,t coordinates.
If the depth of a polygon pixel is nearer to
the camera than the value for the respective
screen pixel in the Z buffer, the Z buffer is
updated and the pixel is colored. Otherwise,
the polygon pixel is ignored and the next one
is tried.
Technique Comparison
Each rendering method has its strengths and weaknesses.
Because the shortcomings of one approach tend to be
strengths in the other, some renderers, suitably named "hybrid renderers",
use both methods in an attempt to have few or no weaknesses.
Raytracers are good at:
-
Photorealistic features such as reflections, transparency,
multiple lights, shadows, area lights, etc. With only a little
work, these features pretty much "fall out" of the algorithm,
because rays are a good analogy for light paths, thereby
modeling the real-world properties of light.
-
Rendering images with very large amounts of scene geometry.
By using a hierarchical bounding box tree data structure,
locating any given object to intersection-test is some
inverse power (log) of the number of primitives, similar
to guessing a number in a sorted list of numbers. Because
only world-aligned boxes need to be intersection-tested
when searching the tree, searches are relatively fast
compared to scene complexity.
-
Using different cameras. By simply altering how eye
rays are projected into the scene, one can easily imitate
the optical properties of many different lenses,
scene projections, and special lens distortions.
-
CSG. Constructive Solid Geometry modeling is easy
to support (todo: specifics).
-
Motion blur (todo: specifics).
Scanline renderers are good at:
-
Drawing quickly if the final number of polygons is
under some threshold relative to the visibility
determination algorithm being used (BSP, octree, etc.).
By not searching for scene geometry for each pixel,
they just "hop to it" and start drawing.
-
Supporting displacement shaders. After splitting
a primitive into polygons or patches, the polygon
or patch can easily be subdivided further to
produce more geometry.
-
Maintaining CPU/GPU code and data cache coherency,
because the switching of textures and primitives
occurs less frequently.
-
Arbitrary generation of primitives/patches/polygons,
because they can be unloaded after being drawn.
This is useful when implementing, for example,
shaders that work by inserting additional geometry
on the fly.
-
Realtime rendering even without hardware support,
and realtime rendering of considerable model complexity
with hardware support.
-
Wireframe, pointcloud, and other diagnostic-style rendering.
What impedes raytracing performance:
-
Although each screen pixel need only be computed once, that
computation is expensive. This can happen even for pixels that
are not intersected by any geometry. First, the projected eye
ray is determined. This costs at least 10 multiplies and 7 adds.
Next, the bounding slabs hierarchy is traversed. This requires
an optimized search with intersection testing of world-aligned
bounding boxes. Each box test costs two multiply-adds plus
some comparison logic. Then, when
the nearest bbox leaf node is found, the primitive inside is
tested. This costs at least 18 multiplies and 15 adds, because
the eye ray must be transformed into the primitive's local
coordinate system. Then the actual hit test is done; for
a sphere, we're looking at an extra 10 multiplies and 15 adds.
All the conditional logic inside these routines (and it can
be complex) impedes the CPU's branch prediction. There is
also the overhead of the slab machinery, which must maintain
state flags in rays, etc. If it takes five bbox tests during
the bounding slabs traversal, we've used 10 multiply-adds,
so the total for a sphere intersection for one screen pixel
would be 48 multiplies and 47 additions. So far, we have
not cast any secondary rays -- this computation load occurs
for each primary ray. Even without global illumination effects,
a raytracer would have to be able to trace width x height pixels
in 1/24 second to perform in realtime, but that would mean
about 14,745,600 multiplies and 14,438,400 additions for
a 640 x 480 display.
There's just so much computing going on. Considering that current chip fabrication
processes are hitting a wall, the necessary speed might not
be available for some time. Clearly, if raytracing is to
perform in realtime before Moore's Law can be unstalled,
a hardware assist (or massive parallelism) is necessary.
-
In scanlining, there are no computations required for
pixels that do not intersect geometry. The expensive operations are projecting
the primitive into eye space, tesselating
a primitive into polygons, projecting each polygon into screen space,
and computing per-polygon edge lists.
The larger a polygon is, the more pixels it has to spread across
the cost of the per-polygon computation. Since the ratio of the perimeter length
of a polygon to its surface area decreases as the polygon
gets larger, the per-pixel cost can become very small, the
ultimate minimum being 4 multiplies and 5 adds (an optimized
interpolation to compute the pixel's 3D location and depth buffer
value plus a single addition to increment the pixel's X coordinate).
This benefit accrues particularly in preview rendering
and rendering of flat surfaces such as planes, boxes, triangles,
etc. Throw in the greater cache coherency and less
disruptive effects upon branch prediction, and it's apparent
that a scanliner can afford to suffer pixel overwrites
several times before a raytracer becomes competitive.
With efficient visibility determination, scanlining
is an order of magnitude ahead. For micropolygons,
edge lists and their per-pixel interpolations become
unnecessary, so a different set of computation costs occur
(todo: investigate this).
|