The following is an excerpt from the Daylon Leveller documentation.
Written by Ray Gardener.
Last updated: May 20, 2003


About Heightfields

The Basics
How Heightfields are Rendered
How Sample Heights are Determined
Heightfield Pros and Cons
UV Displacement Mapping of Heightfields
Texturing Issues
Gridding in Linearly Sampled Heightfields
Efficient Rendering of Heightfields






The Basics

A heightfield is simple: it's a set of numbers arranged so that they form a two-dimensional grid, like a bitmap, except each number represents a ground elevation instead of a color. Most of the stress over heightfields comes from confusing them with bitmaps. This is not surprising, since the two are conceptually similar, and bitmap files are often used to store heightfields.

Heightfields are also sometimes called terrain objects, although this is not quite as accurate, since many types of objects can be used to model terrain.

Essentially, a heightfield has two parts:

A water level is a hint to the renderer that height values below the level will be obscured (usually by a plane representing water, hence the name) and can be avoided for consideration when rendering the final image. A water level is optional, but it's worth using for just this reason.

 

How Heightfields are Rendered

Most renderers will display an untransformed heightfield so that it lies flat in the scene like a floor plane. The height values deform the heightfield so that it is stretched up and/or down. As in POV-Ray, the Y-axis is the vertical axis by default, so it is along this axis that height values relate to in Leveller.

The renderer determines the location of each 3D point in a heightfield by simply iterating over its width and breadth, and sampling the height value at each grid location. If the renderer is sampling the heightfield too finely, height values that lie between adjacent samples can be approximated by averaging.

The renderer will usually create a polygon mesh to describe the heightfield as a set of connected triangles. The samples in the heightfield determine the vertical aspect of the triangles' vertices, while the current sample location determines the triangles' location in the other two dimensions. The normal of a particular vertex is the average of the plane normals of the surrounding triangles. With the triangle vertices and normals computed, the renderer can draw the heightfield.

Smoother heightfields can be rendered by using nonlinear sampling. Instead of simply connecting adjacent elevation points into triangles, the points are used to compute spline patches. The slope and surface normal of the heightfield varies continually and smoothly, although more computation must be done to determine the patches. Renderman-compliant renderers, for example, support bicubic PatchMesh objects.

In POV-Ray (and most other renderers), heightfields are described in their own coordinate system which measures from the points (0,0,0) to (1,1,1). Renderman "Pz" PatchMesh objects (those with height values only) are similarily described. The final size and orientation of the heightfield is determined by transforming space. Leveller, for example, will add scale and translate commands to make the heightfield occupy the coordinates in POV-Ray world space that the user expects.

 

How Sample Heights are Determined

For efficiency, each height value in a heightfield is described using some multiple of eight bits. POV-Ray, for example, supports both 8-bit and 16-bit heightfields. Leveller internally uses 32 bits when saving heightfields in its own TER format. USGS DEM files use human-readable text representations for height values, so their magnitude and precision depends to some extent on the system exporting and importing the file.

The more bits per sample, the smoother the terrain can look. With eight bits, there can be 256 distinct elevations in the heightfield. With sixteen bits, there can be up to 65,536. This range is sometimes called the heightfield's vertical resolution. In practice, eight bits tends to be insufficient -- some such heightfields suffer from a slightly terraced look.

The lowest and highest points in the heightfield are mapped to the vertical coordinate values of 0 and 1, respectively. Two untransformed heightfields will be the same thickness regardless of their component sample values, since both will be mapped into the 0...1 vertical space. If space has been scaled vertically tenfold, then the heightfields will be mapped into a 0...10 vertical space.

People often wonder how the color of a pixel in a bitmap relates to the height of a heightfield sample. The simple answer is, they don't. Although it's certainly useful to arrange a bitmap so that certain colors represent certain heights, this is merely convention on the user's part and has no bearing on how the renderer interprets the heightfield's sample values.

Take an 8-bit bitmap, for example. Each pixel is an 8-bit value ranging from 0 to 255 that refers to a palette of colors. A pixel with a value of twenty will refer to the color occupying the bitmap's twentieth palette position. That color could be anything: black, red, yellow, whatever. All the renderer cares about is that the pixel has a value of twenty, and therefore a height value of twenty.

 

Heightfield Pros and Cons

Most scenes requiring terrain will use a heightfield. This is because a heightfield is the most efficient way to describe the numerous tiny undulations of a typical landscape at design time. When a heightfield's design is completed, it can be converted to a more efficient polygon mesh, but not all heightfields benefit from this conversion (e.g., those containing only a few connected areas of similar slope).

But heightfields have their limits. By ommitting specific width and breadth information, (leaving only height values), each height value must occupy a unique 2D location when viewed from above. The important corollary to this is that no 2D location in the heightfield can have more than one height value associated with it. This effectively rules out using heightfields to model rocks, asteroids, bridges, caves, or terrain with slopes equalling or exceeding ninety degrees. Some objects, however, can be convincingly modelled with two or more heightfields carefully placed. This limitation is not true for mapped heightfields (see next topic).

Although this fact makes heightfields unsuitable for many types of objects, it also makes heightfields very easy to work with, since the height values can effectively be treated using standard bitmap-editing interfaces that work with 2D images. Formally, a heightfield has a well-defined 2D topology: each 2D coordinate refers to a unique location on the heightfield.

This feature also makes it easy to paint heightfields by mapping bitmaps onto them. All that's required is to project the bitmap onto the heightfield's groundplane, and every point on the heightfield will be textured.

 

UV Displacement Mapping of Heightfields

Since heightfields have a well-defined 2D topology, they are good candidates for UV mapping onto other 3D objects. A Mercator projection of a planet's terrain, for example, maps nicely onto a sphere to form a bumpy world. By converting mapped heightfields into polygon meshes, we can use heightfields to model otherwise unavailable objects, such as rocks and caves. In fact, all we are doing is using a heightfield as a texture map.

Some mappings are non-invertible or are otherwise highly distorted. A sphere is both: all the points in a heightfield's first row map to a sphere's north pole. It's not possible to derive a unique first-row heightfield coordinate from a pole.

A renderer can save memory by creating the final mesh at runtime. To do this, it must have the heightfield and also some information describing the object to be displaced. In the case of basic shapes, this information is quite small. A sphere, for example, is simply a radius about some center.

Basic shapes like spheres tend to have well-defined surface normals, which make displacement mapping easy since the heightfield elevations must displace mesh vertices along these normal vectors. They also tend to have well-defined UV mappings: a sphere uses longitude and latitude. Complex or arbitrary shapes are more difficult: usually, an artist must define an explicit UV map saying which UV coordinates relate to which object locations, and some or all surface normals may also have to be explicitly defined.

 

Texturing Issues

Such straightforward mapping of an image to a heightfield, however, is problematic when slopes in the terrain became significantly greater than about thirty or forty degrees. The distance between adjacent height values becomes much greater in three dimensions than it does in two, and the pixels in the image are noticeably stretched in order to provide coverage.

There are only five solutions to this problem that we know of:

 

Gridding in Linearly Sampled Heightfields

Another important problem with triangle-mesh (i.e., linearly sampled) heightfields is what's commonly referred to as gridding, i.e., the blocky or aliased appearance of slopes, usually those nearest to the viewer. The problem tends to be even worse in areas of significant change in slope, where the fact that the heightfield is composed of discrete samples becomes particularly obvious.

Phong shading is commonly employed to overcome gridding. Each component triangle is lit by averaging the normals of its vertices across its surface, and using the result to accurately determine the amount of reflected light. This approach fails, however, when the triangles are significantly larger than the pixels of the device being rendered to.

The inherent problem is that the heightfield really is composed of discrete samples, and any digital image viewed close enough will exhibit such 'jaggies'. The ultimate solution, therefore, is to increase the heightfield's resolution. A good rule of thumb when choosing a resolution is to base it on the expected highest resolution of the device the heightfield will be rendered to.

One also needs to take into account how closely the heightfield will be viewed. It does little good to use a high resolution if the viewer is so close that each sample point is many device pixels apart. In such cases, the rest of the heightfield will not be visible (because it will be outside the field of view), and it becomes practical to use a separate, smaller heightfield for the 'close-up shot'.

Another popular solution is to use nonlinear sampling. Renderman PatchMesh objects, for example, treat elevations as node points in spline curves, forming smooth spline patches instead of flat triangles. However, since triangles must be fed to the renderer for rasterization, the spline patches must be tesselated, slowing things down.

A similar technique (used by Terragen and World Construction Set) is to use fractal subdivision. This breaks up each heightfield triangle into more and smaller triangles. This not only gives the illusion of greater surface detail, but for terrain images, the resulting surfaces often look convincingly like rock. Again, this slows rendering down, because of the tesselation.

Both of the preceding methods are variations on displacement shading, which creates more geometry than actually exists. Unlike bumpmapping, which gives the illusion of wrinkled surfaces, displacement shading actually makes extra triangles for the rasterizer to draw. The thing to note is that you only have full control over the elevations in the heightfield -- any extra geometry is being created dynamically by the renderer, and thus can only be indirectly controlled using settings that apply (or tend to apply) globally to all the heightfield pixels.

The fractal subdivision technique, for example, looks good for rocky surfaces but makes sand and snow regions look too hard. The workaround is normally some kind of material map, which is usually an auxiliary texture map that says what material (snow, ice, rock, grass, mud, etc.) each heightfield pixel is covered with. Assuming the renderer has tesselators/shaders for each material, the resulting rendition can look very good. The best renderers are able to blend their material shaders smoothly, so that the material in adjacent heightfield cells transitions evenly (and logically) between coverings. In some older games (and even some new ones) you can see paved roads exhibiting jagged edges. In Bungie's combat simulator Halo, for example, you see very good transitions.

 

Efficient Rendering of Heightfields

Much work has been expended (and continues to be expended) on drawing heightfields quickly. For the person needing interactive (or simply prompt) display of heightfields, there is almost an embarrasment of riches in terms of rendering technique. They all pretty much emphasize a two-step program: drawing only that which is truly visible (visibility determination) and level-of-detail management (LOD rendering).

LOD management exploits the fact that heightfield cells which are farther from the eye occupy fewer screen pixels when drawn. If you approximate several distant cells with just one that covers them all, the illusion often holds (especially for action games where the camera is often in motion). Although cameras in flight simulators often have long steady moments, such programs tend to use realistic terrain which is mostly flat, and LOD techniques work well for such heightfields.

The classic visibility determination algorithm is the quadtree. This simply builds up a hierarchical set of bounding boxes where the heightfield's overall bounding box is recursively subdivided into four smaller boxes, over and over again, until the smallest boxes each contain only four cells. If you know that a given box isn't visible, then you know that all of its child boxes (and therefore the heightfield cells within them) aren't visible either. You simply test the visibility of the largest box and recurse through its children, skipping over the invisible ones. In the worst case scenario, all of the boxes intersect the viewing frustum, but if you employ limited visibility (e.g., a fogged horizon), then this never happens.

Quadtrees also make LOD management possible. If you need to draw the cells in a distant box, you can opt to draw just a few polygons representing the box's corners instead.

The downside with quadtrees is that they take an enormous amount of memory, often some multiple of the heightfield itself. Quadtree LOD rendering also suffers from noticeable "popping": triangles abruptly changing size and position as you move nearer and farther from an approximated set of cells. Poor quadtree LOD implementations also suffer from gapping, where the edges of adjacent triangles fail to meet correctly.

Other methods offer adjustments to quadtree rendering or take different approaches altogether. The goals can be varied: some focus on the reduction or elimination of popping; others try to keep memory usage down. I haven't yet seen a method that works perfectly in all situations, because the demands of each often produce conflicting solutions.

The changing hardware landscape also makes some approaches more reasonable. With sufficient rasterizing speed, for example, LOD optimizations aren't needed -- it's more efficient to just send all the triangles to the screen. On the other hand, that may not be true if you want to draw something else that's very complex, such as finely detailed characters or buildings.

Another common approach is to not use the heightfield data directly. Instead, the heightfield is converted to an irregular polygon mesh where areas of similar slope can be safely replaced with large triangles. The main drawback is that it's hard (or impossible) to do any dynamic surface effects (like the Thumper weapon in Activision's Battlezone, which sends a visible seismic wave rippling through the ground). The main benefit is that the mesh often has a memory footprint orders of magnitude smaller than the heightfield.

The ROAM (Real-time Optimally Adapting Meshes) technique tries to tesselate the heightfield into the minimal number of triangles for the current camera location. It was developed by Mark Duchaineau et al at the Los Alamos National Laboratory.

Using a precomputed bintree (each triangle recursively refers to two smaller triangles), and dynamic split-and-merge operations, ROAM can automatically split triangles into smaller ones as they come closer and vice versa. Since the only time triangles are merged and split is when the camera moves, performance remains constant despite heightfield size (although this would be true of any good automatic LOD scheme). The triangles are always right triangles, which makes the bintree fairly easy to compute, and the split/merge operations are likewise easier to perform. Gapping is also easier to deal with: if you split a triangle, there's only one neighbouring triangle to consider, and cracking it to fit properly with is straightforward. Memory usage is still pretty high, but popping is somewhat easier to control, and it's even possible to support dynamic terrain changes.

For landscape views that can assume certain fixed properties of the viewpoint, voxel rendering is popular. The FLY! terrain viewer, for example, can render textured terrain in realtime by never allowing the camera to roll or bank. The rendering algorithm therefore knows that each elevation maps to a specific set of screen pixel columns.

 


Copyright 1998-2003 Daylon Graphics Ltd. All Rights Reserved.