Real-Time Voxel Coloring

Investigators: Andrew Prock, Chuck Dyer

Related Publications:
Photorealistic Scene Reconstruction by Voxel Coloring (Seitz, Dyer) --- in Proc. CVPR 97
Towards Real-Time Voxel Coloring (Prock, Dyer) --- in Proc. IUW 98

Source Code

The source code is available for public consumption. This code was originally developed for the SGI O2 platform, and I make no guarnatees on how easy it will be to port to another platform. There is a Makefile which may be able to compile the non-OpenGL executables on Solaris.

note: the documentation is horrendous.


Abstract from "Towards Real-Time Voxel Coloring" -

"Techniques for constructing three-dimensional scene models from two-dimensional images are often slow and unsuitable for interactive, real-time applications. In this research we explore three methods of enhancing the performance of the voxel coloring reconstruction method. The first approach uses texture mapping to leverage hardware acceleration. The second approach uses spatial coherence and a coarse-to-fine strategy to focus computation on the filled parts of scene space. Finally, the multi-resolution method is extended over time to enhance performance for dynamic scenes."


The original Voxel Coloring algorithm was designed for static scenes. Thus producing output quickly was not a priority. To use Voxel Coloring for with dynamic scenes several key elements have changed from the original approach. Algorithmically, Voxel Coloring has been recast to take advantage of spatial and temporal coherence. Experimentally, the amounto of input data has been reduced significantly. Together, these changes allow us to contemplate implementing Voxel Coloring in real-time on modern desktop workstations.

The most significant change has been the development of a coarse-to-fine/multiresolution approach to Voxel Coloring which speeds up performance dramatically.

Coarse-To-Fine Processing

Oct-tree methods are common in coarse-to-fine processing of volumes. We a similar technique for voxel coloring. By decomposing large voxels into smaller voxels, and then coloring the subdivided set of voxels, computation can be focuses on significant portions of the scene.

Because Voxel Coloring depends on statistical methods, a direct decomposition does not work correctly. Voxels that may appear empty at one resolution, could contain subvoxels which should be colored at a higher resolution. This problem is illustrated in the figure below:

To compensate for the abundance of false negatives, the algorithm performs a nearest neighbor search to augment the set of voxels already colored. Such a search relies on a high degree of spatial coherence. In particular, it is assumed that all small details (high spatial frequency) occur close to the details which can be detected at coarse resolution. The figure below illustrates the augmentation process through one iteration.

Depending on the final resolution of the scene, speedups can vary from 2 to 40 times that of the original algorithm. The higher the final resolution the greater the speedup. The scene being colored consisted of eight images radially placed around the figure. The image resolution as 640x480. The images were manually segmented.
The figure on the right illustrates the coarse-to-fine process as resolution increases from 32x32x32 to 256x256x256, doubling at each iteration. In the chart to the left the performance gain for coarse-to-fine processing is illustrated. Also included are timings for prewarping the image to eliminate radial distortion.

Dynamic Scene Processing Experiment

To investigate dynamic scene processing, a staging area was built for data capture. Four cameras were used to collect the data. Each camera was mounted at a corner of the area, and was calibrated using the planar implementation of Tsai's algorithm. The walls of the staging area were then covered with blue matte paper to fascilitate automatic segmentation. Scene reconstructtions were then created serially off-line and redisplayed as a 3D movie. Finally, a verstion of Voxel Coloring which takes advantage of temporal coherence was developed. The whole process is illustrated in the mpegs below. Dynamic Voxel Coloring is described below.

The MPEGs

The input sequence was made up of 16 frames taken from four cameras. The cameras were place near the ceiling in the four corners of the room. The background was covered with blue matte paper to fascilitate segmentation. The resolution of the input images was 320x240. This mpeg shows the stream of input from each camera in turn.
Voxel Coloring makes use of occlusion information which is built up during scene traversal. This mpeg shows how a particular image is projected onto each layer in turn. Note that the occluded pixels in the image are colored black as the mpeg progresses. Occluded pixels have been accounted for in scene space and cannot contribute further to the reconstruction.
Static colorings were then created from each of the 16 sets of input frames. This mpeg shows the static coloring with a scene resolution of 256x256x256 voxels. Notice that the reconstruction is hollow. Also note that the photo realism is limited by the resolution of the input images (for example the head was roughly 15 high).
This mpeg displays the colorings in sequence. The motion of the subject over time adds a great deal of realism to the visualization. (smaller movie)
The final mpeg shows the voxel coloring algorithm running on a sequence of frames while displaying the output interactivly. Each 3D frame is reconstructed in about a half a second on an SGI O2 R5000. To achieve this real-time performance the resolution of the scene has been reduced to 64x64x64 voxels.

Dynamic Voxel Coloring

Coming soon.