Helmholtz Stereopsis: CS766 Project

Helmholtz Stereopsis

The project presentation slides can be found here.

1. MOTIVATION

The problem of surface reconstruction from a collection of images has been well-studied in the field of computer vision. Most of the algorithms that try to solve the problem rely on some strong assumptions about the scene at hand. For example, conventional stereo assumes that there is enough texture in the scene to find correspondences and photometric stereo assumes that all the objects in the scene are lambertian. These assumptions limit their usgae in many practical scenarios.

Helmholtz Steropsis is an algorithm for surface reconstruction that requires no aprioi knowledge of the underlying reflectance properties of the scene. Thereby, enabling surface reconstruction for arbitrary materials with unknown BRDFs (bidirectional reflectance distribution function). This method exploits the symmetry of surface reflectance.

2. APPROACH

In this section, we briefly describe the Helmholtz Stereopsis algorithm.

Helmholtz Steropsis is an active method for surface reconstruction. The camera and source positions are manipulated in order to acquire different images of the scene. This method uses the Helmholtz reciprocity constraint, which states that, for any two directions v_i and v_r , BRDF (v_i , v_r) = BRDF (v_r , v_i). This means that the BRDF value does not change on interchanging the lighting and viewing directions. To exploit this property, reciprocal pairs of images are captured.

Reciprocal pair: A pair of images are captured by interchanging the camera and source position

Image-1: Camera is at O_r , and the scene point is lit by the source at O_i
Image-2: Camera is at O_i , and the scene point is lit by the source at O_r

The algorithm works as follows:

Capture N reciprocal pairs (2N images) using different camera-source locations
Consider image-1 of the 1st reciprocal pair as the base image
For each pixel in the base image:
- Iterate over a set of depth values d₁ , d₂ , ... , d_m
- For each such d_i , construct the W matrix as suggested in the original paper [1]
- From all the W(d_i) matrices, select the matrix which is the closest to being rank-2 (say W(d_q))
- Surface normal of the point is the 3rd singular vector of W(d_q)
Generate the 3D surface using the depth and normal estimates

3. IMPLEMENTATION

The implementation can be divided into three main parts: (1) Scene Generation (2) Image Generation and (3) Helmholtz Stereopsis

Scene Generation:

We created tessellations of some primitive shapes (spheres, cubes and cuboids). These would serve as building blocks for our scenes
We considered two kinds of materials (1) Lambertian and (2) MERL BRDF models [2]
Using the primitive shapes and BRDFs, we implemented a framework in OpenGL to render the scene given the camera and source positions

Image Generation:

This part of our implementation focuses on capturing the reciprocal pairs
Drawing inspiration from the experimental setup of the original paper, we modeled our image generator to capture reciprocal images such that the camera and source are positioned on a circle
Consider a sphere enclosing the scene. Given the polar (θ) and the azimuth (φ) angle, we considered the camera and source to be located at (θ,φ) and (θ,120 + φ) respectively. Fixing θ at 15° we generated 20 reciprocal pairs by uniformly varying φ from 0° to 359°

Helmholtz Stereopsis:

This part implements the Helmholtz Stereopsis algorithm as was described in section 2
We stored the estimated depth maps, normal maps and error maps along with the corresponding metadata for further analysis
We experimented with various depth smoothing techniques to get better results
Implemented using C++ and OpenCV

4. RESULTS

This section summarizes the experiments we ran on our implementation of Helmholtz Stereopsis.

Lambertian cube:

The images below are the reciprocal image pairs. These images along with the camera/source positions serve as inputs to the algorithm. The top left image is treated as the base image. This is the reference view point for which the depth map and normal map are estimated.



Reciprocal pair 1	Reciprocal pair 2	Reciprocal pair 3

Observe that the estimated depth map is noisy. This is mainly because the constraints which are exploited for reconstruction are only necessary but not sufficient. We elaborate more on this point in section 5. The normals have been color coded so that they can be represented as a RGB image. Each component (x, y, z) of the estimated normal is mapped from (-1, 1) to (0, 255). Though not perfect, the estimated normals are less noisy than the estimated depths. Hence, rather than using the estimated depth values we instead recover a smoother depth map by integrating the estimated normals (Frankot-Chellapa [3]). In all the following sections, we only show and compare the estimated normal maps.


Estimated depth map	Estimated normal map	Depth from normals

Reciprocal pairs:

As was stated previously, the Helmholtz constraints are only necessary but not sufficient. Implying, that it is possible for wrong depth values to also satisfy the constraints. In order to ameliorate this, one can capture and use more than 3 reciprocal pairs. The probablilty that a wrong depth value will satisfy all the constraints, decreases with the number of constraints. The estimated normal maps below illustrate the advantage of using more reciprocal pairs. This is more clearly depicted in the scenario when the object of interest is a sphere. In all the following sections, results are derived using 20 reciprocal image pairs.



Using 3 pairs	True normal map	Using 20 pairs

Different materials:

We used OpenGL to capture reciprocal image pairs. The primary reason for using OpenGL was so that we could model objects with arbitrary BRDFs. Below, we show the result of using real-life materials like plastic, gold and rubber. Notice that the method fails to estimate normals in saturated regions. We believe that this is only because our system is highly symmetric (object is spherical and camera/source positions are circular). The method should be robust to object points being saturated in a few reciprocal pairs. A thorough validation of this stays as future work.


Plastic cube	Estimated normal map

Gold sphere	Estimated normal map

Rubber sphere	Estimated normal map

Compound scenes:

Till now we have only considered one object in the scene. Now we consider the scenario where we have multiple objects in the scene.


Reciprocal pair 1 - Image 1	Reciprocal pair 1 - Image 2

True normal map	Estimated normal map

The reconstruction is of fairly high quality except for the region where the cube is overlapping the sphere. This can be attributed to occlusions in the captured reciprocal images.

5. LIMITATIONS AND ISSUES

We used noise free images in our expermiments. Still, we observed that the results were not perfect. This can be attributed to the following two reasons:

Depth ambiguity: we use the ratio of the 2nd and 3rd singular values of the W matrix as a measure of how close it is to being rank 2 (in practice due to noise in the acquired images it is not possible to exactly get a rank 2 matrix). Larger the ratio, higher the probability of the matrix being rank 2. The figure below illustrates that in the continous scenario (no noise + no discretization) the ratio "peaks" for multiple depth values. The presence of these inherent depth ambiguities call for stronger constraints.

Discretization errors: mapping a continous scene to a 2D pixelated image (spatial discretization) introduces errors in the constructed W matrix.

The rank measure is highly sensitive and deviates even if there are small errors. This along with the time consuming acquisition process, limits the use of Helmholtz Stereopsis in a practical scenario. However, the ability of the method to recover surface normals of arbitrary materials with unkown BRDFs far outweigh these limitations.

6. REFERENCES

T. Zickler, P.N. Belhumeur, and D.J. Kriegman. Helmholtz Stereopsis: Exploiting Reciprocity for Surface Reconstruction. In Proc. of the ECCV, 2002.
MERL BRDF Database
R.T. Frankot and R. Chellappa. A method for enforcing integrability in shape from shading algorithms. IEEE Trans. Pattern Anal. Machine Intell. 1988.