CS 766 Final Project

SIRDS (single image random dot stereograms)

Rob Iverson

Fall 2000


abstract

The SIRDS method of displaying 3D images on 2D displays is implemented, with a simple GUI to change the colors of the images. I optimize the depth resolution, creating better images. Finally, the system is used to make two simple animations to show that it is possible to keep your eyes locked into them.

background

A SIRDS, or single-image random-dot stereogram, is a method of fooling a person's eyes into perceiving depth in a flat image. The earliest notion of 3D from random dots was proposed in [Julesz 1962], the first paper to show that a single image could be used was in [Tyler 1990], but the best reference can be found in [Thimbleby 1992]. The basic idea is summarized in Figure 2 in Thimbleby's paper that I can't get a screen capture of. It's on page 19.

Basically, if we arrange the dots on a page such that the eye is fooled into thinking it is focussing on something farther away, then the illusion is realized.

perception

When you focus both eyes on a point in space, they turn towards each other, and each has a slightly different perspective of the object. This is called stereo disparity, and is one of the strongest depth cues we have. The geometry of SIRDS is simple: your two eyes each point at a different section of the image. These two sections are constrained to be a certain distance apart, which fools your eyes into telling your brain that it is really a point on the other side of the paper/screen. Different regions of the image are constrained to be different distances apart, leading your eyes to send this information about the artificially-induced stereo disparity to the brain. Your eyes can focus on this "point" in 3-space because there is a certain way that the random dots in an image will line up to look "right", and when your eyes converge/diverge to this exact way, everything falls into focus and the effect is realized.

methods

The first SIRDS images were created in 1990 by Tyler and Clarke. Their algorithm is fundamentally the same as the one used in this project. One important difference is that their algorithm does not apply the constraints symmetrically; hence an image can not be viewed upside-down or right-side left. Thimbleby, Inglis, and Witten improved on this method by making it completely symmetric. This can also remove some types of sideways bias in the construction.

One concept I wish to explore more fully is the effect of "patterns" that one can find in most SIRDS. They are generally repeated several times, separated by the same horizontal distance. These patterns actually correspond to small regions of points in the 3D scene. The distance between the patterns is the same distance that your eyes must converge to to see the point in focus. This value is referred to as the "stereo separation" of the 3D point. Recall that there are two points in the image for any one point in the scene; the distance between them is this stereo separation. The stereo separation of points on the far plane should be half the distance between the eyes. Let's derive the separation of a point, given the normalized Z coordinate (refer to figure 2):

The distance between the near and far planes is , or . The Z coordinate is normalized such that Z=0 at the far plane and Z=1 at the near plane. Thus for some 3D point P at Z, the distance from P to the image plane is , and the distance from P to the eyes is , or . By similar triangles we have:

and thus:

is the formula for stereo separation of any point given its Z coordinate. From this we come to an interesting limitation: the depth resolution of the stereogram. Because the stereo separation is what drives our perceptions of depth, the maximum depth difference is between the nar and far planes, and this comes with stereo separations of snear and sfar. The only depth differences come with separation differences between snear and sfar, in pixels. With an eye separation of 2.5 inches, and some common values for and the output device resolution (in dots per inch), we obtain table 1, giving the maximum # of different depth planes.

DPI = 72 DPI = 150 DPI = 300 DPI = 600
0.1     6 11 21 40
0.2     11 22 43 84
0.3     17 34 67 133
0.4     24 48 95 189
0.5     31 64 126 251
0.6     40 81 162 322
0.7     49 102 203 405
0.8     61 126 251 501
0.9     75 154 308 615
1.0     91 189 376 751

Note that high values of increase the difficulty of fusion; experimentally, we found values of 0.6 or lower to be ideal. This means that on a computer monitor, the absolute best we can deal with is about 40 depth planes. This produces very noticable artifacts. However, modern printers can easily produce images of 300 dpi, and sometimes 600, it's more plausible to have 100s of depth planes, bringing image quality up to par with commercial books and posters. However, we have not founda way to convince OpenGL to create really large buffers (on the order of 2000x3000). I really wanted to get images this big, so I wrote a simple rasterizer to get the depth values to input into the sirds algorithm. That's how the 300dpi image on the bottom of this page was generated...

Here is a short summary of the algorithm:

results

We have a simple modelling system in place which models geometric primitives such as cones, spheres, teapots, and triangle meshes. The UI allows users to add any number of these shapes, move and scale them, and view them from any angle. When satisfied, the user prompts the system to create the sird, which is generated and displayed immediately. Some simple color editing controls are available for changing the colors from the default black and white to any other RGB values. It is surprisingly easy to view the image, even when the colors don't seem that far apart.

There is also a facility for generating simple animations. Currently the only kind generated are camera rotations, but the general nature of our system would allow _any_ openGL program to generate a SIRDS at any point in its execution. Near the end of this web page there are two AVI files generated from the SIRDS data we create.

simple dinosaur

These images are the depth buffer and associated SIRD for a dinosaur mesh. Click on the image to get a full-sized copy.


Here is another image pair: a teapot. Watch the inconsistencies near the handle.



You can see the simple GUI I have created; changes to the color choosers above cause immediate changes to the displayed SIRD.



You can see below that using a paint program to create color gradients can give images that strikingly resemble the commercial "Magic Eye" pictures.


Since all these images were created with the CRT resolution of 72dpi, they are boring to print, so here's a high-resolution (300dpi) image: big_sird.gif, and a shrunken depth-buffer image that was used to create it, just in case you need a hint: big_zbuffer_small.gif. The depth values were generated by my program, using a simple renderer I wrote for this purpose.

And finally, I also set up the program to generate sequences of simple animation to create video. Even though the "animation" is just a camera rotation, we should be able to use the system to generate SIRDS from nearly any OpenGL program (with Condor, DynInst, or the GL Hijacker). You may have to stop the animation to focus your eyes properly on one frame. Note that once you have the focus, your eyes stay in focus, enabling you to see the animation very clearly (perhaps easier than just seeing one image).

First, a video I like to call "static": static.avi (3571k)

Next, a similar video called "more static": morestatic.avi (7036k)

references

[Julesz 1962] : Julesz, Miller, 1962. Automatic Stereoscopic presentation of functions of two variables. Bell System Technical Journal 41: 663-676; March.

[Thimbleby 1992] : Thimbleby, Inglis, Witten, 1992. "Displaying 3D Images: Algorithms for Single Image Random Dot Stereograms"

[Tyler 1990] : Tyler, Clark, 1990. The autostereogram. SPIE Stereoscopic Displays and Applications 1258: 182-196.