Smooth Panorama Stitching With Lie Algebras

Mikola Lysenko


Introduction

For this project, we were required to implement a panorama stitching program, with the option of creating a fully automatic image stitcher. Rather than follow the easier path of a cylindrical stitcher, I decided to take the road less travelled and try for a full AutoStitch workalike. In retrospect, this was perhaps too much as I have done very little these last few days except hack on this project. My health has suffered gravely and I am now a zombie like shell of my former self. However, despite missing the deadline by a few dozen hours, this project has been at least partially successful in meeting this ambitious goal.

The algorithm developed in this report solves the autostitching problem using a novel image warping algorithm. It automatically stitches arbitrary collections of images into low distortion panoramas regardless of camera configuration or scene geometry. This approach also works for cylindrical, spherical and many more types of panoramic images - even beyond those which are possible in AutoStitch. The key to making this system work is a clever application of the exponential map derived from the theory of Lie algebras which makes it possible to smoothly interpolate homographies.

Due to the time limitations of this class project, it was not possible to perform a thorough survey of related work, so instead the focus of this report is solely on the description of the technique. The outline of this document is as follows: first, the overall structure of the algorithm is summarized, followed by a description of the specific details. After this, potential improvements are discussed along with some conclusions about the basic approach. Finally, technical details about the accompanying software along with installation and usage instructions are elaborated.

Algorithm Overview

The algorithm consists of 4 distinct phases. The first phase performs pairwise image alignment. This is done using the standard SIFT/RANSAC technique pioneered by Brown and Lowe. This result is then used to compute an initial layout of the images on the viewing plane based. Finally, the images are warped using the new Lie algebra interpolation and then blended.

Pairwise Alignment

Here is a brief summary of the pairwise alignment algorithm discussed in class: Given two images, $ I_i, I_j$, we want to compute a homography, $ H_{i,j}$, mapping $ I_i$ to $ I_j$. To do this, we run SIFT on the pair of images to obtain a sequence of matching features, $ m_k = (x_k, y_k, x'_k, y'_k)$, where $ x_k, y_k$ are the coordinates of the $ k^{th}$ feature in $ I_i$ and $ x'_k, y'_k$ are the coordinates of its matching feature in $ I'_j$. From this we randomly sample 4 matchings within the feature and attempt to fit a homography using the least-squares solution of the following system:
$\displaystyle \textrm{argmin}_h \vert\vert M h\vert\vert^2$ $\displaystyle =$ 0 (1)

Where:
$\displaystyle H_{i,j}$ $\displaystyle =$ $\displaystyle \begin{pmatrix}h_{1,1} & h_{1,2} & h_{1,3} \ h_{2,1} & h_{2,2} & h_{2,3} \ h_{3,1} & h_{3,2} & h_{3,3} \end{pmatrix}$  
$\displaystyle h$ $\displaystyle =$ $\displaystyle \begin{pmatrix}h_{1,1} & h_{1,2} & h_{1,3} & h_{2,1} & h_{2,2} & h_{2,3} & h_{3,1} & h_{3,2} & h_{3,3} \end{pmatrix} ^T$  
$\displaystyle \begin{pmatrix}M_{2n} \ M_{2n+1} \end{pmatrix}$ $\displaystyle =$ $\displaystyle \begin{pmatrix}-x_n & -y_n & -1 & 0 & 0 & 0 & x'_n x_n & x'_n y_n & x'_n \ 0 & 0 & 0 & -x_n & -y_n & -1 & y'_n x_n & y'_n y_n & y'_n \end{pmatrix}$  

From this, we count the number of inliers of $ H_{i,j}$ and store this score in a table of weights, $ W_{i,j}$. The homography with the maximum score is taken to be $ H_{i,j}$.

Panorama Layout

After computing pairwise image alignment, we next compute a set of euclidean transformations, $ E_i$, from each image $ i$ to the viewing plane. For the purposes of this project, we decided to place images relative to each other according to a maximum spanning tree with edge weights determined by $ W_{i,j}$ (as computed in the alignment phase.) From this tree, we assign to the root of the tree the transformation $ E_0 = I$ and construct the remaining transformations by concatenating Euclidean approximations of the homographies. If $ (i,j)$ is a directed edge in the tree, then define the transformation $ E_j$ as follows:
$\displaystyle E_j$ $\displaystyle =$ $\displaystyle E_i * proj_E (H_{i,j})$  

Where $ proj_E$ is an operator which projects homographies onto the nearest Euclidean approximation. Several different estimates for $ proj_E$ were attempted during the course of this work. In the end, we resorted to using some problem specific knowledge. Because our final image is going to be a row of images, we simply project the homographies onto a single axis and store only translational components. This restriction may be relaxed to create arbitrary types of panoramas, such as those produced by AutoStitch. Finally, the arrangement of the scene tiles could be enhanced by taking into account user specific criteria or other aesthetic considerations.

Lie Warping

In this section, a Lie algebra based image warping method is summarized. For such a simple procedure, deriving and debugging it took a substantial amount of time and is largely the reason for this homework being late. It is hoped that the grader shall be merciful and understanding in this account. We begin with a high level summary of the relevant aspects of Lie groups:

Lie Groups & Algebras

A Lie group is a group which is also differentiable manifold. This property endows it with a nice local structure: specifically around every element in the group, there exists a projection onto a Lie algebra which linearizes the group operator. This fact has many useful consequences. For example, spherical interpolation of quaternions is an instance of this type of mapping and is incredibly useful for skinning, camera tweening, character animation and much more.

The mapping from a Lie algebra around the identity to its corresponding Lie group is given by:

$\displaystyle exp(g)$ $\displaystyle =$ $\displaystyle I + g + \frac{1}{2} g^2 + \frac{1}{6} g^3 + ... + \frac{1}{n!} g^n$ (2)

Where $ g$ is an element of the Lie algebra, and $ I$ is the identity. For matrices, mapping an element, $ G$, in the Lie group to the Lie algebra about the identity is given by:
$\displaystyle ln(G)$ $\displaystyle =$ $\displaystyle (G - I) - \frac{1}{2} (G - I)^2 + \frac{1}{3} (G - I)^3 + ... (-1)^{n+1}\frac{1}{n} (G - I)^n$ (3)

These operators are respectively known as the exponential and logarithmic maps, as they generalize the same operators from real and complex numbers to higher dimensions. To find the Lie algebra about an element $ X$ other than the identity, one may use the following identities:
$\displaystyle exp_X(g)$ $\displaystyle =$ $\displaystyle exp(g) * X$  
$\displaystyle ln_X(G)$ $\displaystyle =$ $\displaystyle ln(G * X^{-1})$  

It is important to note that the Lie algebra only approximates the group at a point. In general, the formula $ exp(a + b) = exp(a) * exp(b)$ will only hold if the group is commutative. This is the case for both the real and complex numbers, but certainly not for 3D rotations (aka $ SO(3)$). The degree of commutativity between two elements, $ a, b$, can be measured using the Lie bracket operator, which is defined as:

$\displaystyle [A,B]$ $\displaystyle =$ $\displaystyle A B - B A$ (4)

Where $ A,B$ are two elements of the Lie group. In a non-commutative Lie group, the bracket operator grows larger as $ A,B$ get farther away from each other. Intuitively, the bigger the bracket gets, the worse the Lie algebra approximates the group operator. As a consequence, we can consider the Lie algebra approximation to be good when two elements are nearby and bad when two elements are far away.

Because the Lie algebra is linear, it gives us a nice way to interpolate elements of the Lie group. Suppose we are given an element, $ A$, which we wish to divide into $ n$ evenly spaced intervals; in other words we wish to solve for $ a$ in this equation:

$\displaystyle A$ $\displaystyle =$ $\displaystyle a^n$ (5)

If $ A$ is suitably close to the identity, we can use the Lie algebra approximation to do the following trick:
$\displaystyle lnA)$ $\displaystyle =$ $\displaystyle ln(a^n)$ (6)
$\displaystyle ln(A)$ $\displaystyle =$ $\displaystyle n lg(a)$ (7)
$\displaystyle \frac{ln(A)}{n}$ $\displaystyle =$ $\displaystyle ln(a)$ (8)
$\displaystyle a$ $\displaystyle =$ $\displaystyle exp(\frac{ln(A)}{n})$ (9)

Thus, applying $ a$ $ n$-times is roughly the same as applying $ A$ once. In the next section, we apply this idea to solving for smooth interpolations of camera motions.

Interpolating Homographies

The set of homographies (ie linear automorphisms of projective plane, $ RP^2$) is represented by the set of invertible 3x3 matrices modulo a scaling factor, in other words $ GL(3,R) / R$. This object is a Lie group under the usual matrix multiplication operator. As a result, there is a natural way to interpolate between nearby homographies using the Lie algebra as described above. Given an image $ I_i$ and its corresponding placement $ E_i$, we wish compute an interpolated warp, $ T^{-1}_{i}(x)$, from the viewing plane into the coordinates of image $ i$. This done using a weighted sum of the transformation from the plane to some image $ j$ back to the image $ i$ at each point in space within the Lie algebra at $ E_i^{-1}$. The weights for this are represented using some function $ w_{i,j}(x)$ which gives:
$\displaystyle T^{-1}_{i}(x)$ $\displaystyle =$ $\displaystyle exp_{E_i^{-1} }(\displaystyle{\sum^n_{j=1} w_{i,j}(x) ln_{E_i^{-1} }(H_{j,i} E_j^{-1})})$ (10)

The only restriction on the choice of $ w_{j}$ is that for adjacent images $ i, j$ the coordinates are compatible. Currently, we phrase this condition as follows:

$\displaystyle T_{j}(x) T^{-1}_{i}(x)$ $\displaystyle =$ $\displaystyle E_j E_i^{-1}$ (11)

To satisify this constraint, we construct a Delaunay triangulation of the centers of the images within a plane. Then, $ w_{i,j}$ becomes the $ j^{th}$ barycentric coordinate of the triangle containing $ x$, or 0 if the edge $ i, j$ is not part of the site containing $ x$. In practice, this weighting produces plausible results, but it should be possible to define much better functions with more careful analysis.

Blending

At the moment, we have used a rather simplistic blending algorithm. From the interpolation phase, we store the weight associated with the identity transform in each pixel, then use this as the alpha term to blend the image into the map. With pre-multiplied alpha, the final compositing can be handled using a single post-processing phase. Better blending algorithms such as Laplacian smoothing or graph-cuts should produce higher quality images.

Results

Figure 1: Different types of panorama interpolation.

Image no_warp

Euclidean warped panorama.

Image proj_warp

Perspective warped panorama.

Image bad_algebra

Matrix interpolated warping about identity element.

Image lin_warp

Matrix interpolation about local Euclidean transform.

Image lie_warp

Correct Lie algebra based interpolation.

In this section, we examine the results of our new Lie algebra based interpolation scheme. Fig. 1 shows a comparison between perspective warping, non-interpolated Euclidean transforms, naive matrix interpolation and our new Lie algebra. The Euclidean transformation image shows the result of simply blending the images as placed by the arrangement phase of our algorithm. Overall, this does not produce a particularly smooth image. Perspective warping results in continuous images, but has bad distortion artifacts around the edges. Matrix interpolation is somewhat better, especially if the origin of the interpolation space is continuously moved. However, the Lie algebra method presented in this work has the fewest ghosting artifcats by far.

Figure 2: Some examples of non-cylindrical panoramas.

Lie algebra based warping also makes it possible to stitch panoramas from non-cylindrical camera configurations. Some examples of this are shown in Fig. 2. In each of these cases, the camera was held free hand and no knowledge of the camera parameters was used. Of course, the algorithm is not perfect. Images with large amounts of parallax or few features will give incorrect results. This can be seen to some extent in the tree image as the branches and background are slightly ghosted. However, if a good homography exists between two images, then it should be possible to warp them using this algorithm.

Conclusion

In conclusion, it is true that I was more than a day late in finishing this project. However, in spite of this fact (or perhaps because of) I did develop a new type image warping with several novel properties. As a result, I believe that this work could be considered reasonably successful for a classroom project. Nonetheless, many obvious improvements remain.

First, the image placement phase of our algorithm is basically a hack. It is probably possible to formulate the problem of euclidean projection rigorously and derive optimal placement. Likewise, the weighting functions used in warping are somewhat ad-hoc and could probably be improved with more careful analysis. Also a better blending technique such as graph-cuts or Laplacian smoothing should eliminate the vignetting artificats seen in the results and would greatly increase the visual appeal of our results.

Finally, it is not clear that the Lie algebra based interpolation is always necessary. For relatively small homographies, the errors from linear interpolation are not too bad and could be hidden outright with a decent blending algorithm. As a result, it is not clear that in many situations the Lie algebra based approach is necessary, even though strictly speaking it will always produce more accurate results. However, the cost of Lie algebra interpolation vs. matrix methods is not terribly high for an offline process and the improvement in image quality is likely worth the trade off. Plus, adding this method to an additional image stitching software package has few drawbacks as the implementation overhead (in lines of code) is very low (once you work out the math).

About the Software

This was my first experience working in MATLAB. As a result, some of the stylistic choices are probably a bit unusual. Nonetheless, the code mostly works and should be usable for creating simple panoramas. As a bonus feature, MATLAB has a pretty good implementation of the matrix logarithm/exponent, which makes interpolation much simpler. Also, the voronoi cell computation in the current version is a total hack and doesn't really work for a general stitcher. Fixing this should be easy (in theory), but it is outside the scope of this project.

Software Installation

To install the program you must download David Lowe's SIFT bindings for MATLAB. They are available here: . Once this is done, you need to add the code to your MATLAB path, which may be accomplished by typing in the MATLAB command prompt: path(path, 'path-to-SIFT-here');. After you have finished, extract all of the source code for this project into a single directory and navigate into it.

Software Usage

To use the sofware do the following:
  1. Place all of your pictures in a directory single directory, eg. "mypanorama". Make sure there are no other files in this directory.
  2. At the MATLAB command prompt, type pano_test('mypanorama'); to construct a panorama image from your data.
  3. Wait.
  4. Your panorama will be displayed on the screen and also saved to the file called "mypanorama.jpg".