CS 766 | Fall 2003 |
Reconstructing a three-dimensional object from several images is a well-studied problem in computer vision. An analogous problem is inferring an object's structure from realistic paintings of the object. This technique could be used to reconstruct objects from artistic records, or to analyze an artist's reliance on optical aids such as lenses or mirrors. Results of these digital reconstructions could support, or help to discredit, the theory that Renaissance artists used optical aids to project images onto their canvas, thereby enhancing the realism of their work. I specifically study Jan Vermeer in order to digitally reconstruct a recurring chair in his paintings. I will also consider whether my results support the contention that these artists relied on mechanical devices.
The development of computer vision makes intact measurement available to continuous industrial manufacturing processes, which brings numerous benefits to industries including improving part quality and automation, and decreasing labor cost. There are many companies already engaged in developing machine vision systems to fit the industrial demand [1-2]. In this project the application of computer vision techniques in monitoring part features of injection molding process will be explored. Specifically, it includes taking picture of finished parts, detecting the interested features based upon the acquired images, and comparing the feature information with CAD geometry. Therefore, camera calibration is necessary in the experiments. Then the rectangular part frame, which is the interesting feature in this project, is detected in the image by a line extracting algorithm. Finally, the obtained part frame is compared to the ideal part geometry and some part quality indices such as short shot, flash and linear shrinkage are determined.
Much work lately has gone into the development of techniques for "filling in" certain portions of an image. Such techniques can have many applications, ranging from reconstruction of damaged photographs to removal of large unwanted objects. This paper covers my implementation of one such technique using a combination of computerized inpainting and texture synthesis. The technique used is the one presented in the recent paper "Object Removal by Exemplar-Based Inpainting" by A. Criminisi, P. Perez and K. Toyama, Proc. Computer Vision and Pattern Recognition Conf., 2003.
Active contours have been an actively researched topic in Computer Vision. Active contours are used to detect object boundaries in an image. This project implements a level set approach to active contours developed by Ravikanth Malladi,James A. Sethian and Baba C. Vemuri. The approach has the advantage that it makes no assumption about the topology of the objects in the image. The initial contour is propagated with a constant speed and a speed dependent on the image gradient and the local curvature. The propagation equations are solved by using viscosity solutions to these Hamilton-Jacobi type equations. The discrete approximation uses entropy satisfying upwind difference schemes which result in a final contour that is able to model arbitrarily complex shapes.
In this project, we build up a virtual city that mirrors the real city, which can serve as a platform for demonstrating various computer vision techniques. We focus on a particular computer vision technique, affine 3D representation, and applied it to face reconstruction. While the estimation of depth information of a face using the affine 3D representation turns out to be not so successful, it opens a very interesting direction of relative depth ordering estimation using affine 3D representation. We present our experimental results of face reconstruction using 3D affine representation and explain why it was not that successful as claimed in some literatures. We also provide results of relative depth ordering estimation, which confirms our conjecture and exposes a very intriguing direction to explore.
With today's digital cameras, one can acquire a wealth of image data with little effort. However, the use of these images is relatively limited: most images are used simply to capture a scene at a given point in time. In addition, scenes that do not represent an actual view of the environment are usually drawn or modeled from scratch. In this project, we investigate a cut-and-paste approach for generating novel scenes. Ideally, we would like to take a database of images and given a scene description, composite a subset of the existing images to generate an image of the defined scene. Notice that our technique lies somewhere between image processing and virtual environment generation. This project presents some results of the technique and addresses some of the inherent challenges.
We formulate the problem of shape reconstruction from stereo images as a weighted area minimization process. Before exploring the stereo problem, we started from a simple problem which is surface reconstruction from computer generated autostereogram. We can apply minimal weighted area rule to accurately reconstruct the surface structure embedded in autostereogram. This gave us a hint about using this idea in many other computer vision problems. Here I use this idea to reconstruct surface from stereo images. The approach implemented in here is similar to the stereo reconstruction algorithm by Faugeras and Keriven [1]. But, the proposed scheme here is computationally efficient and easy to implement.
Delineation of RF-ablator induced coagulation (lesion) boundaries is an important clinical problem not well addressed by conventional imaging modalities. Automation of this process is certainly desirable. Elastography that estimates and images the local strain corresponding to small, externally applied quasi-static compression can be used for visualization of thermal coagulation. Several studies have demonstrated that coagulation volumes computed from multiple planar slices through the region of interest are more accurate than volumes estimated assuming simple shapes and incorporating single or orthogonal diameter estimates. This project presents an automated segmentation algorithm for thermal coagulations on three-dimensional elastographic data to obtain both area and volume information. The automated segmentation algorithm is based on a coarse-to-fine algorithm on automated lesion segmentation using gradient vector flow active contour model with the help of prior knowledge of the geometry of general thermal coagulations. The performance of the proposed algorithm has been shown to be comparable to manual delineation.
In this project I propose a semi-automated method of orthophoto generation and explore an alternate method to using stereo plotters for orientation. This method has the advantage that it is fairly robust and very inexpensive. The main inaccuracies in the final product are due to the low resolution DEMs used for the orthophoto production, and so by improving the accuracy of the DEM even better results can be generated by this technique, widening its scope. The basic steps involved were (1) locating ground control manually (2 per image in this case), (2) calculating the homography by matching corner features using correlation, (3) block adjustment to calculate initial approximation for the camera orientation, (4) bundle adjustment for the entire system to calculate the orientation parameters precisely, and (5) reprojection using the pre-existing DEM. In this project I have restricted myself to single band images, but this technique can easily be extended to multiple bands. The results obtained were reasonable and there were no visual artifacts. There is a lot of scope for future improvements for this project, some of which I hope to implement, such as faster routines for bundle adjustment and improved matching algorithms to determine the homography between the images.
The images I work with are 'optical flow maps' of molecular DNA, produced by the Schwartz Lab (Genetics). To produce these images, DNA molecules (which are usually wound up inside the nucleus) are first 'stretched out' by certain chemical processes, after which a restriction enzyme is applied, which 'cuts' the DNA molecules at locations where a particular target sequence is present. In the final image, high intensity values should correspond to the remaining uncut DNA.
Ideally, each DNA molecule in the original sample should be represented by a long thin stretched out line with intermittent gaps corresponding to the cuts. These lines are not necessarily straight, but by the nature of the stretching process, they should have the same general orientation. In practice, most DNA molecules break up into smaller pieces. The mix also contains short artificial sequences known as 'standards', which contain one or two known sites where the restriction enzyme should act.
The ultimate goal in the image processing step is to identify individual stretches of DNA fragments, along with the positions along the fragment where 'cuts' occur. The lengths of these segments need to be determined in terms of kilobases, which should be proportional to 'pixel' length for a given image (the constant of proportionality determined by the amount of 'stretching'). The 'standards,' whose length in kilobases are known, can be used to determine this constant if they can be identified. The images are of fairly good quality, but they also have several artifacts which need to be cleaned out.
As of now, I have only implemented an algorithm to tentatively identify points which lie along the backbone (center) of DNA fragments, along with a rough orientation of the fragment at those points. This mostly avoids false positives (i.e., noise is handled well), but there are some false negatives (i.e., not all points are identified). It is hoped that this information would help us to infer the locations of the fragments.
Pictorial structure models, originally introduced by Fischlet and Elschlager, provide a statistical model of objects. Using these pictorial structure models, objects in an image can be recognized and their constituent parts can be located in the image. Work by Felzenszwalb provides a probabilistic approach to the training and recognition of pictorial structure models in an image. For this paper, I implemented Felzenszwalb approach for face recognition in order to gauge its effectiveness and determine the advantages and limitations of his method.
Reflectance functions are approximated from data using kernel regression and used to classify materials. Classification algorithms are proposed to deal with unseen materials. Experimental results show that some reflectance functions can be approximated quite accurately with kernel regression, and that accurate approximations can be used to classify materials. The kernel regression techniques here use convex optimization techniques that are simpler than the nonlinear techniques often used to fit more sophisticated reflectance models to data. Preliminary results suggest that it is possible to extend classification to work with unseen materials, which has important implications towards the scalability of the method.
Texture synthesis is a technique to render new texture images that are perceptually similar to the given texture samples. It has always been one of the most active research problems in computer vision. In this project, a recent method "image quilting" for synthesizing new images of a texture from an existing texture image will be studied and implemented. The method was presented in the SIGGRAPH 2001 paper "Image Quilting for Texture Synthesis and Transfer" by A. Efros and W. Freeman. The advantages and disadvantages of this method will be studied by comparing it with some other statistical or other patch-based synthesis methods. Possible extensions and improvement of this method will also be investigated.
Morphing is a technique that produces smoothly transforms between two images. The technique of view morphing works by prewarping two images prior to computing a morph and then postwarping the interpolated images. As an extension of the view morphing, we are trying to take either three views as input, or else a sequence of views from a linearly translating camera, and combine them for synthesizing new views. Ideally, the program will track the feature lines from all the input views and transform the images in different angles. This technique can be applied to generate 3D face model.
Many video-surveillance applications involve humans and vehicles in scenes. In this project, an experimental, real-time video system for tracking pedestrians or moving vehicles will be implemented based on a stationary camera. Background subtraction and consecutive frame subtraction are both used to detect the moving objects. Objects are modeled as rectangular templates in the image sequences. The spatio-temporal coordinates of each object are tracked. A Kalman filter is used to update the background. The system's robustness is tested under different situations. The proposed approach can be used in many areas such as security surveillance and traffic applications.