Computer Sciences Dept.

CS 534: Computational Photography

Fall 2016


HW #5 Projects



  • Kevin Atherton, Ted Imhoff-Smith and Zachary Ovanin
    Object Detection and Censoring: The Removal of Unwanted Artifacts
    Methods for object detection and segmentation as well as image completion have been investigated in prior research. Although effective methods for object detection and image completion have been developed, research has not been conducted to detect and remove unwanted or potentially obscene objects from images without necessitating human input. Combining object recognition and image completion techniques will allow for automatic censorship of unwanted or obscene objects within an image.

  • Ali Bramson and Tom Finzell
    Using SIFT Feature Detection and Image Warping Techniques to Look for Changes to Planetary Surfaces
    Galileo and New Horizons are both spacecraft missions that went to Jupiter at different times. Both took images of Jupiter's moons, allowing for a comparison of their surfaces to look for ongoing geologic activity. However, the two data sets were taken under different conditions using different instruments, making a direct comparison difficult. We propose to use Gaussian pyramids to match their resolutions. Then, using SIFT Feature Detection, we will find images taken of corresponding locations on the surface. We will map one data set to overlay the other, match their contrasts, and subtract the two images to get their differences, hopefully revealing changes to the surface that occurred between the two missions. This could be extended to stitching the images together to create a map of the entire surface.

  • Shane Bric and Kevin Paprocki
    Restricted Re-Colorization using Bilateral Filtering
    Artistic filter are often used to change the look of a photo, we will take that to a new level. The first step in our algorithm is to smooth the textures while keeping the edges with a bilateral filter. After the input image is filtered into smooth texture, the colors will be changed with input colors from the user. The end result is a re-colorization with arbitrary colors that holds the same edges and form as the input image, leaving an artistic filter that can be modified to ones desire.

  • Ben Burchfiel and Aaron Cahn
    Facial Recognition using Eigenfaces
    To create a biometrics oriented facial recognition system based on classifying faces into linear combinations of Eigenfaces. Because of the constrained nature of the problem, additional enhancements will be tested such as skin color matching and hair detection.

  • Travis Burleson and Ramsey Kropp
    Automated Removal of Recurring Near-Occlusions from a Video Stream
    Several occupations require rapid entry into potentially hazardous closed spaces and would benefit from spatial information about the contents of the space. Recent research in computer vision and advances in computational speed have allowed the implementation of "structure from motion" methods to extract 3 dimensional structure from a video stream. These methods require a video stream that maintains an approximately constant orientation. A thrown video system suspended in a gimbal structure would provide this data, but the gimbal structure contains moving elements that would partially occlude the lens of the camera. Our project aims to automatically identify a moving object with some characteristic trait (color or distinctive pattern), remove that object from any frames in which it is present, and use a combination of preceding and anteceding frame information and texture synthesis to fill in the voids left in those structures. We aim to use motion segmentation to assign motion to camera motion or object motion, remove those pixels which are associated with the moving object, and complete the final video sans the object. This is popularly known as "diminished reality," as the video stream contains less information after processing.

  • Nathalie Cheng and Joe Kohlmann
    Doodle Segment Querying and Colorization
    Our project idea fuses concepts from two research papers: "Colorization using Optimization" by Levin et al and "Fast Multiresolution Image Querying" by Jacobs et al.
    Given a user's doodle (saved as a GIF with an indexed color table), the project recognizes the segments of the image drawn using the same color. It uses these color-grouped segments to query a database of images (as in Jacobs et al) based on shape alone (not color). The algorithm takes the resulting segment matches and colorizes them using the original doodle segment colors. Finally, it outputs the combined (and possibly blended) segments in a final image—a surreal recreation of the original doodle from real, color-shifted photographic data.

    Imagine a doodle containing a green plain in the background, a blue sky, a yellow stick figure and a red car. Each segment of the image appears in a distinct color, so the algorithm extracts them, performs image matching queries with each, color corrects the matches, and blends the final image together from real images of a plain, a sky, a person, and a car. The photographic matches are all colorized to create a vivid realization of the original doodle.
  • Kevin Cisler, Brent Clark and Stephanie Scherer-Johnson
    Manipulating Images with Image Matting and Texture Transfer
    Our project is inspired by the artwork of Liu Bolin, a Chinese artist whose works involve making himself blend into the background of his images by painting himself. For our project we will create a program that replicates Liu's artistic style. To do this we will extract a person from an image using image matting and then insert them into a new image using texture transfer, thereby creating a new novel image in the style of Liu's work.

  • David Delventhal and Xiaowei Wang
    Hatching Rendering Based on a Single Photo
    There exist many algorithms to create a non-photorealistic image from a photo. Many of these algorithms make the photo look like a painting or a cartoon. This project, however, gives the photo a hatching shading style. Existing techniques for hatching rendering either require the user to manually draw hatching or create a 3D model on which the algorithm rendering the hatching. This project focuses specifically on photos of architecture.

  • Andrew Dibble
    Reading Devanagari
    Here I build upon the algorithm for reading Devanagari advanced in the Ingalls and Ingalls paper "The Mahabharata: Stylistic Study, Computer Analysis, and Concordance." While they developed an application capable of accurately reading hand-written Devangari, I will use 3 or 4 different standard fonts. Also because of time constraints and because hundreds of templates are necessary to cover the full breadth of Devanagari (so long as we augment the alphabet with templates for compound characters, instead of lexing the compound parts out programmatically), I will be dealing with a representative subset of the script. Instead of simply distinguishing between the "skin" and "bones" of a character template (which pixels are sometimes black and which are always black for a character across all fonts), I will assign probabilities to each pixel. This should be an improvements upon the Ingalls and Ingalls algorithm. This paper may also handle simple additions to the Ingalls and Ingalls algorithm for light, rotation, and scale invariance.

  • Samuel Farrell and Kenny Preston
    Transforming Sketches into Vector Art
    For our final project in CS534, we plan to use what we've learned about finding the SSD of an image and finding the edge information (as well as possibly some manipulation of gradients) in order to write a program to automatically take the input of a scanned sketch and output an image in the style of a smooth, vector-based image (figure 1A and 1B omitted in HTML version). Transforming a sketch into a usable image for web design or, really, any design has had to be done in Photoshop or some other professional image editing software since the medium was invented, but rarely do the images originate in the digital realm - most are still based on sketches. There are many, many potential applications for this, including cataloging. Many scientific researchers are left with many sketches that have been poorly transformed to a digital medium. Using software such as this, not only would there be no loss when digitizing sketch work, but there would actually be an enhancement.

    We will begin our program with a simple normalizing of the pixel levels (we will first assume input is grayscale, but after we get our pilot application working we may include RBG support) followed by a lowpass filter. Then we will locate the edges, and use this information to smartly apply swatches of 3x3 templates we will create that represent all possible combinations of edges, but done so with a vectorized look that will clean up any noise. We will blur the places not marked as edges, and possibly do some normalization on the levels to keep them consistent with the edges. Eventually, we hope to add RBG support, which would use a lot more of the gradient information of the original image.

  • Jean Forde and Charles Gorman
    Defocus Magnification
    We are going to complete our project on defocus magnification to give images a shallower depth of field. We are going to base our implementation on the "Defocus Magnification" paper by Soonmin Bae and Frédo Durand (Eurographics 2007).

  • Seth Foss and Peggy Pan
    Modifying Code for Optical Character Recognition
    In our project, we want to focus on optical character recognition (OCR) by taking an image of some text and then extract characters from it. There are some OCR Matlab codes available on the internet, but it turns out that the orientation matters when scanning, as it failed when the text was upside-down. What we want to do is to modify the system to recognize that there are characters in the image that are not oriented in the normal fashion. This could go so far as slight warping in the image. Words should be recognized as long as they are read from left to right in horizontal lines. Also, if time permits, we would also like to add on features such as a decision tree to determine the language of the text.

  • John Fritsche and Hans Werner
    FACE-IT: Face Authoring Compiler Engine - Interactive Tool
    We intend to implement an application that would allow the user to create a plausible human face using components from a library of facial features (eyes, noses, mouths, etc). Using a web interface, the user would select a base facial structure and all of the desired facial features, then a MATLAB algorithm will automatically combine the components and blend them together into a cohesive human face. This type of functionality could be applied to a number of applications, both for entertainment and practical purposes. One amusing use of this application would be to swap the facial components of two faces, for example swapping the eyes of two celebrities (the first Google search result for "Bieber Buscemi eyes" gives a humorous example). A more practical application would be to use this system to construct a more realistic "police sketch" of a criminal by selecting the closest facial features.

  • Keith Gould and James Stoffel
    Pencil Sketch Simulation
    The music video for A-ha's "Take On Me" features an animation style that resembles pencil strokes. These effects were probably achieved by drawing each frame individually. Our process will allow this effect to be achieved much faster by using software. The process utilizes gradient edge detection followed by some filter operations to simulate the characteristics of pencil strokes. Finally, smoothing is applied for enhanced realism.

  • Daniel Greenheck
    Constructing Normal Maps from 2D Textures for Improved Realism in 3D Modeling
    The goal of this project is to apply a set of filters to a texture and in turn generate a deviated normal at each pixel. This is often called a "normal map" or a "bump map." This data is very useful in rendering 3D geometry because it allows you to alter the calculation of the light per pixel. Imagine having a 3D cylinder displayed, rotating about its axis, with the texture of stone wrapped around the surface of the sphere. The features in the texture (stones, cracks, etc.) have their own 3D geometry in real-life, but that information was lost when a 2D picture was taken. Because that information was lost, the stone texture on the cylinder looks very flat and unrealistic. Supplying a normal map gives the renderer extra information about the normal at each pixel. The result is a much more appealing representation of the stone texture, as long as it is not viewed at shallow angles. To extract a normal map from a texture, the image is first converted to a grayscale intensity image. A Sobel filter is then applied to extract gradient information about the texture. Since there are features of varying size within the texture, a Gaussian pyramid must be constructed to extract features in the low frequency band, the mid frequency band and the high frequency band. Since it is difficult to know how much each frequency band should contribute to the normal map, the user will have to specify a weight for each. Once the gradient information is gathered, it is processed to extract the normals which are represented in an RGB image. Each color component corresponds to a weight in the XYZ axes. This RGB image is passed into the renderer which analyzes each pixel and applies the change in the normal. This is the basis for the algorithm. I would like to improve upon it to make it more intelligent when creating the normal map i.e., determining which edges are actually "edges" and which ones are not.

  • Kyle Halbach, Brian Hook and Mike Treffert
    Photomosaics
    From millions to one - this was the idea that became Robert Silvers's claim to fame. During his time at MIT, Silvers gained worldwide acclaim for his ability to mold countless pictures into one, a method known as photomosaic. This relatively new style of art utilizes a technique which transforms an input image into a grid of thumbnail images while preserving its overall appearance. The typical photomosaic algorithm scans through a large database of images for pictures that closely match each block of pixels in the main image. In this project, we have created a web-based application that will transform an uploaded photo into a photomosaic, while acknowledging a few user-defined settings that will influence the style used during the creation process.

  • En Tao Ko and Gabe Stanek
    Goldmember
    We plan to implement a series of manipulations to an image with the end goal of making an object or person in an image look like they are made of gold. We will do this by applying various filters to the input image. These filters include:
    1) Gaussian filter to blur the initial image
    2) A filter to increase the contrast of light and dark
    3) Then we will cast pixels within a specified area to be gold. The darker the pixels are, the darker the gold color will be.
    Through this process, we hope to end with an image that appears to have a gold object in it.

  • Wes Miller
    Expression Detection in Video
    Expression detection is useful as a non-invasive method of lie detection, as well as for predicting behavior. This program will implement expression detection to identify both micro- and macro-expressions in video footage as well as identify whether the expression identified is genuine or consciously produced by the subject.

  • William Pelrine and Borui Wang
    A Solution to Automatic Color Calibration among Different Devices
    The parameters defining color output of a screen such as the gamma values for red, green and blue vary among manufacturers as well as among the monitors they make. To simplify and standardize this, a computer's operating system defines the color preferences of the screen by querying an ICC profile based on a global standard defined by ICC. We observed, however, that with the same ICC color profile loaded on two computers with different screens, the perceived color from the human eye could still be different due to the physical hardware differences of the screens. The color preference of a screen is crucial for users using software applications that involve photographic reproduction. Therefore, the need for screen color calibration arises from the users who require their work to look similar in different hardware context such as photo and video editors. The current software solutions for screen color calibration not only require a professional background and an understanding of color spaces, but also require purchasing additional expensive hardware to work with the software such as Spectracal's CalMan , X-Rite EyeOne, and Colorvision Spider, where the cost of the solutions ranges from $299 to $3000. Our goal is to provide a simpler, cheaper solution that gives acceptable screen color calibration results for amateur users using only their own camera, display and possibly a printed color board. Our implementation includes: 1) Writing a program that decodes, encodes and extract information from the ICC binary file according to the tags defined by ICC. 2) Find the relationship between the color space of RGB and the color space used by the ICC, and figure out the correlation of brightness and gamma values of all color channels used in these two color space. 3) Write Matlab program that measures different image inputs from two screens provide offset in the color space used by ICC to map one screen's color to another. 4) Write Matlab program that takes the difference of cameras into consideration. Note that Step 4) is an optimization for our project, and it's possible that it will not be implemented in the final submission.

  • Scott Rick
    Photo Tourism using Video as Input
    My idea for my final project is to enable the use of video as input for photo tourism systems such as Microsoft's Photosynth or UW's Photo City game. I expect to be able to take different video formats and extract single photo frames to use as the normal input for photo tourism programs. My observation is that people often take videos of interesting tourist attractions in addition to photos. Breaking up these videos into still shots will provide many extra pictures of a given scene in order to construct a 3D point cloud. People generally take only one or two still photos of a landmark, but a video clip should be able to provide many still images from many viewpoints. I expect that part of the challenge of my project will be to clean up motion blur and noise as well as compensate for low quality of video stills compared to photographic images. The final result that I am aiming for is for my program to be able to input a video file and output a Photosynth photo tourism scene.

  • Chad Sebranek and Mike Zwaska
    Social Photography: Making Connections through Pictures of Places
    Abstract: The number of people one knows is very large, yet it is only possible to stay current with people's events for a small group. However, with the existence of Facebook, that group has become much larger, and it is easier to stay up-to-date with more and more people. Still, there are hundreds of acquaintances and friends that people do not keep up with. And these people could share similar experience and have untapped useful information. Thus, the goal of this project is to expose these experiences and allow for novel information to be extracted through the use of social photography. The common use of smart phones makes this program applicable to a mobile platform. However, it will be prototyped on a PC. The basic idea is to allow a user to take a picture or pictures of a commonly visited object or scene. With that image, the program will access to the user's Facebook, come up with a list of his or her friends (or friends of friends) who have been to the same place. The list of friends will be interactive, such that the user can select matched image(s). The approach to this algorithm will be, given a source image, it will quickly reduce the size from all the user's friends' pictures to the possible matches, since the initial number of pictures will be large and the cost of determining a match is expensive. After reducing the size of the potential matches, it then uses an imaged-based feature detection to determine if there is a good match of the source image(s) to any of the remaining pictures acquired from the user's friends. The user has the option of reviewing the set of images, to reduce incorrect matches. Finally, the determined matches will be linked back to the friend who took it and the name-image link will be then be output to the user. The use of the program and finding people who have been to the same places as the user will provide a list of people that the user could contact to gain more information commonly visited places and there sites. All in all, by clustering similar scenic photos across Facebook friends, this program will provide an extra sense of social connectedness to each user.

  • Jacob Stoeffler
    Photosaic
    Photosaic is an automated photomosaic generator that uses a web interface. The user provides the main photograph that the output image will resemble. Then they can choose either to provide the input images that will be tiled or they can choose to enter a keyword that will be used to find relevant images by using online image services such as Flickr and Google Images. The user will also be able to choose the tile size and the output image size. The program will then generate the photomosaic and allow the user to download it. It will most likely use a combination of PHP and JavaScript to search the web for appropriate images and place them properly in the output image by minimizing the sum of squared difference for each tile location. As the project progresses, more features will likely be implemented, but this is the starting point.

  • Will Strinz
    Practical 3D Scanning using Structure from Motion
    Automatically creating three dimensional models of objects from 2 dimensional input data is a common problem in a number of industrial design settings. Most systems available today require either expensive specialized hardware such as time of flight detectors or stereo cameras. Using structure from motion algorithms, input from a consumer grade web camera can be used to reconstruct a reasonably accurate 3D model of the object, which can be further modified using existing editing software or exported for printing on a RPM (Rapid Prototyping and Manufacturing) printer.

  • Allie Terrell
    Social Photography on Facebook
    Facebook has offered users a photo sharing application for many years now. However, it has taken off in the past few. As of May 2009, 220 million photos were added each week, with over 550,000 images being served per second during peak hours. Facebook has undoubtedly become a popular way for friends to share their photos with others. However, what kinds of photos dominate Facebook? Although cameraphones are now the most popular camera on Flickr, does that hold true on Facebook as well? This research explores the facets surrounding why people share on Facebook, and analyzes how their current interface could be augmented to allow for more cohesive sharing across friends to better propagate information.


 
CS 534 | Department of Computer Sciences | University of Wisconsin - Madison