| CS 766 | Computer Vision | Fall 2005 |
Homework #2: 2D Skeleton Shape Representation using Shock Graphs
Due: Thursday, October 13
One useful way of modeling two-dimensional (2D)
shapes is by a skeleton description, initially studied
by H. Blum with the medial axis transform (MAT).
In this assignment you are to implement a method for computing a generalization of the
MAT called a shock graph. The algorithm you are to implement is given
in Section 3.3 of the paper
"Robust and Efficient Skeletal Graphs"
by P. Dimitrov, C. Phillips and K. Siddiqi, Proc. Computer Vision and Pattern
Recognition Conf., Vol. 1, 2000, 417-423.
In order to reduce the amount of coding necessary, I strongly suggest implementing the
algorithm in
Matlab, including
many relevant functions in the
Image Processing Toolkit.
The steps of the algorithm that you are to implement are now summarized.
Step 1: Compute the Euclidean Distance Transform
The first step of the shock graph construction
algorithm is to compute the (approximate)
Euclidean Distance Transform (EDT).
Do this in Matlab by first reading in your image file using the function
imread.
imread will read
most image file formats. Be sure to also read about
the various image classes that are supported in Matlab such as uint8,
uint16 and double, and carefully keep track of the image class
you are using at all steps of processing in order to not lose information.
Next, compute the EDT
using the Image Processing Toolkit
function
bwdist
to produce a real-valued Euclidean distance
transform array D.
Note: bwdist assumes that object pixels have value 0 and background
pixels have any non-zero value.
In order to reduce the effects of boundary aliasing due to the
digital boundary of the object, smooth D slightly
by blurring it using a Gaussian-shaped kernel.
Create this kernel using the Matlab function
fspecial.
For example, to create a 3 x 3 kernel with sigma = 0.5, do
G = fspecial('gaussian',[3 3],0.5).
Experiment with at least two different kernel sizes and values of
the parameter sigma to obtain the best final results you can.
(For example, use a 3 x 3 kernel with sigma=0.5 and use a 5 x 5 kernel
with sigma=1.0.)
Specify the final values you used in your documentation.
Convolve your computed kernel with D using the Matlab
function
imfilter.
The result should be a real-valued 2D array SD
containing smoothed values
of D for every pixel.
Create a separate, intermediate-result image that contains 255 (white) at each
background pixel and the SD value, rounded to the nearest
integer, at each object pixel. Linearly scale the SD values so that the
largest value becomes 0 (black) and an SD value of 0 becomes 255 (white).
You can do this in Matlab using the functions
stretchlim and
imadjust.
(The reason for inverting the background gray level is so that the printed images
look nicer.)
Save this intermediate-result image in an image file using
imwrite.
Print out and turn in this image.
(If you prefer, you can create, display and print this image within Matlab and
never create an intermediate image file.)
Step 2: Compute the Gradient Vector Field
Using the real-valued array SD,
the second step of the Dimitrov et al. shock
graph algorithm is to compute the gradient vector field, delD
from SD. One way to compute delD would be to
use a simple finite difference approximation, but this blurs across singularities,
which is precisely where the skeletal points lie.
(If you want to try out this method for curiosity and comparison purposes,
you can use the Matlab function
gradient.)
For this assignment you are to
compute delD as the unit vector -PQ/||PQ|| where P
is a given object pixel
and Q is (one of) its nearest background point(s).
Q can be computed at the
same time as the EDT using bwdist as follows:
[D,L] = bwdist(IMAGE) Used in this way, L is an array of
type double of the same size as IMAGE
specifying for each pixel (one of) its closest background pixel(s). Use the
value in L (after converting it from its raster-scan index value to
image coordinates) to define the point Q needed to compute
delD. Create an image of this vector field using Matlab's
quiver
function and print a copy of this image.
Step 3: Compute the Flux
The goal of this step is to compute the flux (also known as the divergence)
defined by Eq. (3) in the paper by Dimitrov et al. This is done as
described in the algorithm in the paper
by summing over all eight neighbors of each
point the values computed by the
inner product of delD and a unit outward normal
from the center point to the current 8-neighbor.
Since in our case delD is undefined for background pixels, modify
this step to compute instead the average flux over those 8-neighbors that
are in the object.
Note: You may want to use the Matlab function
divergence
and compare
your result with theirs; your code should use your own implementation of
this step, however.
Hint: For those of you who are a little rusty on your linear algebra, given
two vectors in R2, u = [u1, u2]T and v = [v1, v2]T,
the cosine of the angle between u and v
is equal to the inner product
(aka dot product) of the unit vectors u/||u|| and v/||v||.
The length (or norm) of a vector
u = [u1, u2]T
is defined as
||u|| = sqrt(u12 + u22).
Given two vectors, their inner
product is defined as u v = <u, v> =
uTv = u1v1 + u2v2,
which is a scalar value. In this assignment u and v correspond to the
vectors Ni and delD(Pi).
Flux(P) is the average of these (up to eight) scalar values.
Step 4: Thin
The final step is to compute the skeletal points by thinning, i.e., removing
object points in order by their flux values. Strong negative flux corresponds
to skeletal points. Thinning must not remove endpoints or change the topology
at any point during this process. The method for doing this is given in
the algorithm in the paper. You can create a heap class in Matlab for this
part.
Create an output image where
background pixels are white (255), object pixels are middle-gray (128),
and the skeleton pixels are black (0). Print and hand in this image.
Test Images
There are a variety of test images in the directory
http://www.lems.brown.edu/~dmc/main.html
Be sure to check the images you use to determine what intensity value is used for the
foreground object points because in some cases it might be 0, 1, or 255.
What to Hand In
Hand in
- a hardcopy listing of your
well-documented source code
- a description of any
parameters you used and their values
- hardcopy images of the
blurred EDT, the vector field, and the final skeleton results for at least
the following three test images:
~cs766-1/public/images/hw2/europeanhare.gif
~cs766-1/public/images/hw2/girl.gif
~cs766-1/public/images/hw2/brain_WM_seg_cor1.pbm
The third test image above is from a thresholded MRI image of a human brain's
white matter. It contains some very small regions which you can ignore (or edit out).
Thus you will hand in nine (9) images (or more if you want to show
some additional results).
For More Information
The following papers may also be consulted for other details related to
this algorithm.
- F. Leymarie and M. Levine,
"Fast Raster Scan Distance Propagation on the Discrete Rectangular Lattice,"
Proc. Computer Vision, Graphics and
Image Processing: Image Understanding 55, 1992, 84-94.
- K. Siddiqi et al.,
The Hamilton-Jacobi Skeleton, Proc. Int. Conf. Computer Vision, 1999.
- K. Siddiqi et al.,
The Hamilton-Jacobi Skeleton, Int. J. Computer Vision 48(3), 2002, 215-231.
- S. Bouix and K. Siddiqi,
Divergence-based Medial Surfaces, Proc. European Conf. Computer Vision, 2000.
- K. Siddiqi et al.,
Geometric Shock-Capturing ENO Schemes for Subpixel Interpolation, Computation, and Curve Evolution,
Graphical Models and Image Processing 59(5), 1997, 278-301.
For even more information, see K. Siddiqi's
homepage
and work by the
Brown University Vision Group.