Computer
Vision :
There
may not be Deep Blue machine for computer vision
Chaman Singh
Verma
Acknowledgment: Images are downloaded from Google images
and
the polygonal model data is from cyberware.
There are many interesting problems in the computer vision field but I
don't
have answers to any of them. Some of the simple question that I
frequently ask and hope that someday I or someone else will find a
better solution are:
Idea# 1: Bio-information using Mobile Digital Camera
Imagine you are strolling in
a botanical garden and are simply amazed by
the beauty
of some flowers. Unfortunately, there are no sign boards telling the
name of
those beautiful flowers, but you have a mobile phone with digital
camera.
You are too curious to know the name of the flower so that you could
buy seeds online. Luckily nearby university's biology
department
has a large database of flowers, plants and supports specialized search
engine for them.
Here is the way to go: Put some healthy flower,
leaf, seed, or fruit on a clean white paper (for color balancing) and
take photos and send the images to flower search engine along with the
following additional information to reduce the complexities of the
search problem.
- Region: Your location because certain plants grow only in
some specific regions.
- Month: The month in which the shot is taken. This
information is used to narrow down the search for seasonal plants.
- Plant or Tree: Specify whether the flower or leaves
are from a plant or tree.
- Bounding box: Specify the actual width and height of the
flower
and leaf.
Many well known techniques such as
contour matching, invariant shape descriptors, colors an texture
analysis will be required to solve this particular problem. In
case,some ambiguities are found, a short list of probable candidates is
given to the user and some more questions are posed by the search
engine.
Idea#2: Painting, Cartoon or
Photograph ?
We humans have great cognitive
capabilities and with certainty can detect whether a given image
is a painting, cartoon sketch or a photograph. We can say with great
confidence whether "The picture is real or not". But what is this "Real"
and what goes in our brains to classify them with aplomb? Can we
mathematically express or
quantify it ? Is the "real" thing have larger Shannon entropy or
some statistical correlations that make us classifying the images
instantly or do we calculate and assemble something in higher
dimensions to get this
answer ? From computer vision perspective, the curiosity that I have is
: Can computer classify the images
with high probability (if not with certainty as human beings ) and if
yes, can the algorithmic efficiency compete with human perceptual
skills ?
Idea#3: Automatic Shape Completion of
Angiography Images and 3D
Tubular Surface Reconstruction
|
|
Patient specific numerical simulation of hemodynamics requires
close approximation of blood vessels instead of CAD generated
geometrical models. Manual segmentation and 3D surface reconstruction
of medical images take enormous amount of time and therefore, automatic
or semi-automatic
procedure are often used in this process. There are some
inherent problems with the vessel modeling with the images (1) many
images have poor contrast ratio and (2) 2D image are cluttered with
numerous overlapping
veins. Fortunately, there are some heuristics that may make this 3D
reconstruction easy (1) vessels are smooth with no sharp corners
(2) cross sectional shapes are generally tubular (3) multifurcation are
rare in biological systems. Since medical imaging is commercially
competitive
field, and therefore I am sure that this problem must have been studied
by various
research
group, but what I am looking for are the answers (1) How reliable
are the
automatic methods (if they exists) ? (2) What tools are required for
semi-automatic process to reduce the 3D surface reconstruction time
from days to hours ? (3) How the generated surfaces are verified and
certified before neurosurgeons use them in practice?
Idea#4: Interactive polygonal
reconstruction of 3D shapes
from multiple images:
|
Polygonal representation is often required for interactive geometry
exploration. Large 3D scanner are now commercially available that can
provide very high resolution polygonal models, but they are both
expensive and non-portable. Now a days digital cameras or
camcorders are quite inexpensive and to achieve high resolution
representation from these gadgets a large number of images from
different views may be required. Perhaps the most important things is
to capture salient features (such as ridges, valleys etc) from some
suitable direction so that 3D surface reconstruction algorithms capture
them to produce perceptually acceptable model, also it is desirable
that the entire process is done in
some reasonable time and with limited resources. Interaction with the
user for missing parts or to remove topological ambiguities may also be
essential. The unanswerable questions are (1) How many images are
required (2) Can we really generate a polygonal model for which even a
child can say: Yes, it looks very similar to the one I saw
in the temple ?
Idea#5: Automatic Detection of Rivers
Paths Google Earth Images.
Recently I was tracking the path of rivers Ganges and Brahmaputra
with the Google earth manually. Although it took great patience and
time, I was able to complete it and
verified the path with other geographic information available. I was
also
able to find out that Ganges starts
from somewhere Gangotri in Himalayas and Brahmaputra somewhere in Tibet
plateau. It took hours to identify the main path of the river
which has vast
number of small and large tributaries. The tracking was more time
consuming than identifying road systems because of meandering nature of
rives. I think that perhaps some automation is possible to identify the
rivers paths from
the Google or aerial images with some ideas taken from automatic road
detection systems. Such automatic process can be extremely useful
for GIS applications, urban
planning, and flood control.
Idea#6: Full or partial similarity
detection in 3D objects.
|
|
Problem: Very often 3D polygonal models
are modified by some affine transformations and for reducing the
complexity (mesh simplification) and sometimes by clipping or
implanting
something on the object, Therefore
identifying the similarity between two objects both locally and
globally is important. Establishing correspondence between
salient feature points, lines and contour is essential to solve the
problem. For partial similarity detection, consistent
segmentation of the model is performed and correspondence in segment
pairs is established. Invariant such as surface ridges and
valleys is used for both data reduction and identifying salient
features.
Application: There are large number of precious statues and artifacts
that are smuggled from India every year and there are few mechanism
available for the law enforcement agencies to find out whether the
object is part of the national treasure. Suppose there is national
database of such objects and if the officials are able to take
few photograph of the objects and query for the similarities from
the national antique/historical database, then it could help in
combating the menace of illegal
smuggling.