Copyright (C) 2010 Brandon M. Smith and Li Zhang Author: Brandon M. Smith, Dept. of Computer Sciences, UW-Madison Email: bmsmith@cs.wisc.edu Personal web: http://www.cs.wisc.edu/~bmsmith Project web: http://www.cs.wisc.edu/~lizhang/projects/mvstereo/cvpr2009/ Description: This is a C++ implementation of the multi-view stereo matching algorithm described in: Brandon M. Smith, Li Zhang, Hailin Jin. Stereo Matching with Non- parametric Smoothness Priors in Feature Space. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2009. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . QUICK START =========== The source code (located in the src folder) can be compiled in Windows using Visual Studio 2008 or in Linux using the make command. An executable called mvstereo (Linux), or mvstereo.exe (Windows), will be created in the bin folder. The user supplies the program with all parameters and input/output filenames via an arguments text file. To run the program on the command line, type Linux: mvstereo -af Windows: mvstereo.exe -af Two example arguments text files exist in the bin directory: af_full_example.txt contains all possible parameters. af_simple_example.txt contains the minimum parameters required to run the program, with all remaining parameters set to default values. The above two files will produce the same output using the example input data in the data directory. All results will be saved to data/results. The directory in which results are saved must exist before the program executes. DETAILED INSTRUCTIONS ===================== At a minimum, the user must specify the following in the arguments file: * N input image filenames (where N is greater than or equal to 2), * N camera parameter filenames, * a set of N-1 camera pairings that span the set of camera viewpoints, * the inverse of the min and max depths, and * N output disparity/depth map filenames. Input image filenames (only pgm, ppm, and pfm formats supported): -imFiles N ... Camera parameter filenames: -camFiles N ... Camera pairings: -camPairs N-1 ... The inverse of the min and max depths: -zMinInv -zMaxInv Output disparity/depth map filenames: -dptSmoothFiles 5 ... Additionally, the following parameters can be set, where <...> is a number. The default value for each of the parameters is given in bin/ af_full_example.txt and also in src/data.h. A default value will be used if the parameter is not given in the arguments file (as in bin/ af_simple_example.txt). -runMstnbr -runMvstereo -runNoiseremoval -runTrilatfilt -numLayers -dptScale -initDptValue -birchfield -lambda -lambdaVs -tauSm -tauPh -sigmaX -sigmaC -sigmaD -maxNbrRad -numMst -dataCostType -maxIter -nrThresh -numThreads -nbrEdgeFiles N ... -dptFiles N ... The -birchfield parameter refers to the stereo technique developed by Stan Birchfield and Carlo Tomasi (http://www.ces.clemson.edu/~stb/research/stereo_p2p/). There are four steps indicated by the -run parameters (where is Mstnbr, Mvstereo, Noiseremoval, or Trilatfilt); they are executed in the following order: MST neighborhood computation, multi-view stereo, disparity/ depth map noise removal, continuous refinement (using a trilateral filter over the disparity/depth map). Any steps(s) can be omitted (by setting the -run parameter to 0). However, for the program to work correctly, the results of previous omitted step(s) must have been computed previously, and the filenames of these results must be given in the arguments file so that they can be read from disk. The -dptScale parameter must be set so that * is less than or equal to 256 because the result is saved as an 8-bit image. The -lambdaVs parameter should be set very high, i.e., 100000, so that violating the visibility constraint incurs a very large penalty. The -maxNbrRad parameter determines the size of the search window around each pixel in the MST neighborhood computation. It should be set based on -sigmaX. -maxNbrRad = 1 will generate results similar to the standard graph cuts stereo algorithm. Large -maxNbrRad, i.e., > 40, will greatly slow down the MST neighborhood computation. The -nrThresh parameter is used in the disparity/depth map noise removal step. For each pixel in the disparity/depth map, it indicates the number of neighboring pixels that must be at the same disparity/depth as the center pixel to avoid being overwritten by the local mode average disparity/depth value. For example, if a single pixel has a much larger disparity value than any of its neighbors, it will be overwritten by the local mode average disparity value. The -numThreads parameter defaults to the number of input images/viewpoints. A separate thread is spawn in parallel for each image in the pre- and post- processing steps (since each image can be processed independently). Each line of the neighborhood edge set files have the following format:
... where
is the index of the pixel at the center of the neighborhood, and is the index of a pixel connected to the center pixel. The pixel index can be converted to image coordinate by x = index % , y = floor(index / ).