libvptree
This package provides a basic implementation of Vantage Point Trees (vp-trees) for nearest neighbor search. The vp-tree provides a general way to perform nearest neighbor search in arbitrary metric spaces. The data structure was developed by Yianilos and independently by Uhlmann as "Metric Trees." vp-trees were shown by Kumar et. al. to provide superior performance on common Computer Vision nearest neighbor computations. In addition to the flexibility of performing queries in any arbitrary metric space (for example the metric derived from using the Pyramid Match Kernel as an inner product), even in Euclidean spaces the vp-tree may be a good way to handle very high-dimensional nearest neighbor problems.
The code provided in this package allows for the following types of queries:
- k-nearest neighbor
- ε-neighbor
- Incremental nearest neighbor: Returns an arbitrary number of neighbors, one at a time, in order from closest to farthest
- Approximate k-NN: Visits a fixed number of nodes in order defined by a priority queue (like done in ANN by Mount & Arya)
Language bindings are provided for C++, Python, and Matlab. Example code in each language shows how to build and query a VP-tree using the provided interface.
Distributed under the Apache 2.0 License. This code was initially written as part of the construction of k-NN graphs for clustering in the paper Spectral Clustering with a Convex Regularizer on Millions of Images by Collins et. al. Researchers who wish to cite this work should refer to that paper and the Yianilos paper below.
Future Additions
- sp-trees (see Liu et. al.)
- Additional (better) ways to choose the vantage points.
References
- Yianilos, P.N. Data structures and algorithms for nearest neighbor search in general metric spaces. Symposium on Discrete Algorithms (SoDA), January 1993
- J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, November 1991.
- N. Kumar, L. Zhang, and S. K. Nayar. What is a Good Nearest Neighbors Algorithm for Finding Similar Patches in Images? European Conference on Computer Vision (ECCV), October 2008.
- M. Collins, J. Liu, J. Xu, L. Mukherjee, and V. Singh. Spectral Clustering with a Convex Regularizer on Millions of Images. European Conference on Computer Vision (ECCV), October 2014.
- T. Liu, A.W. Moore, A. Gray, and K. Yang. An Investigation of Practical Approximate Nearest Neighbor Algorithms. Advances in Neural Information Processing Systems (NIPS), December 2004.
- K. Grauman and T. Darrell. The pyramid match kernel: discriminative classification with sets of image features. International Conference on Computer Vision (ICCV), October 2005.