QUEST Classification Tree (version 1.9.2)

© Yu-Shan Shih 1997-2005

View this page in Romanian courtesy of Aleksandra Seremin.

QUEST is a binary-split decision tree algorithm for classification and data mining developed by Wei-Yin Loh (University of Wisconsin-Madison) and Yu-Shan Shih (National Chung Cheng University, Taiwan). QUEST stands for Quick, Unbiased and Efficient Statistical Tree.

The objective of QUEST is similar to that of the CART(TM) algorithm described in the book, Classification and Regression Trees, by Breiman, Friedman, Olshen and Stone (1984). [CART is a registered trademark of California Statistical Software, Inc.] The major differences are:

If there are no missing values in the data, QUEST can optionally use the CART algorithm to produce a tree with univariate splits.

See Table 1 for a feature comparison between QUEST and other classification tree algorithms.

Documentation:

  1. Loh, W.-Y. and Shih, Y.-S. (1997), Split selection methods for classification trees, Statistica Sinica, vol. 7, 815-840. [This is the definitive reference for QUEST.]
  2. Lim, T.-S., Loh, W.-Y., and Shih, Y.-S. (2000), A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms, Machine Learning Journal, vol. 40, 203-228. [This paper compares the performance of version 1.7 of QUEST against other methods.] A separate appendix contains more detailed results. The datasets used in the study are in the gzipped tar archive (5.8Mb)
  3. QUEST User Manual in pdf format. The manual uses the example data and description files hepdat.txt and hepdsc.txt for illustration.
  4. Loh, W.-Y. and Vanichsetakul, N. (1988), Tree-structured classification via generalized discriminant analysis (with discussion), Journal of the American Statistical Association, vol. 83, 715-728. [This paper documents an older algorithm called FACT.]
  5. Shih, Y.-S. (1999), Families of splitting criteria for classification trees, Statistics and Computing, vol. 9, 309-315. This paper documents the enlarged class of splitting criteria in version 1.8 of QUEST. Downloadable from Shih's page.

Compiled binaries: The following files may be freely distributed but not sold for profit.

  • Intel and compatibles (Windows 9x/NT/2000) in pkzip format --- download (download pkunzip.exe)
  • Intel and compatibles (Linux 2.0) in gzip format --- download
  • Sun SPARCstation/Ultra (Sun Solaris OS 5) in gzip format --- download
  • Revision history: See the file history.txt

    Commercial implementations of earlier versions of the algorithm for the Windows platform are available from SPSS (AnswerTree) and StatSoft (STATISTICA).

    Tree diagrams: The QUEST program can optionally produce LaTeX ( MikTeX) or allCLEAR source code for the tree diagrams. The LaTeX code, which requires the PSTricks package, can output pdf or postscript files (the latter can be viewed and printed using Ghostscript and GSView).

    Related algorithms with unbiased splits:

  • CRUISE: Classification trees with more than two splits per node
  • GUIDE: Piecewise-linear least-squares, quantile, and Poisson regression trees
  • License:

    QUEST is free software. You may use the Program without restriction. You may copy and distribute the Program in executable form provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; and give any other recipients of the Program a copy of this license along with the Program.

    Disclaimer of Warranty:

    The copyright holder provides the Program "as is" without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the Program is with you. Should the Program prove defective, you assume the cost of all necessary servicing, repair or correction. In no event will the copyright holder be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the Program (including but not limited to loss of data or data being rendered inaccurate or losses sustained by you or third parties or a failure of the Program to operate with any other programs), even if such holder has been advised of the possibility of such damages.

    Return to Wei-Yin Loh's homepage.

    Last modified: December 1, 2011 by Wei-Yin Loh