L. Oliphant, E. Burnside & J. Shavlik (2009).
Boosting First-Order Clauses for Large, Skewed Data Sets. Proceedings of the Nineteenth Conference on Inductive Logic Programming, Leuven, Belgium.
This publication is available in PDF.
Creating an effective ensemble of clauses for large, skewed data sets requires finding a diverse, high-scoring set of clauses and then combining them in such a way as to maximize predictive performance. We have adapted the RankBoost algorithm in order to maximize area under the recall-precision curve, a much better metric when working with highly skewed data sets than ROC curves. We have also explored a range of possibilities for the weak hypotheses used by our modified RankBoost algorithm beyond using individual clauses. We provide results on four large, skewed data sets showing that our modified RankBoost algorithm outperforms the original on area under the recall-precision curves.
Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison
INFORMATION ~ PEOPLE ~ GRADS ~ UNDERGRADS ~ RESEARCH ~ RESOURCES
5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison, WI 53706
email@example.com ~ voice: 608-262-1204 ~ fax: 608-262-9777