Knowledge-Based Data Classification, Approximation and Optimization
Project Supported by NSF under Grant IIS-0511905
Project Period: September 1, 2005 -- August 31, 2009
Office: 4385 Comp Sci & Stat
Phone: (608) 262-6593, (608) 262-1204
E-mail: olvi@cs.wisc.edu
Office: 4381 Comp Sci & Stat
Phone: (608) 262-4281
E-mail: ferris@cs.wisc.edu
Supported Graduate Students
Michael Thompson
Ted Wild
Geng Deng
Qian Li
Project Summary
Massive datasets occur in all types of settings ranging from the
highly scientific to the ubiquitous internet. Making sense of this
massive data requires sophisticated computer sciences techniques such
as data classification, approximation and optimization. All of these
techniques can be improved substantially by making effective use of
prior knowledge that is often readily available. For example doctors'
experience can be utilized in obtaining improved classifiers for
various types of important problems, such as medical diagnosis and
prognosis. Since the most powerful state-of the-art classifiers are
based on support vector machines, which in turn are formulated as
constrained or unconstrained optimization problems, it is our aim that
prior knowledge be incorporated into various optimization-based
applications such as classification and approximation problems as well
into the theory of optimization itself.
To a large degree, this proposal is motivated by the investigators'
extensive collaborative work with oncologists, surgeons and medical
physicists and the investigators' desire to make full use of the
expertise of such practitioners by incorporating it into computable
but rigorous models.
The intellectual merit of the proposed work lies in the use of
rigorous theory and problem analysis techniques that incorporate
domain specific information into general optimization problems.
The research will first
incorporate knowledge into a linear or nonlinear support vector
machine classifier and show that such incorporation is possible by
appending additional constraints to the original problem. This does
not seem to have been attempted before, and preliminary tests indicate
improvements in classifier accuracy. Secondly, prior knowledge will
be introduced into approximation problems.
Thus, in addition to given
discrete data that is normally used to generate an approximation to an unknown
function, prior knowledge in the form of inequalities on polyhedral
sets is also taken into account. Finally, prior knowledge will be
incorporated into general constrained or unconstrained optimization
problems, wherein the prior knowledge consists of new constraints to
be imposed on the behavior of the objective function on various
regions. The generality of these new techniques will facilitate the
integration of information from disparate sources, since the theory
allows multiple sets of prior information to be included concurrently.
Specific application to radiotherapy treatment planning problems will
ensure the computer science advancements are demonstrably useful in a
particular problem domain.
The work will have broader impacts in other areas of medical
science, public health and health care delivery.
The optimization, modeling, and
computational techniques will provide a boost to advances in
cancer diagnosis and prognosis, chemotherapy, and other treatment regimes.
The knowledge-based approach encompasses a broad
spectrum of important classification and approximation problems that
have wide applicability in science and engineering.
The work
will also raise the profile of data mining techniques in other areas
such as
surgery, pharmacology, and medical research, by demonstrating how
our methodologies can be utilized to incorporate prior knowledge into
both planning and design issues, and improving both efficiency of
delivery and effectiveness of treatment in many clinical settings.
By coupling
the education of several computer science and engineering
students with the proposed work, a new
group of multidisciplinary
researchers will be trained that will ensure the technical
advances are applied to further application domains.
Publications Supported by Grant IIS-0511905
- O. L. Mangasarian amd M. C. Ferris
-
Uniqueness of Integer Solution of Linear Equations
PDF Version
Data Mining Institute Technical Report 09-01, July 2009.
- M.C. Ferris, S.P. Dirkse, J.-H. Jagla, and A. Meeraus
-
Extending modeling systems: Structure and solution
- X. Ban, H.X. Liu, M.C. Ferris, and B. Ran
-
A link-node complementarity model and solution algorithm for dynamic user
equilibria with exact flow propagations
In Transportation Research Part B, 42(9):823--842, 2008.
- X. Ban, S. Lu, M. C. Ferris and H. Liu
-
Risk-Averse Second Best Toll Pricing
- X. Ban, M.C. Ferris, and H.X. Liu
-
Numerical studies on reformulation techniques for continuous network design with
asymmetric user equilibrium
International Journal of Operations Research and Information Systems, vol 1, forthcoming 2010
- X. Ban, M.C. Ferris, and L. Tang (Grad student)
-
Risk neutral second best toll pricing
- M. C. Ferris, S. P. Dirkse, J.-H. Jagla and A. Meeraus
-
An extended mathematical programming framework
- E. Bartholomew Fisher, K. Hedman, R. O'Neill , M. C. Ferris and S. S. Oren
-
Optimal Transmission Switching in Electrical Networks for Improved Economic Operations
Technical Report, Federal Energy Regulatory Commission
- O. L. Mangasarian
-
Knapsack Feasibility as an Absolute Value Equation Solvable by Successive Linear Programming
PDF Version
Data Mining Institute Technical Report 08-03, September 2008. Optimization Letters, to appear.
Online Version
- O. L. Mangasarian and E. W. Wild
-
Privacy-Preserving Random Kernel Classification of Checkerboard
Partitioned Data
PDF Version
Data Mining Institute Technical Report 08-02, September 2008.Annals of Information
Systems to appear.
- O. L. Mangasarian
-
A Generlaized Newton Method for Absolute Value Equations
PDF Version
Data Mining Institute Technical Report 08-01, May 2008.
Optimization Letters 3(1), January 2009, 101-108.
- O. L. Mangasarian and E. W. Wild
-
Privacy-Preserving Classification of Horizontally Partitioned Data
via Random Kernels
PDF Version
Data Mining Institute Technical Report 07-03, November 2007.
The 2008 4th International Conference on Data Mining - DMIN'08,July 14-17, Las
Vegas, Nevada. Proceedings of the 2008 International Conference on Data Mining,
DMIN08, Volume II, 473-479, R. Stahlbock, S.V. Crone
and S. Lessman, Editors.
2008 Best Academic Research Paper Award DMIN'08.
- O. L. Mangasarian, E. W. Wild and G. M. Fung
-
Privacy-Preserving Classification of Vertically Partitioned Data
via Random Kernels
PDF Version
Data Mining Institute Technical Report 07-02, September 2007.
ACM Transactions on Knowledge Discovery from Data (TKDD), to appear.
- O. L. Mangasarian and E. W. Wild
-
Exactness Conditions for a Convex Differentiable
Exterior Penalty for Linear Programming
PDF Version
Data Mining Institute Technical Report 07-01, July 2007. Optimization, to appear.
- O. L. Mangasarian and M. E. Thompson
-
Chunking for Massive Nonlinear Kernel Classification
PDF Version
Data Mining Institute Technical Report 06-07, December 2006.
Optimization Methods and Software 23, 2008, 365-274.
- O. L. Mangasarian and E. W. Wild
-
Nonlinear Knowledge in Kernel Machines
PDF Version
Data Mining Institute Technical Report 06-06, November 2006.
CRM Proceedings \& Lecture Notes, Volume 45, American Mathematical Society and
Centre de Recherches Math\'{e}matiques at the Universit\'{e} de
Montr\'{e}al, 2008, 181-198.
- O. L. Mangasarian, E. W. Wild and G. M. Fung
-
Proximal Knowledge-Based Classification
PDF Version
Data Mining Institute Technical Report 06-05, November 2006.
Statistical Analysis and Data Mining, to appear.
- O. L. Mangasarian & E. W. Wild
-
Nonlinear Knowledge-Based Classification
PDF Version
Data Mining Institute Technical Report 06-04, August 2006.
- O. L. Mangasarian & E. W. Wild
-
Feature Selection for Nonlinear Kernel Support Vector Machines
PDF Version
Data Mining Institute Technical Report 06-03, July 2006.
IEEE Seventh International Conference on Data Mining (ICDM'07)
October 28, 2007, Omaha, NE, Workshop Proceedings 231-236.
- O. L. Mangasarian
-
Absolute Value Equation Solution via Concave Minimization
PDF Version
Data Mining Institute Technical Report 06-02, March 2006. Optimization Letters 1(1),
2007, 3-8.
- O. L. Mangasarian and M. E. Thompson
-
Massive Data Classification via Unconstrained Support Vector Machines
PDF Version
Data Mining Institute Technical Report 06-01, March 2006. Journal of
Optimization Theory and Applications 131(3), December 2006, 315-325.
- O. L. Mangasarian and R. R. Meyer
-
Absolute Value Equations
PDF Version
Data Mining Institute Technical Report 05-06, December 2005. Linear
Algebra and Its Applications 419 (2006) 359-367.
- O. L. Mangasarian and E. W. Wild
-
Nonlinear Knowledge in Kernel Approximation
PDF Version
Data Mining Institute Technical Report 05-05, October 2005. Revised June 2006.
IEEE Transactions on Neural Networks, to appear.
- O. L. Mangasarian
-
Absolute Value Programming
PDF Version
Data Mining Institute Technical Report 05-04, September 2005.
Computational Optimization and Applications 36(1), January 2007, 43-53.
- O. L. Mangasarian
-
Exact 1-Norm Support Vector Machines via Unconstrained
Convex Differentiable Minimization
PDF Version
Data Mining Institute Technical Report 05-03, August 2005. Revised January 2006.
Journal of Machine Learning Research 7, 2006, 1517-1530.
- O. L. Mangasarian and E. W. Wild
-
Multiple Instance Classification via Successive Linear Programming
PDF Version
Data Mining Institute Technical Report 05-02, May 2005. Journal of Optimization Theory and Applications 137(1), 2008, to appear.
- M. C. Ferris, P. F. Brennan, L. Tang, J. Marquard, S. M. Robinson, and S. J. Wright
-
Creating operations research models to guide RHIO decision making
In American Medical Informatics Association 2007 Symposium
Proceedings, 2007.
- G. Deng and M. C. Ferris.
-
Extension of the DIRECT optimization algorithm for noisy functions
In B. Biller, S. Henderson, M. Hsieh, and J. Shortle, editors, Proceedings
of the 2007 Winter Simulation Conference, 2007.
- P. Prakash, G. Deng, M. C. Converse, J. G. Webster, , D. M. Mahvi, and M. C. Ferris
-
Design optimization of a robust sleeve antenna for hepatic microwave ablation
Physics in Medicine and Biology, 53:1057-1069, 2008.
- G. Deng and M. C. Ferris
-
Variable-number sample-path optimization
Mathematical Programming, forthcoming, 2008
- J. Wallace, A. Philpott, M. O'Sullivan, and M. Ferris
-
Optimal rig design using mathematical programming
In 2nd High Performance Yacht Design Conference, Auckland, 14-16 February, 2006, pages 185-192, 2006.
- G. Deng and M. C. Ferris
-
Adaptation of the UOBQYA algorithm for noisy functions
In B. Lawson, J. Liu, F. Perrone, and F. Wieland, editors, Proceedings of the 2006 Winter Simulation Conference, pages 312-319, Orlando, Florida, 2006. Omnipress.
- G. Deng and M. C. Ferris
-
Neuro-dynamic programming for fractionated radiotherapy planning
In C.J.S. Alves, P.M. Pardalos, and L.N. Vicente, editors, Optimization in Medicine, International Center for Mathematics, Springer Optimization and Its Applications, pages 49-74. Springer-Verlag, 2007.
-
X. Ban, H. X. Liu, M. C. Ferris, and B. Ran.
-
A general MPCC model and its solution algorithm for continuous network
design problem.
Mathematical And Computer Modelling, 43:493-505, 2006.
-
M. C. Ferris and G. Deng.
-
Classification-based global search: An application to a simulation for
breast cancer
In Proceedings of the NSF CMMI Engineering Research and Innovation
Conference, 2008.
-
M. C. Ferris, A. Sundaramoorthy, and C. T. Maravelias.
-
Simultaneous batching and scheduling using dynamic decomposition on a grid
Technical report, Computer Sciences Department, University of Wisconsin, 2007.
-
M. R. Bussieck, M. C. Ferris, and A. Meeraus.
-
Grid enabled optimization with GAMS
Technical report, Computer Sciences Department, University of Wisconsin, 2007.
-
E. Bartholomew Fisher, R. O'Neill, and M. C. Ferris.
-
Optimal transmission switching
IEEE Transactions on Power Systems, forthcoming, 2008.
-
M. C. Ferris, C. T. Maravelias, and A. Sundaramoorthy.
-
Using grid computing to solve hard planning and scheduling problems
In Proceedings of 18th European Symposium on Computer-Aided Process
Engineering, Lyon, France, June 1-4 2008.
Viewable Math Programming Papers and Reports
Viewable Data Mining Papers and Reports
Mathematical Programming at UW
Data Mining at UW
This page is updated periodically. Last updated on August 1, 2008.