University of Wisconsin Computer Sciences Header Map (repeated with 
textual links if page includes departmental footer) Useful Resources Research at UW-Madison CS Dept UW-Madison CS Undergraduate Program UW-Madison CS Graduate Program UW-Madison CS People Useful Information Current Seminars in the CS Department Search Our Site UW-Madison CS Computer Systems Laboratory UW-Madison Computer Sciences Department Home Page UW-Madison Home Page

M. Molla, P. Andreae & J. Shavlik (2004).
Building Genome Expression Models using Microarray Expression Data and Text. Department of Computer Sciences, University of Wisconsin, Machine Learning Research Group Working Paper 04-1.



This publication is available in PDF and available in Microsoft Word.

Abstract:

Microarray expression data is being generated by the gigabyte all over the world with undoubted exponential increases to come. Annotated genomic data is also rapidly pouring into public databases. Our goal is to develop automated ways of combining these two sources of information to produce insight into the operation of cells under various conditions. Our approach is to use machine-learning techniques to identify characteristics of genes that are up-regulated or down-regulated in a particular microarray experiment. We seek models that are both accurate and easy to interpret. This paper explores the effectiveness of two algorithms for this task: PFOIL (a standard machine-learning rule-building algorithm) and GORB (a new rule-building algorithm diviseddevised by us). We use a permutation test to evaluate the statistical significancequality of the learned models. The paper reports on experiments using actual E. coli microarray data, discussing the strengths and weaknesses of the two algorithms and demonstrating the trade-offs between accuracy and comprehensibility.


return Return to the publications of the Univ. of Wisconsin Machine Learning Research Group.

Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison


INFORMATION ~ PEOPLE ~ GRADS ~ UNDERGRADS ~ RESEARCH ~ RESOURCES

5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison, WI 53706
cs@cs.wisc.edu ~ voice: 608-262-1204 ~ fax: 608-262-9777