M. Craven & J. Shavlik (1993).
Machine Learning Approaches to Gene Recognition. IEEE Expert, 9.
(The on-line file is a variant of the journal article.)
This publication is available in PDF and available in postscript.
Currently, a major computational problem in molecular biology is to identify genes in uncharacterized DNA sequences. The variation, complexity, and incompletely-understood nature of genes make it impractical to hand-code algorithms to recognize them. Machine learning methods - which are able to form their own descriptions of genetic concepts - offer a promising approach to this problem. This article surveys machine-learning approaches to identifying genes in DNA. We discuss two broad classes of gene-recognition approaches: search by signal and search by content. For both classes, we define the specific tasks that they address, describe how these tasks have been framed as machine-learning problems, and survey some of the machine-learning algorithms that have been applied to them.
Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison
INFORMATION ~ PEOPLE ~ GRADS ~ UNDERGRADS ~ RESEARCH ~ RESOURCES
5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison, WI 53706
firstname.lastname@example.org ~ voice: 608-262-1204 ~ fax: 608-262-9777