C. Allex, J. Shavlik & F. Blattner (1999).
Neural Network Input Representations that Produce Accurate Consensus Sequences from DNA Fragment Assemblies. Bioinformatics, 15, pp. 723-728.
This publication is available in PDF and available in postscript.
Motivation: Given inputs extracted from an aligned column of DNA bases and the underlying Perkin Elmer Applied Biosystems (ABI) fluorescent traces, our goal is to train a neural network to correctly determine the consensus base for the column. Choosing an appropriate network input representation is critical to success in this task. We empirically compare five representations; one uses only base calls and the others include trace information. Results: We attained the most accurate results from networks that incorporate trace information into their input representations. Based on estimates derived from using 10-fold cross-validation, the best network topology produces consensus accuracies ranging from 99.26% to over 99.98% for coverages from two to six aligned sequences. With a coverage of six, it makes only three errors in 20, 000 consensus calls. In contrast, the network that only uses base calls in its input representation has over double that error rate - eight errors in 20, 000 consensus calls.
Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison
INFORMATION ~ PEOPLE ~ GRADS ~ UNDERGRADS ~ RESEARCH ~ RESOURCES
5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison, WI 53706
email@example.com ~ voice: 608-262-1204 ~ fax: 608-262-9777