F. DiMaio, J. Shavlik & G. Phillips (2003).
Using Pictorial Structures to Identify Proteins in X-ray Crystallographic Electron Density Maps.
Working Notes of the ICML Workshop on Machine Learning in Bioinformatics, Washington DC, USA.
Slides (PPT).
This publication is available in PDF.
The slides for this publication are available in Microsoft PowerPoint.
Abstract:
One of the most time-consuming steps in determining a protein's structure via x-ray crystallography is interpretation of the electron density map. This can be viewed as a computer-vision problem, since a density map is simply a three-dimensional image of a protein. However, due to the intractably large space of conformations the protein can adopt, building a protein model to match in the density map is extremely difficult. This paper describes the use of pictorial structures to build a flexible protein model from the protein's amino acid sequence. A pictorial structure is a way of representing an object as a collection of parts connected, pairwise, by deformable springs. Model parameters are learned from training data. Using an efficient algorithm to match the model to the density map, the most probable arrangement of the protein's atoms can be found in a reasonable running time. We test the algorithm is on two different tasks. The first is an amino-acid sidechain-refinement task, in which the location of the protein's backbone is approximately known. The algorithm places the remaining atoms into the region of density quite accurately, placing 72% of atoms within 1.0 of their actual location (as determined by a crystallographer). In the second task, a classification task, the algorithm is used to predict the type of amino acid contained in an unknown region of density. In this task, the algorithm is 61% accurate in discriminating between four different amino acids.
Computer Sciences Department
College of Letters and Science
University of Wisconsin - Madison
INFORMATION
~ PEOPLE
~ GRADS
~ UNDERGRADS
~ RESEARCH
~ RESOURCES
5355a Computer Sciences and Statistics ~ 1210 West Dayton Street, Madison,
WI 53706
cs@cs.wisc.edu ~ voice: 608-262-1204 ~
fax: 608-262-9777