The Use of Pictorial Structures for Pattern Recognition
in X-ray Crystallographic Electron Density Maps

Frank DiMaio
dimaio@cs.wisc.edu
CS 766 Final Project
December 17, 2002



Abstract

In this paper, I investigate the use of pictorial structures in recognizing patterns in electron density maps.  Such maps are an intermediate step in protein structure determination via x-ray crystallography.  By building pictorial structures corresponding to individual amino acids, or short peptide sequences, patterns in the density map can be located.  This provides a good estimate of the atomic coordinates of atoms in the located pattern.  This technique could prove an important step in the development of a fully automated routine for interpreting electron density maps.  In implementing pictorial structure recognition for this problem, several non-trivial additions need to be made to the linear-time algorithm developed by Felzenszwalb and Huttenlocher.  The most significant of these involves building a 3D rotational model in which the structure is free to rotate around a connection (bond), but cannot rotate in any other direction.  The linear-time algorithm was implemented producing a program which runs quickly provided too many discretizations (buckets) are not used for variables in the six-dimensional conformation space.  The memory cost is quite high, however.  Results of the program are disappointing.  While a low-cost solution is found to the specifications provided, the solution is often far from optimal.


I. Introduction

A critically important problem in biochemistry and molecular biology involves the determination of a protein’s three dimensional shape.  Known as the protein’s folding, this shape is in most cases determined by the protein’s sequence alone  The issue is important because knowing a protein’s structure provides great insight into the mechanisms in which that protein is involved.  Knowledge of these mechanisms is needed for, say, treating a disease, or developing customized drugs.  Secondarily, knowing the structure of a protein allows biologists to get one step closer to the "holy grail" – a direct mapping from sequence to structure.

In the current state of structural biology, no algorithm mapping sequence to structure exists, and we are forced to use laboratory methods to elucidate the structure of proteins.  By far, the most common technique in use today for determining the structure of proteins and other large molecules is x-ray crystallography.  This is an experimental technique that allows visualization on the molecular scale.  At a high level, x-ray crystallography is similar to a very powerful light microscope.  However, a light microscope is limited by the long wavelengths of visible light.  These long waves simply pass through objects as small as an electron cloud, undisturbed.  On the other hand, the wavelength of an x-ray is small enough to be scattered by electron cloud surrounding atoms.  So an "x-ray microscope" would allow us to visualize objects at an atomic resolution.  One problem remains, though.  In light microscopes, a lens is used to collect the scattered light and focus it on a single point – but there is no such thing as an x-ray lens.  In this case, a computer algorithm acts as a virtual lens by reassembling scattered x-rays into a 3D plot of a molecule’s electron cloud.  This resultant plot is known as an electron density map.

The actual process of x-ray crystallography is much more difficult than this abstraction would make it seem.  X-rays don’t allow us to visualize a single molecule.  Instead, we must have many molecules, arranged in a regular pattern – a crystal.  Additionally, the reconstruction process, by which the electron density map is built, is not exact, and the resultant image may be noisy.  The crystal through which diffraction occurs often also is imperfect - affecting the quality of the final output as well.  Even with a fairly accurate map, it is not trivial to determine the locations of all the atoms in the density map.  This final representation, in which the coordinates of every atom have been determined, is most valuable to biochemists.  Yet getting to this final structure, given a density map, currently takes many weeks of work for all but the most perfect maps.  An automated method of such interpretation would be of great assistance to crystallographers.

I propose to investigate the use of pictorial structures [1] in order to automatically interpret electron density maps.  Pictorial structures are a representation of an object as a collection of simple parts, connected by deformable "springs".  By representing individual amino acids or short peptide sequences as pictorial structures, a dynamic programming algorithm [2] can be used to find the optimal location and configuration of the amino acid or polypeptide.  This optimal configuration could provide an important first step in automated density map interpretation.

The remainder of this paper is structured as follows.  Section 2 discusses some related efforts in automated density map interpretation.  Section 3 discusses the dynamic programming algorithm for pictorial structure recognition in more detail.  Sections 4 and 5 discuss the modifications, and difficulties, of extending this the algorithm to this three-dimensional dataset.  Finally, Section 6 shows some preliminary results, and Section 7 presents some concluding remarks.


II. Related Work

A number of attempts have been made in the past several decades to automate the interpretation of electron density maps.  By far the most successful is ARP/wARP [3,4].  The rather proprietary technique involves placing and moving atoms in the density map randomly, until they do a sufficiently good job explaining the density observed.  By using this Monte Carlo approach multiple times, and averaging the results, the output is quite good.  This method has been used successfully a number of times in the past, but it does have one fairly significant drawback: the density map must be of a fairly high resolution, about 2.5 Angstroms or lower.

Two other attempts that have been made are MAID and Textal.  In MAID [5], the computer approaches interpretation "as a human would" – by first finding the major secondary structures, alpha helices and beta sheets, and then filling in the remaining regions.  Textal [6], on the other hand, takes a pattern-recognition approach to solving the problem.  It uses a set of fifteen rotation-invariant features at each of four resolutions to try to determine the type of amino acid contained in a certain region.  The features used are mostly statistical, looking at the deviation and skew, for example, of the contained densities.  The technique was shown to be successful, provided it was given the locations of every alpha carbon, in advance.

Finally, a fourth attempt, fffear [7] uses Fast Fourier Transforms to find secondary structures in poor-quality density maps.  For example, it can find alpha helices in maps with as poor as 5.0 Angstrom resolution; beta sheets can be located in maps with resolutions of 4.0 Angstroms or better.


III. Pictorial Structures

First proposed in [1], pictorial structures are a way of representing an object as a collection of features – often small and simple features are used – and a set of deformable connections joining the features together.  They provide a number of distinct advantages over other feature-finding techniques.  One such advantage is when the object under scrutiny is not a rigid object, but is jointed.  Rather than requiring a number of detectors for each possible conformation, a single pictorial structure can represent the object.  Another advantage is that large complicated feature detection routines do not have to be used.  Since the features in the object are small and fairly simple, oftentimes a simple template matching algorithm can be used.  Even something as simple as average color, or color variance of a region can be used in the framework.  Pictorial structures also allow detectors to be more general: a single detector can be built to recognize "red cars", for example, regardless of size, position, orientation, or even style of car.

Although proposed three decades ago, it is only recently that an algorithm with a reasonable running-time has been developed for pictorial structure matching.  In [2], Felzenszwalb and Huttenlocher present a clever linear-time implementation for matching pictorial structures.  In their work, a pictorial structure is simply a graph G=(V,E).  The vertices V are "features" in the pictorial structure; an edge (vi, vj) exists in E if parts vi and vj are connected in the structure.  One restriction their implementation places on the structure is that G is a tree.

Felzenszwalb and Huttenlocher’s work makes use of two "tricks" to achieve linear running time.  First, they present a dynamic programming algorithm which finds the minimum-cost configuration for a pictorial structure.  A configuration refers to, for each node in the pictorial structure, a position, orientation, and scale.  The dynamic programming formulation is used to find the optimal cost, Bj, and position, Bj', of placing a subtree of nodes, given the position of the parent.  This value is computed for every possible configuration – i.e., every possible position, scale, and orientation of the node.  By computing this function for every node in the structure, starting at the leaves and traveling up to the root, we can find the optimal configuration.  This optimal configuration is found by simply take the minimal cost position (the minimum Br value) of the root node.  We can then pass the corresponding Br' location down the tree to find the optimal configuration of every node in the structure.

The second trick employed in the efficient implementation involves the clever use of the generalized distance transform to compute Bj (and Bj') in linear, rather than quadratic, time, with respect to the number of configurations of a part.  First, they restrict the pairwise cost function dij – the function that determines how well two locations in the graph match the model – to take the form

where ||.|| is some norm – in this paper the 1-norm - li and lj are two possible configurations, and Tij and Tji are invertible functions.  They then use the generalized distance transform to compute Bj for every possible configuration in linear time.  The generalized distance transform of a function f is given by

with G the space of configurations under consideration.  Notice that this becomes a classical distance transform if f(w) = 1B(w), where 1B(w) is an indicator function taking the value 1 if w in in an object B, and +inf otherwise.

By choosing the proper function f(w), we can represent the function Bj in terms of a distance transform.  Specifically, if we let

then Bj is a distance transform in Tij(li) space;  i.e,

Note that Bc is the value of B for each child of vjHere is the breakthrough – there exist linear-time algorithms for approximating a distance transform to within a constant error factor [8,9].  In this case, with periodic data (as a result of the periodicity of the rotation parameters), four passes through the complete data set are required.  This allows linear-time computation of Bj.  Furthermore, these algorithms are easily modified to compute Bj' as well.

With this framework in place, all that remains is to develop a good pairwise cost function.  In their paper, Felzenszwalb and Huttenlocher present cost functions for modeling two different types of joints: flexible revolute joints, and flexible prismatic joints.  This work will only concern itself with the former.  Flexible revolute joints (see Fig. 1) involve two features coming together around a central "pivot."  Rotations about the pivot are cheap, but a translation for either component away from their ideal location with respect to the pivot is expensive.  In this case, a conformation (location) is a set of four components: x and y translation, scaling, and rotation.  Given two locations li = (xi,yi,θi,si) and lj = (xj,yj,θj,sj), the deformation cost function is given by

There are a couple things to note in this equation. First, (xij, yij) is the joint position in the reference frame of object vi.  Similarly, (xji, yji) is the joint position in the reference frame of object vjqij is the ideal angle between the two components; sij is the ideal scaling.  The w terms are weight terms, which account for the importance of each dimension.  In the case of revolute joints, wx, wy, and ws are large, wθ is small.  Since we must express this equation in terms of the norm of the difference of a function T at each point, we have Tij = (x',y',θ',s') with

In the above, Rθ indicates the rotation matrix corresponding to a rotation θ.  Clearly, equation (1) holds given (5) and (6).

Thus, we now have an algorithm for finding objects given their pictorial structure.  The algorithm runs efficiently, taking only linear time with the number of configurations and number of nodes in our graph.  In practice, the algorithm runs sufficiently fast with real-world data.


IV. Extending Pictorial Structures to 3D: The Curse of Dimensionality

Extending Felzenszwalb and Huttenlocher’s work to recognize amino acids in electron density maps is a non-trivial endeavor.  Several challenges arise that need to be solved.  Translation can be dealt with trivially; simply add a dimension zij to the joint locations and extend (5) and (6) in the obvious way.  For this problem, we can ignore scaling, which actually allows us to simplify the problem somewhat.  The main difficulty of this extension deals with rotation.

In two dimensional vision tasks, as the Felzenszwalb and Huttenlocher work originally assumed, rotation can be summed up by a single parameter, θ.  The algorithm presented in the previous section simply used an angle θij to express the optimal rotation of parts configured around a joint.  Unfortunately, rotations in three dimensions are not nearly so simple.  Instead of one parameter, there are three rotational parameters required to arbitrarily rotate objects in three dimensions.  Additionally, for this task, it would be desirable to allow rotations around a bond to be permitted with minimal cost, but all other rotations and translations come with a fairly large penalty.

To handle rotations in three dimensions, I have opted to use Euler angles. These three parameters (φ,θ,ψ) can describe any possible rotation as three consecutive rotations: one about the z axis, one about the x axis, and another about the new z axis, z'.  See Fig. 2 for a graphical representation.  There are many flaws with using Euler angles as a rotational representation.  The rotations are not unique; there may be many ways of expressing the same rotation.  The rotations are not "smooth": linearly interpolating between two rotations will not give a smooth path, but rather will often zigzag on the unit sphere.  However, for this problem, Euler angles are sufficient.

One interesting aspect of this problem is that it is very desirable to allow rotations around a bond.  In real molecules, atoms are constantly rotating about bonds, which, when dealing with large molecules, can lead to radically different structures.  These different orientations of the same molecule are called rotamers.  There are often several distinct low-energy states that atoms fall into, however, rather than a continuum of bond rotations.  In x-ray crystallography, rotamers of amino acid residues are very important - so much so that large libraries have been compiled.  Current rotamer libraries describe about 6000 configurations the twenty amino acids can take.  Fig. 3 shows an example of a pair of rotamers.  Notice how a single rotation can cause a drastic change in the shape of the molecule.

In this problem, we are helped somewhat by the regularity of the structure of amino acid residues.  For the most part, each carbon atom is connected (bonded) to four different atoms.  The bonds form a tetrahedron around the molecule, each bond separated by 109.5 degrees.  Nitrogens and Oxygens usually have only three and two bonds, respectively, but they are separated by the same 109.5 degrees.  There are exceptions to this rule, but for most atoms this is the case.  Exceptions are treated in a similar manner to the technique described below.

Our technique first involves rooting the tree at an atom that does not have 4 connections.  This is usually quite simple.  Then, when we build the tree, we compute the (xij,yij,zij) pivot points such that incoming bonds, i.e. bonds going up the tree are straight up. The pivot point is chosen to be the center of the child in the tree, i.e. (xji,yji,zji) is (0,0,0).  Then the (xij,yij,zij) pivot extends the length of the bond. This enables the initial rotation of φ about the z axis to be around the bond.  By setting a low weight to this rotation, we allow free rotation about the bond.  Unfortunately, it does not allow certain minimal-energy conformations to be specified, but it still is a decent model.

Finally, we have to constrain the other two parameters. By tetrahedral geometry, the incoming and (up to) three outgoing points are spaced evenly around the unit sphere. In spherical coordinates, the three outgoing points are at ψ=0,120,240 deg.  These parameters give us our parameters θij and ψij.   In a similar fashion, we can build up the entire amino acid, or peptide sequence.

Finally, we are ready to derive the distance function.  It is a relatively simple extension to the distance function in the previous section,

where wrotate is the cost of rotating about a bond, worient is the cost of rotating around another axis, and wtrans is the cost of translating in x, y or z.   For this application, wrotate is kept low, while worient and wtrans are held high, to prevent movement other than bond rotations. style="mso-spacerun: yes">  This gives us our function Tij(li) = (x',y',z',φ',θ',ψ' ) where

In the above, REuler is the rotation matrix corresponding to a rotation (in Euler angles) of (φi,θi,ψi).  So the term (x',y',z') gives the corrected (rotated) location of the pivot point.  With the theoretical framework in place, we are ready to discuss the implementation of the pictorial structure algorithm.


V. Implementation

The efficient pictorial structure matching algorithm was implemented in C++ on Windows.  Implementation was fairly straightforward given the framework in the previous section.  There are a few things to note, however.  First off, our search is now a search through the six-dimensional space (x,y,z,φ,θ,ψ).  Three of these dimensions are periodic, and so, special treatment of boundary conditions need to be made during application of the distance transform, and an additional two passes need to be made of the space G during the distance transform. This is the same as in the original algorithm.

Aside for the increased search-space dimensionality, implementation of this algorithm is almost identical to the original algorithm. Because of the increased dimensionality of the problem, fewer "buckets" are allowed when discretizing the variables.   In the original algorithm, work was done on 320 by 240 images with 50x50x10x20 buckets for x, y, s, and θ, respectively.  In our case, we will be limited to images no larger than perhaps 20x20x20, with a similar magnitude of bucks in angular parameters.

For the matching function, two different approaches were tried. The first approach attempted to build a set of templates corresponding to atom locations.  Using k-means clustering, twelve 5x5x5 templates were build, using solved electron density maps as the training set. The match function treated each template and the actual density data in a 5x5x5 region as a vector, and found the angle between the two.  The best (closest) matching template was used to calculate the score.  The second approach was even simpler.  The match score was better the higher the density was at a given point.  Thus, the best matching points have a very large density.

Finally, a source code listing has been provided.  The work for this project was entirely included in a PictorialStruct class.  This class is a module in a much larger visualization package.  For brevity, only the source to the PictorialStruct class has been provided.


VI. Results

A pictorial structure was built corresponding to the amino acid lycine.   It was constructed in the fashion described in the previous section, such that it was allowed to rotate freely about its bonds.  Rather than search the entire protein space for an instance of the object, I instead limited the search to a sphere with a diameter of 5 Angstroms in the area of the residue. The complete set of parameters used for this problem is shown in Table 1. Running time was fairly snappy, taking about 20 seconds to run on a 300 MHz Pentium.  Template matching was used as our matching function.

The protein from which the data comes is a structure which is currently being solved here at the University of Wisconsin - Madison.  It it protein ak3, and has a map of 1.9 Angstrom resolution; it is a quite high resolution map.

Figure 4 shows a sample instance of a pictorial structure located.  Clearly, the results are less than ideal.  While the model does match with the density map somewhat, the quality of the overall match is a bit poor.  It is unclear at this time if altering the parameters would improve the quality of the search.  It is also possible that a bug in the implementation is effecting the quality of the results.  It is important to note, though, that the structure of the predicted output is good - it models the shape of an actual lysine quite nicely.

Finally, one additional thing to mention is that while the running time of the program is quick, the memory requirements are quite high.  This might make scaling the program up to search the entire protein space a bit difficult.  However, such a task might just require nothing more than a few clever tricks.


VII. Conclusion

Although the results are far from perfect, the idea of using pictorial structures for interpretation of electron density maps should not be thrown away.  Modifying the weights, and even the formulation of the problem, may improve the algorithms performance.  Also running the problem on a finer grid may help somewhat.  The quick speed of such an algorithm alone makes it a promising area of future research.  No other algorithm is able to so quickly consider so many configurations of a structure simultaneously.


References

[1] M.A. Fischler and R.A. Elschlager. (1973)  The representation and matching of pictorial structures. IEEE Trans. on Computers. 22:1, 67-92.

[2] P. Felzenszwalb and D. Huttenlocher. (2000) Efficient matching of pictorial structures. Proc. Computer Vision and Pattern Recognition Conf. 66-73.

[3] V.S. Lamzin and K.S. Wilson. (1993) Automated refinement of protein models. Acta Cryst. D49, 129-149.

[4] A. Perrakis, T.K. Sixma, K.S. Wilson, V.S. Lamzin. (1997) wARP: improvement and extension of crystallographic phases by weighted averaging of multiple refined dummy atomic models. Acta Cryst. D53:448-455.

[5] D.G. Levitt. (2001) A New Software Routine that Automates the Fitting of Protein X-Ray Crystallographic Electron Density Maps. Acta Cryst. D57, 1013-1019.

[6] T.R. Ioerger, T. Holton, J.A. Christopher, J.C. Sacchettini. (1999) TEXTAL: a pattern recognition system for interpreting electron density maps. Proc Int Conf Intell Syst Mol Biol. 130-137.

[7] K. Cowtan. (1998) Modified phased translation functions and their application to molecular-fragment location. Acta Cryst. 54:5, 750-756.

[8] G. Borgefors. (1984) Distance transformations in arbitrary dimensions. Computer Vision, Graphics, and Image Processing. 27:3, 321-345.

[9] G. Borgefors. (1986) Distance transformations in digital images. Computer Vision, Graphics, and Image Processing. 34:3, 344-371.