cis-Regulatory Module (CRM)-Finding Software

Source code for our cis-regulatory module (CRM)-finding alogorithms can be found below. Yeast data from Lee et al. (Science, 2002) can be downloaded here.

Both algorithms learn a CRM as a set of motifs and a logical and spatial relationship among them. The learned CRM distinguishes between a set of positive DNA sequence examples (e.g. promoters of interest) and a set of negative (control) sequences.

Key differences among these algorithms are listed below for each (see papers for details). In either case, the user may specify which logical and spatial aspects are allowed for a learned CRM.

Contact: notocs.wisc.edu (Keith Noto, University of Wisconsin-Madison)

Noto and Craven, Learning Probabilistic Models of cis-Regulatory Modules that Represent Logical and Spatial Aspects, European Conference on Computational Biology (ECCB) 2006 (PDF)

Key points of this algorithm:

Learns motif (PWMs) de novo.
Learns spatial preferences instead of hard constraints. For example, it learns a smooth probability distribution over possible distances between adjacent motifs instead of a maximum allowable distance constraint.
In fact, the algorithm learns a generalized hidden Markov model (HMM) representation of a CRM, and learns both model structure (number of motifs and logical relationship among them) and parameters (motif PWMs and spatial preferences).

Download:

Downloaded latest version of source code and documentation (tar.gz)
View documentation online (includes installation instructions)
Previous releases (mostly edited for combatibility with g++ on various platforms):

Noto and Craven, A Specialized Learner for Inferring Structured cis-Regulatory Modules, BMC Bioinformatics, 2006 (PDF)

Key points of this algorithm:

Selects the relevant CRM motifs from a given set of candidate motifs. These candidates are defined as position weight matrices (PWMs) and may come from a database or suggested motifs from a motif-learning algorithm.
Learns constraints on the spatial relationships among motifs. For example, a CRM may include a maximum distance (in base pairs) between motifs.

Download:

Download source code and documentation
View documentation online (includes installation instructions)