MotifScan was made in Fall 2001 as a program to aid in the discovery and visualization of protein motifs. Written entirely in Java SWING, the program takes inputs from TIGR genomes and MEME results. Using MEME as the motif discovery tool from fragments isolated from protein binding experiements and the genome and the gene coordinates files from TIGR the program shows likely matches to the motif in all parts of the genome and its adjacent genes.

This screenshot shows the main viewing window along with the mini-browser that aids in the exploration of genes adjacents to the selected motif.

Screenshot

USING

-Takes 3 Input Files
- MEME log matrix output. Generally this is from running a MEME motif search and using the PSSM# from the motif (#) of your choice. You must take the matrix and cut-out the first line (the one with the text) and save it for input into MotifScan. Example:

-58 -1173 45 79
-106 209 -271 -1173
20 -1173 -1173 108
-1173 -271 145 52
-338 168 87 -1173
126 -72 -171 -180
-6 -72 119 -106
-507 162 101 -507
89 -507 -507 46
52 -271 61 -38
-1173 153 -113 20
-1173 -271 -1173 166
-1173 -1173 -1173 170
-1173 204 -39 -238
70 -72 -1173 32

This is a 15 bp motif with first column is Log-probablility of Column 1 = P(A) Column(2) = P(C) Column(3) = P(G) Column(4) = P(T).

You can also make your own Matrix for use of a concrete (you know the sequence your looking for) or semi-concrete (pyrimidines or purines or general idea). To do this see appendix 1.

-Sequence File. This is the whole genome that you want to search. MotifScan assumes that the genome entered is from start to finish (e.g. from Base 1 to Base (sequence length)).

-Coords File. This file holds the start and finish coordinates for the coding regions in the genome specified in the Sequence File. It has the form <<Start position>> <<End position>> <<Name>> <<junk>>. This is what .coord files look like from TIGR.


Features:
-Contig locations. The sequence window shows the forward and reverse coding regions (proteins). The forward strand proteins are on the row above reverse strand proteins. The beginning of a protein is signified with a "|" and the end is with a ">" for forward strand and a "<" for reverse strand proteins. By clicking on a region once, the name of that protein is displayed in the Protein text window.

-Dragging to find distance: Click on the sequence area and hold the button down and drag the mouse either to the right or the left. When you let go of the mouse, the distance covered since you clicked is given in the Selection Width Field. This facilitates finding distances to neighboring proteins.

-Web Interface: Double Click on the sequence area on any Gene (contig) and a mini-browser will come up referencing that gene on the TIGR website. (This will be non-functional when genomes that are not in TIGR are used unless the Coords file still references genes on TIGR). It works by adding the Contig name to the link: http://www.tigr.org/tigr-scripts/CMR2/GenePage.spl?locus= . This cannot be expected to work if the protein name attached to the end of the link is non-existent.

-Selection Tool: Use the slider to cut-off lower scoring matches to facilitate viewing more important matches.

-Width Tool: Enter in a width to the left of the Width button and click on the button. From then on, the viewer will show that many bases on both sides of the matching motif.

-Show Percent: On the menu bar, use the percent menu to show relative amounts of matches. You can tune the amount of data you see. Use a smaller percent to show less matches, and a larger one to show more.


Appendix 1:
To make a file that expresses a concrete motif:

Rules:
-Each line is one base pair in the motif.
-Each Column is the Log probability of getting that base at that position
-Column 1 = P(A) Column(2) = P(C) Column(3) = P(G) Column(4) = P(T).


Example:
-Since this uses log-probability, the number should be whole numbers with their relative intensity singnifying its certainty.

Sequence: ATC
Matrix: 100 0 0 0
0 0 0 100
0 0 100 0

Sequence: A[Pyrmimidines C/T]C

Matrix: 100 0 00
0 50 0 50
0 0 100 0

Using self made matrices it is possible to formformulate your own motifs and bypass MEME

 

Download the installer. Please view the MotifScan1.doc file for help on using the program.

View Source Files Try the Web Installer.