* Description ProbeMatch is a sequence alignment program that finds sequence alignments for short DNA sequences ( 36-50 bp ). Unlike other programs such as eland and soap that perform ungapped alignment allowing up to 2 substitution, Probematch performs *gapped* alignment, allowing up to 3 errors including substitution, insertion, and deletion. * Installation 1. Run ./make in probematch folder. 2. This will generate binary under bin/ folder. * Usage - (1) Run splitfasta program. The splitfasta programs will split a fasta file containing multiple sequences into a set of files, each of which contains less than 16 sequences. Newly generated files are named with an file extension .sp. splitfasta -d [directory] All fasta files under the directory will be processed. splitfasta -f [file] A specified fasta file will be processed. - (2) Run probematch program usage: probematch [options] -i input query file in a fasta format -d database directory that contains all database files -f input database file in a fasta format -o output file -l probe length -m 0: print a single top hit only 1: print all multiple top hits. default = 0 -h 0: print output in non HTML format. 1: print output in HTML format. 2: print output in ELAND format. Default:0 - non-HTML format -m number of Tophit to be displayed. Now user can control no of top hit in output file. for example, if value is set to 5 then top 5 hit will be displayed. All these hit will appear in Descending order. Default Value: 5. Valid value - integer greater than 1. -x 0: non Expandible. 1: Expandible. if it is set to expandible then tophit number will be doubled incase number of hit, which has same score exceed tophit number --help help << Note >> - When use "-f ", please make sure that the file contains no more than 16 sequences. * Example >bin/splitfasta -f ./example/hs_ref_chrY.fa This command will split hs_ref_chrY.fa files into multiple fasta files, each of which contains no more than 16 sequences. Each file will be stored in ./example/ directory with a file extension .sp. >bin/probematch -i ./example/query.fa -d ./example/ -o ./example/query.output -l 36 This command will read all .sp files under ./example/ directory and query.fa will be searched against each .sp files. * Limitation - Probemath handles DNA sequences of which size is between 36-50 bp - Probematch searches against genomic sequences. Search against for example EST sequences are not supported. * Contact teletia@cs.wisc.edu