* Description
ProbeMatch is a sequence alignment program that finds sequence alignments for short DNA sequences ( 36-50 bp ). 
Unlike other programs such as eland and soap that perform ungapped alignment allowing up to 2 substitution, 
Probematch performs *gapped* alignment, allowing up to 3 errors including substitution, insertion, and deletion.

* Installation
  1. Run ./make in probematch folder.
  2. This will generate binary under bin/ folder.

* Usage
- (1) Run splitfasta program.
  The splitfasta programs will split a fasta file containing multiple sequences into a set of files, each of which
  contains less than 16 sequences. Newly generated files are named with an file extension .sp. 

  splitfasta -d [directory]
  All fasta files under the directory will be processed. 
  
  splitfasta -f [file] 
  A specified fasta file will be processed.

- (2) Run probematch program
  usage: probematch [options]
  -i <str>                input query file in a fasta format
  -d <str>                database directory that contains all database files
  -f <str>                input database file in a fasta format
  -o <str>                output file
  -l <positive int>       probe length
  -m <int>                0: print a single top hit only 1: print all multiple
                          top hits. default = 0
  -h <int>		  0: print output in non HTML format.
			  1: print output in HTML format.
			  2: print output in ELAND format.
			  Default:0 -  non-HTML format
  -m <int>		  number of Tophit to be displayed. Now user can control no of top hit in output file.
			  for example, if value is set to 5 then top 5  hit will be displayed. All these hit will appear in Descending order.
			  Default Value: 5. Valid value - integer greater than 1.
  -x <int>                0: non Expandible. 1: Expandible.
			  if it is set to expandible then tophit number will be doubled incase number of hit, which has same score exceed tophit number 
  --help                  help

  << Note >>
  - When use "-f <filename>", please make sure that the file contains no more
    than 16 sequences. 

* Example

  >bin/splitfasta -f ./example/hs_ref_chrY.fa

  This command will split hs_ref_chrY.fa files into multiple fasta files, each
of which contains no more than 16 sequences. Each file will be stored in
./example/ directory with a file extension .sp.  

  >bin/probematch -i ./example/query.fa -d ./example/ -o ./example/query.output -l 36

  This command will read all .sp files under ./example/ directory and query.fa
will be searched against each .sp files. 
  
* Limitation
  - Probemath handles DNA sequences of which size is between 36-50 bp
  - Probematch searches against genomic sequences. Search against for example
    EST sequences are not supported.  

* Contact
	teletia@cs.wisc.edu