| Author: | Titus Brown |
|---|---|
| Date: | March 13th, 2005 |
| Version: | 1.0 |
paircomp is a toolkit for doing ungapped comparisons of two DNA sequences. It contains a C++ library, several standalone command-line programs, and a Python interface to the C++ library.
You can download paircomp 1.0 here: paircomp-1.0.tar.gz.
The C++ library is located under lib/. It contains routines for creating and manipulating fixed-width window comparisons of two sequences, as well as functions for saving them to, and loading them from, files.
The command line programs are all written in C++ and located under bin/. Run the programs without arguments to see Command line parameters. The three binary programs are:
paircomp
paircomp does an O(N*M) comparison of two sequences, and saves the result in a simple format consisting of tab-delimited columns of four numbers: top position, bottom position, match length, and orientation (1/-1) of the bottom match with respect to the bottom sequence.
seqcomp
seqcomp does the same comparison as paircomp but produces an output format compatible with that of the original seqcomp. This program is present for backwards compatibility and should not be used.
find_patch
find_patch does an O(N+M) comparison of two sequences. It will slow down exponentially as window sizes increase and thresholds drop; it's designed for 10bp comparisons with a threshold of 90%.
The python library package is called paircomp and is kept under python/. To install it, first run 'make' in the top level directory; then, go into python/ and type python setup.py install.
See the Python API documentation for more information.
Elliot Bush pointed out that the current implementation of three-way filtering varies wildly in speed depending on the sequences compared. Filtering comparisons of sequences containing many low-complexity sequence repeats will be especially slow.
Tristan De Buysscher wrote the initial C version of seqcomp, and helped develop the format. The find_patch/O(N+M) algorithm was written by Shoudan Liang of the NASA Ames Research Center. Daniel Fu, Alok Saldanha, and Eliot Bush have contributed bug reports & documentation suggestions.