SeqComp


Seqcomp is a simple C program that takes 2 fasta format sequence files and performs the basic analysis that "dot-plot" DNA comparison programs do. Its been written in C with a few basic optimizations to be able to handle large sequences (tens, even a hundred kb) at large window sizes (20+ bp) in a reasonable amount of time (ie minutes). For example, a 30kb vs 30kb, with either 30bp or 100bp window, takes about a minute and a half on a 700 Mhz Pentium III.

It is written in standard ANSI C, so should compile on any system with a decent C compiler. The input and output files are all plain text, but unix style end-of-lines are used (unix, linux and MacOS X are fine with this). Due to the lack of access to a Windows machine with a C compiler, its Windows compatability has not been tested. The program also uses commandline arguments, so its use on Windows or Classic MacOS (9.x or lower) is unclear to me. But, the program is GPL'd so anyone can feel free to make changes (if any) needed to run properly in these cases.


  • seqcomp.tar.gz

        On unix/linux, untar the file with:

       prompt> tar -xzf seqcomp.tar.gz
        MacOS X should uncompress it automatically if downloaded thru a browser, if not use the instructions above in the terminal window.

  • Algorithm description -- what SeqComp does...
    Tristan De Buysscher, tristanfamily.caltech.edu .