PCAP Genome Assembly Program and CAP3 Sequence Assembly Program

PCAP Genome Assembly Program and CAP3 Sequence Assembly Program


PCAP and CAP3 code: Download

Assembly Papers

Huang, X. and Madan, A. (1999)
CAP3: A DNA Sequence Assembly Program. Genome Research, 9: 868-877.

Huang, X. and Yang, S.-P. (2005)
Generating a Genome Assembly with PCAP.
In Baxevanies, A. D., Davison, D. B., Page, R. D. M., Petsko, G. A., Stein, L. D. and Stormo, G. D. (eds)
Current Protocols in Bioinformatics, Volume 2, John Wiley & Sons, 11.3.1-23.
This book chapter gives a step-by-step description of how to use PCAP.

Huang, X., Wang, J., Aluru, S., Yang, S.-P. and Hillier, L. (2003)
PCAP: A Whole-Genome Assembly Program. Genome Research, 13: 2164-2170.

Huang, X., Yang, S.-P., Chinwalla, A., Hillier, L., Minx, P., Mardis, E. and Wilson, R. (2006)
Application of a Superword Array in Genome Assembly. Nucleic Acids Research, 34: 201-205.
Supplemental Data

Maize Partial Genome Assembly

A partial assembly of the maize genome has been produced with the PCAP program. The assembly was produced on an SGI Altix with 30 GB of main memory and 16 processors of 900 MHz. We are grateful to Haruna Cofer of Silicon Graphics for making the computer available for the maize assembly project.

Chicken Whole Genome Assembly

An assembly of the chicken genome has been produced with PCAP and other programs by Washington University Genome Sequencing Center. The assembly results are available from GenBank.

Chimpanzee Whole Genome Assembly

An assembly of the chimpanee genome has been produced with PCAP and other programs by Washington University Genome Sequencing Center. The assembly results are available from GenBank.

Mouse Whole Genome Assembly

A mouse whole genome data set has been assembled by Xiaoqiu Huang in Department of Computer Science and Plant Sciences Institute at Iowa State University. The assembly was performed with the PCAP program (Parallel Contig Assembly Program) on a cluster of Compaq ES40 servers.

The assembly used the December release of the public whole genome shotgun data from NCBI. The whole genome shotgun data was produced by the Mouse Genome Sequencing Consortium. After removal of 2 million reads of low quality, a resulting data set of 30 million reads was used as input to the PCAP program.

The Compaq cluster was made available for the assembly project by Ray Hookway and Eamonn OToole at Compaq. A Compaq ES40 server was loaned to us for PCAP development and testing with the help of Nat Goodman. The PCAP program is being developed by Xiaoqiu Huang with his students at ISU. The project is supported by National Human Genome Research Institute under NIH grant R01 HG01502-06.


Acknowledgements

We are grateful to Shiaw-Pyng Yang, LaDeana Hillier, Asif Chinwalla, Pat Minx, and Rick Wilson at Washington University Genome Sequencing Center for collaboration on development of PCAP. We thank Liang Ye for assistence with setting up the PCAP/CAP3 downloading page.

Correspondence

Xiaoqiu Huang
Department of Computer Science
Iowa State University
226 Atanasoff Hall
Ames, IA 50011
Email: xqhuang@cs.iastate.edu