User Tools

Site Tools


archive:bioinformatic_tools:abyss

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:bioinformatic_tools:abyss [2010/04/15 19:31]
galt
archive:bioinformatic_tools:abyss [2010/05/10 23:15]
jstjohn
Line 1: Line 1:
 ====== ABySS ====== ====== ABySS ======
- 
 ===== Overview ===== ===== Overview =====
 ABySS[(cite:​abyss>​Jared T. Simpson, Kim Wong, Shaun D. Jackman, Jacqueline E. Schein, Steven J.M. Jones, and İnanç Birol. ABySS: A parallel assembler for short read sequence data. //Genome Res.// June 2009 19: 1117-1123; Published in Advance February 27, 2009, doi:​[[http://​dx.doi.org/​10.1101/​gr.089532.108|10.1101/​gr.089532.108]]. ABySS[(cite:​abyss>​Jared T. Simpson, Kim Wong, Shaun D. Jackman, Jacqueline E. Schein, Steven J.M. Jones, and İnanç Birol. ABySS: A parallel assembler for short read sequence data. //Genome Res.// June 2009 19: 1117-1123; Published in Advance February 27, 2009, doi:​[[http://​dx.doi.org/​10.1101/​gr.089532.108|10.1101/​gr.089532.108]].
 )] stands for **A**ssembly **By** **S**hort **S**equences. )] stands for **A**ssembly **By** **S**hort **S**equences.
  
-ABySS is a //de novo// parallel, paired-end, short read DNA sequence assembler. The single processor version can assemble genomes of up to 100 Mbases.[(cite:​website>​[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss]])] The parallel version uses MPI and can assemble larger genomes.[(cite:​website)] It was used for assembly of a transcriptome from the tumor tissue of a patient with follicular lymphoma.[(cite:​Biroletal>​Inanç Birol, Shaun D. Jackman, Cydney B. Nielsen, Jenny Q. Qian, Richard Varhol, Greg Stazyk, Ryan D. Morin, Yongjun Zhao, Martin Hirst, Jacqueline E. Schein, Doug E. Horsman, Joseph M. Connors, Randy D. Gascoyne, Marco A. Marra, and Steven J. M. Jones. De novo transcriptome assembly with ABySS. //​Bioinformatics//​ 25: 2872-2877. Advance Access published on November 1, 2009, doi:​[[http://​dx.doi.org/​10.1093/​bioinformatics/​btp367|10.1093/​bioinformatics/​btp367]].)]+ABySS is a //de novo// parallel, paired-end, short read DNA sequence assembler. ​\\ 
 +The single processor version can assemble genomes of up to 100 Mbases.[(cite:​website>​[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss]])]\\ 
 +The parallel version uses MPI and can assemble larger genomes.[(cite:​website)] ​\\ 
 +It was used for assembly of a transcriptome from the tumor tissue of a patient with follicular lymphoma.[(cite:​Biroletal>​Inanç Birol, Shaun D. Jackman, Cydney B. Nielsen, Jenny Q. Qian, Richard Varhol, Greg Stazyk, Ryan D. Morin, Yongjun Zhao, Martin Hirst, Jacqueline E. Schein, Doug E. Horsman, Joseph M. Connors, Randy D. Gascoyne, Marco A. Marra, and Steven J. M. Jones. De novo transcriptome assembly with ABySS. //​Bioinformatics//​ 25: 2872-2877. Advance Access published on November 1, 2009, doi:​[[http://​dx.doi.org/​10.1093/​bioinformatics/​btp367|10.1093/​bioinformatics/​btp367]].)]
  
-===== Installing =====+ABySS can use large kmer values greater than 31.
  
-  cd /​campusdata/​BME235/​programs 
-  wget http://​www.bcgsc.ca/​downloads/​abyss/​abyss-1.1.2.tar.gz 
-  tar xfz abyss-1.1.2.tar.gz 
-  mv abyss-1.1.2 abyss 
-  mv abyss-1.1.2.tar.gz abyss/ 
-  cd abyss 
  
 +Note that ABySS is also the recommended assembler by Illumina for large genomes. {{:​bioinformatic_tools:​abyss_technote_illumina.pdf|Illumina Technote Paper}}
  
-==== Website ​==== +==== Installing ​====
-[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss]]+
  
-==== Source ​with Binaries and Documentation ==== +Get the appropriate source files to be compiled: 
-[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss/​releases]]+ 
 +<​code>​ 
 +cd /​campusdata/​BME235/​programs 
 +wget http://​www.bcgsc.ca/​downloads/​abyss/​abyss-1.1.2.tar.gz 
 +wget http://​www.open-mpi.org/​software/​ompi/​v1.4/​downloads/​openmpi-1.4.1.tar.gz 
 +wget http://​google-sparsehash.googlecode.com/​files/​sparsehash-1.7.tar.gz 
 +tar xfz abyss-1.1.2.tar.gz 
 +tar xfz openmpi-1.4.1.tar.gz 
 +tar xfz sparsehash-1.7.tar.gz 
 +mv abyss-1.1.2.tar.gz abyss-1.1.2/​ 
 +mv openmpi-1.4.1.tar.gz openmpi-1.4.1/​ 
 +mv sparsehash-1.7.tar.gz sparsehash-1.7/​ 
 +</​code>​ 
 + 
 +First, OpenMPI and Google sparsehash need to be compiled and installed for ABySS. 
 + 
 +<​code>​ 
 +cd /​campusdata/​BME235/​programs/​openmpi-1.4.1 
 +./configure --prefix=/​campusdata/​BME235 
 +make 
 +make install 
 +cd /​campusdata/​BME235/​programs/​sparsehash-1.7 
 +./configure --prefix=/​campusdata/​BME235 
 +make 
 +make install 
 +</​code>​ 
 + 
 +Next, a [[http://​code.google.com/​p/​google-sparsehash/​issues/​detail?​id=55|patch]] needs to be applied so that ABySS can properly compile with support for Google sparsehash 1.7. This issue will be fixed in the next release of Google sparsehash. 
 + 
 +<​code>​ 
 +cd /​campusdata/​BME235/​include/​google/​sparsehash 
 +wget http://​google-sparsehash.googlecode.com/​issues/​attachment?​aid=-5666329961626930947&​name=deallocate.diff 
 +patch < deallocate.diff 
 +</​code>​ 
 + 
 +Now ABySS can be compiled with OpenMPI and Google sparsehash support. 
 + 
 +<​code>​ 
 +cd /​campusdata/​BME235/​programs/​abyss-1.1.2 
 +./configure --prefix=/​campusdata/​BME235 CPPFLAGS=-I/​campusdata/​BME235/​include 
 +make 
 +make install 
 +</​code>​ 
 + 
 +==== Websites ==== 
 +[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss|ABySS]] \\ 
 +[[http://​www.open-mpi.org|OpenMPI]] \\ 
 +[[http://​code.google.com/​p/​google-sparsehash|Google sparsehash]] 
 + 
 +==== Sources ​with Binaries and Documentation ==== 
 +[[http://​www.bcgsc.ca/​platform/​bioinfo/​software/​abyss/​releases|ABySS]] \\ 
 +[[http://​www.open-mpi.org/​software/​ompi|OpenMPI]] \\ 
 +[[http://​code.google.com/​p/​google-sparsehash/​downloads/​list|Google sparsehash]] 
 + 
 +===== Slug Assembly ===== 
 +In the directory:​ 
 +  /​campus/​BME235/​assemblies/​slug/​ABySS-assembly1 
 +I ran the following command to start the assembly process on this file in parallel MPI mode. note that the binaries for abyss were installed with open-mpi 1.4, but I am using mpirun 1.3. When we re-install open-mpi 1.4 so that it has SGE support, I will re-run this with that if there are problems. Here is the command executed to start the process: 
 +  /​campus/​BME235/​assemblies/​slug/​ABySS-assembly1 
 + 
 +And here are the contents of the script I use to run everything:​ 
 + 
 +  #​!/​bin/​bash 
 +  # 
 +  #$ -cwd 
 +  #$ -j y 
 +  #$ -S /bin/bash 
 +  #$ -V 
 +  #$ -l mem_free=15g 
 +  #  
 +  /​opt/​openmpi/​bin/​mpirun -np $NSLOTS abyss-pe -j j=2 np=$NSLOTS n=8 k=25 name=slugAbyss lib='​lane1 lane2 lane3 lane5 lane6 lane7 lane8' lane1='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_1_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_1_2_all_qseq.fastq'​ lane2='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_2_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_2_2_all_qseq.fastq'​ lane3='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_3_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_3_2_all_qseq.fastq'​ lane5='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_5_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_5_2_all_qseq.fastq'​ lane6='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_6_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_6_2_all_qseq.fastq'​ lane7='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_7_1_all_qseq.fastq /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_7_2_all_qseq.fastq'​ lane8='/​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_8_1_all_qseq.fastq ​ /​campus/​BME235/​data/​slug/​Illumina/​illumina_run_1/​CeleraReads/​s_8_2_all_qseq.fastq'​ 
 + 
 +Unfortunately this command crashes. The error states that the LD_LIBRARY_PATH might need to be set to point to shared MPI libraries. Also it would probably be best to use our version of "​mpirun"​ once we get it compiled with sge support.
  
 ===== References ===== ===== References =====
 <​refnotes>​notes-separator:​ none</​refnotes>​ <​refnotes>​notes-separator:​ none</​refnotes>​
 ~~REFNOTES cite~~ ~~REFNOTES cite~~
- 
- 
archive/bioinformatic_tools/abyss.txt · Last modified: 2015/07/28 06:23 by ceisenhart