User Tools

Site Tools


contributors:team_4_page

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

======Team 4 | ABySS ====== **A**ssembly **By** **S**hort **S**equences - a //de novo//, parallel, paired-end sequence assembler =====Team composition===== | Name | Email | | Jared Copher | jcopher@ucsc.edu | | Emilio Feal | efeal@ucsc.edu | | Sidra Hussain | sihussai@ucsc.edu | =====ABySS overview===== ABySS is published by Canada's Michael Smith Genome Sciences Centre, and was the first //de novo// assembler for large genomes recommended bu Illumina in [[http://www.illumina.com/Documents/products/technotes/technote_denovo_assembly_ecoli.pdf|this technical note]] when using only their data. The ABySS team are active members on [[https://www.biostars.org/t/Abyss/|BioStars]] where they recommend all technical questions be asked. [[http://www.bcgsc.ca/platform/bioinfo/software/abyss | ABySS main site]] [[http://genome.cshlp.org/content/19/6/1117.full.pdf| ABySS paper]] [[https://github.com/bcgsc/abyss| ABySS manual and source code]] =====Installing ABySS===== ABySS source code was downloaded from Github <code> % lftpget https://github.com/bcgsc/abyss/archive/master.zip </code> ABySS needs to be configured with it's dependencies <code> % ./autogen.sh % ./configure --prefix=/campusdata/BME235\ % --enable-maxk=96\ #must be a multiple of 32 % --enable-dependency-tracking\ % --with-boost=/campusdata/BME235/include/boost\ % --with-mpi=/campusdata/BME235/include\ % CC=gcc-4.9.2 CXX=g++-4.9.2\ % CPPFLAGS=-I/campusdata/BME235/include/sparsehash </code> Then ABySS can be installed via the makefile <code> % make % make install </code> =====ABySS parameters===== Parameters of the driver script, abyss-pe, and their [default value] * a: maximum number of branches of a bubble [2] * b: maximum length of a bubble (bp) [10000] * c: minimum mean k-mer coverage of a unitig [sqrt(median)] * d: allowable error of a distance estimate (bp) [6] * e: minimum erosion k-mer coverage [sqrt(median)] * E: minimum erosion k-mer coverage per strand [1] * j: number of threads [2] * k: size of k-mer (bp) [no default] * l: minimum alignment length of a read (bp) [k] * m: minimum overlap of two unitigs (bp) [30] * n: minimum number of pairs required for building contigs [10] * N: minimum number of pairs required for building scaffolds [n] * p: minimum sequence identity of a bubble [0.9] * q: minimum base quality [3] * s: minimum unitig size required for building contigs (bp) [200] * S: minimum contig size required for building scaffolds (bp) [s] * t: minimum tip size (bp) [2k] * v: use v=-v for verbose logging, v=-vv for extra verbose [disabled] Please see the abyss-pe manual page for more information on assembly parameters. Possibly, abyss-pe parameters can have same names as existing environment variables'. The parameters then cannot be used until the environment variables are unset. To detect such occasions, run the command: <code> abyss-pe env [options] </code> Above command will report all abyss-pe parameters that are set from various origins. However it will not operate ABySS programs. =====Running ABySS===== abyss-pe is a driver script implemented as a Makefile. Any option of make may be used with abyss-pe. Particularly useful options are: <code> -C dir, --directory=dir </code> Change to the directory dir and store the results there. <code> -n, --dry-run </code> Print the commands that would be executed, but do not execute them. ===Commands of abyss-pe=== * default: Equivalent to `scaffolds scaffolds-dot stats'. * unitigs: Assemble unitigs. * unitigs-dot: Output the unitig overlap graph. * pe-sam: Map paired-end reads to the unitigs and output a SAM file. * pe-bam: Map paired-end reads to the unitigs and output a BAM file. * pe-index: Generate an index of the unitigs used by abyss-map. * contigs: Assemble contigs. * contigs-dot: Output the contig overlap graph. * mp-sam: Map mate-pair reads to the contigs and output a SAM file. * mp-bam: Map mate-pair reads to the contigs and output a BAM file. * mp-index: Generate an index of the contigs used by abyss-map. * scaffolds: Assemble scaffolds. * scaffolds-dot: Output the scaffold overlap graph. * stats: Display assembly contiguity statistics. * clean: Remove intermediate files. * version: Display the version of abyss-pe. * versions: Display the versions of all programs used by abyss-pe. * help: Display a helpful message. ===Programs in pipeline=== abyss-pe uses the following programs, which must be found in your PATH: * ABYSS: de Bruijn graph assembler * ABYSS-P: parallel (MPI) de Bruijn graph assembler * AdjList: find overlapping sequences * DistanceEst: estimate the distance between sequences * MergeContigs: merge sequences * MergePaths: merge overlapping paths * Overlap: find overlapping sequences using paired-end reads * PathConsensus: find a consensus sequence of ambiguous paths * PathOverlap: find overlapping paths * PopBubbles: remove bubbles from the sequence overlap graph * SimpleGraph: find paths through the overlap graph * abyss-fac: calculate assembly contiguity statistics * abyss-filtergraph: remove shim contigs from the overlap graph * abyss-fixmate: fill the paired-end fields of SAM alignments * abyss-map: map reads to a reference sequence * abyss-scaffold: scaffold contigs using distance estimates * abyss-todot: convert graph formats and merge graphs =====ABySS pipeline===== {{ :bioinformatic_tools:abysspipeline.png?nolink |}}

You could leave a comment if you were logged in.
contributors/team_4_page.1431051932.txt.gz · Last modified: 2015/05/08 02:25 (external edit)