======Team 4 | ABySS ====== **A**ssembly **By** **S**hort **S**equences - a //de novo//, parallel, paired-end sequence assembler =====Team composition===== | Name | Email | | Jared Copher | | | Emilio Feal | | | Sidra Hussain | | =====ABySS overview===== ABySS is published by Canada's Michael Smith Genome Sciences Centre, and was the first //de novo// assembler for large genomes recommended bu Illumina in [[|this technical note]] when using only their data. The ABySS team are active members on [[|BioStars]] where they recommend all technical questions be asked. [[ | ABySS main site]] [[| ABySS paper]] [[| ABySS manual and source code]] =====Installing ABySS===== ABySS source code was downloaded from Github <code> % lftpget </code> ABySS needs to be configured with it's dependencies <code> % ./ % ./configure --prefix=/campusdata/BME235\ % --enable-dependency-tracking\ % --with-boost=/campusdata/BME235/include/boost\ % --with-mpi=/campusdata/BME235/include\ % CC=gcc-4.9.2 CXX=g++-4.9.2\ % CPPFLAGS=-I/campusdata/BME235/include/sparsehash </code> Then ABySS can be installed via the makefile <code> % make % make install </code> =====ABySS parameters===== Parameters of the driver script, abyss-pe * a: maximum number of branches of a bubble [2] * b: maximum length of a bubble (bp) [10000] * c: minimum mean k-mer coverage of a unitig [sqrt(median)] * d: allowable error of a distance estimate (bp) [6] * e: minimum erosion k-mer coverage [sqrt(median)] * E: minimum erosion k-mer coverage per strand [1] * j: number of threads [2] * k: size of k-mer (bp) * l: minimum alignment length of a read (bp) [k] * m: minimum overlap of two unitigs (bp) [30] * n: minimum number of pairs required for building contigs [10] * N: minimum number of pairs required for building scaffolds [n] * p: minimum sequence identity of a bubble [0.9] * q: minimum base quality [3] * s: minimum unitig size required for building contigs (bp) [200] * S: minimum contig size required for building scaffolds (bp) [s] * t: minimum tip size (bp) [2k] * v: use v=-v for verbose logging, v=-vv for extra verbose [disabled] Please see the abyss-pe manual page for more information on assembly parameters. Possibly, abyss-pe parameters can have same names as existing environment variables'. The parameters then cannot be used until the environment variables are unset. To detect such occasions, run the command: <code> abyss-pe env [options] </code> Above command will report all abyss-pe parameters that are set from various origins. However it will not operate ABySS programs. =====ABySS Programs===== abyss-pe is a driver script implemented as a Makefile. Any option of make may be used with abyss-pe. Particularly useful options are: <code> -C dir, --directory=dir </code> Change to the directory dir and store the results there. <code> -n, --dry-run </code> Print the commands that would be executed, but do not execute them. abyss-pe uses the following programs, which must be found in your PATH: * ABYSS: de Bruijn graph assembler * ABYSS-P: parallel (MPI) de Bruijn graph assembler * AdjList: find overlapping sequences * DistanceEst: estimate the distance between sequences * MergeContigs: merge sequences * MergePaths: merge overlapping paths * Overlap: find overlapping sequences using paired-end reads * PathConsensus: find a consensus sequence of ambiguous paths * PathOverlap: find overlapping paths * PopBubbles: remove bubbles from the sequence overlap graph * SimpleGraph: find paths through the overlap graph * abyss-fac: calculate assembly contiguity statistics * abyss-filtergraph: remove shim contigs from the overlap graph * abyss-fixmate: fill the paired-end fields of SAM alignments * abyss-map: map reads to a reference sequence * abyss-scaffold: scaffold contigs using distance estimates * abyss-todot: convert graph formats and merge graphs =====ABySS pipeline===== {{ :bioinformatic_tools:abysspipeline.png?nolink |}}

