User Tools

Site Tools


Team 4 Report: ABySS

ABySS stands for Assembly By Short Sequences.

Assembler Overview

  • Load kmers
  • Find adjacent kmers
  • Generate de Bruijn graphs


  • Generate paths through the reads
  • Merge paths
  • Generate contigs
  • ParseAligns: Empirical fragment-size distribution
    1. Maximum Likelihood Estimator
    2. Use empirical paired-end size distribution

ABySS Details

  • Distributes k-mers and deBruijn graph across a cluster
  • Each node announces the list of k-mers it has to nodes that hold their possible extensions
  • 8 bits of storage per k-mer, ACGT forward and reverse extensions
  • Finds paths through contigs that agree with distance estimates and then merge overlapping paths

Installing ABySS

  • Installation of single processor version was straightforward
  • Difficulty installing parallelized version
  • Developers are active in the community and the assembler has a long history, so documentation is abundant

Running ABySS

Single processor version: straight forward step

  • Qsub
  • Embedded qsub
  • Exporting paths
  • Abuss-pe [parameters] parallel environment in campusrocks2

Parameters: Primary Name: name of assembly K: size of k-mer If 1 library of pe data: In = ‘reads1.fq reads2.fq’ Pipeline organized via makefile: abyss-pe Autogenerated assembly statistics Contig, scafold metrics

  • Does not necessarily clean up things that failed.
  • It is better to manually clean the file.

Using ABySS, the plan

  • Use all libraries, after processing, but no error correction
  • Run the seqprep
  • Run adapter trimming only
  • Run adapter trimming plus merging

Initial run

  • The initial run is located on Edser
  • K = 55, arbitrary
  • Did not work

For the future

  • Get parallel versions working
  • Finish data analysis (kmergenie, fastqc, etc)
  • Do assemblies
  • RNA-seq rescaffolding with TransABySS
  • Meta-assembly
You could leave a comment if you were logged in.
lecture_notes/04-26-2015.txt · Last modified: 2015/04/29 20:25 by jdhouser