Table of Contents

Velvet, Dan Zerbino

De Bruijn Graphs

Velvet uses de Bruijn graphs to condense reads and resolve common sequencing problems. Compared to overlap layout consensus, it simplifies multiple reads by resolving repeats into only one node, so that their counts only need to be stored on top. This reduces the amount of memory required to store the reads by the dimension of coverage.

Specifically, the de Bruijn graph breaks each read into words and paths through the words, mapping new reads to form a graph structure. Velvet then simplifies this graph by removing unjoined tips and reducing parallel strands into the strand with the maximum coverage (this error corrects for mismatched bases). Velvet leaves loops unresolved in the final structure, as these represent repeat regions. [1]

Extensions

Shorty

Shorty uses the variance in paired read lengths to build larger contigs from small ones. It bears some resemblence to how PRICE does contig extension.

Oasis

Oasis does splicing analysis on words with more than one connection, breaking them up into separate contigs. This resolves nodes in the de Bruijn graph.

Columbus

Columbus is a combined approach between mapping and de novo sequencing. It uses a reference sequence to organize contigs, but allows for novel structures within the contigs.


1. a Zerbino, D. and Birney, E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008. 18: 821-829. doi: 10.1101/gr.074492.107