User Tools

Site Tools


Team 1: Meraculous

Team composition

Name Email
Charles Cole
Jake Houser
Kyle McGovern
Jennie Richardson

Meraculous overview

Meraculous is a de novo assembler first published by the US Department of Energy Joint Genome Institute and managed by the Lawrence Berkeley National Laboratory. Meraculous was designed for deep paired-end short reads (e.g., Illumina). Stated advantages include:

  • Multi-threaded and parallelized computation
  • Lightweight hash structure resulting in low RAM footprint
  • No error correction step for faster processing

After selecting a k-mer set, Meraculous produces a set of maximal linear sub-paths of the deBruijn graph. This process avoids an explicit error correction step used in other assemblers, instead relying on base quality scores. It then aligns reads to the assembly in order to identify useful read-pair information. Next, it uses paired-reads and splinting singletons to produce a scaffolding by “ordering and orienting” a set of contigs. Finally, gaps are closed using paired-end placements.

Other programs used


GCC is a compiler for the GNU operating system (/campusdata/BME235/bin/gcc-4.9.2). Webpage:


KmerGenie estimates the best k-mer length for genome de novo assembly (/campusdata/BME235/bin/kmergenie-1.6972). Webpage:


Musket is a multistage k-mer spectrum based error corrector for Illumina short read data (/campusdata/BME235/bin/musket). Webpage:


Skewer is an adapter trimmer for Illumina paired-end sequences (/campusdata/BME235/bin/skewer-0.1.123-linux-x86_64). Webpage:


We were unable to get Meraculous to complete the bubble popping step. Prior to the assembly being killed, there were 28,610,138 total contigs with 542,137 (1.89%) contigs over 1,000bp and 15 (5.2e-5%) contigs over 10,000bp.

Our team was dissolved to support other tasks.

Lecture slides

You could leave a comment if you were logged in.
contributors/team_1_page.txt · Last modified: 2015/07/18 20:29 by ceisenhart