**This is an old revision of the document!** ----
====== Team 1: Meraculous ====== =====Team composition===== | Name | Email| | Charles Cole | chkcole@ucsc.edu | | Jake Houser | jdhouser@ucsc.edu | | Kyle McGovern | kmcgover@dudek.org | | Jennie Richardson | jemricha@ucsc.edu | =====Meraculous overview===== Meraculous is a //de novo// assembler first published by the US Department of Energy Joint Genome Institute and managed by the Lawrence Berkeley National Laboratory. Meraculous was designed for deep paired-end short reads (e.g., Illumina). Stated advantages include: * Multi-threaded and parallelized computation * Lightweight hash structure resulting in low RAM footprint * No error correction step for faster processing After selecting a k-mer set, Meraculous produces a set of maximal linear sub-paths of the deBruijn graph. This process avoids an explicit error correction step used in other assemblers, instead relying on base quality scores. It then aligns reads to the assembly in order to identify useful read-pair information. Next, it uses paired-reads and splinting singletons to produce a scaffolding by “ordering and orienting” a set of contigs. Finally, gaps are closed using paired-end placements. =====Programs used===== ===GCC=== GCC is a compiler for the GNU operating system (/campusdata/BME235/bin/gcc-4.9.2). Webpage: https://gcc.gnu.org/. ===KmerGenie=== KmerGenie estimates the best k-mer length for genome //de novo// assembly (/campusdata/BME235/bin/kmergenie-1.6972). Webpage: http://kmergenie.bx.psu.edu/. ===Musket=== Musket is a multistage k-mer spectrum based error corrector for Illumina short read data (/campusdata/BME235/bin/musket). Webpage: http://musket.sourceforge.net/. ===Skewer=== Skewer is an adapter trimmer for Illumina paired-end sequences (/campusdata/BME235/bin/skewer-0.1.123-linux-x86_64). Webpage: http://sourceforge.net/projects/skewer/. =====Lecture slides===== {{:bme_235_meraculous_report_1.pdf| First report, Monday April 20th, 2015}}