User Tools

Site Tools


lecture_notes:04-20-2015

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
lecture_notes:04-20-2015 [2015/04/25 02:55]
calef [Basic features]
lecture_notes:04-20-2015 [2015/04/25 02:58]
calef [Meraculous algorithm]
Line 5: Line 5:
  
   * Counts occurrences of each kmer in the data set.   * Counts occurrences of each kmer in the data set.
-  * Removes kmers whose frequency are below the user threshold.+  * Removes kmers whose frequency are below a threshold provided by the user.
   * For each kmer, counts the number of high-quality single-base extensions   * For each kmer, counts the number of high-quality single-base extensions
-  * Classifies the 5' and 3' ends of each kmer+  * Classifies the 5' and 3' ends of each kmer as U, F, or X, corresponding to having zero, one, or multiple high-quality single-base extensions
   * Stores the extensions of kmers with a classification in a hash   * Stores the extensions of kmers with a classification in a hash
-  * Removes non-reciprocal ​linkages ​between kmers+  * Removes non-reciprocal ​U-U extensions ​between kmers (i.e. an extension where the end of one mer is marked as U but the other is marked F). 
 +  * Stores the linear subgraph of U-U extensions
   * Selects kmers at random and extend outwards to produce contigs   * Selects kmers at random and extend outwards to produce contigs
   * Aligns all reads to contigs via BLAST   * Aligns all reads to contigs via BLAST
   * Assembles contigs into scaffolds using paired-end data   * Assembles contigs into scaffolds using paired-end data
   * Searches unaligned reads as potential gap-closers using mate-pair data   * Searches unaligned reads as potential gap-closers using mate-pair data
- 
 =====Meraculous limitations===== =====Meraculous limitations=====
   * The assembler relies on data with high quality in order to avoid error correction   * The assembler relies on data with high quality in order to avoid error correction
lecture_notes/04-20-2015.txt · Last modified: 2015/04/25 03:07 by calef