Differences

This shows you the differences between two versions of the page.

--- archive:bioinformatic_tools:jellyfish [2011/05/16 20:37]
karplus Added numbers from Illumina run 1 counts
+++ archive:bioinformatic_tools:jellyfish [2015/07/28 06:23] (current)
ceisenhart ↷ Page moved from bioinformatic_tools:jellyfish to archive:bioinformatic_tools:jellyfish
@@ Line 1: / Line 1: @@
 ====== Jellyfish ======
+The current version installed on campusrocks is 1.1 (official release).
 Jellyfish is a tool for fast, memory efficient counting of K-mers in DNA [[http://www.cbcb.umd.edu/software/jellyfish/]][(cite:jellyfish>Marçais, Guillaume and Kingsford, Carl. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 first published online January 7, 2011 doi:10.1093/bioinformatics/btr011)]
@@ Line 38: / Line 40: @@
 Total distinct: 2,298,220,805 (19-mers) 2,699,479,169 (22-mers)
 These counts were done before running SeqPrep, so include adapter reads.
+After running SeqPrep, using all the illumina data produced
+{{:bioinformatic_tools:fit-gamma-illumina-all-seqprep.png|}}
+We have 2196163636 distinct 19-mers. If we use 2-or-less as the criterion for calling a k-mer a sequencing error, we get 1,222,498,009 distinct k-mers---close to our previous estimates.
+The fit-gamma-illumina-all-seqprep.gnuplot script gives an estimated coverage of 10.247.  If we divide the total number of k-mers (23731306715) by the approximate coverage, we get a genome length of 2.3159 Gbases.
 ====== Gamma distribution is wrong ======

Banana Slug Genomics

User Tools

Site Tools

Differences

Page Tools