This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
archive:bioinformatic_tools:quake [2011/05/09 18:04] eyliaw [Running Quake] |
archive:bioinformatic_tools:quake [2011/05/27 21:12] eyliaw |
||
---|---|---|---|
Line 34: | Line 34: | ||
They also recommend a kmer probability of 0.01 in a random sequence that is as long as the genome. That is, 2*G/4^k ~ 0.01, where G is the size of the sequenced genome and k is the size of the kmer. Simplified, k ~ log4(200*G), which is about 19 for our prediction of the banana slug genome size, {{https://banana-slug.soe.ucsc.edu/bioinformatic_tools:jellyfish|2.042e+09}}. | They also recommend a kmer probability of 0.01 in a random sequence that is as long as the genome. That is, 2*G/4^k ~ 0.01, where G is the size of the sequenced genome and k is the size of the kmer. Simplified, k ~ log4(200*G), which is about 19 for our prediction of the banana slug genome size, {{https://banana-slug.soe.ucsc.edu/bioinformatic_tools:jellyfish|2.042e+09}}. | ||
+ | |||
+ | ===== Potential Problems ====== | ||
+ | * Input files need to have an extension, or Quake will throw a substr error when trying to merge hidden files into a result. | ||
+ | * With paired-end input, Quake will output two files for each paired-end read. One will be the cor.fastq file, which contains corrected, paired reads. The other will be the cor_single.fastq file, which contains reads where only one pair could be corrected. You can treat the cor_single.fastq file as a single read file. |