This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:bioinformatic_tools:bwa [2011/05/23 02:25] svohr [After SeqPrep] |
archive:bioinformatic_tools:bwa [2011/06/08 18:22] svohr |
||
---|---|---|---|
Line 29: | Line 29: | ||
</code> | </code> | ||
+ | Note that BWA does seem to accept gzipped files, so there is no need to ungzip the read files, though the documentation doesn't mention this. | ||
===== Quirks ===== | ===== Quirks ===== | ||
The SAM formatted alignments include a column labeled "inferred insert length" by the BWA manual, but in the SAM specification it is described as the "template length" or distance between the leftmost mapped base to the rightmost mapped base. The second description seems to | The SAM formatted alignments include a column labeled "inferred insert length" by the BWA manual, but in the SAM specification it is described as the "template length" or distance between the leftmost mapped base to the rightmost mapped base. The second description seems to | ||
Line 87: | Line 88: | ||
{{:bioinformatic_tools:run2_bc08_seqprep_histogram_r2.png|}} | {{:bioinformatic_tools:run2_bc08_seqprep_histogram_r2.png|}} | ||
+ | |||
+ | ===== After Quake ===== | ||
+ | This process was repeat on the data after Quake correction. In addition to the paired reads and merge reads, Quake separates the reads whose pair could not be successfully corrected. | ||
+ | |||
+ | {{:bioinformatic_tools:run1_quake_histogram.png|}} | ||
+ | {{:bioinformatic_tools:run2_bc07_quake_histogram.png|}} | ||
+ | {{:bioinformatic_tools:run2_bc08_quake_histogram.png|}} | ||
+ | |||
+ | ==== Mean Lengths ==== | ||
+ | |||
+ | Run 1 | ||
+ | | Template | 151.7 | | ||
+ | | Merged | 113.3 | | ||
+ | | Single | 65.7 | | ||
+ | |||
+ | Run 2 bc07 | ||
+ | | Template | 250.3 | | ||
+ | | Merged | 142.5 | | ||
+ | | Single | 88.8 | | ||
+ | |||
+ | Run 2 bc08 | ||
+ | | Template | 94.5 | | ||
+ | | Merged | 104.2 | | ||
+ | | Single | 61.3 | | ||
+ | |||
+ | |||
+ | ===== After Assembly ===== | ||
+ | This process was repeated using the [[bioinformatic_tools:soapdenovo|SOAPdenovo]] contigs from ''assemblies/slug/SOAPdenovo-assembly2/k47_w_454_contigs'' instead of the 454 reads as the reference. The shapes of these distributions follow the same patterns as the ones found using the 454 reads, although there are more mapped pairs because of the higher coverage of the contigs and we see some longer templates sizes than before. For the run1 and run2 barcode 7 pairs, the results involve some self-reference, since the previous estimates were used to build the contigs to which we mapped the pairs. However, the pairs for barcode 8 were not used in the assembly because of their negative mean | ||
+ | insert size (most pairs overlap but not in a way that SeqPrep could merge). When these were mapped to the contigs, we saw the same pattern as before but with more reads that mapped successfully. | ||
+ | |||
+ | {{:bioinformatic_tools:run1_assembly_histogram.png|}} | ||
+ | |||
+ | {{:bioinformatic_tools:run2_bc07_assembly_histogram.png|}} | ||
+ | |||
+ | {{:bioinformatic_tools:run2_bc08_assembly_histogram.png|}} | ||
+ | |||