This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
lecture_notes:05-13-2011 [2011/05/13 23:33] svohr created |
lecture_notes:05-13-2011 [2015/09/12 02:47] (current) 5.9.83.211 ↷ Links adapted because of a move operation |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Insert Length Analysis ====== | + | ====== Insert Length Analysis (lecture notes) ====== |
- | + | ||
- | We discussed the results from mapping the Illumina run 2 reads to the 454 reads using [[bioinformatic_tools:bwa|BWA]]. | + | |
+ | We discussed the results from mapping the Illumina run 2 reads to the 454 reads using [[archive:bioinformatic_tools:bwa|BWA]]. | ||
+ | There is a more complete analysis at [[archive:bioinformatic_tools:bwa#determining_paired-end_insert_size|Determining Paired-end Insert Size]]. | ||
===== Template Length ===== | ===== Template Length ===== | ||
{{:bioinformatic_tools:run2_insert_size_histogram.png|}} | {{:bioinformatic_tools:run2_insert_size_histogram.png|}} | ||
- | First, we determined what this histogram really shows in the distribution of template lengths, the length from the leftmost base mapped to the rightmost. This size includes the length of each read int the pair and the insert region between them. | + | First, we determined what this histogram really shows in the distribution of template lengths, the length from the leftmost base mapped to the rightmost. This size includes the length of each read in the pair and the insert region between them. |
- | Both these distributions show smaller template lengths than the previous estimates (see [[computer_resources:data|computer_resources:data]]). The distribution for barcode 8 is especially odd because it | + | Both these distributions show smaller template lengths than the previous estimates (see [[archive:computer_resources:data|computer_resources:data]]). The distribution for barcode 8 is especially odd because it appears to be cut off at 100. The read lengths for both barcodes were around 100 bps so any template length less than 200 represents a pair that can be joined. It was decided that SeqPrep should be run prior to mapping the reads to avoid these pairs that overlap. |
- | appears to be cutoff at 100. The read lengths for both barcodes were around 100 bps so any template length less than 200 represents a pair that can be joined. It was decided that SeqPrep should be run prior to mapping the reads to avoid these pairs that overlap. | + | |
===== 454 Coverage ===== | ===== 454 Coverage ===== | ||
- | One possible explanation for misshaped distribution is that BWA had difficulty aligning the Illumina reads to the reference due to the possible overlap of the 454 reads. We calculated the expected number of overlaps in the 454 reads. | + | One possible explanation for misshapen distribution is that BWA had difficulty aligning the Illumina reads to the reference due to the possible overlap of the 454 reads. We calculated the expected number of overlaps in the 454 reads. |
<code> | <code> | ||
R = # of 454 reads ( ~500,000 ) | R = # of 454 reads ( ~500,000 ) |