This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
lecture_notes:03-30-2011 [2011/04/01 19:07] svohr Added notes from coverage discussion. |
lecture_notes:03-30-2011 [2011/04/01 19:20] (current) svohr [Coverage] slight corrections |
||
---|---|---|---|
Line 33: | Line 33: | ||
* Useful when mapping to a reference. | * Useful when mapping to a reference. | ||
===== Coverage ===== | ===== Coverage ===== | ||
- | We briefly discussed how much sequence data would be required to assemble the genome. First, we considered the probability of seeing every base | + | We briefly discussed how much sequence data would be required to assemble the genome. First, we considered the probability of seeing a particular base ''i'' in a single read ''j''. |
- | in the genome | + | |
| | ||
P( seeing base i in read j ) = L/G | P( seeing base i in read j ) = L/G | ||
Line 42: | Line 41: | ||
P( never seeing base i ) = (1 - L/G)^R | P( never seeing base i ) = (1 - L/G)^R | ||
- | We can multiple ''L/G'' by ''R/R'' to get ''((L*R) / G) / R'' or ''C / R'' where ''C'' is our coverage of the genome. We take the limit of this as | + | We can multiply ''L/G'' by ''R/R'' to get ''((L*R) / G) / R'' or ''C / R'' where ''C'' is our coverage of the genome. We take the limit of this as |
''R'' goes to infinity: | ''R'' goes to infinity: | ||
lim n->inf (1 - C/R)^R = e^-C | lim n->inf (1 - C/R)^R = e^-C | ||
- | Thus we can expect to miss G*e^-C bases. | + | Thus we can expect to miss ''G*e^-C'' bases. |
We cannot assemble an entire chromosome if we are missing bases. However, we can construct contiguous stretches of bases or //contigs// and later | We cannot assemble an entire chromosome if we are missing bases. However, we can construct contiguous stretches of bases or //contigs// and later |