User Tools

Site Tools


lecture_notes:06-01-2015

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
lecture_notes:06-01-2015 [2015/06/01 20:38]
charles
lecture_notes:06-01-2015 [2015/06/01 23:15] (current)
jaredc
Line 1: Line 1:
 +===== Suffix Arrays =====
 +
 +Example Suffix Array, A[ ], of "​$banana"​
 +
 +{{:​lecture_notes:​screen_shot_2015-06-01_at_3.42.20_pm.png|}}
 +
 +===== Burrows Wheeler Transform =====
 +
 +An example BWT of "​^BANANA|"​
 +
 +{{:​lecture_notes:​screen_shot_2015-06-01_at_3.41.29_pm.png|}}
 +
 +===== FM Index =====
 +
 +An FM-index is created by first taking the Burrows-Wheeler transform (BWT) of the input text. For example, the BWT of the string T = "​abracadabra"​ is "​ard$rcaaaabb",​ and here it is represented by the matrix M where each row is a rotation of the text that has been sorted. The transform corresponds to the last column labeled L.
 +
 +{{:​lecture_notes:​screen_shot_2015-06-01_at_3.44.14_pm.png|}}
 +
 + C[c] is a table that, for each character c in the alphabet, contains the number of occurrences of lexically smaller characters in the text.
 +
 +{{:​lecture_notes:​screen_shot_2015-06-01_at_3.43.25_pm.png|}}
 +
 +The function Occ(c, k) is the number of occurrences of character c in the prefix L[1..k]
 +
 +{{:​lecture_notes:​screen_shot_2015-06-01_at_3.43.33_pm.png|}}
 +
 +
 +The program [[http://​bowtie-bio.sourceforge.net/​index.shtml|Bowtie]] uses an FM index technique
 +  * [[http://​www.genomebiology.com/​2009/​10/​3/​R25|Ultrafast and memory-efficient alignment of short DNA sequences to the human genome]]
 +
 +
 <​code>​ <​code>​
 Charles M. - BWT Notes Charles M. - BWT Notes
Line 89: Line 120:
  
 [[http://​www.ncbi.nlm.nih.gov/​pubmed/​19451168|BWT paper]] [[http://​www.ncbi.nlm.nih.gov/​pubmed/​19451168|BWT paper]]
- 
- 
lecture_notes/06-01-2015.1433191100.txt.gz ยท Last modified: 2015/06/01 20:38 by charles