This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
lecture_notes:06-01-2015 [2015/06/01 20:38] charles |
lecture_notes:06-01-2015 [2015/06/01 23:09] jaredc |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== Suffix Arrays ===== | ||
+ | |||
+ | Example Suffix Array, A[ ], of "$banana" | ||
+ | |||
+ | {{:lecture_notes:screen_shot_2015-06-01_at_3.42.20_pm.png|}} | ||
+ | |||
+ | ===== Burrows Wheeler Transform ===== | ||
+ | |||
+ | An example BWT of "^BANANA|" | ||
+ | |||
+ | {{:lecture_notes:screen_shot_2015-06-01_at_3.41.29_pm.png|}} | ||
+ | |||
+ | ===== FM Index ===== | ||
+ | |||
+ | An FM-index is created by first taking the Burrows-Wheeler transform (BWT) of the input text. For example, the BWT of the string T = "abracadabra" is "ard$rcaaaabb", and here it is represented by the matrix M where each row is a rotation of the text that has been sorted. The transform corresponds to the last column labeled L. | ||
+ | |||
+ | {{:lecture_notes:screen_shot_2015-06-01_at_3.44.14_pm.png|}} | ||
+ | |||
+ | C[c] is a table that, for each character c in the alphabet, contains the number of occurrences of lexically smaller characters in the text. | ||
+ | |||
+ | {{:lecture_notes:screen_shot_2015-06-01_at_3.43.25_pm.png|}} | ||
+ | |||
+ | The function Occ(c, k) is the number of occurrences of character c in the prefix L[1..k] | ||
+ | |||
+ | {{:lecture_notes:screen_shot_2015-06-01_at_3.43.33_pm.png|}} | ||
+ | |||
+ | |||
<code> | <code> | ||
Charles M. - BWT Notes | Charles M. - BWT Notes |