This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
lecture_notes:04-22-2011 [2011/06/08 17:11] eyliaw [Suffix array] |
lecture_notes:04-22-2011 [2011/06/08 17:29] eyliaw [Suffix array] |
||
---|---|---|---|
Line 11: | Line 11: | ||
GOOGOL | GOOGOL | ||
Append end character: | Append end character: | ||
- | GOOGOL$ | + | X = GOOGOL$ |
Cyclic transformation: | Cyclic transformation: | ||
0 GOOGOL$ | 0 GOOGOL$ | ||
Line 21: | Line 21: | ||
6 $GOOGOL | 6 $GOOGOL | ||
Sort: | Sort: | ||
- | 6 $GOOGOL | + | 6 $GOOGO**L** |
- | 3 GOL$GOO | + | 3 GOL$GO**O** |
- | 0 GOOGOL$ | + | 0 GOOGOL**$** |
- | 5 L$GOOGO | + | 5 L$GOOG**O** |
- | 2 OGOL$GO | + | 2 OGOL$G**O** |
- | 4 OL$GOOG | + | 4 OL$GOO**G** |
- | 1 OOGOL$G | + | 1 OOGOL$**G** |
+ | |||
+ | S(i) = [6,3,0,5,2,4,1] | ||
+ | The Burrows Wheeler transform takes the last character of the sorted cyclic strings: | ||
+ | B(i) = LO$OOGG | ||
+ | ===== Searching ===== | ||
+ | We can use the FM-index to find the upper and lower bounds of a substring. | ||
- | S = [6,3,0,5,2,4,1] | + | C_X(a) is the number of characters lexicographically before a in X. |
- | The Burrows Wheeler transform takes the last column of the | + | $ 0 |
+ | G 1 | ||
+ | L 3 | ||
+ | O 4 | ||
+ | O_X(a,i) is the number of occurrences of a in B_X[0,i]. |