Differences

This shows you the differences between two versions of the page.

--- lecture_notes:04-22-2011 [2011/06/08 17:29]
eyliaw [Suffix array]
+++ lecture_notes:04-22-2011 [2015/08/09 23:06] (current)
212.129.31.47 ↷ Links adapted because of a move operation
@@ Line 1: / Line 1: @@
 ====== Burrows Wheeler Aligner ======
-We discussed the [[bioinformatic_tools:bwa]].  It uses the Burrows Wheeler Transform to represent a prefix trie, allowing for short read alignment with mismatches and gaps.
+We discussed the [[archive:bioinformatic_tools:bwa]].  It uses the Burrows Wheeler Transform to represent a prefix trie, allowing for short read alignment with mismatches and gaps.
 ===== The prefix trie ======
@@ Line 21: / Line 21: @@
 $GOOGOL
 Sort:
-$GOOGO**L**
+$GOOGOL
-GOL$GO**O**
+GOL$GOO
-GOOGOL**$**
+GOOGOL$
-L$GOOG**O**
+L$GOOGO
-OGOL$G**O**
+OGOL$GO
-OL$GOO**G**
+OL$GOOG
-OOGOL$**G**
+OOGOL$G
    S(i) = [6,3,0,5,2,4,1]
@@ Line 36: / Line 36: @@
 C_X(a) is the number of characters lexicographically before a in X.
-   $ 0
+   G 0
-   G 1
+   L 2
-   L 3
+   O 3
-   O 4
 O_X(a,i) is the number of occurrences of a in B_X[0,i].
+^ i ^ B_X[0,i] ^ O_X(G,i) ^ O_X(L,i) ^ O_X(O,i) ^
+| 1 | L        | 0        | 1        | 0        |
+| 2 | LO       | 0        | 1        | 1        |
+| 3 | LO$      | 0        | 1        | 1        |
+| 4 | LO$O     | 0        | 1        | 2        |
+| 5 | LO$OO    | 0        | 1        | 3        |
+| 6 | LO$OOG   | 1        | 1        | 3        |
+| 7 | LO$OOGG  | 2        | 1        | 3        |
+There are then two recursive formulas to find the start and end positions of a substring.
+   End R_(aW) = C(a) + O(a,R_(W) - 1) + 1 or 1 if W is the empty string
+   Start R-(aW) = C(a) + O(a,R-(W)) or (n - 1) if W is the empty string
+Where a is the first character and W is the word.

Banana Slug Genomics

User Tools

Site Tools

Differences

Page Tools