User Tools

Site Tools


lecture_notes:04-22-2011

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
lecture_notes:04-22-2011 [2011/04/25 22:01]
eyliaw created
lecture_notes:04-22-2011 [2011/06/08 17:14]
eyliaw [Suffix array]
Line 1: Line 1:
-====== Burrows Wheeler ​Transform ​======+====== Burrows Wheeler ​Aligner ​====== 
 +We discussed the [[bioinformatic_tools:​bwa]]. ​ It uses the Burrows Wheeler Transform to represent a prefix trie, allowing for short read alignment with mismatches and gaps. 
 ===== The prefix trie ====== ===== The prefix trie ======
-The prefix trie is a tree built from possible prefixes, starting at the end of a branch and going backwards to the root.  Each node is the range in the suffix array where the +The prefix trie is a tree built from possible prefixes, starting at the end of a branch and going backwards to the root.  Each node is the range in the suffix array where that prefix may be found. ​ Each edge is a character in the prefix. ​ A string can be found--allowing for mismatches and gaps--in the prefix tree by going up each branch and comparing characters to the prefix until the all characters have been matched or the mismatch limit is met.
  
 ===== Suffix array ====== ===== Suffix array ======
-The suffix array is found by sorting the cyclic transformation lexicographically and taking their original indices. +The suffix array is found by appending an end character and sorting the cyclic transformation ​of the resulting string ​lexicographically and storing the original indices. 
-Range in the suffix array can be found by the R_ and R-+ 
 +Example: 
 +   ​GOOGOL 
 +Append end character:​ 
 +   ​GOOGOL$ 
 +Cyclic transformation:​ 
 +   0 GOOGOL$ 
 +   1 OOGOL$G 
 +   2 OGOL$GO 
 +   3 GOL$GOO 
 +   4 OL$GOOG 
 +   5 L$GOOGO 
 +   6 $GOOGOL 
 +Sort: 
 +   6 $GOOGO**L** 
 +   3 GOL$GO**O** 
 +   0 GOOGOL**$** 
 +   5 L$GOOG**O** 
 +   2 OGOL$G**O** 
 +   4 OL$GOO**G** 
 +   1 OOGOL$**G** 
 + 
 +   S(i) = [6,​3,​0,​5,​2,​4,​1] 
 +The Burrows Wheeler transform takes the last character of the sorted cyclic strings: 
 +   B(i) = LO$OOGG 
lecture_notes/04-22-2011.txt · Last modified: 2015/08/09 23:06 by 212.129.31.47