User Tools

Site Tools


archive:bioinformatic_tools:sga

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

====== Sanger String Graph Assembler ====== * Written by [[http://people.pwf.cam.ac.uk/js779/|Jared Simpson]]. * Currently only has a [[https://github.com/jts/sga|GitHub repository]]. * Paper: Efficient construction of an assembly string graph using the FM-index [(cite:string_graph>Jared T. Simpson and Richard Durbin. Efficient construction of an assembly string graph using the FM-index. Bioinformatics 2010, 26(12): i367-i373. doi: [[http://dx.doi.org/10.1093/bioinformatics/btq217]])] ===== Methods ====== * Uses the Burrows-Wheeler Transform(BWT)/Ferragina—Manzini(FM)-index to build a string graph. ===== String Graphs ====== * Nodes are reads (reads that are substrings are condensed into superstrings). Edges are overlaps between reads, and the non-overlapping prefix is stored in the forward edge and suffix is stored in the backwards edge. * Condenses repeats like a de Bruijn graph. * More expensive to construct than a de Bruijn graph. ===== BWT/FM-index ===== * Like BWA, it uses the FM-index, which is a compressed method of inferring the suffix array. * The Burrows Wheeler transform B_X is an array of the last characters in the alphabetically sorted suffix array. * The FM-index (two data structures: 1. C_X(a) be the number of symbols in X that are lexographically lower than the symbol a, 2. Occ_X(a, i) be the number of occurrences of the symbol a in B_X[1, i], the ) allows substring searching and can be extended to construct the string graph. ===== String Graph Construction with the FM-index ===== ===== Installation ===== Installation of SGA from the GitHub is a major pain, because it has so many dependencies. It needs * google-sparsehash (also needed for Abyss) * hoard * bamtools, which in turn needs * cmake (newer than the version installed on campusrocks)

You could leave a comment if you were logged in.
archive/bioinformatic_tools/sga.1307572945.txt.gz · Last modified: 2011/06/08 22:42 by karplus