This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
lecture_notes:04-15-2015 [2015/04/15 21:05] chkcole created |
lecture_notes:04-15-2015 [2015/04/15 22:31] chkcole |
||
---|---|---|---|
Line 7: | Line 7: | ||
The rate-limiting step for this process is calculating the overlap between each sequence because the process time increases exponentially with the number of sequences in the data set. | The rate-limiting step for this process is calculating the overlap between each sequence because the process time increases exponentially with the number of sequences in the data set. | ||
- | Assembling genomes with a De Bruijn graph circumvents this problem by allowing the assembler to extend the genome independently of any other sequence in the data. In order to assemble the genome with a De Bruijn graph, you must select a k-mer size such that the genome being assembled contains few or no repeats when divided into k-mers of that size. | + | Assembling genomes with a De Bruijn graph circumvents this problem by allowing the assembler to extend the genome independently of any other sequence. In order to assemble the genome with a De Bruijn graph, you must select a k-mer size such that the genome being assembled contains few or no repeats when divided into k-mers of that size. |
+ | |||
+ | The graph is built by dividing each sequence into k-mers of a given length and constructing nodes such that each node contains a k-mer, and a directed edge from one node to another means that one kmer can be extended into another kmer. |