This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
lecture_notes:04-05-2010 [2010/04/07 18:27] galt |
lecture_notes:04-05-2010 [2010/04/16 01:16] (current) karplus fixed citations to use Refnotes syntax |
||
---|---|---|---|
Line 29: | Line 29: | ||
* SOLiD System Tools (Corona_lite, etc): Hyunsung and Chris | * SOLiD System Tools (Corona_lite, etc): Hyunsung and Chris | ||
* Newbler documentation: Galt and Herbert | * Newbler documentation: Galt and Herbert | ||
+ | * SOAPdenovo: Galt and Jenny | ||
- | + | Assembly Review Articles: | |
- | [[http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WG1-4YJ6GD8-1&_user=10&_coverDate=03%2F06%2F2010&_rdoc=1&_fmt=high&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1282691739&_rerunOrigin=google&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=32c08d11cc10fd1eefca0f8a8def738b|Review Article]] | + | * Jason R. Miller, Sergey Koren and Granger Suttona [(cite:Miller2010>Jason R. Miller, Sergey Koren, Granger Sutton, Assembly algorithms for next-generation sequencing data, Genomics, In Press, Corrected Proof, Available online 6 March 2010, ISSN 0888-7543, DOI: 10.1016/j.ygeno.2010.03.001 http://www.sciencedirect.com/science/article/B6WG1-4YJ6GD8-1/2/ae6c957910e4ea658cdebff4a0ce9793)] \\ Covers these assemblers: SSAKE, SHARCGS, VCAKE, Newbler, Celera, Euler, Velvet, ABySS, AllPaths, and SOAPdenovo.Compares de Bruijn graph to overlap/layout/consensus. |
- | + | | |
- | Assembly algorithms for next-generation sequencing data | + | |
- | + | ||
- | Jason R. Miller, Sergey Korena and Granger Suttona | + | |
- | + | ||
- | SSAKE, SHARCGS, VCAKE, Newbler, Celera Assembler, Euler, Velvet, ABySS, AllPaths, and SOAPdenovo. | + | |
- | + | ||
- | More generally, it compares the two standard methods known as the de Bruijn graph approach and the overlap/layout/consensus approach to assembly. | + | |
=====Assembly Overview===== | =====Assembly Overview===== | ||
Line 64: | Line 58: | ||
* Expect half your reads to have an error in them. | * Expect half your reads to have an error in them. | ||
* Contiguous chromosomes with a low error rate ( output from assemblers). | * Contiguous chromosomes with a low error rate ( output from assemblers). | ||
- | * Miami standard for a finished genome should have an error rate of 1 x 10^-5 bases. | + | * Bermuda standard for a finished genome should have an error rate of 1 x 10^-5 bases.1) [(cite:Bermuda1>[[http://www.genome.gov/page.cfm?pageID=10506376]])] [(cite:Bermuda2>[[http://www.ornl.gov/sci/techresources/Human_Genome/research/bermuda.shtml]])] |
* To reduce error rate in short reads, stack up many reads and take the most common base at each position. | * To reduce error rate in short reads, stack up many reads and take the most common base at each position. | ||
* How much data do we have? | * How much data do we have? | ||
Line 103: | Line 97: | ||
- Can find repeat regions using paired-end data. | - Can find repeat regions using paired-end data. | ||
* Most resquencing projects map reads to scaffolds and create contigs based upon mapping. Sections with missing read data can be assumed to be a deleting or an alteration to the existing scaffold. | * Most resquencing projects map reads to scaffolds and create contigs based upon mapping. Sections with missing read data can be assumed to be a deleting or an alteration to the existing scaffold. | ||
+ | |||
+ | |||
+ | ===== References ===== | ||
+ | <refnotes>notes-separator: none</refnotes> | ||
+ | ~~REFNOTES cite~~ | ||
+ | |||
+ |