User Tools

Site Tools


archive:computer_resources:assemblies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
archive:computer_resources:assemblies [2010/04/22 12:29]
karplus Added T-shirt k-mer suggestion.
archive:computer_resources:assemblies [2010/04/22 13:28]
karplus corrected order of short repeats (using double-stranded counts)
Line 37: Line 37:
  
 ===== slug/ ===== ===== slug/ =====
-  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.) ​ Much of the assembly is low-complexity regions (repetitions of short repeats (GA)*, (TA)*, (TTC)*, (AC)*, (TAG)*, (CGAA)*, (TATC)*, (CAA)*, ... ).  The most common 14-mer that is not a repeat of a short k-mer is TAGTTTACAGCTTG (so that is what we should put on the T-shirt).+  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.) ​ Much of the assembly is low-complexity regions (repetitions of short repeats (AT)*, (AAG)*, (AG)*, (AC)*, (AGT)*, (AGAT)*, (ACAT)*, (AAC)*, (AACG)*, ... ).  The most common 14-mer that is not a repeat of a short k-mer is TAGTTTACAGCTTG (so that is what we should put on the T-shirt).
archive/computer_resources/assemblies.txt ยท Last modified: 2015/09/02 16:53 by 92.247.181.31