User Tools

Site Tools


archive:computer_resources:assemblies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:computer_resources:assemblies [2010/04/22 13:43]
karplus moved velvet-assembly1 to test
archive:computer_resources:assemblies [2010/04/22 20:27]
karplus added largest contig size for newbler-assembly1
Line 42: Line 42:
  
 ===== slug/ ===== ===== slug/ =====
-  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.) ​ Much of the assembly is low-complexity regions (repetitions of short repeats (AT)*, (AAG)*, (AG)*, (AC)*, (AGT)*, (AGAT)*, (ACAT)*, (AAC)*, (AACG)*, ... ).  The most common 14-mer that is not a repeat of a short k-mer is TAGTTTACAGCTTG (so that is what we should put on the T-shirt).+  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2.  ​ 
 +    * This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs
 +    * The longest contig is 5783 bases.  ​ 
 +    * From the total number of bases in the assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.)  ​ 
 +    * Much of the assembly is low-complexity regions (repetitions of short repeats (AT)*, (AAG)*, (AG)*, (AC)*, (AGT)*, (AGAT)*, (ACAT)*, (AAC)*, (AACG)*, ... ).  ​ 
 +    * The most common 14-mer that is not a repeat of a short k-mer is TAGTTTACAGCTTG (so that is what we should put on the T-shirt). 
 +  * newbler-mapping1-lottia/​ tries to do a reference-based assembly with the //Lottia gigantea// genome as a reference. 
 +    * The reference has 4475 contigs with 359,512,207 bases. 
 +    * The output has 183 contigs with 29,389 bases. 
 +    * The longest contig is only 644 bases---way too small to be of much use. 
 +  * newbler-mapping2-seahare/​ tries to do a reference-based assembly with the //Aplysia californica//​ genome as a reference. 
 +    * The sea hare reference has 8767 contigs, comprising 715,806,041 bases. 
 +    * The output has 2664 contigs, comprising 443,648 bases (still less than the de novo assembly). ​  
 +    * The longest contig is only 1876 bases. 
archive/computer_resources/assemblies.txt · Last modified: 2015/09/02 16:53 by 92.247.181.31