User Tools

Site Tools


archive:computer_resources:assemblies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
archive:computer_resources:assemblies [2010/04/22 07:49]
galt Added Galt's work on velvet and SOAPdenovo
archive:computer_resources:assemblies [2010/04/22 12:29]
karplus Added T-shirt k-mer suggestion.
Line 33: Line 33:
       * Final graph has 3602 nodes and n50 of 4851, max 94854, total 1767903, using 28785664/​61262410 reads       * Final graph has 3602 nodes and n50 of 4851, max 94854, total 1767903, using 28785664/​61262410 reads
   * SOAPdenovo   * SOAPdenovo
-    * SOAPdenovo-assembly1/​ Assembling Pog 454 long reads with SOAPdenovo. ​ After being simply unable to get any version of the program to read a FASTA file despite documentation examples, I finally found a utility sff2fastq that made it possible to run SOAPdenovo on Pog 454 fastq. ​ I have not had time to optimize parameters yet.  ​ +    * SOAPdenovo-assembly1/​ Assembling Pog 454 long reads with SOAPdenovo. ​ After being simply unable to get any version of the program to read a FASTA file despite documentation examples, I finally found a utility sff2fastq that made it possible to run SOAPdenovo on Pog 454 fastq. ​ I have not had time to optimize parameters yet.  The largest contig made with default params was just 4k.
-      * The largest contig made with default params was just 4k.+
  
  
 ===== slug/ ===== ===== slug/ =====
-  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.)+  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.) ​ Much of the assembly is low-complexity regions (repetitions of short repeats (GA)*, (TA)*, (TTC)*, (AC)*, (TAG)*, (CGAA)*, (TATC)*, (CAA)*, ... ).  The most common 14-mer that is not a repeat of a short k-mer is TAGTTTACAGCTTG (so that is what we should put on the T-shirt).
archive/computer_resources/assemblies.txt · Last modified: 2015/09/02 16:53 by 92.247.181.31