User Tools

Site Tools


archive:computer_resources:assemblies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:computer_resources:assemblies [2011/06/03 21:37]
eyliaw [slug/]
archive:computer_resources:assemblies [2011/06/08 15:47]
karplus [slug/] added info about abyss iteration and bwa+samtools+bcftools
Line 100: Line 100:
     * so ran with filling -R to get 12k maxcontig.     * so ran with filling -R to get 12k maxcontig.
     * Then ran the scaffolding steps with 200bp insert size.     * Then ran the scaffolding steps with 200bp insert size.
-    * For all steps, used low default cutoffs since our 10x coverage +    * For all steps, used low default cutoffs since our 10x coverage is not high.  21k max scaffold size.   
-    * is not high.  21k max scaffold size.  ​Estimated +    * Estimated ​genome size is around 3G.  ​ 
-    * genome size is around 3G.  The 4 steps are+    * The 4 steps are
       - pregraph (3.5 to 4.5 hours for 30 to 60 cpus)       - pregraph (3.5 to 4.5 hours for 30 to 60 cpus)
       - contig (1.3 hours)       - contig (1.3 hours)
Line 117: Line 117:
       * We have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome.       * We have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome.
       * It turns out that a lot of the hard hand work and iterated searching to assemble the mitochondrion was not necessary, as the SOAPdenovo-assembly2/​k63_w_454_contigs/​ assembly now has a 14960-long contig (not scaffold!) which is an almost-full-length mitochondrion,​ roughly as good as the best I've managed to assemble so far.  I'll combine it with my efforts and see if I can eke out a few more bases.       * It turns out that a lot of the hard hand work and iterated searching to assemble the mitochondrion was not necessary, as the SOAPdenovo-assembly2/​k63_w_454_contigs/​ assembly now has a 14960-long contig (not scaffold!) which is an almost-full-length mitochondrion,​ roughly as good as the best I've managed to assemble so far.  I'll combine it with my efforts and see if I can eke out a few more bases.
 +      * Iterating mapping reads with BWA and assembling them with SOAPdenovo made some progress, but there was a gap that just wouldn'​t close.
 +      * Switching to abyss (version 1.2.7) for the assembly of the reads made a much larger contig (15535-long after pasting on a suggestion from one abyss assembly onto another).
 +      * Iterating search and abyss assembly does not lengthen the large contig. ​ Cleaning up and calling the consensus with bwa+samtools+bcftools doesn'​t change things much either. ​ There seems to be a large variation in coverage (from 20x to 2300x, with a median of 225x), so I suspect that there is a repeat region at the beginning of the current contig that may have 10 repeats in it.
   * SOAPdenovo-assembly2/​ Assembly with new + old Illumina and 454 data.   * SOAPdenovo-assembly2/​ Assembly with new + old Illumina and 454 data.
     * SOAPdenovo 1.05 - can handle gzipped fastq files.     * SOAPdenovo 1.05 - can handle gzipped fastq files.
Line 131: Line 134:
     * Statistics for each kmer size assembly (using illumina and 454 data, using both for contig and scaffolding):​     * Statistics for each kmer size assembly (using illumina and 454 data, using both for contig and scaffolding):​
       * k31:       * k31:
-          1298372 ​scaffolds from 4814226 ​contigs sum up 632702276bp, with average length 487, 0 gaps filled +         * 1,​298,​372 ​scaffolds from 4,​814,​226 ​contigs sum up 632,​702,​276bp, with average length 487, 0 gaps filled 
-          ​3611844 ​scaffolds&​singleton sum up 1133413022bp, with average length 313 +         * 3,​611,​844 ​scaffolds&​singleton sum up 1,​133,​413,​022bp, with average length 313 
-          the longest is 10340bp,scaffold N50 is 442 bp, scaffold N90 is 148 bp+         * the longest is 10,340bp,scaffold N50 is 442 bp, scaffold N90 is 148 bp
       * k47:       * k47:
-          871819 ​scaffolds from 5306463 ​contigs sum up 530762874bp, with average length 608, 0 gaps filled +         * 871,​819 ​scaffolds from 5,​306,​463 ​contigs sum up 530,​762,​874bp, with average length 608, 0 gaps filled 
-          ​4203195 ​scaffolds&​singleton sum up 1296678043bp, with average length 308 +         * 4,​203,​195 ​scaffolds&​singleton sum up 1,​296,​678,​043bp, with average length 308 
-          the longest is 14750bp,scaffold N50 is 458 bp, scaffold N90 is 140 bp+         * the longest is 14,750bp,scaffold N50 is 458 bp, scaffold N90 is 140 bp
       * k63:       * k63:
-          270887 ​scaffolds from 4022505 ​contigs sum up 139720415bp, with average length 515, 0 gaps filled +         * 270,​887 ​scaffolds from 4,​022,​505 ​contigs sum up 139,​720,​415bp, with average length 515, 0 gaps filled 
-          ​3710532 ​scaffolds&​singleton sum up 690332560bp, with average length 186 +         * 3,​710,​532 ​scaffolds&​singleton sum up 690,​332,​560bp, with average length 186 
-          the longest is 14897bp,scaffold N50 is 232 bp, scaffold N90 is 112 bp+         * the longest is 14,897bp,scaffold N50 is 232 bp, scaffold N90 is 112 bp
  
archive/computer_resources/assemblies.txt · Last modified: 2015/09/02 16:53 by 92.247.181.31