This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:computer_resources:assemblies [2011/06/02 19:26] eyliaw |
archive:computer_resources:assemblies [2011/06/03 22:12] eyliaw [slug/] |
||
---|---|---|---|
Line 115: | Line 115: | ||
- assemble them using SOAPdenovo. | - assemble them using SOAPdenovo. | ||
* It looks like the Illumina reads have about 228x coverage of the mitochondrion, but coverage is patchy, and it seems to be difficult to close the circle (at least with SOAPdenovo). | * It looks like the Illumina reads have about 228x coverage of the mitochondrion, but coverage is patchy, and it seems to be difficult to close the circle (at least with SOAPdenovo). | ||
- | * I have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome. | + | * We have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome. |
+ | * It turns out that a lot of the hard hand work and iterated searching to assemble the mitochondrion was not necessary, as the SOAPdenovo-assembly2/k63_w_454_contigs/ assembly now has a 14960-long contig (not scaffold!) which is an almost-full-length mitochondrion, roughly as good as the best I've managed to assemble so far. I'll combine it with my efforts and see if I can eke out a few more bases. | ||
* SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | * SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | ||
* SOAPdenovo 1.05 - can handle gzipped fastq files. | * SOAPdenovo 1.05 - can handle gzipped fastq files. | ||
* Runs with k27, 31, 47, and 63 so far. 47 was the best overall. 63 got the longest contig (~14.9kb). | * Runs with k27, 31, 47, and 63 so far. 47 was the best overall. 63 got the longest contig (~14.9kb). | ||
* Run parameters: | * Run parameters: | ||
- | * pregraph: | + | - pregraph: |
- | - lowest count size of 2 (-d 2) | + | * lowest count size of 2 (-d 2) |
- | * contig: | + | - contig: |
- | - solve tiny repeats on (-R) | + | * solve tiny repeats on (-R) |
- | * map: | + | - map: |
- | - all default | + | * all default |
- | * scaff: | + | - scaff: |
- | - intra-scaffold gap closure on (-F) | + | * intra-scaffold gap closure on (-F) |
+ | * Statistics for each kmer size assembly (using illumina and 454 data, using both for contig and scaffolding): | ||
+ | * k31: | ||
+ | * 1,298,372 scaffolds from 4,814,226 contigs sum up 632,702,276bp, with average length 487, 0 gaps filled | ||
+ | * 3,611,844 scaffolds&singleton sum up 1,133,413,022bp, with average length 313 | ||
+ | * the longest is 10,340bp,scaffold N50 is 442 bp, scaffold N90 is 148 bp | ||
+ | * k47: | ||
+ | * 871,819 scaffolds from 5,306,463 contigs sum up 530,762,874bp, with average length 608, 0 gaps filled | ||
+ | * 4,203,195 scaffolds&singleton sum up 1,296,678,043bp, with average length 308 | ||
+ | * the longest is 14,750bp,scaffold N50 is 458 bp, scaffold N90 is 140 bp | ||
+ | * k63: | ||
+ | * 270,887 scaffolds from 4,022,505 contigs sum up 139,720,415bp, with average length 515, 0 gaps filled | ||
+ | * 3,710,532 scaffolds&singleton sum up 690,332,560bp, with average length 186 | ||
+ | * the longest is 14,897bp,scaffold N50 is 232 bp, scaffold N90 is 112 bp |