This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:computer_resources:assemblies [2011/05/29 23:48] karplus [slug/] added barcode-of-life description |
archive:computer_resources:assemblies [2011/06/03 21:43] eyliaw [slug/] |
||
---|---|---|---|
Line 111: | Line 111: | ||
* Looked for 454 reads that extended or joined contigs in scaffold | * Looked for 454 reads that extended or joined contigs in scaffold | ||
* Repeated (sometimes using more sensitive searches) until no more credible scaffolds from the SOAPdenovo-assembly1/k31/ assembly nor 454 reads were found. | * Repeated (sometimes using more sensitive searches) until no more credible scaffolds from the SOAPdenovo-assembly1/k31/ assembly nor 454 reads were found. | ||
- | * Next step (not done yet, as of 29 May 2011) is to find all Illumina reads that map to the mitochondrial draft and assemble them. | + | * The 454 coverage of the mitochondrion is so slight as to be nearly useless, so instead we can iterate: |
+ | - find all Illumina reads that map to the mitochondrial draft, using BWA | ||
+ | - assemble them using SOAPdenovo. | ||
+ | * It looks like the Illumina reads have about 228x coverage of the mitochondrion, but coverage is patchy, and it seems to be difficult to close the circle (at least with SOAPdenovo). | ||
+ | * We have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome. | ||
+ | * It turns out that a lot of the hard hand work and iterated searching to assemble the mitochondrion was not necessary, as the SOAPdenovo-assembly2/k63_w_454_contigs/ assembly now has a 14960-long contig (not scaffold!) which is an almost-full-length mitochondrion, roughly as good as the best I've managed to assemble so far. I'll combine it with my efforts and see if I can eke out a few more bases. | ||
+ | * SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | ||
+ | * SOAPdenovo 1.05 - can handle gzipped fastq files. | ||
+ | * Runs with k27, 31, 47, and 63 so far. 47 was the best overall. 63 got the longest contig (~14.9kb). | ||
+ | * Run parameters: | ||
+ | - pregraph: | ||
+ | * lowest count size of 2 (-d 2) | ||
+ | - contig: | ||
+ | * solve tiny repeats on (-R) | ||
+ | - map: | ||
+ | * all default | ||
+ | - scaff: | ||
+ | * intra-scaffold gap closure on (-F) | ||
+ | * Statistics for each kmer size assembly (using illumina and 454 data, using both for contig and scaffolding): | ||
+ | * k31: | ||
+ | * 1298372 scaffolds from 4814226 contigs sum up 632702276bp, with average length 487, 0 gaps filled | ||
+ | * 3611844 scaffolds&singleton sum up 1133413022bp, with average length 313 | ||
+ | * the longest is 10340bp,scaffold N50 is 442 bp, scaffold N90 is 148 bp | ||
+ | * k47: | ||
+ | * 871819 scaffolds from 5306463 contigs sum up 530762874bp, with average length 608, 0 gaps filled | ||
+ | * 4203195 scaffolds&singleton sum up 1296678043bp, with average length 308 | ||
+ | * the longest is 14750bp,scaffold N50 is 458 bp, scaffold N90 is 140 bp | ||
+ | * k63: | ||
+ | * 270887 scaffolds from 4022505 contigs sum up 139720415bp, with average length 515, 0 gaps filled | ||
+ | * 3710532 scaffolds&singleton sum up 690332560bp, with average length 186 | ||
+ | * the longest is 14897bp,scaffold N50 is 232 bp, scaffold N90 is 112 bp |