This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
archive:computer_resources:assemblies [2011/06/08 15:47] karplus [slug/] added info about abyss iteration and bwa+samtools+bcftools |
archive:computer_resources:assemblies [2011/06/21 23:27] karplus [slug/] moved mitochondrion assembly information to a new page |
||
---|---|---|---|
Line 107: | Line 107: | ||
- map (0.6 hours with 60cpus) - paired ends | - map (0.6 hours with 60cpus) - paired ends | ||
- scaff (1 hour with 60cpus) | - scaff (1 hour with 60cpus) | ||
- | * barcode-of-life/ attempt to assemble the mitochondrial genome, with particular emphasis on the gene for mitochondrial cytochrome c oxidase subunit I protein I (CO1), which is used for the "barcode of life". [[http://www.boldsystems.org/|BOLD (barcode of life database)]] | + | * barcode-of-life/ attempt to assemble the mitochondrial genome, documented on its own page: [[computer_resources:assemblies:mitochondrion]] |
- | * Started with a search of SOAPdenovo-assembly1/k31/soapSlug.scafSeq for scaffolds that matched examples from other mollusks. | + | |
- | * Looked for 454 reads that extended or joined contigs in scaffold | + | |
- | * Repeated (sometimes using more sensitive searches) until no more credible scaffolds from the SOAPdenovo-assembly1/k31/ assembly nor 454 reads were found. | + | |
- | * The 454 coverage of the mitochondrion is so slight as to be nearly useless, so instead we can iterate: | + | |
- | - find all Illumina reads that map to the mitochondrial draft, using BWA | + | |
- | - assemble them using SOAPdenovo. | + | |
- | * It looks like the Illumina reads have about 228x coverage of the mitochondrion, but coverage is patchy, and it seems to be difficult to close the circle (at least with SOAPdenovo). | + | |
- | * We have an almost complete mitochondrial genome, and I'm hoping that a few more iterations or some tricky assembly will close it into a clean circular genome. | + | |
- | * It turns out that a lot of the hard hand work and iterated searching to assemble the mitochondrion was not necessary, as the SOAPdenovo-assembly2/k63_w_454_contigs/ assembly now has a 14960-long contig (not scaffold!) which is an almost-full-length mitochondrion, roughly as good as the best I've managed to assemble so far. I'll combine it with my efforts and see if I can eke out a few more bases. | + | |
- | * Iterating mapping reads with BWA and assembling them with SOAPdenovo made some progress, but there was a gap that just wouldn't close. | + | |
- | * Switching to abyss (version 1.2.7) for the assembly of the reads made a much larger contig (15535-long after pasting on a suggestion from one abyss assembly onto another). | + | |
- | * Iterating search and abyss assembly does not lengthen the large contig. Cleaning up and calling the consensus with bwa+samtools+bcftools doesn't change things much either. There seems to be a large variation in coverage (from 20x to 2300x, with a median of 225x), so I suspect that there is a repeat region at the beginning of the current contig that may have 10 repeats in it. | + | |
* SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | * SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | ||
* SOAPdenovo 1.05 - can handle gzipped fastq files. | * SOAPdenovo 1.05 - can handle gzipped fastq files. |