This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
archive:computer_resources:assemblies [2011/06/07 23:16] karplus [slug/] added abyss assembly info to barcode-of-life |
archive:computer_resources:assemblies [2011/06/08 15:47] karplus [slug/] added info about abyss iteration and bwa+samtools+bcftools |
||
---|---|---|---|
Line 100: | Line 100: | ||
* so ran with filling -R to get 12k maxcontig. | * so ran with filling -R to get 12k maxcontig. | ||
* Then ran the scaffolding steps with 200bp insert size. | * Then ran the scaffolding steps with 200bp insert size. | ||
- | * For all steps, used low default cutoffs since our 10x coverage | + | * For all steps, used low default cutoffs since our 10x coverage is not high. 21k max scaffold size. |
- | * is not high. 21k max scaffold size. Estimated | + | * Estimated genome size is around 3G. |
- | * genome size is around 3G. The 4 steps are | + | * The 4 steps are |
- pregraph (3.5 to 4.5 hours for 30 to 60 cpus) | - pregraph (3.5 to 4.5 hours for 30 to 60 cpus) | ||
- contig (1.3 hours) | - contig (1.3 hours) | ||
Line 119: | Line 119: | ||
* Iterating mapping reads with BWA and assembling them with SOAPdenovo made some progress, but there was a gap that just wouldn't close. | * Iterating mapping reads with BWA and assembling them with SOAPdenovo made some progress, but there was a gap that just wouldn't close. | ||
* Switching to abyss (version 1.2.7) for the assembly of the reads made a much larger contig (15535-long after pasting on a suggestion from one abyss assembly onto another). | * Switching to abyss (version 1.2.7) for the assembly of the reads made a much larger contig (15535-long after pasting on a suggestion from one abyss assembly onto another). | ||
+ | * Iterating search and abyss assembly does not lengthen the large contig. Cleaning up and calling the consensus with bwa+samtools+bcftools doesn't change things much either. There seems to be a large variation in coverage (from 20x to 2300x, with a median of 225x), so I suspect that there is a repeat region at the beginning of the current contig that may have 10 repeats in it. | ||
* SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | * SOAPdenovo-assembly2/ Assembly with new + old Illumina and 454 data. | ||
* SOAPdenovo 1.05 - can handle gzipped fastq files. | * SOAPdenovo 1.05 - can handle gzipped fastq files. |