This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
archive:computer_resources:assemblies [2010/04/30 23:01] karplus added mention of trim9.joins for the homework. |
archive:computer_resources:assemblies [2010/05/19 20:45] galt Assembled Slug Genome from Illumina Paired with SOAPdenovo |
||
---|---|---|---|
Line 56: | Line 56: | ||
* The output has 2664 contigs, comprising 443,648 bases (still less than the de novo assembly). | * The output has 2664 contigs, comprising 443,648 bases (still less than the de novo assembly). | ||
* The longest contig is only 1876 bases. | * The longest contig is only 1876 bases. | ||
+ | * SOAPdenovo-assembly1/ First run of SOAPdenovo on illumina paired ends. | ||
+ | * SOAPdenovo requires fastq input files. | ||
+ | * It was used to assemble the Panda genome by BGI. | ||
+ | * Used kolossus which has 1TB and 64cpus. | ||
+ | * Ran with k=31 an k=23. k=31 was better (9k maxcontig) | ||
+ | * so ran with filling -R to get 12k maxcontig. | ||
+ | * Then ran the scaffolding steps with 200bp insert size. | ||
+ | * For all steps, used low default cutoffs since our 10x coverage | ||
+ | * is not high. 21k max scaffold size. Estimated | ||
+ | * genome size is around 3G. The 4 steps are | ||
+ | * 1. pregraph (3.5 to 4.5 hours for 30 to 60 cpus) | ||
+ | * 2. contig (1.3 hours) | ||
+ | * 3. map (0.6 hours with 60cpus) - paired ends | ||
+ | * 4. scaff (1 hour with 60cpus) | ||