User Tools

Site Tools


archive:computer_resources:assemblies

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:computer_resources:assemblies [2010/04/20 04:28]
karplus added newbler-assembly2/
archive:computer_resources:assemblies [2010/04/21 00:44]
karplus added map-colorspace3/
Line 6: Line 6:
   * newbler-assembly1/​ is an attempt to do a de novo assembly using the 454 tools (Newbler) version 2.3, starting with the entire set of reads (including any contaminants). ​ This resulted in 43 contigs and 2449932 bases.   * newbler-assembly1/​ is an attempt to do a de novo assembly using the 454 tools (Newbler) version 2.3, starting with the entire set of reads (including any contaminants). ​ This resulted in 43 contigs and 2449932 bases.
   * newbler-clean1/​ does not create an assembly, instead it is an attempt to remove contaminant reads from the Pog 454 data, by removing reads that map to //​Helicobacter pylori// genomes. The results are in newbler-clean1/​sff_cleaned/​no_Hyp.sff  ​   * newbler-clean1/​ does not create an assembly, instead it is an attempt to remove contaminant reads from the Pog 454 data, by removing reads that map to //​Helicobacter pylori// genomes. The results are in newbler-clean1/​sff_cleaned/​no_Hyp.sff  ​
-  * newbler-assembly2/​ is a second de novo assembly using Newbler, starting from the cleaned reads of newbler-clean1/​sff_cleaned/​no_Hyp.sff+  * newbler-assembly2/​ is a second de novo assembly using Newbler, starting from the cleaned reads of newbler-clean1/​sff_cleaned/​no_Hyp.sff ​ It gets 42 contigs and 2,449,409 bases. 
 +  * newbler-assembly3/​ starts from the same sff file as newbler-assembly2/​ but raises the expected coverage to 60 (close to actual coverage). ​ It gets 41 contigs and 2,449,426 bases, still more than the old version of Newbler got after similar cleaning. ​ The contigs have been mapped to the finished genome (using megablast, blastn, blat, and pluck-scripts/​find-dna-differences). All the contigs map cleanly to the finished genome. If contigs map to more than one place, find-dna-differences may (incorrectly) report it as not mapping. 
 +  * map-colorspace3/​ uses the [[bioinformatic_tools:​pluck-scripts|pluck-scripts]] script map-colorspace to map the SOLiD mate-pair reads onto the contigs of the newbler-assembly3/​ run.  The intent is to find what contigs join to what other ones.  The numbering starts with 3, not 1, so that the map-colorspace directories correspond to the newbler-assembly directories that they are mapping onto.
  
 ===== slug/ ===== ===== slug/ =====
 +  * newbler-assembly1/​ first attempt at de novo assembly using Newbler, using all the reads from 454_run1 and 454_run2. ​ This assembly of 499,873 reads including 138,351,643 bases produced only 2,910,773 bases assembled into 8,963 contigs. ​ From this low assembly number, I estimate the coverage to be about 0.043x and the genome size to be about 3.2E9 basepairs. (See the README file for the calculation.)
archive/computer_resources/assemblies.txt · Last modified: 2015/09/02 16:53 by 92.247.181.31