User Tools

Site Tools


archive:bioinformatic_tools:gs_de_novo_assembler

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:bioinformatic_tools:gs_de_novo_assembler [2010/04/16 05:11]
karplus updated installation info and provided pointer to Makefile
archive:bioinformatic_tools:gs_de_novo_assembler [2010/04/20 14:17]
karplus Added warning about non-serial conting numbering, added -noace
Line 5: Line 5:
 It works in flow-space to reduce the impact of its most common ​ It works in flow-space to reduce the impact of its most common ​
 sequencing error (uncertainty about the length of homopolymers).\\ sequencing error (uncertainty about the length of homopolymers).\\
-It claims it can assemble a 3GB genome in one day and can use pai --- //​[[karplus@soe.ucsc.edu|Kevin Karplus]] 2010/04/15 22:07//red-end+It claims it can assemble a 3GB genome in one day and can use paired-end
 information to construct scaffolds from contigs. ​ Currently the paired-end data must have at least 50 bases in each end, so only 454 paired-end libraries are accepted---it would be good if they relaxed that constraint so that their data could be mixed with data from other platforms. information to construct scaffolds from contigs. ​ Currently the paired-end data must have at least 50 bases in each end, so only 454 paired-end libraries are accepted---it would be good if they relaxed that constraint so that their data could be mixed with data from other platforms.
  
Line 49: Line 49:
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff
-runProject -e 50 .+runProject -e 50 -noace -rst 0 .
 </​code>​ </​code>​
 Of course, different sff files will be used on different runs. Of course, different sff files will be used on different runs.
  
-A Makefile that illustrates the use of the SunGrid to avoid running on the head node is shown in /​campusdata/​BME235/​assemblies/​Pog/​newbler-assembly1/Makefile+A Makefile that illustrates the use of the SunGrid to avoid running on the head node is shown in /​campusdata/​BME235/​assemblies/​Pog/​newbler-assembly2/Makefile 
 + 
 +Note: earlier versions of Newbler provided serially numbered contigs, but version 2.3 seems to skip numbers rather arbitrarily,​ so that the range of the numbers is larger than the size of the set of contigs. ​ Look at the counts (in assembly/​454NewblerMetrics.txt) or run a program to count the contigs, rather than relying on the largest contig number.
  
 == Mapping to existing genome == == Mapping to existing genome ==
Line 65: Line 67:
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff
-runProject -e 50 .+runProject -e 50 -noace -rst 0 .
 </​code>​ </​code>​
  
archive/bioinformatic_tools/gs_de_novo_assembler.txt · Last modified: 2015/07/28 06:23 by ceisenhart