User Tools

Site Tools


archive:bioinformatic_tools:gs_de_novo_assembler

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
archive:bioinformatic_tools:gs_de_novo_assembler [2010/04/16 01:37]
karplus bug notice, installed wrong version
archive:bioinformatic_tools:gs_de_novo_assembler [2010/04/24 15:36]
karplus replaced -noace by -nobig and added explanations
Line 40: Line 40:
   * stopRun   * stopRun
  
-(but I haven'​t ​tested ​them yet --- //​[[karplus@soe.ucsc.edu|Kevin Karplus]] 2010/04/15 16:02//) +Installed and being tested ​
-BUG: I installed the old version...I'​ll have to try again with the 2.3 version. --- //​[[karplus@soe.ucsc.edu|Kevin Karplus]] 2010/04/15 18:36//+
  
 == De novo assembly == == De novo assembly ==
  
-The standard ​commands ​for de novo assembly ​are to create a new directory, and in that directory create a Makefile that includes a target to execute the following commands:+The standard ​approach ​for de novo assembly ​is to create a new directory, and in that directory create a Makefile that includes a target to execute the following commands:
 <​code>​ <​code>​
 newAssembly . newAssembly .
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff
-runProject -e 50 .+runProject -e 50 -nobig ​.
 </​code>​ </​code>​
 Of course, different sff files will be used on different runs. Of course, different sff files will be used on different runs.
 +
 +The "​-e"​ value is the expected coverage. ​ For the Pog 454 data, that should be about 60.  For the banana-slug data, it is very much smaller (0.05?).
 +
 +The -nobig parameter suppresses the generation of big output files.
 +
 +A Makefile that illustrates the use of the SunGrid to avoid running on the head node is shown in /​campusdata/​BME235/​assemblies/​Pog/​newbler-assembly2/​Makefile
 +
 +Note: earlier versions of Newbler provided serially numbered contigs, but version 2.3 seems to skip numbers rather arbitrarily,​ so that the range of the numbers is larger than the size of the set of contigs. ​ Look at the counts (in assembly/​454NewblerMetrics.txt) or run a program to count the contigs, rather than relying on the largest contig number.
  
 == Mapping to existing genome == == Mapping to existing genome ==
Line 64: Line 71:
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ01.sff
 addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff addRun . /​campusdata/​BME235/​data/​Pog/​454_run/​sff/​FUIPDCZ02.sff
-runProject -e 50 .+runProject -e 50 -nobig ​.
 </​code>​ </​code>​
  
archive/bioinformatic_tools/gs_de_novo_assembler.txt · Last modified: 2015/07/28 06:23 by ceisenhart