User Tools

Site Tools


archive:bioinformatic_tools:velvet

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
archive:bioinformatic_tools:velvet [2010/04/23 07:09]
galt Added Daniel Z.'s phd thesis on velvet
archive:bioinformatic_tools:velvet [2015/07/28 06:27] (current)
ceisenhart ↷ Page moved from bioinformatic_tools:velvet to archive:bioinformatic_tools:velvet
Line 18: Line 18:
  
 Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time. Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time.
 +The colorspace version of velvet (_de) expects all data to be double-encoded. Mixed-space not directly supported.
  
 Velvet has support for long-read data. Velvet has support for long-read data.
 +
 +Velvet will accept sequence data from fastq input files, but does not use the quality information.
  
 ==== Color-Space ==== ==== Color-Space ====
Line 54: Line 57:
  
 [[http://​solidsoftwaretools.com/​gf/​project/​denovo/​|De-novo Tools for velvet from ABI for Solid]] [[http://​solidsoftwaretools.com/​gf/​project/​denovo/​|De-novo Tools for velvet from ABI for Solid]]
 +
 +==== Running ====
 +
 +Strategy: ​
 +  - Find the right value for k.  For short reads remember to keep k small for good kmer coverage.
 +  - Find the right values for exp_cov and cov-cutoff. This is very important.
 +    * velvet-estimate-exp_cov.pl out/​stats.txt makes a useful graph.
 +  - If you only have long reads, use them also as your short reads.
 +
 +For 454 long reads, this was our best result:
 +  velveth out 31 -short 454/?​.TCA.454Reads.fna -long 454/?​.TCA.454Reads.fna
 +  velvetg out -exp_cov 60 -cov_cutoff 13
 +  Final graph has 1755 nodes and n50 of 41723, max 142286, total 2468925, using 778257/​782604 reads
 +
 +
 +==== Failures ====
 +
 +=== VelvetOptimiser ===
 +The contributed (velvet/​contrib/​) utility VelvetOptimiser is intended to help find 
 +the critical parameters k, exp_cov, and cov_cutoff. ​ However although it found k,
 +it got stuck on a local maximum on coverage and failed to produce anything useful.
 +
 +=== pseudoFlow ===
 +Wondering if homopolymer errors in 454 data could cause trouble for the DBG,
 +I made a utility called pseudoFlow.c that takes all homopolymers longer than 
 +6 and shortens them to 6.  We know that in the range 1 to 6, 454 is accurate.
 +In any case, the pseudoFlow version of the data did not perform better,
 +in fact it was a little worse.
  
 ==== Installing ==== ==== Installing ====
archive/bioinformatic_tools/velvet.1272006577.txt.gz · Last modified: 2010/04/23 07:09 by galt