This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
archive:bioinformatic_tools:velvet [2010/04/23 07:09] galt Added Daniel Z.'s phd thesis on velvet |
archive:bioinformatic_tools:velvet [2015/07/28 06:27] (current) ceisenhart ↷ Page moved from bioinformatic_tools:velvet to archive:bioinformatic_tools:velvet |
||
---|---|---|---|
Line 18: | Line 18: | ||
Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time. | Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time. | ||
+ | The colorspace version of velvet (_de) expects all data to be double-encoded. Mixed-space not directly supported. | ||
Velvet has support for long-read data. | Velvet has support for long-read data. | ||
+ | |||
+ | Velvet will accept sequence data from fastq input files, but does not use the quality information. | ||
==== Color-Space ==== | ==== Color-Space ==== | ||
Line 54: | Line 57: | ||
[[http://solidsoftwaretools.com/gf/project/denovo/|De-novo Tools for velvet from ABI for Solid]] | [[http://solidsoftwaretools.com/gf/project/denovo/|De-novo Tools for velvet from ABI for Solid]] | ||
+ | |||
+ | ==== Running ==== | ||
+ | |||
+ | Strategy: | ||
+ | - Find the right value for k. For short reads remember to keep k small for good kmer coverage. | ||
+ | - Find the right values for exp_cov and cov-cutoff. This is very important. | ||
+ | * velvet-estimate-exp_cov.pl out/stats.txt makes a useful graph. | ||
+ | - If you only have long reads, use them also as your short reads. | ||
+ | |||
+ | For 454 long reads, this was our best result: | ||
+ | velveth out 31 -short 454/?.TCA.454Reads.fna -long 454/?.TCA.454Reads.fna | ||
+ | velvetg out -exp_cov 60 -cov_cutoff 13 | ||
+ | Final graph has 1755 nodes and n50 of 41723, max 142286, total 2468925, using 778257/782604 reads | ||
+ | |||
+ | |||
+ | ==== Failures ==== | ||
+ | |||
+ | === VelvetOptimiser === | ||
+ | The contributed (velvet/contrib/) utility VelvetOptimiser is intended to help find | ||
+ | the critical parameters k, exp_cov, and cov_cutoff. However although it found k, | ||
+ | it got stuck on a local maximum on coverage and failed to produce anything useful. | ||
+ | |||
+ | === pseudoFlow === | ||
+ | Wondering if homopolymer errors in 454 data could cause trouble for the DBG, | ||
+ | I made a utility called pseudoFlow.c that takes all homopolymers longer than | ||
+ | 6 and shortens them to 6. We know that in the range 1 to 6, 454 is accurate. | ||
+ | In any case, the pseudoFlow version of the data did not perform better, | ||
+ | in fact it was a little worse. | ||
==== Installing ==== | ==== Installing ==== |