This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
archive:bioinformatic_tools:velvet [2010/04/11 00:12] galt |
archive:bioinformatic_tools:velvet [2015/07/28 06:27] (current) ceisenhart ↷ Page moved from bioinformatic_tools:velvet to archive:bioinformatic_tools:velvet |
||
---|---|---|---|
Line 1: | Line 1: | ||
===== VELVET ===== | ===== VELVET ===== | ||
- | ====High Level Overview==== | ||
- | Velvet was developed by Ewan Birney and Daniel R. Zerbino for de-novo assembly of short-reads using de Bruijn graphs. | ||
- | Zerbino D, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. (2008) 18:821–829. | + | ==== Overview==== |
- | [[http://nar.oxfordjournals.org/cgi/ijlink?linkType=ABST&journalCode=genome&resid=18/5/821|Free full text]] | + | Velvet was developed by Daniel R. Zerbino and Ewan Birney. |
+ | |||
+ | **Velvet: algorithms for de novo short read assembly using de Bruijn graphs** | ||
+ | [(cite:velvet>Daniel R. Zerbino and Ewan Birney.\\ | ||
+ | Velvet: Algorithms for de novo short read assembly using de Bruijn graphs\\ | ||
+ | Genome Res. May 2008 18: 821-829; Published in Advance March 18, 2008, \\ | ||
+ | doi:[[http://dx.doi.org/10.1101/gr.074492.107|10.1101/gr.074492.107]] | ||
+ | )] | ||
Velvet may be downloaded free from [[http://www.ebi.ac.uk/~zerbino/velvet/|here]] (GPL license). | Velvet may be downloaded free from [[http://www.ebi.ac.uk/~zerbino/velvet/|here]] (GPL license). | ||
- | There is a [[http://en.wikipedia.org/wiki/Velvet_(software)|wiki article]] about velvet. | + | On wikipedia: [[wp>Velvet_(software)|Velvet]]. |
+ | [[http://www.ebi.ac.uk/training/ftp/PhDtheses/Daniel_Zerbino.pdf|Daniel Zerbino's PhD Thesis on Velvet]] | ||
- | == Installing == | + | Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time. |
+ | The colorspace version of velvet (_de) expects all data to be double-encoded. Mixed-space not directly supported. | ||
+ | |||
+ | Velvet has support for long-read data. | ||
+ | |||
+ | Velvet will accept sequence data from fastq input files, but does not use the quality information. | ||
+ | |||
+ | ==== Color-Space ==== | ||
+ | |||
+ | === DE double-encoded === | ||
+ | This is done by the pre-processor. | ||
+ | The primer base from the colorspace read is | ||
+ | removed, followed by the first color, since | ||
+ | it was tied to the primer-base. | ||
+ | In the case of mate-paired reads, | ||
+ | the F3 read is reversed. | ||
+ | Then the colors are all converted to bases | ||
+ | for software that doesn't parse colorspace inputs. | ||
+ | Thus double-encoded means reads encoded in colorspace, | ||
+ | and then re-encoded as if bases in base-space. | ||
+ | |||
+ | === colorspace programs === | ||
+ | |||
+ | denovo_preprocessor | ||
+ | converts colorspace reads into double-encoded 24-base reads | ||
+ | that can be given to velvet_de. | ||
+ | |||
+ | velveth_de | ||
+ | colorspace version of velveth hashes reads. | ||
+ | |||
+ | velvetg_de | ||
+ | colorspace version of velvetg creates de Bruijn graph. | ||
+ | |||
+ | denovo_postprocessor | ||
+ | converts velvet output double-encoded to colorspace contigs. | ||
+ | |||
+ | denovo_adp - adapter program converts colorspace | ||
+ | to base-space while reducing read errors in colorspace as much as possible. | ||
+ | |||
+ | [[http://solidsoftwaretools.com/gf/project/denovo/|De-novo Tools for velvet from ABI for Solid]] | ||
+ | |||
+ | ==== Running ==== | ||
+ | |||
+ | Strategy: | ||
+ | - Find the right value for k. For short reads remember to keep k small for good kmer coverage. | ||
+ | - Find the right values for exp_cov and cov-cutoff. This is very important. | ||
+ | * velvet-estimate-exp_cov.pl out/stats.txt makes a useful graph. | ||
+ | - If you only have long reads, use them also as your short reads. | ||
+ | |||
+ | For 454 long reads, this was our best result: | ||
+ | velveth out 31 -short 454/?.TCA.454Reads.fna -long 454/?.TCA.454Reads.fna | ||
+ | velvetg out -exp_cov 60 -cov_cutoff 13 | ||
+ | Final graph has 1755 nodes and n50 of 41723, max 142286, total 2468925, using 778257/782604 reads | ||
+ | |||
+ | |||
+ | ==== Failures ==== | ||
+ | |||
+ | === VelvetOptimiser === | ||
+ | The contributed (velvet/contrib/) utility VelvetOptimiser is intended to help find | ||
+ | the critical parameters k, exp_cov, and cov_cutoff. However although it found k, | ||
+ | it got stuck on a local maximum on coverage and failed to produce anything useful. | ||
+ | |||
+ | === pseudoFlow === | ||
+ | Wondering if homopolymer errors in 454 data could cause trouble for the DBG, | ||
+ | I made a utility called pseudoFlow.c that takes all homopolymers longer than | ||
+ | 6 and shortens them to 6. We know that in the range 1 to 6, 454 is accurate. | ||
+ | In any case, the pseudoFlow version of the data did not perform better, | ||
+ | in fact it was a little worse. | ||
+ | |||
+ | ==== Installing ==== | ||
ssh campusrocks.cse.ucsc.edu | ssh campusrocks.cse.ucsc.edu | ||
| | ||
cd /campusdata/BME235/programs | cd /campusdata/BME235/programs | ||
- | # but currently having problems with my group membership | ||
- | #cd $HOME | ||
- | # mkdir programs | ||
- | # cd programs | ||
- | # | ||
wget http://www.ebi.ac.uk/~zerbino/velvet/velvet_0.7.62.tgz | wget http://www.ebi.ac.uk/~zerbino/velvet/velvet_0.7.62.tgz | ||
tar xfz velvet_0.7.62.tgz | tar xfz velvet_0.7.62.tgz | ||
Line 32: | Line 102: | ||
cp velveth velvetg velveth_de velvetg_de /campusdata/BME235/bin/ | cp velveth velvetg velveth_de velvetg_de /campusdata/BME235/bin/ | ||
+ | |||
+ | ==== Examples ==== | ||
+ | |||
+ | [[http://kevin-gattaca.blogspot.com/2009/12/de-novo-assembly-with-abi-solid-reads.html|example of using velvet with solid]] | ||
+ | |||
+ | ==== Website ==== | ||
+ | [[http://www.ebi.ac.uk/~zerbino/velvet/]] | ||
+ | |||
+ | ==== Source with Binaries and Documentation ==== | ||
+ | [[http://www.ebi.ac.uk/~zerbino/velvet/velvet_0.7.62.tgz]] | ||
+ | |||
+ | ===== References ===== | ||
+ | <refnotes>notes-separator: none</refnotes> | ||
+ | ~~REFNOTES cite~~ | ||