User Tools

Site Tools


archive:bioinformatic_tools:velvet

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
archive:bioinformatic_tools:velvet [2010/04/11 00:17]
galt
archive:bioinformatic_tools:velvet [2010/04/28 20:58]
galt
Line 1: Line 1:
 ===== VELVET ===== ===== VELVET =====
-====High Level Overview==== 
-Velvet was developed by Ewan Birney and Daniel R. Zerbino for de-novo assembly of short-reads using de Bruijn graphs. 
  
-Zerbino ​D, Birney ​E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. (200818:821829. +==== Overview==== 
-[[http://nar.oxfordjournals.org/cgi/ijlink?​linkType=ABST&​journalCode=genome&​resid=18/​5/​821|Free full text]]+Velvet was developed by Daniel R. Zerbino ​and Ewan Birney. 
 + 
 +**Velvet: algorithms for de novo short read assembly using de Bruijn graphs**  
 +[(cite:​velvet>​Daniel RZerbino and Ewan Birney.\\ 
 +Velvet: Algorithms for de novo short read assembly using de Bruijn graphs\\ 
 +Genome Res. May 2008 18: 821-829; Published in Advance March 18, 2008, \\ 
 +doi:[[http://dx.doi.org/10.1101/gr.074492.107|10.1101/​gr.074492.107]
 +)]
  
 Velvet may be downloaded free from [[http://​www.ebi.ac.uk/​~zerbino/​velvet/​|here]] (GPL license). Velvet may be downloaded free from [[http://​www.ebi.ac.uk/​~zerbino/​velvet/​|here]] (GPL license).
  
-There is a [[http://​en.wikipedia.org/​wiki/​Velvet_(software)|wiki article]] about velvet.+On wikipedia: ​[[wp>Velvet_(software)|Velvet]].
  
 +[[http://​www.ebi.ac.uk/​training/​ftp/​PhDtheses/​Daniel_Zerbino.pdf|Daniel Zerbino'​s PhD Thesis on Velvet]]
  
 Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time. Velvet has support for COLORSPACE, possibly the only de-novo short-read DBG assembler that does at this time.
 +The colorspace version of velvet (_de) expects all data to be double-encoded. Mixed-space not directly supported.
  
 Velvet has support for long-read data. Velvet has support for long-read data.
  
-=== Installing ===+Velvet will accept sequence data from fastq input files, but does not use the quality information. 
 + 
 +==== Color-Space ==== 
 + 
 +=== DE double-encoded === 
 +This is done by the pre-processor. 
 +The primer base from the colorspace read is  
 +removed, followed by the first color, since 
 +it was tied to the primer-base.  
 +In the case of mate-paired reads, 
 +the F3 read is reversed. 
 +Then the colors are all converted to bases 
 +for software that doesn'​t parse colorspace inputs. 
 +Thus double-encoded means reads encoded in colorspace,​ 
 +and then re-encoded as if bases in base-space. 
 + 
 +=== colorspace programs === 
 + 
 +denovo_preprocessor 
 +converts colorspace reads into double-encoded 24-base reads 
 +that can be given to velvet_de. 
 + 
 +velveth_de 
 +colorspace version of velveth hashes reads. 
 + 
 +velvetg_de 
 +colorspace version of velvetg creates de Bruijn graph. 
 + 
 +denovo_postprocessor 
 +converts velvet output double-encoded to colorspace contigs. 
 + 
 +denovo_adp - adapter program converts colorspace  
 +to base-space while reducing read errors in colorspace as much as possible. 
 + 
 +[[http://​solidsoftwaretools.com/​gf/​project/​denovo/​|De-novo Tools for velvet from ABI for Solid]] 
 + 
 +==== Running ==== 
 + 
 +Strategy:  
 +  - Find the right value for k.  For short reads remember to keep k small for good kmer coverage. 
 +  - Find the right values for exp_cov and cov-cutoff. This is very important. 
 +    * velvet-estimate-exp_cov.pl out/​stats.txt makes a useful graph. 
 +  - If you only have long reads, use them also as your short reads. 
 + 
 +For 454 long reads, this was our best result: 
 +  velveth out 31 -short 454/?​.TCA.454Reads.fna -long 454/?​.TCA.454Reads.fna 
 +  velvetg out -exp_cov 60 -cov_cutoff 13 
 +  Final graph has 1755 nodes and n50 of 41723, max 142286, total 2468925, using 778257/​782604 reads 
 + 
 + 
 +==== Failures ==== 
 + 
 +=== VelvetOptimiser === 
 +The contributed (velvet/​contrib/​) utility VelvetOptimiser is intended to help find  
 +the critical parameters k, exp_cov, and cov_cutoff. ​ However although it found k, 
 +it got stuck on a local maximum on coverage and failed to produce anything useful. 
 + 
 +=== pseudoFlow === 
 +Wondering if homopolymer errors in 454 data could cause trouble for the DBG, 
 +I made a utility called pseudoFlow.c that takes all homopolymers longer than  
 +6 and shortens them to 6.  We know that in the range 1 to 6, 454 is accurate. 
 +In any case, the pseudoFlow version of the data did not perform better, 
 +in fact it was a little worse. 
 + 
 +==== Installing ​====
  
   ssh campusrocks.cse.ucsc.edu   ssh campusrocks.cse.ucsc.edu
   ​   ​
   cd /​campusdata/​BME235/​programs   cd /​campusdata/​BME235/​programs
-  # but currently having problems with my group membership 
-  #cd $HOME 
-  # mkdir programs 
-  # cd programs 
-  #  
   wget http://​www.ebi.ac.uk/​~zerbino/​velvet/​velvet_0.7.62.tgz   wget http://​www.ebi.ac.uk/​~zerbino/​velvet/​velvet_0.7.62.tgz
   tar xfz velvet_0.7.62.tgz   tar xfz velvet_0.7.62.tgz
Line 36: Line 102:
   cp velveth velvetg velveth_de velvetg_de /​campusdata/​BME235/​bin/​   cp velveth velvetg velveth_de velvetg_de /​campusdata/​BME235/​bin/​
  
 +
 +==== Examples ====
 +
 +[[http://​kevin-gattaca.blogspot.com/​2009/​12/​de-novo-assembly-with-abi-solid-reads.html|example of using velvet with solid]]
 +
 +==== Website ====
 +[[http://​www.ebi.ac.uk/​~zerbino/​velvet/​]]
 +
 +==== Source with Binaries and Documentation ====
 +[[http://​www.ebi.ac.uk/​~zerbino/​velvet/​velvet_0.7.62.tgz]]
 +
 +===== References =====
 +<​refnotes>​notes-separator:​ none</​refnotes>​
 +~~REFNOTES cite~~
  
archive/bioinformatic_tools/velvet.txt · Last modified: 2015/07/28 06:27 by ceisenhart