User Tools

Site Tools


post-assembly_analysis:2015:rna_scaffolding

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
post-assembly_analysis:2015:rna_scaffolding [2015/08/31 19:58]
ceisenhart created
post-assembly_analysis:2015:rna_scaffolding [2015/08/31 21:11] (current)
ceisenhart
Line 1: Line 1:
 =====RNA scaffolding===== =====RNA scaffolding=====
-The RNA Scaffolding is currently in process. ​ Please contact ​Chris Eisenhart (ceisenhart@soe.ucsc.eduwith questions.  ​+The RNA Scaffolding is currently in process. ​ Please contact ​[[ceisenhart@soe.ucsc.edu ​| Chris Eisenhart]] ​with questions.  ​
  
 The current pipeline can be broken down into three major steps; ​ The current pipeline can be broken down into three major steps; ​
 **data processing**,​ **transcriptome assembly**, and **genome scaffolding** **data processing**,​ **transcriptome assembly**, and **genome scaffolding**
  
 +The corresponding data files and wet lab procedures are documented online [[https://​banana-slug.soe.ucsc.edu/​data_overview:​2015:​rna-seq | here]]. ​
 ====Data processing==== ====Data processing====
  
Line 20: Line 21:
  
 ====Current progress==== ====Current progress====
-Currently ​I am working with a small subset of the RNA seq data. I am running it through the pipeline to optimize the options and system usage. ​  ​Currently one full run has been done (completing L_RNA_scaffolder and generating a new fasta assembly file) while seven partial runs have been done (completing the transcriptome assembly). ​ I am still deciding what data processing is needed, I am debating running a RAM expensive de duplication to ensure that all duplicates are removed.  ​+I am working with a small subset of the RNA seq data running it through the pipeline to optimize the options and system usage. ​  ​Currently one full run has been done (completing L_RNA_scaffolder and generating a new fasta assembly file) while seven partial runs have been done (completing the transcriptome assembly). ​ I am still deciding what data processing is needed, I am debating running a RAM expensive de duplication to ensure that all duplicates are removed.  ​These partial runs has been using 130+ Gigs of RAM at their peak, which means that without optimization the full run will crash even our Terrabyte RAM machines. ​
  
 Currently I have one undergraduate from UC Berkley working on the pipeline, [[darenliu@berkley.edu | Daren Liu ]].  Daren has been assisting me by writing a program for fastq de duplication,​ and a program for generating fasta statistics.  ​ Currently I have one undergraduate from UC Berkley working on the pipeline, [[darenliu@berkley.edu | Daren Liu ]].  Daren has been assisting me by writing a program for fastq de duplication,​ and a program for generating fasta statistics.  ​
post-assembly_analysis/2015/rna_scaffolding.1441051114.txt.gz ยท Last modified: 2015/08/31 19:58 by ceisenhart