This shows you the differences between two versions of the page.
Next revision Both sides next revision | |||
post-assembly_analysis:2015:rna_scaffolding [2015/08/31 19:58] ceisenhart created |
post-assembly_analysis:2015:rna_scaffolding [2015/08/31 20:00] ceisenhart |
||
---|---|---|---|
Line 20: | Line 20: | ||
====Current progress==== | ====Current progress==== | ||
- | Currently I am working with a small subset of the RNA seq data. I am running it through the pipeline to optimize the options and system usage. Currently one full run has been done (completing L_RNA_scaffolder and generating a new fasta assembly file) while seven partial runs have been done (completing the transcriptome assembly). I am still deciding what data processing is needed, I am debating running a RAM expensive de duplication to ensure that all duplicates are removed. | + | I am working with a small subset of the RNA seq data running it through the pipeline to optimize the options and system usage. Currently one full run has been done (completing L_RNA_scaffolder and generating a new fasta assembly file) while seven partial runs have been done (completing the transcriptome assembly). I am still deciding what data processing is needed, I am debating running a RAM expensive de duplication to ensure that all duplicates are removed. These partial runs has been using 130+ Gigs of RAM at it's peak, which means that without optimization the full run will crash even our Terrabyte RAM machines. |
Currently I have one undergraduate from UC Berkley working on the pipeline, [[darenliu@berkley.edu | Daren Liu ]]. Daren has been assisting me by writing a program for fastq de duplication, and a program for generating fasta statistics. | Currently I have one undergraduate from UC Berkley working on the pipeline, [[darenliu@berkley.edu | Daren Liu ]]. Daren has been assisting me by writing a program for fastq de duplication, and a program for generating fasta statistics. |