User Tools

Site Tools


lecture_notes:05-20-2015

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

lecture_notes:05-20-2015 [2015/05/21 19:45]
nsaremi created
lecture_notes:05-20-2015 [2015/05/21 20:47] (current)
nsaremi
Line 10: Line 10:
  
  
-running dissovar: +====Running Discovar==== 
-frac option ​to limit input of files to only porition ​of reads+Used fraction ​option limit input of files to only portion ​of reads
  
-key to specify threads and max memory for the run+Needed ​to specify threads and maximum ​memory for the run as well
  
-50% UCSF run showed much better results in N50 for contig and scafoold ​than 50% run and used less memory+50% UCSF run showed much better results in N50 for contig and scaffold ​than 50% original data run and used less memory
  
-discovar ​performed much better with 2x250 reads vs 2x100 reads +Discovar ​performed much better with 2x250 reads vs 2x100 readsmore scaffolds of longer length
- more scaffolds of longer length+
  
-want to use more data when there is more RAM available+Want to use full data set when there is more RAM available 
 + 
 +====BLAST results====
  
 8th longest scaffold when nucleotide BLASTed matched a transcript variant of sea hare 8th longest scaffold when nucleotide BLASTed matched a transcript variant of sea hare
  
-metallothionein hit may be result of having cysteine rich scafoold+metallothionein hit may be result of having cysteine rich scaffold
  
-most common gene hit was robsomoal ​subunit 28S, good sign bc consistent across species+most common gene hit was ribosomal ​subunit 28S, which is a good sign because this gene is consistent across species
  
 +Want to run PRICE to find viral sequences that were found with blast
  
-look at runnign PRICE to find viral sequences that were found with blast +would create an assembly for the viral sequnce that was found and determine if sequence was integrated in the genome or are extranuclear ​
- would create an assembly for the viral sequnce that was found +
- determine if sequence was integrated in the genome or are extranuclear ​+
  
-can map contigs to scaffolds to see if any contig has a different coverage than normal coverage ​+Can map contigs to scaffolds to see if any contig has a different coverage than normal coverage ​ 
 + 
 +====SSpace====
  
 SSpace to do scaffolding after getting contigs SSpace to do scaffolding after getting contigs
- their scaffolds and contigs had been coming out identical sequences 
- 50% UCSF contigs as input 
- using SW041 and SW042 files 
- run with old BWA 0.5, will re-run with bwa 0.7 version 
- merged a few scaffolds, but only added more Ns 
  
- no schange ​in scaffold N50 +Scaffolds and contigs had been coming out identical sequences 
- only affected shorter contigs + 
- number of scaffolds decreased by 20-50+used 50% UCSF contigs as input, using SW041 and SW042 files 
 + 
 +run with old BWA 0.5, will re-run with bwa 0.7 version 
 + 
 +SSpace merged a few scaffolds, but only added more Ns 
 + 
 +no change ​in scaffold N50 
 +only affected shorter contigs 
 +number of scaffolds decreased by 20-50 
 + 
 +probably due to not enough coverage of the assembly
  
- probably due to not enough coverage of the assembly 
  
 +====mitochondrion assembly====
  
-mitochondion assembly+Looked for contig that might have been mitochondrial (previous class iteration) 
 +Took reads that mapped to the 2012 consensus sequence 
 +Hiseq w018 and sw019 reads so far 
 +mito size 14kb estiamte 
 +used discovar sw018 data that mapped to 2012 seq-> coverage 60X
  
- looked for contig that might have been mitochondrial (previous class iteration) 
- took reads that mapped to the 2012 consensu ssequence 
- Hiseq w018 and sw019 reads so far 
- mito size 14kb estiamte 
- used discovar sw018 data that mapped to 2012 seq-> coverage 60X 
- price sw018 reads that mapped to 2012 mito seq-> ​ 
  
-wants contig ​built from read data rather than scaffold +Want to use contigs ​built from read data rather than scaffold 
-statrt ​with one contig that maps well to mito  (use 12kb discovar 18+19 output)+start with one contig that maps well to mito  (use 12kb discovar 18+19 output)
  
 mito genome does integrate into nuclear genome, over time mutates and changes sequence, results in lots of ambiguity in contig construction mito genome does integrate into nuclear genome, over time mutates and changes sequence, results in lots of ambiguity in contig construction
Line 68: Line 73:
 look at ends of contigs and compare, try to join Ns together look at ends of contigs and compare, try to join Ns together
  
-sea hare is 14kb, usually doesnt include ​hvr that is very difficult to assemble+sea hare is 14kb, usually doesnt include ​hypervariable region ​that is very difficult to assemble
lecture_notes/05-20-2015.1432237535.txt.gz · Last modified: 2015/05/21 19:45 by nsaremi