User Tools

Site Tools


lecture_notes:04-28-2010

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
lecture_notes:04-28-2010 [2010/05/01 22:34]
galt
lecture_notes:04-28-2010 [2010/05/02 09:22]
karplus added workaround for campusrocks filesystem problem
Line 3: Line 3:
 === Misc Notes: === === Misc Notes: ===
  
-**campusrocks is broken!**+**campusrocks is broken!** ​ The head node has the file system mounted as /​campusdata,​ but the client nodes have it mounted as /​campus. ​ The workaround is to use the trick in assemblies/​Pog/​map-colorspace5/​Makefile 
 +<​code>​ 
 +CWD ?= $(subst campusdata,​campus,​$(shell pwd)) 
 +</​code>​ 
 +Then instead of  
 +<​code>​ 
 +        qsub -cwd 
 +</​code>​ use  
 +<​code>​ 
 +        qsub -wd ${CWD} 
 +</​code>​ 
  
 //Pog// has 2 repeats: ~1k & 1.1k \\ //Pog// has 2 repeats: ~1k & 1.1k \\
 use makefiles, not shell scripts! use makefiles, not shell scripts!
  
-SOLiD data formats:\\+**Sanger quality info**\\ 
 +Kevin found the location of the Sanger qual info.\\ ​  
 +.as or something like that.\\ 
 +3 different files from 3 different runs.\\ 
 + 
 +**SOLiD data formats**:\\
 .csfasta = colorspace with numbers\\ .csfasta = colorspace with numbers\\
 .de = changes #s to letters (0123 -> ACGT) but it’s colors not numbers! very confusing.\\ .de = changes #s to letters (0123 -> ACGT) but it’s colors not numbers! very confusing.\\
Line 117: Line 133:
  
 Took about 50 minutes for all.\\ Took about 50 minutes for all.\\
-For comparison, Newbler took 18 minutes and about 40 contigs.+For comparison, Newbler took 18 minutes and 31 non-overlapping ​contigs.
  
-Just qsub them with no arguments, and it runs everything.\\+Just qsub them with no arguments, and it runs everything. ​("​Them"?​ "​it"?​ What does this sentence mean? FIXME  --- //​[[karplus@soe.ucsc.edu|Kevin Karplus]] 2010/05/02 09:14//)
  
  
 === MIRA === === MIRA ===
 +
 +Mostly used the default settings.
 +
 +mira-assembly1/​
 +
 +Running is easy.
 +Parameters: fasta denovo, tell it which instruments it has (e.g. 454 etc).
  
 Needs datafile named pog_in.[format].fa \\ Needs datafile named pog_in.[format].fa \\
-sff_extract script to create .qual files+uses sff_extract script to create ​.fasta and .fasta.qual files \\ 
 +and also the traceinfo_in.454.xml file.
  
-created 30 contigs ​>=500 (largest contig 640k) \\ +Time: 1 hour plus. 
-but... upon mapping to the reference genome, ​ \\+ 
 +Created 621 contigs, 30 larger than 500(largest contig 640k) \\ 
 +The 500 cutoff it probably too large.\\ 
 +100 might me more reasonable.\\ 
 +Total concensus size is good.\\ 
 +But... upon mapping to the reference genome, ​ \\
 it turns out that while it is making big contigs, it's producing a chimeric assembly, in which the contigs join genomic regions that are not truly adjacent. it turns out that while it is making big contigs, it's producing a chimeric assembly, in which the contigs join genomic regions that are not truly adjacent.
-it’s getting bigger contigs because it’s joining them incorrectly! \\ +It’s getting bigger contigs because it’s joining them incorrectly! \\ 
-this is very bad; worse even than a lot of small contigs \\+This is very bad; worse even than a lot of small contigs \\ 
 + 
 +Not DBG.  Should find out more about how it actually works.\\ 
 +Good to know how it works so you know what to do with the parameters. 
 + 
 +Newbler may be able to take fasta+qual file. 
 + 
 +Mira might be worth fussing with on the parameters a bit more if it looks like 
 +it is doing a good job. 
 + 
 +Mira probably can't handle large genomes due to memory. 
 +Mira has a tool to estimate memory required. 
 +For a 3.2G genome it will need 1.1TB ram.
  
  
lecture_notes/04-28-2010.txt · Last modified: 2010/05/02 09:22 by karplus