User Tools

Site Tools


lecture_notes:04-28-2010

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

=== Misc Notes: === **campusrocks is broken!** //Pog// has 2 repeats: ~1k & 1.1k \\ use makefiles, not shell scripts! SOLiD data formats:\\ .csfasta = colorspace with numbers\\ .de = changes #s to letters (0123 -> ACGT) but it’s colors not numbers! very confusing.\\ .fa //is the real basespace// === Euler === ran well first time (it ran, at least) \\ have to run it where you installed it \\ no makefiles \\ result: \\ ~2k contigs which create a 2x long genome… suspicious \\ are contigs overlapping? \\ //find out:// \\ check blat_strict_match (blat alignment to reference genome) \\ look for "Q name" (contigs) which match to the same "T start" positions on the reference genome \\ //answer://yes, appear to overlap a lot – double coverage because they totally overlap Things to try to improve the run: \\ - longer k-mers \\ - increase frequency threshold (help make up for read errors, maybe?) \\ "Error Correction via threading" \\ - took reads that “they couldn’t make error free” \\ - made contigs out of these \\ - tried to map them back to the “error-free” contigs \\ - perhaps this is where it went wrong? \\ Tried to run on just the SOLiD data… started on Sunday, but still running (Wed) \\ === Celera Assember: === needs qual info (need this from Sanger reads, too) \\ ... so can't run unless you have the .qual files seemed to have a script to convert Illumina -> their format… but not released yet result: \\ with 454 data alone: 386 contigs \\ (newbler: ~40 contigs) \\ took about 50min === Mira === needs datafile named pog_in.[format].fa \\ sff_extract script to create .qual files created 30 contigs >=500 (largest contig 640k) \\ but... upon mapping to the reference genome, \\ it turns out that while it is making big contigs, it's producing a chimeric assembly, in which the contigs join genomic regions that are not truly adjacent. it’s getting bigger contigs because it’s joining them incorrectly! \\ this is very bad; worse even than a lot of small contigs \\

You could leave a comment if you were logged in.
lecture_notes/04-28-2010.1272576713.txt.gz · Last modified: 2010/04/29 21:31 by learithe