User Tools

Site Tools


lecture_notes:03-30-2011

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

====== Brief overview of goals and data input characteristics ====== Kevin laid out some of the logistics of the class. A broad goal: For each chromosome in the slug, we want the full sequence in DNA bases. Since it is unlikely to be completed in the timeframe of one quarter, some smaller goals: build contigs and have an idea of the scaffold to arrange the contigs in. Inputs: Sequencing data from various machines. Some of the characteristics of these machines/techniques: ==== Sanger capillary ==== * ~800bp reads[(cite:wikisanger>http://en.wikipedia.org/wiki/Microfluidic_Sanger_sequencing)]. * Q (quality value) ~30 * ~$1/read, expensive because primers must be attached to each read. ==== 454 ==== * ~400bp reads[(cite:wiki454>http://en.wikipedia.org/wiki/454_Life_Sciences)]. * Pyrosequencing * Q ~20 * $5000/run/1M reads, no downscaling (numbers approximate). ==== SoLiD ==== * 2x25bp or 1x50bp reads * Paired end reads: ligation with adapter, cleaves 25bp from adapter using restriction enzyme. * Potential for double ligation: two unrelated sequences ligating. * $2000/run/100M reads (numbers approximate). ==== Illumina ==== * 2x50, 2x100bps ? * Paired end reads * Potential errors: innies (ligated region not between sequenced regions) or chimeric (sequence passes ligated region) * Cheaper than SoLiD, 10K Genomes project uses it. ==== Ion Torrent ==== ==== Pac Bio ==== * Very long, single molecule reads (~10K) * High error rates (~5%) * Useful when mapping to a reference. ===== References ===== <refnotes>notes-separator: none</refnotes> ~~REFNOTES cite~~

You could leave a comment if you were logged in.
lecture_notes/03-30-2011.1301629550.txt.gz · Last modified: 2011/04/01 03:45 by svohr