User Tools

Site Tools


This is an old revision of the document!

Brief overview of goals and data input characteristics

Kevin laid out some of the logistics of the class.

A broad goal: For each chromosome in the slug, we want the full sequence in DNA bases. Since it is unlikely to be completed in the timeframe of one quarter, some smaller goals: build contigs and have an idea of the scaffold to arrange the contigs in.

Inputs: Sequencing data from various machines. Some of the characteristics of these machines/techniques:

Sanger capillary

  • ~800bp reads[1].
  • Q (quality value) ~30
  • ~$1/read, expensive because primers must be attached to each read.


  • ~400bp reads[2].
  • Pyrosequencing
  • Q ~20
  • $5000/run/~1m reads, no downscaling (numbers approximate).


  • 2x25bp or 1x50bp reads
  • Paired end reads: ligation with adapter, cleaves 25bp from adapter using restriction enzyme.
  • Potential for double ligation: two unrelated sequences ligating.
  • $2000/run/100m reads (numbers approximate).


Ion Torrent

Pac Bio


You could leave a comment if you were logged in.
lecture_notes/03-30-2011.1301527900.txt.gz · Last modified: 2011/03/30 16:31 by eyliaw