Seqprep lecture notes

Phred:

Systematic base-calling biases in Illumina exist. I.e. read one thing going one way, something else going the other.

How does Seqprep do error correction?

Quality scores Q_i = -10 * log_10(P(i is wrong)) Given that base Xi is x:

  = -10 * log_10(1-P(Y=x|Xi=x))

Quality of our base call based on an alignment: Q1 (quality of read 1), Q2 (quality of read 2), Qa (alignment quality).

If we assume that the alignment is perfect: