User Tools

Site Tools


lecture_notes:05-04-2015

Why Mate-Pair library?: long distance information to span contig gaps

Problems:

  • low-complexity data
  • lucigen mate-pair kit not very user friendly
  • only 5-10% of your DNA gets circularized like it's supposed to
  • only 10% of circular DNA contains junctions
  • less than 1 ng of every microg is actual usable data

How to compensate:

  • start with lots of DNA
  • more efficient molecular biology
  • use Tn5 (from Chris Vollmers)
    • recognizes and loads specific sequence (an adapter)
    • cuts the DNA and ligates an adapter in the same step
    • very efficient - all sheared DNA has adapters
    • add a biotinylated linker to the end of the adapters
    • 2×75 data

Other Issues:

  • Lots of linker near the beginning of the reads
  • those reads need to be filtered out
  • We want AT LEAST 30bp of non-linker at the beginning
  • For Tn5 data, linker sequence is more likely to be farther into the read
  • That's a good thing! Almost always have at least 30bp before linker

What to do about read where you don't see any linker?

  • might want to throw them out because we're not confident that they're actually mate pairs
  • throws out tons of data if you are sequencing less than 2×300

How to avoid chimeric circular DNA?

  • can't run it on a gel (circular dna smears)
  • adjust insert size to ~4kb - chimeras are large and unlikely to circularize properly

2×75 data with long linker (60bp): we'll probably not read all the way through the linker, but we'll see bits of it.

You could leave a comment if you were logged in.
lecture_notes/05-04-2015.txt · Last modified: 2015/05/08 11:55 by almussel