User Tools

Site Tools


Genome Annotation



  • Done to facilitate conventional gene annotation efforts.
  • Helps avoid false SNP calls and mapping ambiguities.
  • Hard Masking: replacing repeats with Ns {ACGTNNNNNNNNNATGG}
  • Soft Masking: replacing repeats with lowercase {ACGTtagtagtagATGG}

Repeat Annotation:

  • Different types of repeats can be studied along with their levels of activity (evolutionary analyses)

Types of Repeats:

  • low-complexity sequence: microsatellites, homopolymers, etc.
  • Transposable Elements:
  • * class 1: retrotransposon; “copy & paste”; LTR, LINES, SINES
  • * class 2: DNA transposons; “cut & paste”; subclass 1 and subclass 2

Repeat Content

  • Does not necessarily correlate with genome size
  • some correlation within the same group


  • Homology: RepeatMasker
  • denovo: RepeatModeler, WindowMasker, RepeatScout, Piler
  • denovo from reads: REPdenovo, TEDNA
  • NOTE: denovo tools run risk of false positives from highly conserved protein-coding genes.

Gene Annotation

Evidence-driven Annotation

  • protein information, EST, RNA-Seq

Ab initio Gene Prediction

  • doesn't require evidence data
  • requires training for organism of interest
  • most find single most likely CDS
  • do not report UTR's (incomplete gene model)
  • does not accommodate spliceoforms
  • requires high-quality assembly (scaffold N50 ≈ avg gene size)

Combined Approach

  • challenge of collating different models and sources of evidence.

Annotation Metrics

  • Sensitivity, specificity, accuracy, AED
  • AED = 1 - ACC = 1 - .5(Sensitivity+specificity)
  • AED useful for identifying low quality inconsistent annotations (can be manually curated later)


  • Pipelines: Maker2, Pasa, Ensembl, NCBI
  • Evidence Mapping: BLAST/BLAT, Exonerate (computationally expensive)
  • ab initio gene predictors: Augustus, SNAP, GeneMark
  • Choosers and Combiners: JigSaw, Glean
  • Visualization & Curation: Artemis, Apollo, JBROWSE, IGV
You could leave a comment if you were logged in.
lecture_notes/05-27-2015.txt · Last modified: 2015/05/28 12:48 by emfeal