Evidence-driven Annotation
Ab initio Gene Prediction
doesn't require evidence data
requires training for organism of interest
most find single most likely CDS
do not report UTR's (incomplete gene model)
does not accommodate spliceoforms
requires high-quality assembly (scaffold N50 ≈ avg gene size)
Combined Approach
Annotation Metrics
Sensitivity, specificity, accuracy, AED
AED = 1 - ACC = 1 - .5(Sensitivity+specificity)
AED useful for identifying low quality inconsistent annotations (can be manually curated later)
Tools
Pipelines: Maker2, Pasa, Ensembl, NCBI
Evidence Mapping: BLAST/BLAT, Exonerate (computationally expensive)
ab initio gene predictors: Augustus, SNAP, GeneMark
Choosers and Combiners: JigSaw, Glean
Visualization & Curation: Artemis, Apollo, JBROWSE, IGV