User Tools

Site Tools


lecture_notes:05-27-2015

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
lecture_notes:05-27-2015 [2015/05/28 12:27]
emfeal
lecture_notes:05-27-2015 [2015/05/28 12:48] (current)
emfeal
Line 1: Line 1:
 ====== Genome Annotation ====== ====== Genome Annotation ======
-===== Repeat Annotation ​===== +===== Repeats ​===== 
-==== Masking ​====+Masking:
   * Done to facilitate conventional gene annotation efforts.   * Done to facilitate conventional gene annotation efforts.
   * Helps avoid false SNP calls and mapping ambiguities.   * Helps avoid false SNP calls and mapping ambiguities.
   * Hard Masking: replacing repeats with Ns {ACGTNNNNNNNNNATGG}   * Hard Masking: replacing repeats with Ns {ACGTNNNNNNNNNATGG}
   * Soft Masking: replacing repeats with lowercase {ACGTtagtagtagATGG}   * Soft Masking: replacing repeats with lowercase {ACGTtagtagtagATGG}
-==== Repeat Annotation ​====+Repeat Annotation:
   * Different types of repeats can be studied along with their levels of activity (evolutionary analyses)   * Different types of repeats can be studied along with their levels of activity (evolutionary analyses)
- ​=== ​Types of Repeats ​===+Types of Repeats:
   * low-complexity sequence: microsatellites,​ homopolymers,​ etc.   * low-complexity sequence: microsatellites,​ homopolymers,​ etc.
-  ​== Transposable Elements ​== +  ​Transposable Elements: 
-  * class 1: retrotransposon;​ "copy & paste";​ LTR, LINES, SINES +  ​*   * class 1: retrotransposon;​ "copy & paste";​ LTR, LINES, SINES 
-  * class 2: DNA transposons;​ "cut & paste";​ subclass 1 and subclass 2 +  ​*   * class 2: DNA transposons;​ "cut & paste";​ subclass 1 and subclass 2 
- === Repeat Content ​===+Repeat Content
   * Does not necessarily correlate with genome size   * Does not necessarily correlate with genome size
   * some correlation within the same group   * some correlation within the same group
- ​=== ​Tools ===+Tools
   * Homology: RepeatMasker   * Homology: RepeatMasker
   * denovo: RepeatModeler,​ WindowMasker,​ RepeatScout,​ Piler   * denovo: RepeatModeler,​ WindowMasker,​ RepeatScout,​ Piler
Line 22: Line 22:
   * NOTE: denovo tools run risk of false positives from highly conserved protein-coding genes.   * NOTE: denovo tools run risk of false positives from highly conserved protein-coding genes.
 ===== Gene Annotation ===== ===== Gene Annotation =====
-==== Evidence-driven Annotation ​====+Evidence-driven Annotation
   * protein information,​ EST, **RNA-Seq**   * protein information,​ EST, **RNA-Seq**
-==== Ab initio Gene Prediction ​====+Ab initio Gene Prediction
   * doesn'​t require evidence data   * doesn'​t require evidence data
   * requires training for organism of interest   * requires training for organism of interest
Line 31: Line 31:
   * does not accommodate spliceoforms   * does not accommodate spliceoforms
   * requires high-quality assembly (scaffold N50 ≈ avg gene size)   * requires high-quality assembly (scaffold N50 ≈ avg gene size)
-==== Combined Approach ​====+Combined Approach
   * challenge of collating different models and sources of evidence.   * challenge of collating different models and sources of evidence.
-==== Annotation Metrics ​====+Annotation Metrics
   * Sensitivity,​ specificity,​ accuracy, AED   * Sensitivity,​ specificity,​ accuracy, AED
   * AED = 1 - ACC = 1 - .5(Sensitivity+specificity)   * AED = 1 - ACC = 1 - .5(Sensitivity+specificity)
   * AED useful for identifying low quality inconsistent annotations (can be manually curated later)   * AED useful for identifying low quality inconsistent annotations (can be manually curated later)
- +Tools 
- +  * Pipelines: Maker2, Pasa, Ensembl, NCBI 
- +  * Evidence Mapping: BLAST/BLAT, Exonerate (computationally expensive) 
 +  * ab initio gene predictors: Augustus, SNAP, GeneMark 
 +  * Choosers and Combiners: JigSaw, Glean 
 +  * Visualization & Curation: Artemis, Apollo, JBROWSE, IGV
lecture_notes/05-27-2015.txt · Last modified: 2015/05/28 12:48 by emfeal