Banana Slug Genomics

**This is an old revision of the document!** ----

A PCRE internal error occured. This might be caused by a faulty plugin

Lecture Notes 4/6/2015 Note Taker: Christopher Kan A road map to the Denovo-Assembly of the Banana Slug Genome - Stefan Prost Denovo VS. Reference Genome - Reference can be biased by the assembly itself. Eg some areas may not be annotated or reads are not available. - Denovo costs more Scaffolds and Contigs - Contigs have little to no gaps - Scaffolds can have missing regions but the linear order of the contigs within each scaffold is known - N50s for Scaffold and Contigs are used as quality measures. ○ You sum the size of the scaffolds or contigs until you reach 1/2 the linear length of a genome. The size of the last constituent part of the N50. It’s a way to obtain a median-esque measure of assembly quality - Ideally # scaffolds = # chromosomes Definition: Kmer - Short unique element of DNA of a certain length n - The elements can overlap - Used to summarize data by assemblers A priori knowledge of a genome - Expected Genome Size * C-values from www.genomesize.com * C-value is the genome size in picrograms * 1pg=1C=980MB * Depending on clade information from related genomes can be used to provide a-priori knowledge § Some have low variation and high synteny - Birds * 6-7 GB becomes difficult - Data bases * www.Gigaadb.org * NCBI Genome - Expected repeat content * Correlated with genome size * Small repeats and pseudogenes, genome duplications - Expected Heterozygosity - Haploid? Diploid or polyploid? * No assembler that can assemble polyploid currently Sequencing Technology - 1st Gen ○ Sanger - 2nd Gen (PCR Needed) * Illumia § It took me a long time to understand how this works these two video helped me: [[https://www.youtube.com/playlist?list=PLfvYDg0hWvoqfF9z7bw7Zizeenj620r5c| Link]] * Roche:454 * IONtorrent * ABI: Solid - 3rd Gen (Single Molecule Sequencing) * Heliscope * PacBio RS II § Problems * Polyerase needs to be fast with low error * Poor yield from cell ~ Need to wash with low concentration to ensure most cells only have one molocule * Insertions and deletions. Missing or having one that hangs around ~ Random. This property used to error correct * Light emission at time of amplification * Real time, allows decernment of 3D structure of molocule based on time between incorporations * Can circularize small DNA fragments and get multiple reads ~3kb, possible to 8kb * MiniION and GridION * Sequences by taking molocule apart * Nanopore allows the molocule through based on salt gradient * Sequence as molocule goes through ~ Molocule held by molocule that clips off one nucleotide at a time - exonuclease ~ Measure the charge at the nanopore. * OR sequence the molocule as it goes through as its held with an helicase * Some systematic errors - Harder to correct * Can use a hair pin to run both side of DNA so its effectiely paired * No restriction on size hypothetically Issues with 3rd Gen - High error - High cost - Error correction very computationally expensive Note Taker: XXX

Banana Slug Genomics

User Tools

Site Tools

Discussion

Page Tools