Differences

This shows you the differences between two versions of the page.

--- lecture_notes:04-08-2015 [2015/04/09 23:05]
sihussai
+++ lecture_notes:04-08-2015 [2015/04/17 22:34] (current)
sihussai fixing capitalization
@@ Line 1: / Line 1: @@
-=====De novo Assembly II=====
+======De novo assembly II======
 **Guest lecturer: Stefan Prost, stefan.prost@berkley.edu**
-====Illumina Paired-end Sequencing Libraries====
+=====Illumina paired-end sequencing libraries=====
   * MiSeq has 300 bp reads
   * Paired ends read from both directions, so you get one read for each end (may or may not overlap depending on molecule and read size)
@@ Line 14: / Line 14: @@
     * not enough info for scaffolding
-====Illumina Mate-Pair Sequencing Libraries===
+=====Illumina mate-pair sequencing libraries====
   * Idea: get paired reads that are much farther away (for more scaffolding info)
   * Basically, the same idea as paired-ends, except the middle section is missing and the ends are oriented the opposite way.
@@ Line 34: / Line 34: @@
-====BAC (Bacterial Artificial Chromosome) and Fosmid Libraries====
+=====BAC (Bacterial Artificial Chromosome) and fosmid libraries=====
   * Uncommon and expensive, but the gold standard
   * Bacterial F-plasmid takes< 40 kb insert size
@@ Line 40: / Line 40: @@
   * http://www.scq.ubc.ca/wp-content/plasmidtext.gif
-====Read Quality Assessment====
+=====Read quality assessment=====
   * Base quality: Phred scores reported by sequencer.
   * Fastq files: fasta files, plus encoded phred scores
@@ Line 49: / Line 49: @@
   * Pacific Bio doesn’t have GC|AT bias
-===Tools===
+====Tools====
   * FastQC (Most popular tool to tell you about the read library)
     * FastQC reported an issue with our data with kmer count (related to adapter content)
@@ Line 56: / Line 56: @@
     * Estimates how difficult the assembly will be
-====Estimating Genome Size from Read Data====
+=====Estimating genome size from read data=====
 	G = (pn(1-k+1))/(λ_k)
 	G = Genome size
@@ Line 66: / Line 66: @@
   * To estimate genome size need to know:i) total number of reads; ii) length of reads;  and iii) kmer distribution
-====Error Correction====
+=====Error correction=====
   * High amount of small kmers are usually errors
 **Simulated contig length in the k-de Brujin graph can estimate the best kmer to use for assembly. Based on contig length N50**

Banana Slug Genomics

User Tools

Site Tools

Differences

Page Tools