User Tools

Site Tools


lecture_notes:06-02-2010

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
lecture_notes:06-02-2010 [2010/06/02 22:15]
hyjkim created
lecture_notes:06-02-2010 [2010/06/06 22:32] (current)
galt more rationale
Line 1: Line 1:
 +====== Sea Hare and Panda ======
 +
 +Looking at recent successful de-novo assemblies can help inform future sequencing and assembly plans for the Banana Slug.
 +
 +===== Sea Hare =====
 +
 +  Sea Hare is interesting because it is a recent de-novo mollusc assembly using 454 mate-pairs.
 +
   * Analyzing data from previously sequenced mollusk genomes   * Analyzing data from previously sequenced mollusk genomes
     * Broad institute /​ftp/​pub/​assemblies/​invertebrates/​aplysia (seahare)     * Broad institute /​ftp/​pub/​assemblies/​invertebrates/​aplysia (seahare)
Line 26: Line 34:
     * Estimate error after mapping.     * Estimate error after mapping.
     * These two measures should be correlated to each other.     * These two measures should be correlated to each other.
 +
 +===== Panda =====
 +
 +  Panda is interesting because it is a recent de-novo assembly of a large
 +  genome of approximately the same size as banana slug (3Gb). ​ It is also
 +  done using SOAPdenovo which we were able to use to assemble our slug data.
 +  Panda is also the only known large genome yet assembled de-novo using 
 +  only Illumina/​Solexa reads.
  
 Panda Genome statistics Panda Genome statistics
Line 61: Line 77:
  
   * Next slug to be sequenced should be photographed during dissection in order to identify the species.   * Next slug to be sequenced should be photographed during dissection in order to identify the species.
 +
 +good computational challange:
 +  * Subdivide small reads into regions that they group into, then you can do local denovo assemblies on subsets of reads. This is done biologically with things like the BACs.
 +  * Example, shorty: map reads to a contig, then map out to reads in other contigs, and then map back. It collects a bunch of reads that might belong together and can assemble these.
 +  * Can use SOAPdenovo to get initial contigs. Then can map pieces onto contigs and gather reads togeather. Store contigs in memory, and stream out data to sub-assemblers. PHD level question, can we make an efficient parallel assembler out of this? How to stream through this and partition efficiently?​ How can we get efficient ways of dealing with all of this?
 +
lecture_notes/06-02-2010.1275516950.txt.gz ยท Last modified: 2010/06/02 22:15 by hyjkim