Banana Slug Genomics

**This is an old revision of the document!** ----

A PCRE internal error occured. This might be caused by a faulty plugin

====== 454 Presentation by Teri Mueller ====== This talk was hosted by Roche. It was a general overview of some of the capabilities of the 454 platform and its bundled software. [[http://www.my454.com|454 website]] - Access to FLX updates and documentation. Register with UCSC account only. * Pre-talk comment: The GUI is accessible by X11 or VNC. ===== 454 Characteristics ===== * 1 bead/1 DNA fragment: filter will try to remove beads with >1 fragment (~20%) * 200 cycles per FLX Titanium run. New machine does 400. * 1 flow: single base. Unit of measure for pyrosequencing, as bases get added by flow. * Amount of light ~ # of bases. * First 4 bases of each sequence is the key sequence. The first 3 are used to normalize the amount of light for 1 base. * Library sequences: TCAG standard / GATC rapid. Do not mix them! There are also control sequences (did not catch them) * 1 mil reads / 2 well plate. Lane masks will decrease number of reads. * Quality statistics can be viewed in the BaseQualityMetrics and QualityFilterMetrics files. * 10 hr runs. Processing can take up to 80 hrs running on the default computer. This is mitigated by processing with a cluster. * 37 gb per run (~28 gb of raw images) ===== Software ===== * Amplicon variant analyzer * GS Assembler * GS Reporter * GS RunProcessor - Image/signal processing * Formats: * .cwf: raw image * .fna: fasta sequence file, header has a specific format. * .qual: quality scores (like Phred, but offset) * .sff: standard 454 format. <sfffile -s> will split plate into runs. ===== Quality Filtering ===== * Shotgun vs. Amplicon pipeline defaults * Keypass (read rejecting): * Checks for key sequence * ~20% rejection expected * Dot (read rejecting): * Checks for too many negative flows. * 3 successive negative flows or N>5% of last positive flow. * Mix (read rejecting): * Checks for too many positive flows. * Indication of more than one sequence in bead. * >70% positive reads. * Signal Intensity (read trimming): * Reduces size of read until <3% borderline reads. * Primer (read rejecting): * Discard overamplified short sequences. * Valley (read rejecting): * Discard scaled sum scores that are too close to the valleys between base count decision points. * Amplicon default: 4/700 0.57% * Shotgun default: 4/320 1.25% * Trim back (read trimming): * Like valley, except trims instead of discarding until ratio is acceptable. * Amplicon default: off * Shotgun default: on * Quality score trim (read trimming): * 40 base window: if error rate >1%, trim a base. * <40 bases, throws sequence away. * (Even unfiltered, quality scores will reflect low quality areas) A good run: expect a read length mode ~500 and mean >300. ~50% should pass filters.

You could leave a comment if you were logged in.

Banana Slug Genomics

User Tools

Site Tools

Page Tools