User Tools

Site Tools


lecture_notes:04-16-2010

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

====== Newbler assembly of POG ====== ====== Overview ====== Outlines how Kevin assembled 454 data of Pyrobaculum oguniense (POG) using Newbler 2.3 version. ===== Key points ===== * Kevin installed Newbler 2.3 version in Campusrocks cluster under /campusdata/BME235/programs/DataAnalysis_2.3. * Newbler GUI is not installed as it has some issues with unpacking. * Kevin ran the assembly tool on POG 454 data under /campusdata/BME235/assemblies/Pog. * The README file in the directory contains important information about the assembly. * Info about tools installed is listed in bioinformatic_tools [[https://banana-slug.soe.ucsc.edu/bioinformatic_tools:gs_de_novo_assembler | GS De Novo Assembler]]. Info about how to run the De novo as well as Mapping assembly tools is also included there. * Currently, tools are installed under /campusdata/BME235/bin/old_Newbler/. * Tools with prefix "gs" are not supposed to be run directly. * Kevin has written several scripts in Python (version 2.6) which aid in building and analyzing genomes. Currently these scripts do not work as on Campusrocks, as the version of Python installed is 2.4 and it is under the process of being updated to version 2.6. * Newbler assembly tools take .sff (color space and quality data) files as input and converts them into .fna (fasta file with nucleotide information) files. * Good only with 454 data, and is not good on reads with length < 50. * Example code to run the De novo tool on data is shown below. The code is taken from [[https://banana-slug.soe.ucsc.edu/bioinformatic_tools:gs_de_novo_assembler | GS De Novo Assembler]]. <code> newAssembly . addRun . /campusdata/BME235/data/Pog/454_run/sff/FUIPDCZ01.sff addRun . /campusdata/BME235/data/Pog/454_run/sff/FUIPDCZ02.sff runProject -e 50 . </code> * Where, -e 50 is an important parameter -> implies expected coverage and it defaults to 50. * Currently, De novo assembly is done on POG, Mapping is not done yet. * Output : Generated in a separate directory called "assembly". Main outputs - .fna files and .qual files. Look at "/campusdata/BME235/assemblies/Pog/newbler-assembly1/assembly". * make.log - keeps track of what happened. ====== Things to remember while running assembly tools====== * All the assemblies should be listed under /campusdata/BME235/assemblies. * Include .cshrc file in your path. * Its better to run the tool in the current working directory. * Create a README file in each new directory and it should contain all the necessary stuff required to run the assembly tool. * Create Makefile for each assembly tool. (Makefile for newbler_assembly tool is in /campusdata/BME235/assemblies/Pog//newbler-assembly1/). You can use it as a template and modify the data source and the expected coverage as required. Makefile should be considered as "a book for lab protocols". * Its always better to say append to make.log in Makefile. * Wiki page for assembly tools should contain a summary of how to run the tool and other things that might be useful to look at.

You could leave a comment if you were logged in.
lecture_notes/04-16-2010.1271463779.txt.gz · Last modified: 2010/04/17 00:22 by svasili