User Tools

Site Tools


lecture_notes:05-23-2011

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

====== Installing the UCSC Genome Browser ====== Patricia Chan gave a lecture on setting up the UCSC Genome Browser. ===== UCSC Genome Browser ===== ==== Requirements ==== * 32/64 bit Linux/Unix system * CGI * MySQL database * Apache Written in C and JavaScript (JQuery) To install or mirror a genome browser on a new server: http://genomewiki.ucsc.edu/index.php/Browser_Installation ==== Where are the data? ==== * MySQL Database * Each genome assembly has its own database * Most track data stored in MySQL * ''/gbdb/<DB name>'' * Each genome assembly has its own local directory * sequences, wiggle track data, other large data sources (.bam files) ''centraldb'' * Contains all genome information. * ''centraldb.dbDb'' stores the information * ''centraldb.blatservers'' stores the info for two blat servers (one of DNA, one for protein sequences) * Can be renamed in ''.hg.conf'' Track Info * Stored in the ''trackDb'' table in each genome's MySQL database. * Based on ''trackDb.ra'' files as input source. * The global ''trackDb.ra'' contains tracks that apply to multiple genomes. * Similar to building a custom track on the genome browser. Kent Code Base * Required to set up the browser. * Latest source code is in GIT repository. * Need a CGI sandbox. * ''make utils'' in ''~/kent/src'' * Binaries are installed in ''~/bin/${MACHTYPE}'' Browser Configuration File * ''.hg.conf'' * Contains MySQL user accounts and passwords, ''centraldb'' info, ''trackDb'' info. * Required by Kent applications to connect to MySQL. ===== Install the Genome Browser ===== ==== Prepare Genome Sequences ==== * Create ''/gbdb/newGenome'' directory for a new genome assembly * Convert genome sequences from FASTA to 2bit format. * ''faToTwoBit chr1.fa [chr2.fa ...] /gbdb/newGenome/newGenome.2bit'' * FASTA input files must have UNIX LF character. ==== Setup Genome Database ==== * create a MySQL database for the genome assembly: * ''hgsql "" -e "create database if not exists newGenome"'' * ''hgsql'' is a wrapper for passing SQL commands. The first argument is the database name. * create a group table for the new database * ''cd ~/kent/src/hg/lib'' * ''hgsql newGenome < grp.sql'' * create a chromInfo table * ''faSize -detailed chr1.fa [chr2.fa ...] > chrominfo.tab'' * ''hgsql newGenome < ~/kent/src/hg/lib/ chromInfo.sql'' * ''hgsql newGenome -e 'load data local infile "chrominfo.tab" into table chromInfo;"'' * ''hgsql newGenome -e 'update chromInfo set fileName = "/gbdb/newGenome/newGenome.2bit"'' ==== Make New Genome Available ==== * Add an entry into the ''centraldb.dbDb'' table * Add an entry into the ''centraldb.defaultDb'' table * Set the default assembly to use. * Add an entry into the ''centraldb.genomeClade'' table * The clade the genome is associated with. * If the genome belongs to a clade that is not in the browser, add an entry to the ''centraldb.clade'' table. * Add a description of the genome in an HTML file (''/gbdb/newGenome/html/description.html'') * Free formatted. ===== Configuration ===== ==== Track Configuration ==== * Each genome database needs a trackDb table * the global trackDb.ra is in ''~/kent/src/hg/makeDb/trackDb'' * genome specific trackDb.ra is stored in ''~/kent/src/hg/makeDb/trackDb/<DB name>'' * can be stored in an alternate location. ==== Search Configuration ==== * a ''hgFindSpec'' table is required for specifying search criteria. * Search criteria for each track are also loaded from ''trackDb.ra''. ==== Start BLAT Server ==== * To run BLAT, gfServer for each genome must be started. * Insert 2 records into ''centraldb.blatServers'' table. * Make sure port numbers are different. * If BLAT server is not run locally * ''rsync -v /gbdb/newGenome/newGenome.2bit blat_host:/gbdb/newGenome'' * At the host machine, start BLAT server in the background. ===== Automation ===== * The previous steps are automated in the perl script ''make-browser''.

You could leave a comment if you were logged in.
lecture_notes/05-23-2011.1306189905.txt.gz · Last modified: 2011/05/23 22:31 by svohr