Installing the UCSC Genome Browser

Patricia Chan gave a lecture on setting up the UCSC Genome Browser. genomebrowsersetup.pdf

UCSC Genome Browser

Requirements

32/64 bit Linux/Unix system
CGI
MySQL database
Apache

Written in C and JavaScript (JQuery)

To install or mirror a genome browser on a new server: http://genomewiki.ucsc.edu/index.php/Browser_Installation

Where are the data?

MySQL Database
- Each genome assembly has its own database
- Most track data stored in MySQL

/gbdb/<DB name>
- Each genome assembly has its own local directory
- sequences, wiggle track data, other large data sources (.bam files)

centraldb

Contains all genome information.
centraldb.dbDb stores the information
centraldb.blatservers stores the info for two blat servers (one of DNA, one for protein sequences)
Can be renamed in .hg.conf

Track Info

Stored in the trackDb table in each genome's MySQL database.
Based on trackDb.ra files as input source.
The global trackDb.ra contains tracks that apply to multiple genomes.
Similar to building a custom track on the genome browser.

Kent Code Base

Required to set up the browser.
Latest source code is in GIT repository.
Need a CGI sandbox.
make utils in ~/kent/src
Binaries are installed in ~/bin/${MACHTYPE}

Browser Configuration File

.hg.conf
Contains MySQL user accounts and passwords, centraldb info, trackDb info.
Required by Kent applications to connect to MySQL.

Install the Genome Browser

Prepare Genome Sequences

Create /gbdb/newGenome directory for a new genome assembly
Convert genome sequences from FASTA to 2bit format.
- faToTwoBit chr1.fa [chr2.fa …] /gbdb/newGenome/newGenome.2bit
FASTA input files must have UNIX LF character.

Setup Genome Database

create a MySQL database for the genome assembly:
- hgsql “” -e “create database if not exists newGenome”
- hgsql is a wrapper for passing SQL commands. The first argument is the database name.
create a group table for the new database
- cd ~/kent/src/hg/lib
- hgsql newGenome < grp.sql
create a chromInfo table
- faSize -detailed chr1.fa [chr2.fa …] > chrominfo.tab
- hgsql newGenome < ~/kent/src/hg/lib/ chromInfo.sql
- hgsql newGenome -e 'load data local infile “chrominfo.tab” into table chromInfo;“
- hgsql newGenome -e 'update chromInfo set fileName = ”/gbdb/newGenome/newGenome.2bit“

Make New Genome Available

Add an entry into the centraldb.dbDb table
Add an entry into the centraldb.defaultDb table
- Set the default assembly to use.
Add an entry into the centraldb.genomeClade table
- The clade the genome is associated with.
- If the genome belongs to a clade that is not in the browser, add an entry to the centraldb.clade table.
Add a description of the genome in an HTML file (/gbdb/newGenome/html/description.html)
- Free formatted.

Configuration

Track Configuration

Each genome database needs a trackDb table
the global trackDb.ra is in ~/kent/src/hg/makeDb/trackDb
genome specific trackDb.ra is stored in ~/kent/src/hg/makeDb/trackDb/<DB name>
- can be stored in an alternate location.

Search Configuration

a hgFindSpec table is required for specifying search criteria.
Search criteria for each track are also loaded from trackDb.ra.

Start BLAT Server

To run BLAT, gfServer for each genome must be started.
Insert 2 records into centraldb.blatServers table.
Make sure port numbers are different.
If BLAT server is not run locally
- rsync -v /gbdb/newGenome/newGenome.2bit blat_host:/gbdb/newGenome
At the host machine, start BLAT server in the background.

Automation

The previous steps are automated in the perl script make-browser.

You could leave a comment if you were logged in.

Banana Slug Genomics

Table of Contents

Installing the UCSC Genome Browser

UCSC Genome Browser

Requirements

Where are the data?

Install the Genome Browser

Prepare Genome Sequences

Setup Genome Database

Make New Genome Available

Configuration

Track Configuration

Search Configuration

Start BLAT Server

Automation

Banana Slug Genomics

User Tools

Site Tools

Table of Contents

Installing the UCSC Genome Browser

UCSC Genome Browser

Requirements

Where are the data?

Install the Genome Browser

Prepare Genome Sequences

Setup Genome Database

Make New Genome Available

Configuration

Track Configuration

Search Configuration

Start BLAT Server

Automation

Page Tools