Table of Contents

assemblies/

This directory has a subdirectory for each organism.

test/

For test assemblies provided by the tool makers to check that installation is correct.

Pog/

Pyrobaculum oguniense assemblies

                      Assembling Pog 454 long reads with Ray,
                      a parallel implementation of the OpenAssembler.
                      This software seems to be Canadian.
                      It took 3 hours to run, and the output was
                      not very good, max contig size being about 12k.
                      Sadly there are no parameters to tweak.
                      Assembling Pog454 long reads with ABySS.
                      The best params found were kmer size 36 and coverage cutoff 15
                      #ABYSS -k 36 -c 15 both.fq
                      #Total size: mean 1844.8 sd 3479.7 min 36 (1179) max 32566 (556) median 204
                      Assembled Pog 454 long reads with pcap default parameters. Sanger reads are not included.
                      It was necessary to increase the minimum depth coverage for repeats before we got anything good.
                      Assembled Pog 454 long reads with minimum depth coverage for repeats set to 200, and rest of the parameters unmodified. 
                      faSize contigs.bases info : 
                      2506151 bases (8 N's 2506143 real 2506143 upper 0 lower) in 219 sequences in 1 files
                      Total size: mean 11443.6 sd 65849.3 min 56 (Contig174.1) max 611479 (Contig0.1) median 195
                      N count: mean 0.0 sd 0.2
                      U count: mean 11443.6 sd 65849.3
                      Using Kevin's makefile, the blat alignments showed large contigs that looked basically correct, except for contig 8.
                      However many of them overlapped, unlike the Newbler output.  This may have been due to a
                      difference in the way Newbler and PCAP tried to handle the mixed population in the sample where
                      3 inverting regions are found with various frequencies.
                      Also, a cutoff should probably be supplied somewhere after the 17th largest contig because
                      most of the rest of the 219 was small contigs probably representing noise.

slug/