We went over some things that need updating in the wiki and plans for this weekend.
Document scripts added in
bin folder.
Find out which lanes in 454 run 3 are banana slug runs. But it turns out that this is not necessary: run “3” is not a separate run, but just run2 plus the non-banana-slug reads in other lanes.
Find insert lengths for the SeqPrep + Quake corrected Illumina data.
SOAPdenovo assembly try 1:
try 2:
try 3:
If insert length is negative, don't treat them as PE reads.
This was an error in assumptions about what SOAPdenovo wants. The number it wants is the total fragment length, which is what we are already estimating. We just need to look at the average length for the pairs (based on the histogram) after SeqPrep.
If Quake changes the distribution of reads (by trimming and discarding uncorrectable reads) it may be important to remap the new set of pairs to the 454 reads to get an improved estimate of fragment length.
Newbler assemblies—no need for a new one, as there is no new 454 data.
-
It turns out that blastall does not seem to have any documented way to tell tblastx to use a different genetic code, though the NCBI web server has the option.
The work Kevin did so far is now in assemblies/slug/barcode-of-life.
Next step is to use BWA to find all the SeqPrep+Quake treated Illumina reads that map to what has been found so far of the barcode.