UCSF_SW019

Sequencing data

Library	Run	Location	Notes
		/campusdata/BME235/Spring2015Data/UCSF_SW019/

Files

File	Size
SW019_TGGCAAT_R1.fastq	40G
SW019_TGGCAAT_R2.fastq	40G
../preqc/ucsf_sw019/ucsf_sw019.fastq	81G
../adapterAndPCRFreeFiles/SW019_noAdap_R1.fastq	39G
../adapterAndPCRFreeFiles/SW019_noAdap_R2.fastq	39G
../adapterAndPCRFreeFiles/UCSF_SW019_noAdap_noDup_R1.fastq	39G
../adapterAndPCRFreeFiles/UCSF_SW019_noAdap_noDup_R2.fastq	39G
../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_merged.fastq.gz	14G
../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_R1_trimmed.fastq.gz	5G
../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_R2_trimmed.fastq.gz	4.9G
ErrorCorrected/SW019_seqprep_dupRemoved_ec_R1.fastq	37G
ErrorCorrected/SW019_seqprep_dupRemoved_ec_R2.fastq	37G

5/23- I am noticing that the SeqPrep_newData R1 and R2 file sizes differ, how did this happen? I have had some programs crash when the R1 and R2 do not match 1 to 1.

Fastqc results

sw019_tggcaat_l001_r1_001.fastq

sw019_tggcaat_l001_r2_001.fastq

Preqc (SGA preprocessing) results

Sun May 24

ucsf_sw019_preqc_report.pdf

Comments

The preqc report for the ucsf_sw019 reads look similar to previous reports. However, the genome size estimate is low (1.9 compared 2.2 Gb). The ucsf_sw019 is not represented in the k-de Bruijn graphs indicating insufficient coverage to make these predictions. This report provides a useful baseline for comparison with other pre-processing efforts.

Pre processing

The raw fastq files were put through a pre processing pipeline. First the fastq files had adaptor sequences removed using Skewer. The adaptor free files were further processed with FastUniq to remove PCR duplicates.

KmerGenie

Running the merged and trimmed files

predicted best k: 61

KmerGenie Merged SW018 SW019

SeqPrep results

The data files were trimmed using SeqPrep, both with and without merging. The output for the run without merging is in /campusdata/BME235/Spring2015Data/adapter_trimming/SeqPrep_newData and the output for the run with merging is in /campusdata/BME235/Spring2015Data/merging/SeqPrep_newData. The trimmed R1 and R2 files for the run with merging are significantly smaller than those from the non-merging run.

The adapters used for both runs were AGATCGGAAGAGCACACGTCTGAACTCCAG (-A option) and AGATCGGAAGAGCGTCGTGTAGGGAAAGAG (-B option).

Merged SW019 Libraries

All SW019 data sets that had been adapter trimmed using Seqprep were merged with Fastuniq to remove duplicates and then error corrected using Musket

You could leave a comment if you were logged in.

Banana Slug Genomics

Table of Contents

UCSF_SW019

Sequencing data

Files

Fastqc results

Preqc (SGA preprocessing) results

Comments

Pre processing

KmerGenie

SeqPrep results

Merged SW019 Libraries

Banana Slug Genomics

User Tools

Site Tools

Table of Contents

UCSF_SW019

Sequencing data

Files

Fastqc results

Preqc (SGA preprocessing) results

Comments

Pre processing

KmerGenie

SeqPrep results

Merged SW019 Libraries

Page Tools