======UCSF_SW019====== ===== Sequencing data ===== | Library | Run | Location | Notes | | | |/campusdata/BME235/Spring2015Data/UCSF_SW019/ | | ===== Files ===== | File | Size| | SW019_TGGCAAT_R1.fastq | 40G | | SW019_TGGCAAT_R2.fastq | 40G | | ../preqc/ucsf_sw019/ucsf_sw019.fastq | 81G | | ../adapterAndPCRFreeFiles/SW019_noAdap_R1.fastq | 39G | | ../adapterAndPCRFreeFiles/SW019_noAdap_R2.fastq | 39G | | ../adapterAndPCRFreeFiles/UCSF_SW019_noAdap_noDup_R1.fastq | 39G | | ../adapterAndPCRFreeFiles/UCSF_SW019_noAdap_noDup_R2.fastq | 39G | | ../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_merged.fastq.gz | 14G | | ../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_R1_trimmed.fastq.gz | 5G | | ../merging/SeqPrep_newData/UCSF_SW019_TGGCAAT_R2_trimmed.fastq.gz | 4.9G | | ErrorCorrected/SW019_seqprep_dupRemoved_ec_R1.fastq | 37G| | ErrorCorrected/SW019_seqprep_dupRemoved_ec_R2.fastq | 37G| 5/23- I am noticing that the SeqPrep_newData R1 and R2 file sizes differ, how did this happen? I have had some programs crash when the R1 and R2 do not match 1 to 1. ===== Fastqc results ===== {{::sw019_tggcaat_l001_r1_001.fastq.pdf|sw019_tggcaat_l001_r1_001.fastq}} {{::sw019_tggcaat_l001_r2_001.fastq.pdf|sw019_tggcaat_l001_r2_001.fastq}} ===== Preqc (SGA preprocessing) results ===== Sun May 24 {{:ucsf_sw019_preqc_report.pdf|}} ==== Comments ==== The preqc report for the ucsf_sw019 reads look similar to previous reports. However, the genome size estimate is low (1.9 compared 2.2 Gb). The ucsf_sw019 is not represented in the k-de Bruijn graphs indicating insufficient coverage to make these predictions. This report provides a useful baseline for comparison with other pre-processing efforts. =====Pre processing===== The raw fastq files were put through a pre processing pipeline. First the fastq files had adaptor sequences removed using Skewer. The adaptor free files were further processed with FastUniq to remove PCR duplicates. =====KmerGenie===== Running the merged and trimmed files predicted best k: 61 {{:ucsf_sw018_sw019_report.pdf|KmerGenie Merged SW018 SW019}} =====SeqPrep results===== The data files were trimmed using SeqPrep, both with and without merging. The output for the run without merging is in /campusdata/BME235/Spring2015Data/adapter_trimming/SeqPrep_newData and the output for the run with merging is in /campusdata/BME235/Spring2015Data/merging/SeqPrep_newData. The trimmed R1 and R2 files for the run with merging are significantly smaller than those from the non-merging run. The adapters used for both runs were AGATCGGAAGAGCACACGTCTGAACTCCAG (-A option) and AGATCGGAAGAGCGTCGTGTAGGGAAAGAG (-B option). ===== Merged SW019 Libraries ===== All SW019 data sets that had been adapter trimmed using Seqprep were merged with Fastuniq to remove duplicates and then error corrected using Musket