======BS-MK====== ===== Sequencing data ===== | Library | Run | Location | Notes | | | |/campusdata/BME235/Spring2015Data/UCSF_BS-MK/ | | ===== Files ===== | File | Size| | BS-MK_CATCCGG_R1.fastq | 83G | | BS-MK_CATCCGG_R2.fastq | 83G | | ../preqc/ucsf_bs-mk/ucsf_bs-mk.fastq | 172G | | ../adapter_trimming/UCSF_reads_skewer_trimmed/BS_MK_noAdap_R1.fastq | 83G | | ../adapter_trimming/UCSF_reads_skewer_trimmed/BS_MK_noAdap_R2.fastq | 83G | | ../merging/SeqPrep_newData/UCSF_BS-MK_CATCCGG_merged.fastq.gz | 11G| | ../merging/SeqPrep_newData/UCSF_BS-MK_CATCCGG_R1_trimmed.fastq.gz | 22G | | ../merging/SeqPrep_newData/UCSF_BS-MK_CATCCGG_R2_trimmed.fastq.gz | 22G | /campusdata/BME235/S15_assemblies/SOAPdenovo2/errorCorrectionTask/musket_run/pairedEndEC_k31_minmult3/BS-MK_seqprep_dupRemoved_ec_R1.fastq /campusdata/BME235/S15_assemblies/SOAPdenovo2/errorCorrectionTask/musket_run/pairedEndEC_k31_minmult3/BS-MK_seqprep_dupRemoved_ec_R2.fastq A grep search shows that within the raw data files only 0.06% of reads contain the full adapter sequence. ==== FastQC results ==== Results of the run are located here on the campusrocks2 server: /campusdata/BME235/S15_assemblies/SOAPdenovo2/Fastqc/UCSF_BS-MK_fastqc {{:fastqc_bs-mk_catccgg_r1.pdf|}} {{:fastqc_bs-mk_catccgg_r2.pdf|}} ** Skewer Adapter trimmed fastqc results ** Results of fastqc analysis on the adapter trimmed (using skewer) and PCR duplicate removed (using fastuniq) files: forward adapter used: ACACTCTTTCCCTACACGACGCTCTTCCGATCT reverse adapter used: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT {{:fastqc_bs_mk_noadap_r1.pdf|}} {{:fastqc_bs_mk_noadap_r2.pdf|}} The "no-adap" sequences look like they still have a high-frequency kmer at the beginning: GCTCTTCCGATCTA, which looks like the claimed -B option AGATCGGAAGAGCGTCGTGTAGGGAAAGAG, which complements to CTCTTTCCCTACACGACGCTCTTCCGATCT Was adapter trimming done correctly? Why did Skewer not remove the adapter sequence? Skewer was not run properly. It was redone with the adapters: Forward: AGATCGGAAGAGCACACGTCTGAACTCCAG Reverse: AGATCGGAAGAGCGTCGTGTAGGGAAAGAG Here are the new fastqc results: {{:bs_mk_noadap_r1.fastq_fastqc_report.pdf| fastqc report for v2 BS-MK R1 trimmed reads}} {{:bs_mk_noadap_r2.fastq_fastqc_report.pdf| fastqc report for v2 BS-MK R2 trimmed reads}} There is still a failing k-mer content. ** Seqprep Adapter trimmed fastqc results ** Results of fastqc analysis on the seqprep adapter trimmed files: {{::fastqc_ucsf_bs-mk_catccgg_r1_tr....fastq.gz_fastqc_report.pdf|}} {{:fastqc_ucsf_bs-mk_catccgg_r2_tr....fastq.gz_fastqc_report.pdf|}} These fastqc analyses show a huge amount of adapter at the beginnings of the reads. Was SeqPrep told about the adapters? What parameters was it run with? ===== Preqc (SGA preprocessing) results ===== Tues May 26 {{::ucsf_bs-mk_preqc_report.pdf|}} Preqc report of the UCSF BS-MK and BS-tag data (pooled). {{:ucsf_new_lib_preqc_report.pdf|}} ==== Comments ==== The preqc report for the ucsf_bs-mk reads look similar to previous reports. This report provides a useful baseline for comparison with other pre-processing efforts. =====SeqPrep results===== The data files were trimmed using SeqPrep, both with and without merging. The output for the run without merging is in /campusdata/BME235/Spring2015Data/adapter_trimming/SeqPrep_newData and the output for the run with merging is in /campusdata/BME235/Spring2015Data/merging/SeqPrep_newData. The trimmed R1 and R2 files for the run with merging are somewhat smaller than those from the non-merging run. The adapters used for both runs were AGATCGGAAGAGCACACGTCTGAACTCCAG (-A option) and AGATCGGAAGAGCGTCGTGTAGGGAAAGAG (-B option). ===FastQC on Seqprep, Fastuniq, Musket files=== Seqprep adapter removed files were run through Fastuniq to remove PCR duplicates, then through musket for error correction, and lastly FastQC for analysis {{::bs-mk_seqprep_dupremoved_ec_r1.pdf|}} {{::bs-mk_seqprep_dupremoved_ec_r2.pdf|}} Files are located here: /campusdata/BME235/S15_assemblies/SOAPdenovo2/errorCorrectionTask/musket_run/pairedEndEC_k31_minmult3/BS-MK_seqprep_dupRemoved_ec_R1.fastq /campusdata/BME235/S15_assemblies/SOAPdenovo2/errorCorrectionTask/musket_run/pairedEndEC_k31_minmult3/BS-MK_seqprep_dupRemoved_ec_R2.fastq