===== Sequencing data ===== | Library | Run | Location | Notes | | SW042 | |/campusdata/BME235/Spring2015Data/ | Mate pair library. Expected insert size is 5-6kb. | ===== Files ===== | File | Size| Reads| | SW042.r1.trimmed.fastq | 152M| 1,036,699 | | SW042.r2.trimmed.fastq | 153M| 1,086,654 | | Matepair_trimmed/skewer_run2_SW042_1_trimmed-pair1.fastq | 145M| 735,906 | | Matepair_trimmed/skewer_run2_SW042_1-trimmed-pair2.fastq | 140M| 735,906 | | Matepair_dupRemoved/skewer_42_dupRemoved_R1.fastq | 129M| 657,883 | | Matepair_dupRemoved/skewer_42_dupRemoved_R2.fastq | 125M| 657,883 | Note: Duplicates, concatemers, and linkers have already been removed in the “trimmed” files. ===== FastQC analysis ===== There are several summary statistics that fastqc flags as potentially unusual such as the per base sequence content and kmer content. {{:sw042.r1.trimmed.fastq_fastqc_report.pdf| Fastqc results for SW042.r1.trimmed.fastq}} {{:sw042.r2.trimmed.fastq_fastqc_report.pdf| Fastqc results for SW042.r2.trimmed.fastq}} =====PreQC analysis===== Run on SW042.r1.trimmed.fastq and SW042.r2.trimmed.fastq {{:preqc_report_sw042.pdf| Preqc for SW042}} =====Insert size distribution===== The distribution of insert sizes for inward facing, outward facing, and same strand reads is shown below. Mate pairs should be outward facing. {{:sw042_insert_size_distribution.jpg?200|}} To generate this distribution, mates pairs were mapped to all the soapdenovo "run 1" contigs using bwa. The orientation of reads was pulled from the resulting sam file using a script from the Green lab. =====Pre-processing===== ===Adapter Removal with Skewer=== Running skewer with a junction sequence of CTGTCTCTTATACACATCTAGATGTGTATAAGAGACAG skewer-0.1.123-linux-x86_64 -x CTGTCTCTTATACACATCTAGATGTGTATAAGAGACAG -y CTGTCTCTTATACACATCTAGATGTGTATAAGAGACAG -m mp -j CTGTCTCTTATACACATCTAGATGTGTATAAGAGACAG -t 32 -o ${OUTDIR} /campusdata/BME235/Spring2015Data/SW042.r1.trimmed.fastq /campusdata/BME235/Spring2015Data/SW042.r2.trimmed.fastq Using the adapter sequences obtained from this paper [[http://www.illumina.com/documents/products/technotes/technote_nextera_matepair_data_processing.pdf| Nextera Mate Pair Kit]] Files located here: /campusdata/BME235/Spring2015Data/Matepair_trimmed/skewer_run2_SW042_1-trimmed-pair1.fastq /campusdata/BME235/Spring2015Data/Matepair_trimmed/skewer_run2_SW042_1-trimmed-pair2.fastq **Fastqc results** {{::skewer_run2_sw042_1-pair1fastq.pdf|}} {{::skewer_run2_sw042_1-pair2fastq.pdf|}} ===Fastuniq to remove duplicates=== Fastuniq was run to remove any duplicates still remaining after adapter removal Files located here: /campusdata/Spring2015Data/Matepair_dupRemoved/skewer_42_dupRemoved_R1.fastq /campusdata/Spring2015Data/Matepair_dupRemoved/skewer_42_dupRemoved_R2.fastq **Fastqc results** {{::fastqc_skewer_42_dupremoved_r1.pdf|}} {{::fastqc_skewer_42_dupremoved_r2.pdf|}}