User Tools

Site Tools


archive:jolespin_virus

This is an old revision of the document!


A PCRE internal error occured. This might be caused by a faulty plugin

Created contig using these paired-end reads: <code> /campusdata/BME235/Spring2015Data/SW019_S2_L008_R1_001.fastq /campusdata/BME235/Spring2015Data/SW019_S2_L008_R2_001.fastq </code> Original contig sequence: <code> >variant-5304/0 4716 TCCACCGCCTTCGTCCAGTGGAGAATTCGTCGATTCAGAACAAGCCAAGAGAAGAAAGACAGATCCTCCTCCTACAACAT CAACACCAGAGCCGGGTACAGGAACCAGACACGGACTACGTAGTGGAACACCTTTGGTTACTCCGGTTAAACCAAACACA CCAGCAGCTCCAACAGCTGGGCCTAGTACCAGAACACCACAAAATACACCAGCCGGCTCACCAATGGCAGCATCAGTACC GGCGGCAAACATGGACACCAGTGGTGCCCCAGGAGGTGATGTAATGCCAGCAGGAGAGGCAGATGCGGCTGGATATCCAG TACCTGCAGGCATGGCAGGTTCAGGCGGTAATAGGTTTTTCACAGGATTTGGATCCCATACTCAAAAAGAACCAGATGGT TACAGCTCAGTAACCAGGTCCTACAGCAAGACCTTTCTTGTGCACACCAACTTTGACGGCACTCTAAAAGCACTGATCAA TATGGAACCTGGGGTCGCAGTACCAACAGGGATTACCAGTTCAGCAAAAGCAGAATGCATTGTGAACCATGGAGGGGTTA TGATTCCATACATGTATCAAAATTGCTCTACAGACCCTTGGGACTGGAATGTGCCTGATCACTTCATGGGCTGGCAATGC CAGGAGTATGGCTTTAAGGTGGCAGAAGCCCGCATAGAGACCCTAAACAATGACAAGCCCACACCAGAATCGGTGCCCGG GCCACCGCCCCCTAGAGCCAGAATGTGGGCTTTTGTGGATGTTGACAACGATTATGGGCTGGATGACCACAGTGGGATCC TCCAGCACAGTGACTTTTTCAGGGATCAAAATGCCCACAGCCCCAATGCCAATACGCAAGCTAAACTGCCTAACCAGCCA GACAGAAAATTTTTACTGGACAAGCAAGCTGCACAGCAATCATTGTGCCAAGCCTTTACTAGAGCTTCTGGAGCAGGCAC TGAGTCATTTGCTACTTATGAACCGAACTACGTGTGGGACATGTTCAAGTCAGATGGTTATGAAGAGTTTCAACTATGCG ACGGTGACATGCAGCTGGTCTACAAATACAACGGTCCGGTACAACGTTTTCAACACAACCATGACTCCCTTGCACTAGAC TGCTCAGGATTTACACCAGCCATGAACCTTCAAGATGATTTCTCCAACCTAGATGCTTTTGAGACTCAGCTTTGGCCAGG AGCCAGAGAGCTACAAAGGCCATCAGGAACAGAGACCAGCGGTCAGCTAGGCATGATCAACTATGTCAACCACAATTTTG TAAAGGACAGATTTAAAGTGGTGGATACTGACCCACTGTCCACAGTAGAACAGGACTACATAAACCAGACTGGGCAGATG CCTCTAATAGCCAACATGGGCACCAATCACTTGGGCACTAGAACTGCTACTAATCCCAACATGTCAGATGATAACAAAAC ATGGAAACAAACCTTCTCAAAAAGGCCCCCCATCTACATGTTTGGGGTTCACAAGGAGTGGGAATTTCAAACCACCGGTA CACACCCTTACAGGTACTATTTTTGTGCACGGGTGACCTACCACAGCAAAGTCAAGTTTCTAATTAACCAAAGAGGGTGG AAGCCAGTCATTCACGCAGGCTTTGGACTATATTCAGATTACAGCCCAATAAACAATCTTAACTGTGTATCCGTTCCAAG GGCACAAGGAGAAGTTGCCCGCGCAAATAGACGTACCAAGGGCCACAAAATGATTGGCACAGGCAGATTTACCGCCAGTT CTGGTCTATAAAAGCTCAGGGTAAGTAGAGTGCTAGTTATGTTCACTGCCGTTTTCAAGATGTTGAGGTCTTATTTAGCT CGTGATGAAGTTAGGTACCACATGGTTACTGTGGGTCACTGTAGTAGACCAATTATTCACCCCTCATACTTTGTACCGAG GGCTTTGGCCAAGGACAAGTCTGGAAATAAAGTTTCAGCATCAGCAGGTTATGACAAGAGCTACGAACTGGACCACTACT CTGTCTATGAGGAACTGGTAGCAGTTTGGAACACTCCCAATGTTCACTTGGCGGCTATGGATGCCAAACACAAACAAGTG ATCAATAACATTAAAGGCTTTACCAGAGATGAGAGATTCAAGGTGTTAATGGTGTTTCATGACAAGAGTTGTGCCAAACA GTACAAGTCAAATGGCAATCACCTTCATTTGGTGATAAAGACATTAGTACCGGTAATGAGTTCTGACAACAAATATAGAG CCATGATGAGAAGCATGAGCGGCATAGGTGGATACTGTAACACGGCTTTGCTAAAAGGTGACCGTTCATCTTTCCTGAGT TACCTGGCTTCTGACCCTGAGAAAATGTTTCTAGGGTGTCAAGATGCTGACCTTCTACAAGAGTTTAAAGATGCTGAAAA CTTTTCTGGTACGATTAAAGACTGGTTACTAGAAGATCAAAGTGACAAAACCAGTGCCATTAGAAGCTGGTCAGATGCCC TGCCTGTGCCTTCAGATGTGCTCGTTCCTTGTGATTTAAATGTAGCCAGCACCAGCGATACACAGATCCCTAAACACATG ACCAGTGAAAAGGCATCGGACACTGTGAAATTTCTATATGATGAACTGAAAAAGTTCCCTAATGCCCGGTCACTGACAGA CCTCATGGGCATGTATGGGGGCTGGACCCCAGTCTGGAGTGCCTTATGTAATGTTGGGGCCACTCAGGCTGGTAAGAATG CCTTTAACATGGCTTTACAAACAATACTGTTGGAGGCTAGCAAGATGACCCCACTAGCTACATGTGCAGAACTACAGGAT TCTATTGTCGGCTACATGACCCCCAGGCATTCAGTAGCCATGTTGAACGCTTGGTGCATTGAACAGGGCATATCTCCTCG CAAATGGTTTGCCTCTATGCACATGCTACTGTCCGGTAAAGGCAAAAAGAGAGTGGGAATTTATATGCAAGGTGAAGCCA ATTCAGGAAAGACTATGATCACCAATTCCTGCTTTGATTGTTTGAATGACATTGTTGGAAAAATGACCAAAGATGGCTTC CCCTTTCAACAACTTGGAAATAAGAGGATAGTTATTGGGGAAGAAGTGGCCATTACTACATCTAATTTGGAAAAGTTCAA AGACCTCATGTCTGGTGGCAATGTGACCTGCGAGCGAAAGTGTACCACCCCACAGTATTGTAAGCCCAATTTGGTTCTGT TGAACTCTAATGTAACCATCAAGGCTAACCTGTCTCAACATGAAGTGGTGATACTGAAAACCAGACTGTACCTGTTCGAA AACCTGAAGCGATCAGCTGTTATCAACAGCTGTTACGGACTCATTCACCCCAAGGCATATGCCCTGTGTGAGGGCATTAC CGATGATGATTACGCTGCCCTGATTTCCAATGAAACAGACCACTGGACAATGGACCCAGTTGAGATTCAAGGGTCCACTG ATGTGTTTGAAGATGTTTGGGATACGATCCCAAAGGATTATGAGGGTCCACCCTTGACCCCCATTTGTAATCAAATGGAC GTAGGGGAGATCCCCTGCTCACAGAAAAGATTTCGGCGTCGCCCATCTGATTTTGTTCATGAACCAGACTGGTTGCCTCA TGATGAATCCTGGCATCCAGATATGGAGACCCCTGTTACCAAGGTTCGCAAGTTCTTGGACTACCCTGCTGAAGACCTTC AGCACGATGAGCTTGTCCAGTCTGGTGATAAAGAGGTACAGTTTACAGATGAAGAAGTTTGTGATTTAATTGACTCTGAA ATTGAAGATGCCCTGCATTCACACTGTGCCATTTTTGTTTCAGAACTAGAGACCTCTGGGAATCACATTTACCATGAACC CATCGTCAACTTCAGAAGTGACGAGGACAGAGTCCCTTTCATTTCTAGATACGCCACGCTTTGGACAGCCGACCAGACTG ATTTTATTCTTGAAGAAGTGGACGAATTCACTTCTGGTTTTGATAACGCTGACGTCCCTTGCCTTCACTACAGGAGTAGA GCTCTCCCCCTTAACCAAAGGGGAACATTGCTTCAAGTTAACACCGTCAATGGCTCCATCACAAGACTGCTCGTCCCCCA ATTGCCAGACTTTCAAGGACGCCAACCAAAGTGCTTCATTTTTCAAGAACGGAGAAAGACTGCTTGTCCCTTTTCCATTT TGCCTCTGTACCCAGATGAATTCTACAGTGACAATACTTTTCTAATGATGTGCTACGCATATGTTATGCTGTGCACATAT GAACTTACACACGTGTACCCAGATTCACCTACAGAATACCCAGAAAGTCAAGAAATGACCGACATTGTACTGCCCAAAGA AGAAGATCCCATCAACACAAAGAACTGTTACTGGCAGGTCAGACAGAAGCTGAGACGCATCTTGGACGAAAAATATGTTG ATGAATATGAACTTTCATTCAAAAAAGTTTGGTCTTTTACTAGATTTGCTTGCCATCTTTGGATCAGTAATGATTTTTAG TGACATGACTTTTATATTTTCAGGATCCCTGACAGACAAAGACCATTAATATGCATTGCTTTTGTATTATTGTATTCTCA GAATTTCATTCAATAAAGTCCTTACAAAGGACACACAAAACCAATGTCATGAATGGCTTGTCCTTTTCCTCTCTGAGCCT TACAAGGCACCCTTTCTATACCTTTTGTGTGTGGGGGTAGGTCCTTTAAAGGGAAGGTACCACTTTTCCACATAAT </code> In this contig, I found this ORF <code> >variant-5304/0 4716|r0.611|2682|r0.611|894 MFTAVFKMLRSYLARDEVRYHMVTVGHCSRPIIHPSYFVPRALAKDKSGNKVSASAGYDKSYELDHYSVYEELVAVWNTPNVHLAAMDAKHKQVINNIKGFTRDERFKVLMVFHDKSCAKQYKSNGNHLHLVIKTLVPVMSSDNKYRAMMRSMSGIGGYCNTALLKGDRSSFLSYLASDPEKMFLGCQDADLLQEFKDAENFSGTIKDWLLEDQSDKTSAIRSWSDALPVPSDVLVPCDLNVASTSDTQIPKHMTSEKASDTVKFLYDELKKFPNARSLTDLMGMYGGWTPVWSALCNVGATQAGKNAFNMALQTILLEASKMTPLATCAELQDSIVGYMTPRHSVAMLNAWCIEQGISPRKWFASMHMLLSGKGKKRVGIYMQGEANSGKTMITNSCFDCLNDIVGKMTKDGFPFQQLGNKRIVIGEEVAITTSNLEKFKDLMSGGNVTCERKCTTPQYCKPNLVLLNSNVTIKANLSQHEVVILKTRLYLFENLKRSAVINSCYGLIHPKAYALCEGITDDDYAALISNETDHWTMDPVEIQGSTDVFEDVWDTIPKDYEGPPLTPICNQMDVGEIPCSQKRFRRRPSDFVHEPDWLPHDESWHPDMETPVTKVRKFLDYPAEDLQHDELVQSGDKEVQFTDEEVCDLIDSEIEDALHSHCAIFVSELETSGNHIYHEPIVNFRSDEDRVPFISRYATLWTADQTDFILEEVDEFTSGFDNADVPCLHYRSRALPLNQRGTLLQVNTVNGSITRLLVPQLPDFQGRQPKCFIFQERRKTACPFSILPLYPDEFYSDNTFLMMCYAYVMLCTYELTHVYPDSPTEYPESQEMTDIVLPKEEDPINTKNCYWQVRQKLRRILDEKYVDEYELSFKKVWSFTRFACHLWISNDF* </code> I ran this in BLASTp and it mapped to different parovirus NS1 proteins: | Alignment | Max Score | Total Score | Query Coverage | Identity | Accession | |NS1 [Raccoon dog amdo ]|60.1|60.1|17%|7e-06|28%|AID57418.1| |nonstructural protein [Turkey parvo 1078]|57.4|57.4|17%|6e-05|28%|ACA28962.1| |NS1 [Chicken parvo ]|57.0|57.0|17%|7e-05|28%|AJB28744.1| I chose to look at ACA28962.1. Region 360:519 of my input sequence mapped with high probability this protein (the others as well). The sequence for this is: <code> PRKWFASMHMLLSGKGKKRVGIYMQGEANSGKTMITNSCFDCLNDIVGKMTKDGFPFQQLGNKRIVIGEEVAITTSNLEKFKDLMSGGNVTCERKCTTPQYCKPNLVLLNSNVTIKANLSQHEVVILKTRLYLFENLKRSAVINSCYGLIHPKAYALCEG </code> I got the DNA from the contig at this part and it is: <code> >variant-5304/0 ACA28962.1|360:519aa|mult3original|extract CGCAAATGGTTTGCCTCTATGCACATGCTACTGTCCGGTAAAGGCAAAAAGAGAGTGGGAATTTATATGCAAGGTGAAGCCAATTCAGGAAAGACTATGATCACCAATTCCTGCTTTGATTGTTTGAATGACATTGTTGGAAAAATGACCAAAGATGGCTTCCCCTTTCAACAACTTGGAAATAAGAGGATAGTTATTGGGGAAGAAGTGGCCATTACTACATCTAATTTGGAAAAGTTCAAAGACCTCATGTCTGGTGGCAATGTGACCTGCGAGCGAAAGTGTACCACCCCACAGTATTGTAAGCCCAATTTGGTTCTGTTGAACTCTAATGTAACCATCAAGGCTAACCTGTCTCAACATGAAGTGGTGATACTGAAAACCAGACTGTACCTGTTCGAAAACCTGAAGCGATCAGCTGTTATCAACAGCTGTTACGGACTCATTCACCCCAAGGCATATGCCCTGTGTGAGGGC </code> I just ran PriceTI with the following script: <code> /afs/cats.ucsc.edu/users/b/jolespin/PriceTI/PriceTI -fpp /campusdata/BME235/Spring2015Data/SW019_S2_L008_R1_001.fastq /campusdata/BME235/Spring2015Data/SW019_S2_L008_R2_001.fastq 100 95 -icf /campusdata/BME235/virus/virus_seed.fa 1 1 5 -nc 30 -dbmax 72 -mol 30 -tol 20 -mpi 80 -target 90 2 1 1 -o virus_assembly.fa -a 32 </code> I'll update accordingly. -jolespin

You could leave a comment if you were logged in.
archive/jolespin_virus.1432263517.txt.gz · Last modified: 2015/05/22 02:58 by jolespin