ID FJ973561; SV 2; linear; genomic DNA; STD; VRL; 5104 BP. XX AC FJ973561; XX DT 03-JUN-2009 (Rel. 101, Created) DT 23-JUN-2014 (Rel. 121, Last updated, Version 4) XX DE Human bocavirus 4 NI strain HBoV4-NI-385, complete genome. XX KW . XX OS Human bocavirus 4 NI OC Viruses; Parvoviridae; Parvovirinae; Bocaparvovirus; OC Primate bocaparvovirus 2. XX RN [1] RP 1-5104 RX DOI; 10.1086/652416. RX PUBMED; 20415538. RA Kapoor A., Simmonds P., Slikas E., Li L., Bodhidatta L., Sethabutr O., RA Triki H., Bahri O., Oderinde B.S., Baba M.M., Bukbuk D.N., Besser J., RA Bartkus J., Delwart E.; RT "Human bocaviruses are highly diverse, dispersed, recombination prone, and RT prevalent in enteric infections"; RL J. Infect. Dis. 201(11):1633-1643(2010). XX RN [2] RP 1-5104 RA Kapoor A., Delwart E.; RT "Two new species of Human bocaviruses: complete genomes, molecular RT characterization and interspecies recombination"; RL Unpublished. XX RN [3] RP 1-5104 RA Kapoor A., Slikas B., Simmonds P., Mason C., Sethabutr O., Triki H., RA Oderinde S., Mandu B., Nadeba B., Bartkus J., Delwart E.; RT ; RL Submitted (29-APR-2009) to the INSDC. RL Molecular Virology, Blood Systems Research Institute, 270, Masonic Avenue, RL San Francisco, CA 94118, USA XX RN [4] RC Sequence update by submitter RP 1-5104 RA Kapoor A., Slikas B., Simmonds P., Mason C., Sethabutr O., Triki H., RA Oderinde S., Mandu B., Nadeba B., Bartkus J., Delwart E.; RT ; RL Submitted (18-FEB-2010) to the INSDC. RL Molecular Virology, Blood Systems Research Institute, 270, Masonic Avenue, RL San Francisco, CA 94118, USA XX DR MD5; 5bbe7fa08d7348c674992492b18c5ce6. DR EuropePMC; PMC2902747; 20415538. DR EuropePMC; PMC3078135; 21525999. DR EuropePMC; PMC3125170; 21738642. DR EuropePMC; PMC3487788; 23133667. DR EuropePMC; PMC3837656; 24209884. DR EuropePMC; PMC3838256; 24109231. DR EuropePMC; PMC5780762; 22531100. DR EuropePMC; PMC5861819; 29352084. DR EuropePMC; PMC6360332; 30766894. XX CC On Feb 18, 2010 this sequence version replaced gi:237690172. CC Genome sequence lacks part of non-coding region. XX FH Key Location/Qualifiers FH FT source 1..5104 FT /organism="Human bocavirus 4 NI" FT /host="Homo sapiens" FT /strain="HBoV4-NI-385" FT /mol_type="genomic DNA" FT /country="Nigeria" FT /isolation_source="stool" FT /collection_date="10-Apr-2007" FT /db_xref="taxon:1511883" FT gene 162..2560 FT /gene="NS2" FT CDS join(162..2073,2142..2560) FT /codon_start=1 FT /gene="NS2" FT /product="non structural protein NS2" FT /db_xref="GOA:C5IY43" FT /db_xref="InterPro:IPR001257" FT /db_xref="InterPro:IPR014015" FT /db_xref="InterPro:IPR027417" FT /db_xref="UniProtKB/Swiss-Prot:C5IY43" FT /protein_id="ACR15779.1" FT /translation="MAFSAPVLRAFSQPTFTYVIKFPYNNWKEDEHLLWSLLAPGTESL FT MIQLKNCAPHPEDDPIREDILCSLADLHYGAVFAKACYIATSTLMGQKQRTLFPRCDIV FT CQSEIGSDFLHCHILVGGAGLSKRNAKISRATLLGLVMAELTQRCKLLLAHRPFEPAEA FT TIYHELKRIEREAWSGHTGNWVQILQYKDKRGDLHAQPIDPLRFLKHYILPKNRLISPS FT SKPDVCTSPDNWFILADKTYSHTIINGLPLLERNRKAYLQELESEVIPGPSAMAFGGRG FT AWEQLPEVGEQRLITSNTSTAYKANKKEKLMLNLLDKCDELNLLVYEDLVSACPDLLLM FT LEGQPGGARLIEQVLGMHHIKVCAKHTALSFLFHLHPDQLLTSSNKALKLLLIQGYNPL FT QVGHAICCVLNKQMGKQNTICFYGPASTGKTNFAKAIVQGVRLYGCVNHLNKGFVFNDC FT RQRLIIWWEECLMHQDWVEPAKCILGGTECRIDVKHKDSVLLQQTPVIISTNHDIYSVV FT GGNTVSHVHAAPLKERVLQLNFMKQLPQTFGEISPSEIAELLQWCFNEYDCTLAGFKQK FT WNLDKVPNSFPIGDLCPTHSQDFTLHENGFCSDCGGYLPHSADDSVYTDVASETTSGDY FT DPGNLGDTDGEDSKSEASEVDYCPPKKRRVISATPPNSPVSGPSLSTFLDTWQSQPRDD FT DELRIYEEQASQFQKNTKSTSEREEAQLGESQEPQPEPDPTAWGEKLGVCSSQQPGQPP FT IVLYCFEDLRPSDEDEGENIGGD" FT gene 162..2084 FT /gene="NS1" FT CDS 162..2084 FT /codon_start=1 FT /gene="NS1" FT /product="non structural protein NS1" FT /db_xref="GOA:C5IY43" FT /db_xref="InterPro:IPR001257" FT /db_xref="InterPro:IPR014015" FT /db_xref="InterPro:IPR027417" FT /db_xref="UniProtKB/Swiss-Prot:C5IY43" FT /protein_id="ACR15778.1" FT /translation="MAFSAPVLRAFSQPTFTYVIKFPYNNWKEDEHLLWSLLAPGTESL FT MIQLKNCAPHPEDDPIREDILCSLADLHYGAVFAKACYIATSTLMGQKQRTLFPRCDIV FT CQSEIGSDFLHCHILVGGAGLSKRNAKISRATLLGLVMAELTQRCKLLLAHRPFEPAEA FT TIYHELKRIEREAWSGHTGNWVQILQYKDKRGDLHAQPIDPLRFLKHYILPKNRLISPS FT SKPDVCTSPDNWFILADKTYSHTIINGLPLLERNRKAYLQELESEVIPGPSAMAFGGRG FT AWEQLPEVGEQRLITSNTSTAYKANKKEKLMLNLLDKCDELNLLVYEDLVSACPDLLLM FT LEGQPGGARLIEQVLGMHHIKVCAKHTALSFLFHLHPDQLLTSSNKALKLLLIQGYNPL FT QVGHAICCVLNKQMGKQNTICFYGPASTGKTNFAKAIVQGVRLYGCVNHLNKGFVFNDC FT RQRLIIWWEECLMHQDWVEPAKCILGGTECRIDVKHKDSVLLQQTPVIISTNHDIYSVV FT GGNTVSHVHAAPLKERVLQLNFMKQLPQTFGEISPSEIAELLQWCFNEYDCTLAGFKQK FT WNLDKVPNSFPIGDLCPTHSQDFTLHENGFCSDCGGYLPHSADDSVYTDVASETTSGDY FT DPGRL" FT gene 2313..2957 FT /gene="NP1" FT CDS 2313..2957 FT /codon_start=1 FT /gene="NP1" FT /product="NP1" FT /db_xref="GOA:C5IY45" FT /db_xref="InterPro:IPR021075" FT /db_xref="UniProtKB/Swiss-Prot:C5IY45" FT /protein_id="ACR15780.1" FT /translation="MSSESTKNRHRSSKRTPSPLQKERKRNWENRKSRSRSPIRRHGEK FT NLEYAHHNNQDNRQSSYTASKTSDQAMKTKEKTSGGTRTNPYTVFSQHRANHSNAPGWC FT GFYWHSTRLARNGTNNIFNEMKQKFQELQIDGKISWDTTRELLFTQKKTLDQGYRNMLY FT HFRHSPDCPRCDYWDDVYRKHLANVSSQESEEVTDEEMLSAVESMETNASN" FT gene 2944..4956 FT /gene="VP1" FT CDS 2944..4956 FT /codon_start=1 FT /gene="VP1" FT /product="capsid protein VP1" FT /db_xref="GOA:C5IY46" FT /db_xref="InterPro:IPR001403" FT /db_xref="InterPro:IPR013607" FT /db_xref="InterPro:IPR016184" FT /db_xref="InterPro:IPR036952" FT /db_xref="PDB:5US9" FT /db_xref="UniProtKB/Swiss-Prot:C5IY46" FT /protein_id="ACR15781.1" FT /translation="MPPIKRQPGGWVLPGYKYLGPFNPLENGEPVNKADRAAQAHDKSY FT SELIKSGKNPYLYFNKADEKFIDDLKDDWSLGGIIGSSFFKLKRAVAPALGNKERAQKR FT HFYFANSNKGAKKTKNNEPKPGTSKMSENEIQDQQPSDSMDGQRGGGGGATGSVGGGKG FT SGVGISTGGWVGGSYFTDSYVITKNTRQFLVKIQNNHQYKTELISPSTSQGKSQRCVST FT PWSYFNFNQYSSHFSPQDWQRLTNEYKRFRPKGMHVKIYNLQIKQILSNGADTTYNNDL FT TAGVHIFCDGEHAYPNATHPWDEDVMPELPYQTWYLFQYGYIPVIHELAEMEDSNAVEK FT AICLQIPFFMLENSDHEVLRTGESTEFTFNFDCEWINNERAYIPPGLMFNPLVPTRRAQ FT YIRRNNNPQTAESTSRIAPYAKPTSWMTGPGLLSAQRVGPATSDTGAWMVAVKPENASI FT DTGMSGIGSGFDPPQGSLAPTNLEYKIQWYQTPQGTNNNGNIISNQPLSMLRDQALFRG FT NQTTYNLCSDVWMFPNQIWDRYPITRENPIWCKKPRSDKHTTIDPFDGSLAMDHPPGTI FT FIKMAKIPVPSNNNADSYLNIYCTGQVSCEIVWEVERYATKNWRPERRHTTFGLGIGGA FT DNLNPTYHVDKNGTYIQPTTWDMCFPVKTNINKVL" FT gene 3331..4956 FT /gene="VP2" FT CDS 3331..4956 FT /codon_start=1 FT /gene="VP2" FT /product="capsid protein VP2" FT /db_xref="GOA:C5IY46" FT /db_xref="InterPro:IPR001403" FT /db_xref="InterPro:IPR013607" FT /db_xref="InterPro:IPR016184" FT /db_xref="InterPro:IPR036952" FT /db_xref="PDB:5US9" FT /db_xref="UniProtKB/Swiss-Prot:C5IY46" FT /protein_id="ACR15782.1" FT /translation="MSENEIQDQQPSDSMDGQRGGGGGATGSVGGGKGSGVGISTGGWV FT GGSYFTDSYVITKNTRQFLVKIQNNHQYKTELISPSTSQGKSQRCVSTPWSYFNFNQYS FT SHFSPQDWQRLTNEYKRFRPKGMHVKIYNLQIKQILSNGADTTYNNDLTAGVHIFCDGE FT HAYPNATHPWDEDVMPELPYQTWYLFQYGYIPVIHELAEMEDSNAVEKAICLQIPFFML FT ENSDHEVLRTGESTEFTFNFDCEWINNERAYIPPGLMFNPLVPTRRAQYIRRNNNPQTA FT ESTSRIAPYAKPTSWMTGPGLLSAQRVGPATSDTGAWMVAVKPENASIDTGMSGIGSGF FT DPPQGSLAPTNLEYKIQWYQTPQGTNNNGNIISNQPLSMLRDQALFRGNQTTYNLCSDV FT WMFPNQIWDRYPITRENPIWCKKPRSDKHTTIDPFDGSLAMDHPPGTIFIKMAKIPVPS FT NNNADSYLNIYCTGQVSCEIVWEVERYATKNWRPERRHTTFGLGIGGADNLNPTYHVDK FT NGTYIQPTTWDMCFPVKTNINKVL" XX SQ Sequence 5104 BP; 1700 A; 1077 C; 987 G; 1340 T; 0 other; ggtgatcata aacacgccca ggaagtgacg tatgacagcc aatcagcatt gagcatatag 60 cctatataaa ccgatgcact tccgcatctc gtcagactgc atccggtctc cggcgagtga 120 acatctctgg aaagagctcc acgcttgtgg tgagtgacac tatggccttt tctgctcctg 180 tacttagagc tttttctcaa cctactttta cctatgttat taaatttcca tataataact 240 ggaaagaaga tgaacactta ctatggagct tacttgctcc tgggactgaa agtctcatga 300 ttcaactaaa aaactgcgca ccacatcctg aagatgatcc tatcagggaa gatattttat 360 gctcactagc agatctacac tatggtgctg tttttgccaa agcttgctac atagctacat 420 ctacactaat ggggcagaaa caaagaacac tctttccacg ctgcgacatt gtttgccagt 480 ctgaaattgg ctcagacttt ctacactgtc acatacttgt tggaggagcc ggtcttagca 540 agagaaatgc taaaatttca cgcgctacac ttttgggtct tgtgatggct gaactaacac 600 aacgctgcaa gctacttctt gcacatcgtc catttgaacc agctgaagct actatctatc 660 atgaacttaa acgcattgaa cgcgaagcat ggtcagggca tactggtaac tgggttcaga 720 ttcttcaata caaagataaa cgaggtgatc ttcacgctca accaattgat cccttacgct 780 ttctaaaaca ttacattcta ccaaaaaatc gattgatttc tccttccagc aaacctgacg 840 tctgcacttc tccagataac tggtttattc tagctgacaa aacatactct cacactatta 900 ttaatgggct tccgctgcta gaacgtaaca gaaaagccta tctacaagag ttagaaagtg 960 aagtcatccc ggggccttct gccatggcct ttgggggacg tggtgcgtgg gaacaacttc 1020 ctgaggtagg agaacaacgc ctaattactt ctaatacttc tactgcttat aaagctaaca 1080 aaaaagaaaa attaatgtta aatttacttg ataaatgtga tgaacttaat ttgcttgtat 1140 atgaagactt agttagtgct tgtcctgacc ttttactaat gcttgaagga caaccgggtg 1200 gagcacgact aattgaacag gtgcttggca tgcaccatat taaagtatgt gctaaacata 1260 ctgccttatc ttttttattt cacttacatc ctgatcaatt attaacttct agcaataaag 1320 cactaaaact actattgatt caaggataca acccattaca agtagggcat gccatctgtt 1380 gtgtacttaa caaacagatg ggcaagcaga acactatttg cttttatggc cctgcttcaa 1440 caggcaaaac aaactttgca aaagctatag ttcagggcgt tcgcctttat ggctgtgtta 1500 atcatttaaa caaggggttt gtttttaacg attgcagaca acgccttata atttggtggg 1560 aagaatgttt aatgcatcaa gattgggtag aacctgctaa atgtatttta ggcggaactg 1620 aatgtagaat tgatgttaaa cataaagata gtgttctcct tcaacaaaca ccagtaatca 1680 tttccactaa ccatgacatc tactctgtag ttggtggcaa tactgtttct catgttcatg 1740 cagcaccatt aaaagaacga gttcttcagc taaattttat gaaacaacta ccacaaacat 1800 ttggagaaat ctctccaagt gaaattgcag aacttttgca atggtgcttt aatgagtacg 1860 actgtactct tgctggcttt aaacaaaaat ggaacttaga caaagttcca aactcatttc 1920 ctattggaga cctttgtcct acacattcac aggacttcac gcttcacgaa aacggattct 1980 gctctgactg tggcggctat cttcctcata gcgctgacga ttctgtttac actgacgtgg 2040 ctagtgaaac aaccagcggt gactacgacc caggtaggct ttaatacatt agctttacta 2100 tttattactc ttgaagtttg cttatgtatt aactcctaca ggtaacctgg gggatacgga 2160 cggagaggac tccaagtcag aagcatcgga agtggattat tgtccaccca agaaaaggcg 2220 tgtgatttca gcaactccac caaacagtcc agtaagtggt ccaagccttt ctaccttttt 2280 agatacttgg caatcacaac ctagagacga cgatgagctc agaatctacg aagaacaggc 2340 atcgcagttc caaaagaaca ccaagtccac ttcagaaaga gaggaagcgc aactgggaga 2400 atcgcaagag ccgcagccgg agcccgatcc gacggcatgg ggagaaaaac ttggagtatg 2460 ctcatcacaa caaccaggac aaccgccaat cgtcctatac tgcttcgaag acctcagacc 2520 aagcgatgaa gacgaaggag aaaacatcgg gggggactag aacaaatcct tatactgtat 2580 tcagtcaaca cagggctaat cattcaaatg ctcctggctg gtgtgggttt tactggcatt 2640 caactaggct tgctagaaat ggcactaata atatttttaa tgaaatgaaa caaaaatttc 2700 aagaactaca aatagatggg aaaatcagtt gggatactac tagagaacta ttgtttactc 2760 agaaaaaaac attagatcaa ggctacagaa acatgttgta ccactttaga cacagtcctg 2820 attgtcctag atgtgattat tgggatgatg tttaccgtaa acacttagct aatgtctctt 2880 cacaggaatc agaggaggtt acagacgaag aaatgctttc tgctgttgaa agcatggaaa 2940 caaatgcctc caattaaacg ccaacctgga gggtgggtgc ttcctggtta taaatacctt 3000 ggtccattta atcctcttga aaacggtgaa ccagttaata aagctgatcg tgctgctcaa 3060 gctcatgata aatcatattc tgaactaata aaaagtggaa aaaatcctta cttatatttc 3120 aataaagctg atgaaaaatt cattgacgat ttgaaagacg attggtctct tggtggcatt 3180 attggctcaa gtttttttaa acttaaacgc gccgtggctc ctgctctagg aaataaagag 3240 cgagctcaaa aaagacattt ctactttgca aactcaaata aaggtgctaa aaaaacaaaa 3300 aacaacgaac ctaagccagg cacttcaaaa atgtctgaaa atgaaattca agaccaacaa 3360 ccatcagatt ctatggatgg acaacgaggg ggcggaggag gtgcaactgg cagtgtggga 3420 ggggggaaag gttctggtgt gggtatatcc acaggcggat gggtaggagg cagctatttt 3480 actgactcat atgttataac aaaaaacacc agacaatttc tagttaaaat acaaaacaac 3540 catcaataca aaacagaatt aatatcgcct tccacatctc aaggaaaatc acaaagatgc 3600 gtcagcacgc cttggtctta ctttaacttt aatcaataca gcagtcattt ttcaccacaa 3660 gactggcagc gattaacaaa cgaatataaa agattcagac ccaaaggcat gcatgttaaa 3720 atatacaatt tacaaataaa acaaattctt tcaaatggtg ctgacactac atacaacaac 3780 gacctcacag ctggtgttca cattttttgt gatggcgaac acgcatatcc aaatgcaaca 3840 catccttggg atgaagacgt tatgccagag ctgccatacc aaacatggta tttgtttcaa 3900 tatggatata ttccagttat acatgaactt gctgaaatgg aagactcaaa tgctgtagaa 3960 aaagcaattt gcttacaaat accatttttt atgcttgaaa acagcgacca cgaagtttta 4020 agaacaggtg aaagcacaga atttactttc aactttgact gtgaatggat aaacaatgaa 4080 agagcataca ttcctccagg cttaatgttt aatccactag tacctactag aagagcacag 4140 tacataagaa gaaacaacaa tcctcaaact gctgaaagca catccagaat tgctccatat 4200 gcaaaaccta caagttggat gactggacca ggtttactca gtgcacaaag agtaggtcca 4260 gctacttcag acacaggagc ctggatggtt gcagttaaac cagaaaacgc aagcattgac 4320 acaggaatgt ctggaattgg aagtggattt gatccaccac aaggatcact agcaccaaca 4380 aatctagaat acaaaatcca atggtaccaa acaccacaag gaacaaacaa caatggaaac 4440 atcatatcta atcaaccact atctatgcta agagatcaag ctttatttag aggaaatcaa 4500 acaacctata acctatgttc agatgtatgg atgtttccaa atcaaatttg ggacagatac 4560 ccaataacca gagaaaatcc aatatggtgt aaaaaaccca gatcagacaa acacacaaca 4620 attgatcctt ttgatggatc ccttgcaatg gatcatcctc caggcacaat ttttattaaa 4680 atggcaaaaa ttccagttcc ttcaaacaac aatgcagact catacttaaa catttactgc 4740 acagggcaag tcagctgtga aattgtctgg gaagttgaaa gatatgcaac aaagaactgg 4800 agaccagaaa gaagacacac aacatttggt cttggaattg gaggagctga caacttaaat 4860 ccaacctacc atgttgacaa aaacggaact tacattcaac caacaacatg ggacatgtgc 4920 tttccagtta aaacaaacat caataaagtg ttgtaacctt ctaagcctct tttttgctta 4980 tgcttataag ttcctctcca atggacaagt ggaaagaaaa gggtgactgt aatcccgagc 5040 tcatgagttc gaggctacag tccgatggca gtggtgttgc cgtctcgaac ctagccgtta 5100 cacc 5104 //