spacer

Dbfetch Databases

Introduction

The databases available via dbfetch are listed below, the name in parenthesis should be used when:

  • Constructing a dbfetch URL (see Syntax).
  • Constructing an identifier file for upload (see Search Items).
  • Making a request via the web services (see WSDbfetch).

An overview of each database is also provided, which includes a short description of the database, a link to the database, a collection of example identifiers and details of the available data formats and result styles.

Databases

  1. EDAM (edam)
  2. ENA Coding (ena_coding)
  3. ENA Non-coding (ena_noncoding)
  4. ENA Sequence (ena_sequence)
  5. ENA Sequence Constructed (ena_sequence_con)
  6. ENA Sequence Constructed Expanded (ena_sequence_conexp)
  7. ENA/SVA (ena_sva)
  8. Ensembl Gene (ensemblgene)
  9. Ensembl Genomes Gene (ensemblgenomesgene)
  10. Ensembl Genomes Transcript (ensemblgenomestranscript)
  11. Ensembl Transcript (ensembltranscript)
  12. EPO Proteins (epo_prt)
  13. HGNC (hgnc)
  14. IMGT/HLA (imgthla)
  15. IMGT/LIGM-DB (imgtligm)
  16. InterPro (interpro)
  17. IPD-KIR (ipdkir)
  18. IPD-MHC (ipdmhc)
  19. IPRMC (iprmc)
  20. IPRMC UniParc (iprmcuniparc)
  21. JPO Proteins (jpo_prt)
  22. KIPO Proteins (kipo_prt)
  23. MEDLINE (medline)
  24. Patent DNA NRL1 (nrnl1)
  25. Patent DNA NRL2 (nrnl2)
  26. Patent Protein NRL1 (nrpl1)
  27. Patent Protein NRL2 (nrpl2)
  28. Patent Equivalents (patent_equivalents)
  29. PDB (pdb)
  30. RefSeq (nucleotide) (refseqn)
  31. RefSeq (protein) (refseqp)
  32. SGT (sgt)
  33. Taxonomy (taxonomy)
  34. Trace Archive (tracearchive)
  35. UniParc (uniparc)
  36. UniProtKB (uniprotkb)
  37. UniRef100 (uniref100)
  38. UniRef50 (uniref50)
  39. UniRef90 (uniref90)
  40. UniSave (unisave)
  41. USPTO Proteins (uspto_prt)

EDAM (edam)

http://edamontology.sourceforge.net/

EMBRACE Data and Methods (EDAM) Ontology.

FormatStylesExample Identifiers
default default, html, raw Id: 0338, 1929, EDAM_operation:0338, EDAM_format:1929, 0000338, 0001929
obo default, html, raw Id: 0338, 1929, EDAM_operation:0338, EDAM_format:1929, 0000338, 0001929

Data resources: EMBOSS ontoget, EMBOSS ontotext

ENA Coding (ena_coding)

http://www.ebi.ac.uk/ena/

ENA Coding is a database of nucleotide sequences of the CDS (coding sequence) features, as annotated in ENA Sequence database. ENA Coding records contain the nucleotide sequence of the CDS region with accompanying annotation from the parent nucleotide entry and the additional automatically generated annotation.

FormatStylesExample Identifiers
default default, html, raw Accession: AAA59452
Sequence version: AAA59452.1
annot default, html, raw Accession: AAA59452
Sequence version: AAA59452.1
embl default, html, raw Accession: AAA59452
Sequence version: AAA59452.1
emblxml-1.1 default, raw Accession: AAA59452
Sequence version: AAA59452.1
emblxml-1.2 default, raw Accession: AAA59452
Sequence version: AAA59452.1
entrysize default, html, raw Accession: AAA59452
Sequence version: AAA59452.1
fasta default, html, raw Accession: AAA59452
Sequence version: AAA59452.1
seqxml default, raw Accession: AAA59452
Sequence version: AAA59452.1

Data resources: ENA Browser, NCBI BLAST blastdbcmd

ENA Non-coding (ena_noncoding)

http://www.ebi.ac.uk/ena/

ENA Non-coding is a database of nucleotide sequences of the non-coding RNA features, as annotated in ENA Sequence database. ENA Non-coding records contain the nucleotide sequence of the RNA feature with accompanying annotation from the parent nucleotide entry and the additional automatically generated annotation.

FormatStylesExample Identifiers
default default, html, raw Id: AB012758.1:1..40:tRNA
annot default, html, raw Id: AB012758.1:1..40:tRNA
embl default, html, raw Id: AB012758.1:1..40:tRNA
emblxml-1.2 default, raw Id: AB012758.1:1..40:tRNA
entrysize default, html, raw Id: AB012758.1:1..40:tRNA
fasta default, html, raw Id: AB012758.1:1..40:tRNA
seqxml default, raw Id: AB012758.1:1..40:tRNA

Data resources: ENA Browser

ENA Sequence (ena_sequence)

http://www.ebi.ac.uk/ena/

ENA Sequence (formerly known as EMBL-Bank), Europe's primary nucleotide sequence resource. The main sources of the DNA and RNA sequences in the database are submissions from individual researchers, genome sequencing projects and patent applications.

FormatStylesExample Identifiers
default default, html, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
annot default, html, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
embl default, html, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
emblxml-1.1 default, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
emblxml-1.2 default, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
entrysize default, html, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
fasta default, html, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
insdxml default, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1
seqxml default, raw Accession: M10051, K00650, D87894, AJ242600
Sequence version: J00231.1, K00650.1, D87894.1, AJ242600.1

Data resources: ENA Browser, ENA/SVA, NCBI BLAST blastdbcmd, SRS@DKFZ

ENA Sequence Constructed (ena_sequence_con)

http://www.ebi.ac.uk/ena/

The ENA Sequence Constructed database division represents complete genomes and other long sequences constructed from segment entries. Instead of containing the sequence, these entries detail how to assemble the sequence from other ENA Sequence entries.

FormatStylesExample Identifiers
default default, html, raw Accession: CH003588
Sequence version: CH003588.1
annot default, html, raw Accession: CH003588
Sequence version: CH003588.1
embl default, html, raw Accession: CH003588
Sequence version: CH003588.1
emblxml-1.1 default, raw Accession: CH003588
Sequence version: CH003588.1
emblxml-1.2 default, raw Accession: CH003588
Sequence version: CH003588.1
entrysize default, html, raw Accession: CH003588
Sequence version: CH003588.1
fasta default, html, raw Accession: CH003588
Sequence version: CH003588.1
insdxml default, raw Accession: CH003588
Sequence version: CH003588.1
seqxml default, raw Accession: CH003588
Sequence version: CH003588.1

Data resources: ENA Browser, ENA/SVA

ENA Sequence Constructed Expanded (ena_sequence_conexp)

http://www.ebi.ac.uk/ena/

The ENA Sequence Constructed database division represents complete genomes and other long sequences constructed from segment entries. Expanded entries include the complete nucleotide sequence for the constructed entry.

FormatStylesExample Identifiers
default default, html, raw Accession: AL672111
Sequence version: AL672111.1
annot default, html, raw Accession: AL672111
Sequence version: AL672111.1
embl default, html, raw Accession: AL672111
Sequence version: AL672111.1
emblxml-1.1 default, raw Accession: AL672111
Sequence version: AL672111.1
emblxml-1.2 default, raw Accession: AL672111
Sequence version: AL672111.1
entrysize default, html, raw Accession: AL672111
Sequence version: AL672111.1
fasta default, html, raw Accession: AL672111
Sequence version: AL672111.1
insdxml default, raw Accession: AL672111
Sequence version: AL672111.1
seqxml default, raw Accession: AL672111
Sequence version: AL672111.1

Data resources: ENA Browser

ENA/SVA (ena_sva)

http://www.ebi.ac.uk/cgi-bin/sva/sva.pl

The ENA Sequence Version Archive is a repository of all entries which have ever appeared in the EMBL Nucleotide Sequence Databank (EMBL-Bank) or ENA Sequence databases.

FormatStylesExample Identifiers
default default, html, raw Accession: Y09633
Sequence version: Y09633.1, Y09633.4
annot default, html, raw Accession: Y09633
Sequence version: Y09633.1, Y09633.4
embl default, html, raw Accession: Y09633
Sequence version: Y09633.1, Y09633.4
entrysize default, html, raw Accession: Y09633
Sequence version: Y09633.1, Y09633.4
fasta default, raw Accession: Y09633
Sequence version: Y09633.1, Y09633.4

Data resources: ENA/SVA

Ensembl Gene (ensemblgene)

http://www.ensembl.org/

Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Gene sequences and annotations.

FormatStylesExample Identifiers
default default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
csv default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
embl default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
fasta default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
genbank default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
gff2 default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
gff3 default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147
tab default, raw Id: ENSBTAG00000000988, ENSG00000139618, ENSMUSG00000041147

Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia

Ensembl Genomes Gene (ensemblgenomesgene)

http://www.ensemblgenomes.org/

Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Gene sequences and annotations.

FormatStylesExample Identifiers
default default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
csv default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
embl default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
fasta default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
genbank default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
gff2 default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
gff3 default, raw Id: AAEL000001, AGAP006864, GB46163, b2736
tab default, raw Id: AAEL000001, AGAP006864, GB46163, b2736

Data resources: EnsemblGenomes UK

Ensembl Genomes Transcript (ensemblgenomestranscript)

http://www.ensemblgenomes.org/

Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Transcript sequences.

FormatStylesExample Identifiers
default default, raw Id: AAEL000001-RA, AGAP006864-RA, GB46163-RA, AAC75778
fasta default, raw Id: AAEL000001-RA, AGAP006864-RA, GB46163-RA, AAC75778

Data resources: EnsemblGenomes UK

Ensembl Transcript (ensembltranscript)

http://www.ensembl.org/

Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Transcript sequences.

FormatStylesExample Identifiers
default default, raw Id: ENSAMET00000013126, ENSBTAT00000001311, ENST00000380152, ENSMUST00000044620
fasta default, raw Id: ENSAMET00000013126, ENSBTAT00000001311, ENST00000380152, ENSMUST00000044620

Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia

EPO Proteins (epo_prt)

http://www.ebi.ac.uk/patentdata/proteins/

Protein sequences appearing in patents from the European Patent Office (EPO).

FormatStylesExample Identifiers
default default, html, raw Accession: A00022
Sequence version: A00022.1
annot default, html, raw Accession: A00022
Sequence version: A00022.1
embl default, html, raw Accession: A00022
Sequence version: A00022.1
entrysize default, html, raw Accession: A00022
Sequence version: A00022.1
fasta default, html, raw Accession: A00022
Sequence version: A00022.1
seqxml default, raw Accession: A00022
Sequence version: A00022.1

Data resources: SimpleIndex, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret

HGNC (hgnc)

http://genenames.org/

HUGO Gene Nomenclature Committee (HGNC) approved gene name and symbol (short-form abbreviation) for each human gene.

FormatStylesExample Identifiers
default default, html, raw Accession: 1101, 3566
Id: BRCA2, FACD
tab default, html, raw Accession: 1101, 3566
Id: BRCA2, FACD

Data resources: GeneNames.org

IMGT/HLA (imgthla)

http://www.ebi.ac.uk/imgt/hla/

Sequences of the human major histocompatibility complex (HLA) including the official sequences for the WHO Nomenclature Committee For Factors of the HLA System.

FormatStylesExample Identifiers
default default, html, raw Accession: HLA00001
Id: HLA-A*01:01:01:01
annot default, html, raw Accession: HLA00001
Id: HLA-A*01:01:01:01
embl default, html, raw Accession: HLA00001
Id: HLA-A*01:01:01:01
entrysize default, html, raw Accession: HLA00001
Id: HLA-A*01:01:01:01
seqxml default, raw Accession: HLA00001
Id: HLA-A*01:01:01:01
fasta default, html, raw Accession: HLA00001
Id: HLA-A*01:01:01:01

Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd

IMGT/LIGM-DB (imgtligm)

http://imgt.cines.fr/cgi-bin/IMGTlect.jv

A comprehensive database of Immunoglobulins and T cell Receptors from human and other vertebrates.

FormatStylesExample Identifiers
default default, html, raw Accession: A00673, AF003293
annot default, html, raw Accession: A00673, AF003293
embl default, html, raw Accession: A00673, AF003293
entrysize default, html, raw Accession: A00673, AF003293
seqxml default, raw Accession: A00673, AF003293
fasta default, html, raw Accession: A00673, AF003293

Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd

InterPro (interpro)

http://www.ebi.ac.uk/interpro/

The InterPro database (Integrated Resource of Protein Domains and Functional Sites) is an integrated documentation resource for protein families, domains and functional sites. It was developed initially as a means of rationalising the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects, but now also includes the SMART, TIGRFAMs, PIR SuperFamilies and most recently SUPERFAMILY databases.

FormatStylesExample Identifiers
default default, html, raw Id: IPR006212, IPR008266, IPR008958, IPR009030, IPR011009
interpro default, html, raw Id: IPR006212, IPR008266, IPR008958, IPR009030, IPR011009
interproxml default, raw Id: IPR006212, IPR008266, IPR008958, IPR009030, IPR011009
tab default, html, raw Id: IPR006212, IPR008266, IPR008958, IPR009030, IPR011009

Data resources: SimpleIndex

IPD-KIR (ipdkir)

http://www.ebi.ac.uk/ipd/kir/

A centralised repository for human Killer-cell Immunoglobulin-like Receptor (KIR) sequences.

FormatStylesExample Identifiers
default default, html, raw Accession: KIR00001
Id: 3DL3*005
annot default, html, raw Accession: KIR00001
Id: 3DL3*005
embl default, html, raw Accession: KIR00001
Id: 3DL3*005
entrysize default, html, raw Accession: KIR00001
Id: 3DL3*005
seqxml default, raw Accession: KIR00001
Id: 3DL3*005
fasta default, html, raw Accession: KIR00001
Id: 3DL3*005

Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd

IPD-MHC (ipdmhc)

http://www.ebi.ac.uk/ipd/mhc/

Sequences of the the major histocompatibility complex in a number of species.

FormatStylesExample Identifiers
default default, html, raw Accession: MHC00001
Id: Aona-DQA1*27:01
annot default, html, raw Accession: MHC00001
Id: Aona-DQA1*27:01
embl default, html, raw Accession: MHC00001
Id: Aona-DQA1*27:01
entrysize default, html, raw Accession: MHC00001
Id: Aona-DQA1*27:01
seqxml default, raw Accession: MHC00001
Id: Aona-DQA1*27:01
fasta default, html, raw Accession: MHC00001
Id: Aona-DQA1*27:01

Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd

IPRMC (iprmc)

http://www.ebi.ac.uk/interpro/

InterPro Matches Complete (IPRMC) for UniProtKB proteins.

FormatStylesExample Identifiers
default default, html, raw Id: A0PGH6, A0A2V8
gff2 default, html, raw Id: A0PGH6, A0A2V8
iprmc default, html, raw Id: A0PGH6, A0A2V8
iprmctab default, html, raw Id: A0PGH6, A0A2V8
iprmcxml default, raw Id: A0PGH6, A0A2V8
dasgff default, raw Id: A0PGH6, A0A2V8

Data resources: SimpleIndex, InterPro DAS, SRS@WBW

IPRMC UniParc (iprmcuniparc)

http://www.ebi.ac.uk/interpro/

InterPro Matches Complete (IPRMC) for UniParc proteins.

FormatStylesExample Identifiers
default default, html, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D
gff2 default, html, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D
iprmc default, html, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D
iprmctab default, html, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D
iprmcxml default, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D
dasgff default, raw Id: UPI0000000001, UPI0000046364, UPI00001B3DCE, 28FE89850863372D

Data resources: SimpleIndex, InterPro UniParc Matches DAS

JPO Proteins (jpo_prt)

http://www.ebi.ac.uk/patentdata/proteins/

Protein sequences appearing in patents from the Japanese Patent Office (JPO).

FormatStylesExample Identifiers
default default, html, raw Accession: E50010
annot default, html, raw Accession: E50010
embl default, html, raw Accession: E50010
entrysize default, html, raw Accession: E50010
fasta default, html, raw Accession: E50010
seqxml default, raw Accession: E50010
ddbj default, html, raw Accession: E50010

Data resources: EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret, DDBJ getentry

KIPO Proteins (kipo_prt)

http://www.ebi.ac.uk/patentdata/proteins/

Protein sequences appearing in patents from the Korean Intellectual Property Office (KIPO).

FormatStylesExample Identifiers
default default, html, raw Accession: DI500001
annot default, html, raw Accession: DI500001
embl default, html, raw Accession: DI500001
entrysize default, html, raw Accession: DI500001
fasta default, html, raw Accession: DI500001
seqxml default, raw Accession: DI500001
ddbj default, html, raw Accession: DI500001

Data resources: EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret, DDBJ getentry

MEDLINE (medline)

http://www.nlm.nih.gov/pubs/factsheets/medline.html

MEDLINE contains bibliographic citations and author abstracts from more than 5,000 biomedical journals published in the United States and 70 other countries. The files contains over 19 million citations dating back to the mid-1940's, updated weekly.

FormatStylesExample Identifiers
default default, html, raw Id: 1, 2859121, 17567924
medlinefull default, html, raw Id: 1, 2859121, 17567924
medlineref default, html, raw Id: 1, 2859121, 17567924
bibtex default, raw Id: 1, 2859121, 17567924
endnote default, raw Id: 1, 2859121, 17567924
isi default, raw Id: 1, 2859121, 17567924
modsxml default, raw Id: 1, 2859121, 17567924
pubmedxml default, raw Id: 1, 2859121, 17567924
ris default, raw Id: 1, 2859121, 17567924
wordbibxml default, raw Id: 1, 2859121, 17567924

Data resources: Europe PMC, NCBI E-utilities

Patent DNA NRL1 (nrnl1)

http://www.ebi.ac.uk/patentdata/nr/

Non-redundant patent nucleotides level-1. Nucleotide sequences from patents clustered by 100% sequence identity over whole length.

FormatStylesExample Identifiers
default default, html, raw Id: NRN_DJ207917
annot default, html, raw Id: NRN_DJ207917
entrysize default, html, raw Id: NRN_DJ207917
nrl1 default, html, raw Id: NRN_DJ207917
seqxml default, raw Id: NRN_DJ207917
fasta default, html, raw Id: NRN_DJ207917

Data resources: SimpleIndex, NCBI BLAST blastdbcmd

Patent DNA NRL2 (nrnl2)

http://www.ebi.ac.uk/patentdata/nr/

Non-redundant patent nucleotides level-2. Nucleotide sequences from patents clustered by patent family and then by 100% sequence identity over whole length.

FormatStylesExample Identifiers
default default, html, raw Id: NRN006674C5
annot default, html, raw Id: NRN006674C5
entrysize default, html, raw Id: NRN006674C5
nrl2 default, html, raw Id: NRN006674C5
seqxml default, raw Id: NRN006674C5
fasta default, html, raw Id: NRN006674C5

Data resources: SimpleIndex, NCBI BLAST blastdbcmd

Patent Protein NRL1 (nrpl1)

http://www.ebi.ac.uk/patentdata/nr/

Non-redundant patent proteins level-1. Protein sequences from patents clustered by 100% sequence identity over whole length.

FormatStylesExample Identifiers
default default, html, raw Id: NRP_AX013047
annot default, html, raw Id: NRP_AX013047
entrysize default, html, raw Id: NRP_AX013047
nrl1 default, html, raw Id: NRP_AX013047
seqxml default, raw Id: NRP_AX013047
fasta default, html, raw Id: NRP_AX013047

Data resources: SimpleIndex, NCBI BLAST blastdbcmd

Patent Protein NRL2 (nrpl2)

http://www.ebi.ac.uk/patentdata/nr/

Non-redundant patent proteins level-2. Protein sequences from patents clustered by patent family and then by 100% sequence identity over whole length.

FormatStylesExample Identifiers
default default, html, raw Id: NRP00000001
annot default, html, raw Id: NRP00000001
entrysize default, html, raw Id: NRP00000001
nrl2 default, html, raw Id: NRP00000001
seqxml default, raw Id: NRP00000001
fasta default, html, raw Id: NRP00000001

Data resources: SimpleIndex, NCBI BLAST blastdbcmd

Patent Equivalents (patent_equivalents)

http://www.ebi.ac.uk/patentdata/

Patent number equivalents (families) and patent classifications for patents containing sequence data. The patent equivalents are obtained from the patent numbers cited in the major sequence databases (e.g. EMBL-Bank and Patent Proteins), which are them expanded into a set of patent equivalents forming a WIPO Simple Patent Family.

FormatStylesExample Identifiers
default default, html, raw Id: 10517942
patent_equivalents default, html, raw Id: 10517942

Data resources: SimpleIndex

PDB (pdb)

http://www.ebi.ac.uk/pdbe/

Macromolecular structures from the Brookhaven Protein Data Bank (PDB). Contains protein and nucleotide structure and sequence data.

FormatStylesExample Identifiers
default default, html, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
fasta default, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
annot default, html, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
mmcif default, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
pdb default, html, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
pdbml default, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa
rdfxml default, raw Id: 101D, 1GAG, 10MH, 3E3Q, 3E3Q_A, 3E3Q_a, 3E3QA, 3E3Qa

Data resources: EMBOSS seqret, PDB FTP@EMBL-EBI, PDBe, RCSB PDB, PDBj, wwPDB, PDBe FTP@EMBL-EBI

RefSeq (nucleotide) (refseqn)

http://www.ncbi.nlm.nih.gov/refseq/

The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.

FormatStylesExample Identifiers
default default, html, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
annot default, html, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
entrysize default, html, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
fasta default, html, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
insdxml default, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
refseq default, html, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
seqxml default, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2
tinyseq default, raw Accession: AC_000014, NC_004952, NG_000869, NM_000231, NR_000003, NS_000193
Id: 110189662, 32455353, 149879847, 209529740
Sequence version: AC_000014.1, NC_004952.1, NG_000869.2, NM_000231.2

Data resources: NCBI E-utilities, SRS@DKFZ

RefSeq (protein) (refseqp)

http://www.ncbi.nlm.nih.gov/refseq/

The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.

FormatStylesExample Identifiers
default default, html, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
annot default, html, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
entrysize default, html, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
fasta default, html, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
insdxml default, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
refseqp default, html, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
seqxml default, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1
tinyseq default, raw Accession: AP_000130, NP_001075859, XP_006514731, YP_001876438
Id: 56160459, 126723201
Sequence version: AP_000130.1, NP_001075859.1

Data resources: NCBI E-utilities, SRS@DKFZ

SGT (sgt)

http://targetdb.pdb.org/

Structural Genomics Targets (SGT) is a protein target registration database, providing information on the experimental progress and status of target amino acid sequences selected for structural determination.

FormatStylesExample Identifiers
default default, raw Id: 1, 283832, APC4091, IGBMC-0038-000, IPRv2949cPALZ, YCL050c
annot default, html, raw Id: 1, 283832, APC4091, IGBMC-0038-000, IPRv2949cPALZ, YCL050c
fasta default, html, raw Id: 1, 283832, APC4091, IGBMC-0038-000, IPRv2949cPALZ, YCL050c
seqxml default, raw Id: 1, 283832, APC4091, IGBMC-0038-000, IPRv2949cPALZ, YCL050c
sgtxml default, raw Id: 1, 283832, APC4091, IGBMC-0038-000, IPRv2949cPALZ, YCL050c

Data resources: SimpleIndex, NCBI BLAST blastdbcmd, EMBOSS seqret

Taxonomy (taxonomy)

http://www.ncbi.nlm.nih.gov/Taxonomy/

Taxonomic classification of organisms for which there are sequences in the INSDC databases (i.e. DDBJ, EMBL-Bank and GenBank) and many other biological databases.

FormatStylesExample Identifiers
default default, html, raw Id: 3702, 9606
taxonomy default, html, raw Id: 3702, 9606
enataxonomyxml default, raw Id: 3702, 9606
uniprottaxonomyrdfxml default, raw Id: 3702, 9606

Data resources: ENA Browser, UniProt.org, EMBOSS taxget, SRS@DKFZ

Trace Archive (tracearchive)

http://www.ebi.ac.uk/ena/

An archive of capillary electrophoresis trace data.

FormatStylesExample Identifiers
default default, raw Id: TI1, TI1941166100
fasta default, raw Id: TI1, TI1941166100
fastq default, raw Id: TI1, TI1941166100
tracexml default, raw Id: TI1, TI1941166100

Data resources: ENA Browser

UniParc (uniparc)

http://www.uniprot.org/

The UniProt Archive (UniParc) contains available protein sequences collected from many different sources. The sequence data are archived to facilitate examination of changes to sequence data. Search UniParc if you want to examine the "history" of a particular sequence.

FormatStylesExample Identifiers
default default, raw Accession: UPI0000000001, UPI0000046364, UPI00001B3DCE
fasta default, raw Accession: UPI0000000001, UPI0000046364, UPI00001B3DCE
seqxml default, raw Accession: UPI0000000001, UPI0000046364, UPI00001B3DCE
uniparc default, raw Accession: UPI0000000001, UPI0000046364, UPI00001B3DCE
uniprotrdfxml default, raw Accession: UPI0000000001, UPI0000046364, UPI00001B3DCE

Data resources: UniProt.org, NCBI BLAST blastdbcmd, EMBOSS seqret

UniProtKB (uniprotkb)

http://www.uniprot.org/

The UniProt Knowledgebase (UniProtKB) is the central access point for extensive curated protein information, including function, classification, and cross-references. Search UniProtKB to retrieve “everything that is known” about a particular sequence.

FormatStylesExample Identifiers
default default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
annot default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
entrysize default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
fasta default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
gff3 default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
seqxml default, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
uniprot default, html, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
uniprotrdfxml default, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT
uniprotxml default, raw Accession: P06213, P29306, P68255
Name: INSR_HUMAN, 1433X_MAIZE, 1433T_RAT

Data resources: UniProt.org, NCBI BLAST blastdbcmd, EMBOSS entret, SRS@DKFZ

UniRef100 (uniref100)

http://www.uniprot.org/

The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.

FormatStylesExample Identifiers
default default, raw Id: UniRef100_P06213
fasta default, raw Id: UniRef100_P06213
seqxml default, raw Id: UniRef100_P06213
uniprotrdfxml default, raw Id: UniRef100_P06213
uniref100 default, raw Id: UniRef100_P06213

Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ

UniRef50 (uniref50)

http://www.uniprot.org/

The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.

FormatStylesExample Identifiers
default default, raw Id: UniRef50_P06213
fasta default, raw Id: UniRef50_P06213
seqxml default, raw Id: UniRef50_P06213
uniprotrdfxml default, raw Id: UniRef50_P06213
uniref50 default, raw Id: UniRef50_P06213

Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ

UniRef90 (uniref90)

http://www.uniprot.org/

The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.

FormatStylesExample Identifiers
default default, raw Id: UniRef90_P06213
fasta default, raw Id: UniRef90_P06213
seqxml default, raw Id: UniRef90_P06213
uniprotrdfxml default, raw Id: UniRef90_P06213
uniref90 default, raw Id: UniRef90_P06213

Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ

UniSave (unisave)

http://www.ebi.ac.uk/uniprot/unisave/

The UniProtKB Sequence/Annotation Version Archive (UniSave) is a repository of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions.

FormatStylesExample Identifiers
default default, raw Accession: P06213
Entry version: P06213.157, P06213.3
annot default, raw Accession: P06213
Entry version: P06213.157, P06213.3
entrysize default, html, raw Accession: P06213
Entry version: P06213.157, P06213.3
fasta default, raw Accession: P06213
Entry version: P06213.157, P06213.3
uniprot default, raw Accession: P06213
Entry version: P06213.157, P06213.3

Data resources: UniSave

USPTO Proteins (uspto_prt)

http://www.ebi.ac.uk/patentdata/proteins/

Protein sequences appearing in patents from the United States Patent and Trademark Office (USPTO).

FormatStylesExample Identifiers
default default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
annot default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
embl default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
entrysize default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
fasta default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
seqxml default, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
genpept default, html, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
insdxml default, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1
tinyseq default, raw Accession: AAA00053
Name: I02590
Sequence version: AAA00053.1

Data resources: SimpleIndex, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret, NCBI E-utilities

spacer
spacer