![]() |
EBI Dbfetch DatabasesIntroductionThe databases available via dbfetch are listed below, the name in parenthesis should be used when:
An overview of each database is also provided, which includes a short description of the database, a link to the database, a collection of example identifiers and details of the available data formats and result styles. Databases
EDAM (edam)http://edamontology.sourceforge.net/ EMBRACE Data and Methods (EDAM) Ontology.
Data resources: EMBOSS ontoget, EMBOSS ontotext EMBL-Bank (embl)EMBL Nucleotide Sequence Database, Europe's primary nucleotide sequence resource. The main sources of the DNA and RNA sequences in the database are submissions from individual researchers, genome sequencing projects and patent applications.
Data resources: ENA Browser, EMBL-SVA, NCBI BLAST blastdbcmd, SRS@DKFZ EMBLCDS (emblcds)EMBLCDS is a database of nucleotide sequences of the CDS (coding sequence) features, as annotated in EMBL database. EMBLCDS record contains the nucleotide sequence of the CDS region, accompanying annotation from the parent nucleotide entry and the additional automatically generated annotation.
Data resources: ENA Browser, NCBI BLAST blastdbcmd EMBLCON (emblcon)The EMBLCON database division represents complete genomes and other long sequences constructed from segment entries.
Data resources: ENA Browser, EMBL-SVA EMBLCONEXP (emblconexp)The EMBLCON database division represents complete genomes and other long sequences constructed from segment entries. Expanded entries including the complete sequence.
Data resources: ENA Browser EMBL-SVA (emblsva)http://www.ebi.ac.uk/cgi-bin/sva/sva.pl The EMBL Sequence Version Archive is a repository of all entries which have ever appeared in the EMBL Nucleotide Sequence Database.
Data resources: EMBL-SVA Ensembl Gene (ensemblgene)Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Gene sequences and annotations.
Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia Ensembl Genomes Gene (ensemblgenomesgene)http://www.ensemblgenomes.org/ Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Gene sequences and annotations.
Data resources: EnsemblGenomes UK Ensembl Genomes Transcript (ensemblgenomestranscript)http://www.ensemblgenomes.org/ Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Transcript sequences.
Data resources: EnsemblGenomes UK Ensembl Transcript (ensembltranscript)Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Transcript sequences.
Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia EPO Proteins (epo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the European Patent Office (EPO).
Data resources: EB-eye, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret HGNC (hgnc)HUGO Gene Nomenclature Committee (HGNC) approved gene name and symbol (short-form abbreviation) for each human gene.
Data resources: GeneNames.org IMGT/HLA (imgthla)http://www.ebi.ac.uk/imgt/hla/ Sequences of the human major histocompatibility complex (HLA) including the official sequences for the WHO Nomenclature Committee For Factors of the HLA System.
Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd IMGT/LIGM-DB (imgtligm)http://imgt.cines.fr/cgi-bin/IMGTlect.jv A comprehensive database of Immunoglobulins and T cell Receptors from human and other vertebrates.
Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd InterPro (interpro)http://www.ebi.ac.uk/interpro/ The InterPro database (Integrated Resource of Protein Domains and Functional Sites) is an integrated documentation resource for protein families, domains and functional sites. It was developed initially as a means of rationalising the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects, but now also includes the SMART, TIGRFAMs, PIR SuperFamilies and most recently SUPERFAMILY databases.
Data resources: IPD-KIR (ipdkir)A centralised repository for human Killer-cell Immunoglobulin-like Receptor (KIR) sequences.
Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd IPD-MHC (ipdmhc)Sequences of the the major histocompatibility complex in a number of species.
Data resources: EMBOSS entret, EMBOSS seqret, NCBI BLAST blastdbcmd IPRMC (iprmc)http://www.ebi.ac.uk/interpro/ InterPro Matches Complete (IPRMC) for UniProtKB proteins.
Data resources: SRS@WBW IPRMC UniParc (iprmcuniparc)http://www.ebi.ac.uk/interpro/ InterPro Matches Complete (IPRMC) for UniParc proteins.
Data resources: JPO Proteins (jpo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the Japanese Patent Office (JPO).
Data resources: EB-eye, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret KIPO Proteins (kipo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the Korean Intellectual Property Office (KIPO).
Data resources: EB-eye, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret MEDLINE (medline)http://www.nlm.nih.gov/pubs/factsheets/medline.html MEDLINE contains bibliographic citations and author abstracts from more than 5,000 biomedical journals published in the United States and 70 other countries. The files contains over 19 million citations dating back to the mid-1940's, updated weekly.
Data resources: Europe PMC, NCBI E-utilities Patent DNA NRL1 (nrnl1)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent nucleotides level-1. Nucleotide sequences from patents clustered by 100% sequence identity over whole length.
Data resources: SRS@EMBL-EBI, NCBI BLAST blastdbcmd Patent DNA NRL2 (nrnl2)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent nucleotides level-2. Nucleotide sequences from patents clustered by patent family and then by 100% sequence identity over whole length.
Data resources: SRS@EMBL-EBI, NCBI BLAST blastdbcmd Patent Protein NRL1 (nrpl1)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent proteins level-1. Protein sequences from patents clustered by 100% sequence identity over whole length.
Data resources: SRS@EMBL-EBI, NCBI BLAST blastdbcmd Patent Protein NRL2 (nrpl2)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent proteins level-2. Protein sequences from patents clustered by patent family and then by 100% sequence identity over whole length.
Data resources: SRS@EMBL-EBI, NCBI BLAST blastdbcmd Patent Equivalents (patent_equivalents)http://www.ebi.ac.uk/patentdata/ Patent number equivalents (families) and patent classifications for patents containing sequence data. The patent equivalents are obtained from the patent numbers cited in the major sequence databases (e.g. EMBL-Bank and Patent Proteins), which are them expanded into a set of patent equivalents forming a WIPO Simple Patent Family.
Data resources: SRS@EMBL-EBI PDB (pdb)Macromolecular structures from the Brookhaven Protein Data Bank (PDB). Contains protein and nucleotide structure and sequence data.
Data resources: EMBOSS seqret, PDB FTP@EMBL-EBI, PDBe, RCSB PDB, PDBj, wwPDB, PDBe FTP@EMBL-EBI RefSeq (nucleotide) (refseqn)http://www.ncbi.nlm.nih.gov/refseq/ The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.
Data resources: NCBI E-utilities, SRS@DKFZ RefSeq (protein) (refseqp)http://www.ncbi.nlm.nih.gov/refseq/ The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.
Data resources: NCBI E-utilities, SRS@DKFZ SGT (sgt)Structural Genomics Targets (SGT) is a protein target registration database, providing information on the experimental progress and status of target amino acid sequences selected for structural determination.
Data resources: SRS@EMBL-EBI, NCBI BLAST blastdbcmd, EMBOSS seqret Taxonomy (taxonomy)http://www.ncbi.nlm.nih.gov/Taxonomy/ Taxonomic classification of organisms for which there are sequences in the INSDC databases (i.e. DDBJ, EMBL-Bank and GenBank) and many other biological databases.
Data resources: EB-eye, ENA Browser, UniProt.org, EMBOSS taxget, SRS@DKFZ Trace Archive (tracearchive)An archive of capillary electrophoresis trace data.
Data resources: ENA Browser UniParc (uniparc)The UniProt Archive (UniParc) contains available protein sequences collected from many different sources. The sequence data are archived to facilitate examination of changes to sequence data. Search UniParc if you want to examine the "history" of a particular sequence.
Data resources: UniProt.org, NCBI BLAST blastdbcmd, EMBOSS seqret, SRS@EMBL-EBI UniProtKB (uniprotkb)The UniProt Knowledgebase (UniProtKB) is the central access point for extensive curated protein information, including function, classification, and cross-references. Search UniProtKB to retrieve “everything that is known” about a particular sequence.
Data resources: UniProt.org, NCBI BLAST blastdbcmd, EMBOSS entret, SRS@DKFZ, SRS@EMBL-EBI UniRef100 (uniref100)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ, SRS@EMBL-EBI UniRef50 (uniref50)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ, SRS@EMBL-EBI UniRef90 (uniref90)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, NCBI BLAST blastdbcmd, SRS@DKFZ, SRS@EMBL-EBI UniSave (unisave)http://www.ebi.ac.uk/uniprot/unisave/ The UniProtKB Sequence/Annotation Version Archive (UniSave) is a repository of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions.
Data resources: UniSave USPTO Proteins (uspto_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the United States Patent and Trademark Office (USPTO).
Data resources: EB-eye, EMBOSS entret, NCBI BLAST blastdbcmd, EMBOSS seqret ![]() |