![]() |
EBI Dbfetch DatabasesIntroductionThe databases available via dbfetch are listed below, the name in parenthesis should be used when:
An overview of each database is also provided, which includes a short description of the database, a link to the database, a collection of example identifiers and details of the available data formats and result styles. Databases
EDAM (edam)http://edamontology.sourceforge.net/ EMBRACE Data and Methods (EDAM) Ontology.
Data resources: SRS@EBI EMBL-Bank (embl)EMBL Nucleotide Sequence Database, Europe's primary nucleotide sequence resource. The main sources of the DNA and RNA sequences in the database are submissions from individual researchers, genome sequencing projects and patent applications.
Data resources: ENA Browser, SRS@EBI, Xembl, NCBI BLAST blastdbcmd EMBLCDS (emblcds)EMBLCDS is a database of nucleotide sequences of the CDS (coding sequence) features, as annotated in EMBL database. EMBLCDS record contains the nucleotide sequence of the CDS region, accompanying annotation from the parent nucleotide entry and the additional automatically generated annotation.
Data resources: ENA Browser, SRS@EBI, NCBI BLAST blastdbcmd EMBLCON (emblcon)The EMBLCON database division represents complete genomes and other long sequences constructed from segment entries.
Data resources: ENA Browser, SRS@EBI EMBLCONEXP (emblconexp)The EMBLCON database division represents complete genomes and other long sequences constructed from segment entries. Expanded entries including the complete sequence.
Data resources: ENA Browser, SRS@EBI EMBL-SVA (emblsva)http://www.ebi.ac.uk/cgi-bin/sva/sva.pl The EMBL Sequence Version Archive is a repository of all entries which have ever appeared in the EMBL Nucleotide Sequence Database.
Data resources: EMBL-SVA Ensembl Gene (ensemblgene)Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Gene sequences and annotations.
Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia Ensembl Genomes Gene (ensemblgenomesgene)http://www.ensemblgenomes.org/ Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Gene sequences and annotations.
Data resources: EnsemblGenomes UK Ensembl Genomes Transcript (ensemblgenomestranscript)http://www.ensemblgenomes.org/ Ensembl Genomes genome databases for metazoa, plants, fungi, protists and bacteria, for vertebrate species and model organisms see Ensembl instead of Ensembl Genomes. Transcript sequences.
Data resources: EnsemblGenomes UK Ensembl Transcript (ensembltranscript)Ensembl genome databases for vertebrate species and model organisms, for other species see Ensembl Genomes instead of Ensembl. Transcript sequences.
Data resources: Ensembl UK, Ensembl USA East, Ensembl USA West, Ensembl Asia EPO Proteins (epo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the European Patent Office (EPO).
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret GenomeReviews (genomereviews)http://www.ebi.ac.uk/GenomeReviews/ The Genome Reviews Database consists of curated versions of complete genome entries from the EMBL/GenBank/DDBJ nucleotide sequence database.
Data resources: SRS@EBI GenomeReviews Gene (genomereviewsgene)http://www.ebi.ac.uk/GenomeReviews/ Genome Reviews Gene records are available for all archaeal, bacterial, phage and eukaryotic genomes present in Genome Reviews. Each record represents one gene present in a Genome Reviews component record.
Data resources: SRS@EBI GenomeReviews Transcript (genomereviewstranscript)http://www.ebi.ac.uk/GenomeReviews/ Genome Reviews transcript records are available for all archaeal, bacterial, phage and eukaryotic genomes present in Genome Reviews. Each record represents one transcript present in a Genome Reviews component record.
Data resources: SRS@EBI HGNC (hgnc)HUGO Gene Nomenclature Committee (HGNC) approved gene name and symbol (short-form abbreviation) for each human gene.
Data resources: SRS@EBI HGVBase (hgvbase)The Human Genic Bi-Allelic Sequences Database is an attempt to summarize all known sequence variations in the human genome and to facilitate research into how genotypes affect common diseases, drug responses, and other complex phenotypes.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS seqret IMGT/HLA (imgthla)http://www.ebi.ac.uk/imgt/hla/ Sequences of the human major histocompatibility complex (HLA) including the official sequences for the WHO Nomenclature Committee For Factors of the HLA System.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret IMGT/LIGM-DB (imgtligm)http://imgt.cines.fr/cgi-bin/IMGTlect.jv A comprehensive database of Immunoglobulins and T cell Receptors from human and other vertebrates.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret InterPro (interpro)http://www.ebi.ac.uk/interpro/ The InterPro database (Integrated Resource of Protein Domains and Functional Sites) is an integrated documentation resource for protein families, domains and functional sites. It was developed initially as a means of rationalising the complementary efforts of the PROSITE, PRINTS, Pfam and ProDom database projects, but now also includes the SMART, TIGRFAMs, PIR SuperFamilies and most recently SUPERFAMILY databases.
Data resources: SRS@EBI IPD-KIR (ipdkir)A centralised repository for human Killer-cell Immunoglobulin-like Receptor (KIR) sequences.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret IPD-MHC (ipdmhc)Sequences of the the major histocompatibility complex in a number of species.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret IPI (ipi)The International Protein Index (IPI) provides non-redundant proteome sets for a selection of higher eukaryotes, e.g. Arabidopsis, Chicken, Mouse, Human, etc. Cross-references are provided to the various source databases.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret IPI History (ipihistory)IPI History provides details of the history of identifiers in the International Protein Index (IPI) database. Includes details of entry creation, deletion and replacement.
Data resources: SRS@EBI IPRMC (iprmc)http://www.ebi.ac.uk/interpro/ InterPro Matches Complete (IPRMC) for UniProtKB proteins.
Data resources: SRS@EBI, SRS@WBW IPRMC UniParc (iprmcuniparc)http://www.ebi.ac.uk/interpro/ InterPro Matches Complete (IPRMC) for UniParc proteins.
Data resources: SRS@EBI JPO Proteins (jpo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the Japanese Patent Office (JPO).
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret KIPO Proteins (kipo_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the Korean Intellectual Property Office (KIPO).
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret LiveLists (livelists)ftp://ftp.ncbi.nlm.nih.gov/genbank/livelists/ NCBI LiveLists provides a mapping between NCBI gi numbers and INSDC (i.e. DDBJ, EMBL-Bank and GenBank) accessions.
Data resources: SRS@EBI MEDLINE (medline)http://www.ebi.ac.uk/Databases/MEDLINE/medline.html MEDLINE contains bibliographic citations and author abstracts from more than 4,000 biomedical journals published in the United States and 70 other countries. The files contains over 11 million citations dating back to the mid-1960's, updated weekly.
Data resources: SRS@EBI, CiteXplore, NCBI E-utilities Patent DNA NRL1 (nrnl1)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent nucleotides level-1. Nucleotide sequences from patents clustered by 100% sequence identity over whole length.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd Patent DNA NRL2 (nrnl2)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent nucleotides level-2. Nucleotide sequences from patents clustered by patent family and then by 100% sequence identity over whole length.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd Patent Protein NRL1 (nrpl1)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent proteins level-1. Protein sequences from patents clustered by 100% sequence identity over whole length.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd Patent Protein NRL2 (nrpl2)http://www.ebi.ac.uk/patentdata/nr/ Non-redundant patent proteins level-2. Protein sequences from patents clustered by patent family and then by 100% sequence identity over whole length.
Data resources: SRS@EBI, NCBI BLAST blastdbcmd PDB (pdb)Macromolecular structures from the Brookhaven Protein Data Bank (PDB). Contains protein and nucleotide structure and sequence data.
Data resources: EMBOSS seqret, RCSB FTP@EBI, SRS@EBI, PDBe FTP@EBI RefSeq (nucleotide) (refseqn)http://www.ncbi.nlm.nih.gov/refseq/ The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.
Data resources: SRS@EBI, NCBI E-utilities RefSeq (protein) (refseqp)http://www.ncbi.nlm.nih.gov/refseq/ The NCBI Reference Sequence project (RefSeq) provides reference sequence standards for the naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins.
Data resources: SRS@EBI, NCBI E-utilities RESID (resid)A comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.
Data resources: SRS@EBI SGT (sgt)Structural Genomics Targets (SGT) is a protein target registration database, providing information on the experimental progress and status of target amino acid sequences selected for structural determination.
Data resources: SRS@EBI, TargetDB, NCBI BLAST blastdbcmd, EMBOSS seqret Taxonomy (taxonomy)http://www.ncbi.nlm.nih.gov/Taxonomy/ Taxonomic classification of organisms for which there are sequences in the INSDC databases (i.e. DDBJ, EMBL-Bank and GenBank) and many other biological databases.
Data resources: SRS@EBI, ENA Browser, UniProt.org Trace Archive (tracearchive)An archive of capillary electrophoresis trace data.
Data resources: ENA Browser UniParc (uniparc)The UniProt Archive (UniParc) contains available protein sequences collected from many different sources. The sequence data are archived to facilitate examination of changes to sequence data. Search UniParc if you want to examine the "history" of a particular sequence.
Data resources: UniProt.org, SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS seqret UniProtKB (uniprotkb)The UniProt Knowledgebase (UniProtKB) is the central access point for extensive curated protein information, including function, classification, and cross-references. Search UniProtKB to retrieve “everything that is known” about a particular sequence.
Data resources: UniProt.org, SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret UniRef100 (uniref100)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, SRS@EBI, NCBI BLAST blastdbcmd UniRef50 (uniref50)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, SRS@EBI, NCBI BLAST blastdbcmd UniRef90 (uniref90)The UniProt Reference Clusters (UniRef) databases combine closely related sequences into a single record to speed searches. There are three different non-redundant databases with different sequence identity cut-offs. In UniRef100, UniRef90 and UniRef50 databases no pair of sequences in the representative set has >100%, >90% or >50% mutual sequence identity. The three UniRef databases allow the user to choose between a fast search and a truly comprehensive one.
Data resources: UniProt.org, SRS@EBI, NCBI BLAST blastdbcmd UniSave (unisave)The UniProtKB Sequence/Annotation Version Archive (UniSave) is a repository of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entry versions.
Data resources: UniSave USPTO Proteins (uspto_prt)http://www.ebi.ac.uk/patentdata/proteins/ Protein sequences appearing in patents from the United States Patent and Trademark Office (USPTO).
Data resources: SRS@EBI, NCBI BLAST blastdbcmd, EMBOSS entret ![]() |