Databases at the EBI


Databases at the EBI

The main missions of the European Bioinformatics Institute (EBI) centre on building, maintaining and providing biological databases and information services to support data deposition and exploitation.

Nucleotide sequences

  • ENA (European Nucleotide Archive) – primary nucleotide data, incorporating EMBL-Bank
  • Ensembl – vertebrate genomes
  • Ensembl Genomes – non-vertebrate genomes
  • DGVa (Database of Genomic Variants archive) – genomic variants
  • EGA (European Genome-phenome Archive) – genetic and phenotypic data
  • IMGT/HLA – HLA system sequences
  • MetaGenomics – genomes from a specific environment
  • Patents – patent data resource
  • SVA (Sequence Version Archive) – historical repository of EMBL-Bank entries

Protein sequences

  • UniProt – protein databases, including UniProtKB/Swiss-Prot
  • InterPro – predictive protein signatures
  • IntEnz (Integrated relational Enzyme database) – enzyme data
  • PRIDE (PRoteomics IDEntifications) – proteomics data repository
  • UniProt-GOA (Gene Ontology Annotation) – GO annotations to UniProtKB
  • UniSave – historical repository of UniProtKB entry versions

Biological structures

  • PDBe (Protein Data Bank in Europe) – biological macromolecular structures
  • CSA (Catalytic Site Atlas) – active sites and catalytic residues
  • PDBSum – protein structure summaries
  • ProFunc – biochemical function prediction

Pathways & networks

  • BioModels – annotated biological models
  • IntAct – protein interactions
  • Reactome – biological pathways
  • Rhea – chemical reactions

Functional genomics

  • ArrayExpress – functional genomics experiments
  • BioSD (BioSample Database) – biological samples
  • Gene Expression Atlas – semantically enriched, condition-specific gene expression patterns

Small molecules

  • ChEBI (Chemical Entities of Biological Interest) – dictionary of small chemical compounds
  • ChEMBL – bioactive drug-like small molecules
  • EuroCarb – carbohydrate structures
  • GPCR SARFari – chemogenomics workbench focused on GPCRs
  • Kinase SARfari – chemogenomics workbench focused on Kinases
  • PDBeChem – chemical components present in PDB entries
  • RESID – annotations and structures for protein modifications


  • GO (Gene Ontology) – standardised representation of gene and gene product attributes
  • QuickGO – web interface for browsing GO terms and annotations
  • GOA (Gene Ontology Annotation) – GO annotations to entries in UniProtKB
  • EFO (Experimental Factor Ontology) – ontology of experimental factors used in ArrayExpress
  • OLS (Ontology Lookup Service) – query multiple ontologies via a single interface
  • SBO (Systems Biology Ontology) – controlled vocabulary for systems biology


  • CiteXplore – combine biomedical literature searching with text mining tools
  • MEDLINE – bibliographic database from the US National Library of Medicine
  • UKPMC (UK PubMed Central) – access to biomedical and health research findings

Data browsing

  • EBI Search – user-friendly search across all our major databases
  • BioMart – unified access to biological databases worldwide
  • Dbfetch (Data Fetch Tool) – programmatic-style retrieval of entries from EBI databases
  • SRS (Sequence Retrieval System) – advanced querying of EBI databases