We began with the world's first nucleotide sequence database: the EMBL Nucleotide Sequence Data Library (now EMBL Bank, part of the European Nucleotide Archive), established in 1980 at EMBL in Heidelberg, Germany. The original goal was to establish a central database of DNA sequences.

What began as a modest task of abstracting information from scientific literature soon grew into a major database activity, with researchers submitting their data directly and an ever-increasing demand for highly skilled informaticians to manage it all.

High-profile genome projects brought more attention to the project, and the commercial sector began to see the relevance of public data.

The EMBL Nucleotide Sequence Data Library clearly needed financial security to ensure its long-term viability.

EMBL-EBI is established in the UK

In 1992, EMBL Council voted to establish the EMBL-European Bioinformatics Institute (EMBL-EBI) and locate it on the Wellcome Trust Genome Campus in Hinxton, UK, where it would be in close proximity to the major sequencing efforts at the Wellcome Trust Sanger Institute.

The transition of two major bioinformatics services from Heidelberg to Hinxton began in 1992 and in September 1994, EMBL-EBI was firmly established in the UK.

Watch a video about EMBL-EBI

The European Nucleotide Archive and the protein sequence resource UniProt (then known as Swiss-Prot–TrEMBL) were the original EMBL-EBI databases. Since then, the EMBL-EBI has played a major part in the bioinformatics revolution.

We now provide the world’s most comprehensive range of molecular databases and offer an extensive user training programme. Our basic research programme has grown substantially, and remains closely tied with the evolution of our resources.

Bioinformatics today

Data at EMBL-EBI spans genomics, proteins, expression, small molecules, protein structures, systems, ontologies and the literature.

Researchers today depend on access to large data sets of many different types, spanning genes, proteins and the behaviour of small molecules.

Breakthrough methods such as DNA sequencing have changed the face of research in such a short time that they are considered to be disruptive technologies: they are so much better than what came before that it is difficult for researchers to adapt.

Life-science experiments are generating a flood of data every day, which is good news for researchers but poses practical challenges. The amount of data produced is doubling twice as quickly as computer storage and processing power, and this rate is increasing.

Bioinformatics makes it possible to collect, store and add value to these data so that researchers in many fields can retrieve and analyse them efficiently. EMBL-EBI is one of very few places in the world that has the capacity and expertise to fulfil this important task.

It is all too easy to take for granted that data generated in publicly funded experiments will be stored, managed and kept freely available for researchers to query, well into the future.

Why bioinformatics matters

Biological data is the bedrock of life science research. Here are a few examples of how it can be used in beneficial ways:

  • Understanding plant genomes helps us identify which species will be most tolerant to drought, salt and pests while still providing optimum nutrition. EMBL-EBI hosts Ensembl Genomes, a service that lets researchers access and compare genome-scale data from agriculturally relevant species.
  • If we can identify patterns of genes that are active in different tumours, we can diagnose and treat cancers earlier.
  • Methicillin-resistant Staphylococcus aureus (MRSA) infection is a global problem. Small variations in DNA sequence can help track transmission - this technology can help identify the source of new outbreaks.
  • Drug resistance is growing to the one medicine used to treat Schistosomiasis, a parasitic infection. Studying the Schistosome genome will help identify the targets of existing drugs.
  • In order to develop new drugs, researchers need to identify targets and build on previous research by a vast number of individual R&D efforts. ChEMBL provides a freely available catalogue of bioactive, drug-like small molecules and the tools to explore them.
  • Short sections of DNA - called barcodes - are used to identify an organism. The Barcode of Life Initiative uses the European Nucleotide Archive to implement DNA barcoding as a global standard for identifying species, which will have applications in the protection of endangered species, sustaining natural resources through pest control and food labelling.