The roots of the EMBL-EBI lie in the world's first nucleotide sequence database, the EMBL Nucleotide Sequence Data Library (now known as the European Nucleotide Archive), which was established in 1980 at the European Molecular Biology Laboratory in Heidelberg, Germany. The original goal was to establish a central database of DNA sequences, rather than have scientists submit sequences to journals.

What began as a modest task of abstracting information from scientific literature soon became a major database activity, with researchers submitting their data directly and a growing demand for highly skilled informaticians to manage it all. High-profile genome projects brought more attention to the project, and the commercial sector began to see the relevance of public data. The EMBL Nucleotide Sequence Data Library needed financial security to ensure its long-term viability.

In 1992, the EMBL Council voted to establish the EMBL-European Bioinformatics Institute (EMBL-EBI) and to locate it on the Wellcome Trust Genome Campus in Hinxton, UK, where it would be in close proximity to the major sequencing efforts at the Sanger Institute. The transition of two major bioinformatics services from Heidelberg to Hinxton began in 1992 and in September 1995, the EMBL-EBI was firmly established in the UK.

The European Nucleotide Archive (then the EMBL Data Bank) and the protein sequence resource UniProt (then known as Swiss-Prot–TrEMBL) were the original EMBL-EBI databases. Since then, the EMBL-EBI has played a major part in the bioinformatics revolution: we now provide the world’s most comprehensive range of molecular databases and offer an extensive user training programme. Our basic research programme has grown substantially and remains closely tied with the evolution of our resources.

Bioinformatics today

Researchers today depend on access to large data sets of many different types, from genes to protein interactions and the behaviour of small molecules. Breakthrough methods such as DNA sequencing have changed the face of research in such a short time that they are considered to be disruptive technologies. They are so much better than what came before that it is difficult for researchers to adapt.

A veritable flood of data is coming out of life science experiments every day, which is good news for researchers but poses some fascinating challenges. Consider that the amount of data produced is doubling twice as quickly as computer storage and processing power, and that this rate is increasing. It is all too easy to take for granted that data generated in publicly funded experiments will be stored, managed and kept freely available for researchers to query.

Bioinformatics makes it possible to collect, store and add value to these data so that researchers in many fields can retrieve and analyse them efficiently. The EMBL-European Bioinformatics Institute is one of very few places in the world that has the resources and expertise to fulfil this important task.

How bioinformatics impacts our lives

Biological data is the bedrock of life science research. Here are a few examples of how it can be used in beneficial ways:

  • Understanding plant genomes helps us identify which species will be most tolerant to drought, salt and pests while still providing optimum nutrition. EMBL-EBI hosts Ensembl Genomes, a service that lets researchers access and compare genome-scale data from agriculturally relevant species.
  • If we can identify patterns of genes that are active in different tumours, we can diagnose and treat cancers earlier.
  • Methicillin-resistant Staphylococcus aureus (MRSA) infection is a global problem. Small variations in DNA sequence can help track transmission - this technology can help identify the source of new outbreaks.
  • Drug resistance is growing to the one medicine used to treat Schistosomiasis, a parasitic infection. Studying the Schistosome genome will help identify the targets of existing drugs.
  • In order to develop new drugs, researchers need to identify targets and build on previous research by a vast number of individual R&D efforts. ChEMBL provides a freely available catalogue of bioactive, drug-like small molecules and the tools to explore them.
  • Short sections of DNA - barcodes - are used to identify an organism. The Barcode of Life Initiative uses the European Nucleotide Archive to implement DNA barcoding as a global standard for identifying species, which will have applications in the protection of endangered species, sustaining natural resources through pest control and food labelling.