International Genome Sample Resource

The 1000 Genomes Project is a fully open resource consisting of nearly 2500 sequenced individuals from five major world populations (Europeans, East Asians, Amerians, Africans and South Asians). It is designed to capture and provide all common (>1%) genetic variation. Data is made freely and openly available in advance of publication.


The Ensembl project, founded in 1999 to support the results of the Human Genome Project, supports over 80 vertebrate species and provides resources such as reference gene sets, whole genome alignments, gene homology annotation, gene sequence alignments, variant annotation and regulatory regions. Many of these datasets have been adopted as authoritative references within the scientific community.

Ensembl Genomes

Ensembl Genomes is an integrating portal providing access to genome-scale data from across the taxonomic space. Using the same infrastructure developed in the context of the vertebrate-focused Ensembl project, it offers consistent interactive and programmatic user interfaces to data from important invertebrate metazoan, plant, fungal, protist and bacterial species. Key data types supported include genome sequence, (structural and functional) annotation of genes, regulatory elements, and polymorphisms; and comparative and evolutionary analyses.

Immuno Polymorphism Database

The Immuno Polymorphism Database (IPD), was developed in 2003 to provide a centralised system for the study of polymorphism in genes of the immune system. The IPD project was established by the HLA Informatics Group of the Anthony Nolan Research Institute in close collaboration with the European Bioinformatics Institute. The IMGT/HLA Database provides a specialist database for sequences of the human major histocompatibility complex (HLA) and includes the official sequences for the WHO Nomenclature Committee For Factors of the HLA System. The IMGT/HLA Database is part of the international ImMunoGeneTics project (IMGT).


TreeFam (Tree families database) is a database of phylogenetic trees of animal genes. It is a curated resource that aims to provide reliable information about ortholog and paralog assignments, as well as an evolutionary history of various gene families.TreeFam is also an ortholog database. It fits a gene tree into the universal species tree and finds historical duplications, speciations and loss events.


VectorBase is an NIH-NIAID Bioinformatics Resource Center providing genomic, phenotypic and population-centric data to the scientific community for invertebrate vectors of human pathogens.


PhytoPath is a joint project of EMBL-EBI and Rothamsted Research to develop resources for fungal and oomycete plant pathogens, concentrating on genome annotation, comparative analysis and categorisation of infectious phenotypes.  It is developed from Ensembl Fungi, Ensembl Protists and the community-curated resource PHI-base, makes genome analysis for phytopathogens available within the Ensembl system, and integrates this with functional information about genes involved in pathogenesis.

TreeFam search

Search of TreeFam's HMM collection, allows query sequence to be aligned (MAFFT) and inserted into tree (RAxML) to significantly matching models


Gramene is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. Its goal is to facilitate the study of cross-species comparisons using information generated from projects supported by public funds. Gramene currently hosts annotated whole genomes in over two dozen plant species and partial assemblies for almost a dozen wild rice species in the Ensembl browser, genetic and physical maps with genes, ESTs and QTLs locations, genetic diversity data sets, structure-function analysis of proteins, plant pathways databases (BioCyc and Plant Reactome platforms), and descriptions of phenotypic traits and mutations.