Using EBI Search

Here are the answers to some questions you may have about using the EBI search.

What are the different sections of the search results page?

After submitting a search, the results page is broken into four main sections:

A screenshot of the search results page

The largest section on the right hand side contains the main search results. This is a list of the database records or pages found by the search, split by category and database. If your search was for a gene or protein, links to summaries are presented above the main search results in the section titled "Gene & Protein Summaries". Learn more about summaries.

In the left column of the page at the top is a list of categories. These are clickable links enabling you to navigate the main search results list by limiting it to items within a particular category. Learn more about navigating the results.

Below the categories is an expandable "Query Suggestions" panel. The panel has a list of clickable suggestions for additional search terms to furhter refine your search.

What kinds of things can I find?

All of the EBI's major database records are indexed by the search engine. This includes gene and protein sequences, protein families, structures, gene expression data, protein interactions, pathways and small molecules, to name a few.

You can also search across the academic literature, patents, static web pages on the EBI website and for EBI staff members.

The search is different, what are the changes?

Our new search service incorporates two major changes:

  1. A redesigned results page allows you to see the results of your query straight away, without needing to know which databases to search.
  2. Summaries of our gene- and protein-centric data have been added to the search results.

What are Gene and Protein Summaries?

These summaries are a new useful way to explore the EBI's data from the perspective of a gene or protein, for certain key species. A summary collates data from several EBI databases in order to present an organised, easy to navigate and relevant view of a gene. It is arranged along the central dogma of molecular biology, and serves as a kind of homepage from which you can explore EBI resources that deal with specific areas of biology such as Ensembl and UniProt. The summary page has a stable and unique location (URL), and can be exported/printed as a report.

The summary incorporates information about the gene and its genomic context, its expression within an organism and in response to experimental factors, a wide range of functional information about the protein along with its interaction partners and folded 3D structure. Peer-reviewed publications and patents relevant to the gene or protein are also included.

For each gene/protein, a summary comprises five individual sections that you can switch between. These are: gene, expression, protein, protein structure, and literature. You can also switch to another species in order to display equivalent information for a gene's orthologues.

An example of a summary for the 'TPI1' gene is below:

TPI1 summary information screenshot

Where does the information in summaries come from?

The data for our gene and protein summaries are sourced from the following EBI resources:

What can I enter in the search box?

You can search for database records by accession/identifier such as 'P15056', a gene symbol such as 'TPI1', any words in the description, a species, and other keywords. You can also explore a topic by searching with a keyword such as 'diabetes', and find EBI staff members by searching for their name.

You combine multiple search terms by separating them with spaces, which will locate results containing all terms. You can also construct more complex queries - learn more in our technical documentation.

What are the "Discover more" links in search results?

When a record in the search results list is a gene, protein or protein structure, you can click this link to view a useful biological summary of relevant information for it, with links through to our source databases. See What are gene and protein summaries? for more details.

What are the "References" links in search results?

The "references" for a particular database record are its cross-references to different records in other EBI databases. For example, a gene in Ensembl is likely to have one or more references to proteins in UniProt which represent the gene's protein products.

What are the "View" links in some search results?

Some database records listed in the search results are available to view in more than one format or through more than one viewing application. For example, nucleotide sequences are commonly available to view in the European Nucleotide Archive's website, directly as flat files in EMBL format, etc.

What are the category links at the top left?

The links in this section function as filters over your search results. That is, each link represents a category of the type of records that can be searched, and clicking it restricts the search to only records within that category. Each category usually contains results from one or more databases, but there are also categories for the literature (Medline and Patents) and non-data web pages (e.g. press releases and staff).

Once you click a category to filter with, the search results to the right will be updated. Links will also appear underneath the category link you clicked representing subcategories or individual databases that you can choose to further limit the results. To return to searching all records, click "All results" above the category list.

What are the "query suggestions" links?

Visible when expanding the "query suggestions" panel on the left side of the page, these links are suggestions for terms that you might add to your original query to further refine your search. The suggestions have been automatically derived as those which will deliver a discrete subset of the existing results.

How are results ordered?

Results are ordered by relevance to your search query. This incorporates both:

  1. How well your search terms match a field. For example, whether the field exactly matches the terms, as opposed to contains the terms amongst a long list of other terms.
  2. How important the matched field is in describing the record. For example, matches against an identifier are more relevant than those matching the description.

Why do some search queries not have Gene and Protein Summaries?

The search engine will only display links to summaries at the top of the search results if it is able to confidently identify a gene or protein from your query. To do this, it matches against:

  • A gene identifier, as used by Ensembl or Ensembl Genomes (depending on the species).
  • A protein identifier: either UniProt ID or UniProt accession.
  • A protein structure identifier from PDBe.
  • A gene symbol, or one of its synonyms.

It is not usually possible to identify a gene from only its long description.

Which species can have Gene and Protein Summaries?

At present we are only able to offer summaries for genes and proteins from a core set of five species. These are human, mouse, yeast, nematode and fruitfly. More species may be added in future. In the meantime, the full range of species are still included in the full search results list.

Can you explain the list of summaries for my gene?

If you search for a gene, protein or structure, links will be included containing links to summaries for the gene in one or more species. Orthologous genes are grouped together in the list, even if they do not have the same name. Orthologues from different species that each match your search will be displayed by default. If there are more orthologues from other species, you can see them by clicking "View all organisms in this group".

Sometimes more than one group of genes will match your search This is because they share a gene name or synonym, but are not orthologues. When this happens the best match will be displayed first, with the other(s) being visible by clicking the "More" button underneath it.

Why are there multiple Gene and Protein Summaries for some genes?

It is not always possible to show a 1:1 relationship between orthologues due to the underlying biology or difficulty in ascertaining the ancestry of a group of genes. When gene duplications occur within a species after a species divergence event this can result in, for example, one human gene having three orthologues in the nematode.

How can I find more advanced or technical information?

Please consult the Search technical documentation.