Analyse gene list

This feature can be used to investigate a list of protein/small molecule identifiers with or without expression data (Figure 7). The tool performs over-representation and pathway-topology analysis on the input data.

Figure 7 Over-representation tool in Reactome interface.

Input format

Qualitative data

Your data should be a single column of identifiers such as UniProt IDs, gene symbols or ChEBI IDs (Figure 8). They are matched to Reactome pathways and tested for over-representation and pathway-topology.

Quantitative data

Your data should have a column of identifiers as before and one or more additional columns of numbers (expression or dose-response values) (Figure 9). This will be recognised as quantitative data and an additional overlay process will be run along with over-representation. The numbers are used to produce a coloured overlay for Reactome pathway diagrams.

Figure 8 Example input list that contains a mixture of UniProt IDs, gene names, NCBI gene IDs and KEGG small molecule IDs. Note the hash symbol at the beginning of the header row.

Figure 9 Example of expression values in a simulated microarray dataset. Note the hash symbol at the beginning of the header row.

Identifier mapping

The submission process recognises many types of identifiers. As part of the pre-analysis, they are mapped to Reactome molecules. The ideal identifiers to use are UniProt IDs for proteins, ChEBI IDs for small molecules, and either HGNC gene symbols or Ensembl IDs for DNA/RNA molecules, as these are our main external reference sources for proteins and small molecules.

Help Many other identifiers are recognised and mapped to appropriate Reactome molecules – a list of these can be found in our user guide.

By default, all non-human identifiers are converted to their human equivalents. If you want to use non-human identifiers to search a computationally-inferred non-human Reactome pathway, uncheck the box. You may also prefer to uncheck this box if your query consists of a mixture of human and microbial identifiers, representing an infection.