Cellular Phenotype Database homepage

Cellular Phenotype Database


Submitting data to the Cellular Phenotype database

The Cellular Phenotype database accepts data from high-throughput phenotypic studies. Such studies allow screening living cells under a wide range of experimental conditions and give access to a whole panel of cellular responses to a specific treatment. Substances like small molecules and peptides, or techniques like RNA interference (RNAi), can be applied to look at the effects, or phenotypes, that such substances induce in cells, with the aim of elucidating novel gene function as well screening compounds for desirable therapeutic effects.

Submitting data associated with RNAi studies

All phenotypic data currently stored in the database were derived from RNAi treatment of cultured human cells and phenotypes were recorded by high-throughput live cell imaging.

For each RNAi based study that you wish to submit, we kindly ask you to provide the following 3 files:

  1. The study description file, which you can create using this spreadsheet template. Instructions to help you filling in this spreadsheet can be found in the spreadsheet itself.

    With this spreadsheet we want to capture some generic information about the study, including title, description, and publication details (if applicable), and specific information about each screen included in the study (i.e. primary screen, validation screen, etc.). For each screen in the study, we request details such as target organism, materials used (e.g. cell line), names and descriptions of the experimental and analytical protocols used, as well as names and descriptions of the phenotypes observed.

    If you need additional fields to describe your data, feel free to add them to the spreadsheet, but provide the relevant annotation needed to understand the content of the new field(s).
  2. The library annotation file, including, at the absolute minimum, siRNA reagent IDs, sequences and position in the experimental layout (e.g. plate and position in the plate). This file should be saved in TAB DELIMITED FORMAT.



    siRNA sequences are fundamental, as they are used to map siRNAs to the reference genome when a new release becomes available. We are therefore unable to accept datasets for which the siRNA sequences are not available; and

  3. The processed data file (one for each screen in the study). This file should be saved in TAB DELIMITED FORMAT. This file should contain the screen results in the form of a table where each row corresponds to a plate position, and therefore to a siRNA ID. Additional columns must contain the phenotype(s) assigned to each position/siRNA ID and the score(s) (i.e. Z-score, p-value, etc.) that has been used to assign a phenotype to that a position/siRNA ID. In special cases when raw image data are also to be submitted, for each position in the processed data file, we require an additional column with the full image path for the image(s) associated with the selected position/siRNA ID. This column can be split into several columns (one per channel), if needed.

Some examples of datasets already loaded in the Cellular Phenotype database are available below. If you need help with your submission or want more examples, contact us at this email.

Study title: Phenotypic profiling of the human genome by time-lapse microscopy reveals cell division genes
Study title: EMBL-secretion screen
Study description: A systematic analysis of genes and proteins that are required for chromosome segregation and cell division in human cells was carried out. All 22,000 human genes were inactivated one by one in cultured human cells using RNAi. The cellular phenotypes were recorded by high-throughput live cell imaging. Automated analyses of the resulting images revealed that some 600 out of the 22,000 human genes play a role in mitosis.
Study description: An RNAi-based high content screening microscopy platform was used to screen a genome-wide library of over 51,000 small interfering RNAs (siRNAs) targeting approximately 22,000 human genes for interference with ER-to-plasma membrane transport of the well-characterized secretory cargo membrane protein tsO45G.
Screens: Primary screen and validation screen
Screens: Primary screen
Download study description
View publication
View database entry
Download study description
View publication
View database entry

Submitting data associated with other study types

The Cellular Phenotype database is built using a flexible, open source document‐oriented NoSQL database management system, MongoDB, and allows for the loading of data in different formats.

If the data that you wish to submit to us cannot be adequately represented with the format proposed above, please get in touch with us at phenotype@ebi.ac.uk.

A generic view of our approach to data processing:

click to see the bigger image
Schema of the study collected:

click to see the bigger image