0%

What kind of data should we share?

The repeatability of published studies implies not only making raw data (unprocessed data files obtained from the microarray scanner or from the sequencing machine) available, but also providing detailed information and protocols about the overall study and individual samples (known as metadata).

What metadata are required?

Providing sufficient metadata is essential to understand the associated data and to make the data reusable and the results reproducible. Several ‘minimum metadata reporting standards’ have been created with that aim. These include MIAME (Minimum Information About a Microarray Experiment) guidelines for microarray data and MINSEQE (Minimum Information about Sequencing Experiments) for RNA-seq data (Figure 2).

Both MIAME and MINSEQE emphasise the importance of providing not only the raw data but also the following:

  • general information about the experiment (e.g. a summary of the experiment and its goals, contact information, and any associated publication)
  • sample data relationships (e.g. which raw data file relates to which sample, which hybridisations are technical replicates and which are biological replicates)
  • detailed sample annotation, including ‘obvious’ information such as the organism from which the samples were obtained, as well as highlighting the experimental factors and their values (e.g. listing ‘compound’ and ‘dose’ as factors in dose-response experiments, and specifying which compound was used and at what dose)

Minimum Information About a Microarray Experiment (MIAME) and Minimum Information about SEQuencing Experiments (MINSEQE)
Figure 2 Minimum Information About a Microarray Experiment (MIAME) and Minimum Information about SEQuencing Experiments (MINSEQE).

Curation of functional genomics experiments is essential to ensure that public datasets contain sufficient metadata so that the experiment can be understood without referring to an associated paper. Unfortunately, the metadata in public data sets often lack sufficient information and, even when extra information is found in an associated publication, it is often incomplete and in a non-standard format.

 

Tools to help annotate your experiments

There are annotation tools such as Annotare that help you to annotate your experiments easily, thus improving metadata quality (see section ‘How to submit your data’).