Edit

BioStudies Team

Aggregating all publication data

The team develops and runs the BioStudies database that provides access to all data outputs of a life sciences study from a single place, thereby facilitating research transparency and reproducibility. We also provide the underlying infrastructure for the BioImage Archive. 

Edit

The BioStudies database holds descriptions of biological studies, links to data generated in these studies in key community databases at the EMBL-EBI and elsewhere, as well as “orphan data” traditionally provided as supplementary materials. The database can accept data from a wide range of studies described via a simple format, and does not impose minimum requirements outside those agreed by the respective community.

The overall goal of BioStudies is to facilitate transparency and reproducibility of research by aggregating all the outputs of a study (a ‘data package’) in a single place. We are enabling this aggregation across the various stages of research.

  • Ideally scientists consider data management and publishing while the investigation is running, and we are partnering with projects like RISK-HUNT3R to make sure that well-structured data is captured, ready for release upon publication.
  • When data is captured during the manuscript preparation stage, we enable authors to submit supplementary information and cite it in the publication. Specialized community databases should be used when applicable, and BioStudies enable linking to these, as well as submitting orphan data that do not have a dedicated ‘home’ resource.
  • The BioStudies system can be used for an emerging public data infrastructure, as exemplified by the BioImage Archive.
  • We create data packages also after publication, by importing supplementary data and text-mined database links from Europe PMC, and curated data on figures from the SourceData project.
  • BioStudies can also help with publishing data that live in a resource being retired – an example is ArrayExpress.

Please contact us at biostudies@ebi.ac.uk for further information or collaboration ideas.

Data resources

BioStudies Database

BioStudies Database

The BioStudies database holds descriptions of biological studies, links to data from these studies in other databases at EMBL-EBI or outside, as well as data that do not fit in the structured archives at EMBL-EBI. The database can accept a wide range of types of studies described via a simple format…

EBI BioSamples Database

EBI BioSamples Database

The BioSamples database aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of EMBL-EBI’s assay databases such as ArrayExpress, the European Nucleotide Archive or PRIDE, the proteomics identifications database. It provides links to as…

MageComet

MageComet

MageComet is an online tool that facilitates curation of MAGE-TAB. It combines text-mining and the Experimental Factor Ontology (EFO) to create a semi-automatic environment for faster annotation, guided data-manipulation and content summary

Edit