Access data

Noor has found a dataset that could be useful for the study of insulin-dependent diabetes and has discovered that accessing cohort data requires submitting an application for access.

On the other hand, Sam wants to access synthetic cohort data, which is usually more straightforward than accessing real cohort data.
Once you have identified a cohort dataset that you would like to use in your study, you will need to apply for access to the data.
Frequently, the procedure to access the data is done through a Data Access Committee (DAC). A DAC is composed of one or more named individuals who are responsible for making decisions on data release to external requestors based on consent and/or National Research Ethics terms. A DAC is typically formed, but not necessarily, from the same organisation that collected the samples and generated any associated analyses. Before approving access to a dataset, DACs will usually require the requestor to agree to and sign a Data Access Agreement which outlines the rules and responsibilities associated with accessing and using the data. You can find more information about DACs and their functions at the European Genome-Phenome Archive (EGA) DAC section.
Keep in mind that one of the benefits of using synthetic datasets is that, as the information inside is not real, it requires fewer access controls. Therefore, much of the remaining hands-on activities in this learning pathway will use synthetic datasets to avoid the need to sign Data Access Agreements and wait for DAC approval.
Remember that you can find several synthetic datasets from CINECA and other initiatives in the Introduction of the Learning pathway.
How do you request access to a dataset stored at EGA?
The following steps can be followed when applying for access to either real or synthetic cohort data at the EGA; however, keep in mind that step 7 can potentially take a long time depending on how long it takes the DAC to make access decisions.
- First, register for a Life Science Login account, if you do not have one.
- Request an EGA account by sending an email to helpdesk@ega-archive.org.
- Link your Life Science Login and your EGA account, by first logging in with your EGA account and subsequently with your Life Science account here: https://ega.ebi.ac.uk:8443/ega-openid-connect-server/ega-login
- Use your Life Science Login to connect to: https://data-access.sd.csc.fi
- Access the catalogue and find your dataset of interest, for example “CINECA Synthetic dataset EUROPE UK1” or the Human genomic and phenotypic synthetic data for the study of rare diseases. Click “Apply”.
- Fill and submit the data access request. You can find a demo about this step below.
- You should receive confirmation that your data access request is approved and you can access the dataset.
Alternatively, you can email the EGA Helpdesk and request access to the synthetic dataset of interest. If you do not have an EGA account already, one will be created for you. You should receive confirmation that your data access request is approved and you can access the dataset.
Check this video if you want to learn how to authorise access to the sensitive data of your research.
You will find an example of accessing data through DAC in the tutorial of the ‘Federated Data analysis‘ section of this learning pathway.