Date:Thursday 26 - Friday 27 March 2020
Venue:Aristo Amsterdam - Teleportboulevard 100, 1043 EJ, Amsterdam, Netherlands
Application opens:Thursday 31 October 2019
Application deadline:Friday 31 January 2020
Participation:First come, first served
On the first day of the course, we will introduce the tranSMART loading pipeline and file format requirements. In a hands-on session we will import data by going through the data loading principles, steps and tooling, interspersed with hands-on exercises.
On the second day, we will introduce the cBioPortal loading pipeline and file format requirements. In a hands-on session we will clean up some real-world data and import it as a study in cBioPortal. We will provide servers with cBioPortal installed where we will practice data loading.
The exercises will be targeted at participants with varying levels of skills, from non-technical data manager to more programmatically experienced data engineer.
All training material will be made publicly available and will be shared with the participants.
This course is primarily aimed at data stewards/data managers involved in translational research. It will also be useful to researchers with an interest in data management or with a high component of data management in their job.
No programming experience is required.
Participants should bring their own laptops, which should comply with the following requirements:
• Ability to connect to wifi.
• Internet browsers: latest version of Google Chrome or Mozilla Firefox.
• Computer with MS Excel and PuTTy (Putty Installer).
Please, review the tranSMART documentation on github to get an idea of data formats before this course (see resources).
Please, review the cBioPortal - Steps to Load a Study and the structure of cBioPortal File-Formats for Importing (see resources).
Most of the classroom training and at least one exercise will be possible for participants without programming experience and only Excel as available tooling.
For the exercises related to more advanced data transformation and independent data loading participants will need to have a basic understanding of programming, preferably in Python. The participants will also need to be able to install on their machine: Python and public Python packages (e.g. pandas – you can also use R scripts), Jupyter Notebooks and Java-based executables.
After this course, you should be able to:
- Explain what tranSMART and cBioPortal are and their high-level data structure, including clinical data and a sample of high dimensional data tables.
- Fill in a tranSMART clinical data template for a dataset already structured as one row per patient.
- Prepare and check files for import to cBioPortal.
- If you have programming experience, you should also be able to:
- perform independently simple data transformations,
- export to tranSMART ready loading files, and
- load these to a provided tranSMART server.
Day 1 – Thursday 26 March 2020
|9:00-13:00||tranSMART16.2, data loading, followed by tranSMART app user-interface training||Netherlands Cancer Institute (NKI)|
|14:00-17:30||tranSMART17.x data loading, followed by Glowing Bear user-interface training||The Hyve|
|17:30||End of day|
Day 2 – Friday 27 March 2020
|9:00-13:00||cBioPortal loading pipeline and file format requirements||NKI|
|14:00-17:00||cBioPortal loading pipeline and file format requirements, continued and user-interface training||NKI|
|17:00||End of day|