Bioinformatics pioneers: Helen Parkinson

“So, I thought, if no one knows the answer, this would be a good thing for me to work on.”Helen Parkinson, Head of Molecular Archives at EMBL-EBI

Helen Parkinson, Head of Molecular Archives at EMBL-EBI

Helen Parkinson started her career as a researcher, studying Drosophila behaviour, molecular biology and medical genetics. Now, she leads EMBL-EBI’s Samples, Phenotypes and Ontologies team and has strategic oversight of two foundational resources: the European Nucleotide Archive (ENA), which is fully open, and the European Genome–phenome Archive (EGA), which  contains controlled-access data.

We asked Helen how the twists and turns of her career brought her to this point, and what advice she has for others navigating the shifting landscape of science and technology careers.

Q&A with Helen Parkinson

This field is relatively new – how did you get into it?

I am a Drosophila biologist by training, and did my predoctoral research on circadian rhythms in flies. That was a really exciting field to be in at the time – so much so that this year, the work of Hall, Rosbach and Young on biological clocks was recognised with a Nobel Prize. The work I was doing required getting comfortable with a lot of different disciplines: genetics, molecular biology, statistics, sequencing and so forth. So it was great all-around training.

I shifted from fly genetics to human genetics, and for expediency started building databases for the gene we were trying to clone. This was before the human genome was finished, so it was all rather new. I found that I actually liked working with data more than working in the wet lab.

Since I was mostly doing sequence analysis, I came to EMBL-EBI – and you know, for someone who was doing a lot of analysis, it was like coming to the ‘mother ship’.

My PhD supervisor had already suggested that I might enjoy working on Flybase after I finished my PhD. I wasn’t convinced at first, but it turned out he was right – I love working in bioinformatics. In fact, my team now has a collaborative project with Flybase: Virtual FlyBrain.

I started working for the Nucleotide Archive [ENA] in 2000, after my second postdoc. I was really struck by the scale of it. I wasn’t just thinking about what I was doing, I was thinking about a huge database filled with very complicated datasets – and I was surprised by how different things looked from the ‘other side’.

You are now in a position of leadership at EMBL-EBI. How did that happen?

I moved into Alvis Brazma’s group when he started ArrayExpress, the archive for microarray data. Alvis thought I would be a good fit for what he wanted to do, which was really about standardising data. This was a new idea at the time, and when I started asking questions I eventually realised that no one knew the answers, particularly around ontologies. So, I thought, if no one knows the answer, this would be a good thing for me to work on.

Alvis really supported me with this. Eventually, I had a small group of people acquiring the data for ArrayExpress and making it useful. I was promoted to Team Leader and, over time, my group changed to focus more on ontologies, then samples and phenotype data.

You’re now able to shape projects and move them forward. What was the most important factor in getting here?

My timing was really good. When I got involved with microarrays, people were only just becoming active in the area. That was a really exciting time. I was lucky and Alvis was generous with his time and mentorship.

Do you find you’re able to do the same for people in your team now?

The challenge is creating opportunities for people. I had a baby when I was a Team Leader, and at that point, I had already achieved a lot. Not everybody is in that position. I am active in promoting measures to make sure people have access to a fair-minded workplace.

As a leader you always have to ask the question, why are we doing this? And you have to be confident that you are being even-handed. I recall in one of our Sex in Science events, Paul Walton, whose Chemistry department at York University (which holds the prestigious Athena Swann gold acreditation) said, “It doesn’t matter who is doing this, they have to be fair-minded and ask questions. If you don’t ask, you don’t get.”

What is your team working on now?

We’re involved in so many projects – it’s difficult to choose just one. The Human Cell Atlas, Open Targets, the International Mouse Phenotyping Consortium, HipSci, the Ontology Mapping Service – there are many others.

I’m very pleased with our work in the BioSamples database, which started in Alvis’s group. We basically wanted to simplify access to information about samples. People will use the same set of biological samples to generate many different types of data, which means that their study results go into separate databases.

For example, outputs from the HipSci project, which focuses on induced-pluripotent stem cells, are split between a controlled-access database, gene and protein sequence databases, and a gene-expression database. My team enables people to compile and explore these datasets by identifying the samples connecting them.

We have close to five million samples now, which I’m honestly very proud of. Some samples only have one set of data, but others have been sequenced many times over and so have a lot of datasets associated with them, in many different studies.

Bioinformatics is a great career choice if you’re interested in lots of different things.”

What are the big challenges in delivering molecular data?

If you can’t find a dataset, you can’t reuse it. So we’re always working on ways to put the principles of FAIR into practice – making sure the data we share is findable, accessible, interoperable and reusable.

One of our biggest challenges is getting good metadata – the descriptions – to make interoperability work. For example, each of the molecular archives at EMBL-EBI was built around datasets that were generated by a single technology – but biology doesn’t work like that anymore. It takes a lot of work to make these databases interoperable, so we share the approaches and tools we develop so that other people can use them.

What do you enjoy most about working at EMBL-EBI?

The fact that it is constantly challenging and I am never bored. And it’s diverse and international. I work on lots of projects focussing on different species. I love that I get to learn something new about biology every day. Sometimes there is a paradigm shift that moves on the data, the knowledge and the infrastructure, all together – Human Cell Atlas is our latest challenge.

It doesn’t matter who is doing this, they have to be fair-minded and ask questions. If you don’t ask, you don’t get.” - Paul Walton

What advice would you give someone who was applying for a job here?

I wish I’d understood the diversity and complexity better, the range of possibilities. I wish that I’d known I could move between the teams to build a career. I would like to see more people advancing by collaborating with different teams. This approach helped me see the bigger picture, and made the path easier because I now understand why people do things the way they do.

Bioinformatics is a great career choice if you’re interested in lots of different things, because the skills you have can be applied in many different ways. Planning your career so you can access several areas of biology is a great thing.

Want to work with us?

View career openings at EMBL-EBI