Robert Petryszak

Robert Petryszak

Team Leader, Gene Expression

Robert Petryszak leads the Gene Expression team, which deals with the acquisition, curation, quality control, statistical analysis and visualisation of functional genomics data, as well as user support and training for ArrayExpress and the Expression Atlas. Previously, in the InterPro team, Robert led the development of a pipeline applying InterProScan software to all known protein sequences. Before joining EMBL-EBI in 2003, he enjoyed ten years of software development and technical management experience in a number of commercial and R&D technology organisations spanning expert systems, telecommunications, printing software and investment research. Robert received his MPhil in Computer Speech and Language Processing from the University of Cambridge in 1995.

ORCID iD: 0000-0001-6333-2182

Tel:+44 (0)1223 492 696 / Fax:+44 (0)1223 494 468

Petryszak team

Part of EMBL-EBI's mission is to archive experimental data, supporting publications in journals and facilitating research reproducibility. The Gene Expression team, led by Robert Petryszak, comprises biologists, bioinformaticians and software engineers and is responsible for the acquisition, curation, quality control and user support for data submitted to the ArrayExpress archive of functional genomics experiments. 

EMBL-EBI is commited to making archived data accessible to life scientists through resources such as the Expression Atlas, a value-added database of information on gene and protein expression patterns under different biological conditions. The Gene Expression team is responsible for acquiring high-quality microarray, RNA-seq and proteomics data from EMBL-EBI archives such as ArrayExpress and PRIDE, and from external sources. Our team provides in-depth manual curation, annotation to ontologies, quality control and re-analysis using in-house, standardised statistical pipelines. This allows us to provide accessible data visualisations in the Expression Atlas.

The Gene Expression team is also responsible for training and outreach for ArrayExpress and the Expression Atlas. We work closely with EMBL-EBI's Bioinformatics training programme to develop on-line training tools and deliver courses, workshops and conference presentations.

Find out more about activities in the Gene Expression Team.



Expression Atlas update--an integrated database of gene and protein expression in humans, animals and plants.
Petryszak R, Keays M, Tang YA, Fonseca NA, Barrera E, Burdett T, Füllgrabe A, Fuentes AM, Jupp S, Koskinen S, Mannion O, Huerta L, Megy K, Snow C, Williams E, Barzine M, Hastings E, Weisser H, Wright J, Jaiswal P, Huber W, Choudhary J, Parkinson HE, Brazma A.
Nucleic acids research Volume 44 (2016) p.D746-52

Gramene 2016: comparative plant genomics and pathway resources.
Tello-Ruiz MK, Stein J, Wei S, Preece J, Olson A, Naithani S, Amarasinghe V, Dharmawardhana P, Jiao Y, Mulvaney J, Kumari S, Chougule K, Elser J, Wang B, Thomason J, Bolser DM, Kerhornou A, Walts B, Fonseca NA, Huerta L, Keays M, Tang YA, Parkinson H, Fabregat A, McKay S, Weiser J, D'Eustachio P, Stein L, Petryszak R, Kersey PJ, Jaiswal P, Ware D.
Nucleic acids research Volume 44 (2016) p.D1133-40


Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction.
Frankish A, Uszczynska B, Ritchie GR, Gonzalez JM, Pervouchine D, Petryszak R, Mudge JM, Fonseca N, Brazma A, Guigo R, Harrow J.
BMC genomics Volume 16 Suppl 8 (2015) p.S2

ArrayExpress update--simplifying data submissions.
Kolesnikov N, Hastings E, Keays M, Melnichuk O, Tang YA, Williams E, Dylag M, Kurbatova N, Brandizi M, Burdett T, Megy K, Pilicheva E, Rustici G, Tikhonov A, Parkinson H, Petryszak R, Sarkans U, Brazma A.
Nucleic acids research Volume 43 (2015) p.D1113-6


Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments.
Petryszak R, Burdett T, Fiorelli B, Fonseca NA, Gonzalez-Porta M, Hastings E, Huber W, Jupp S, Keays M, Kryvych N, McMurry J, Marioni JC, Malone J, Megy K, Rustici G, Tang AY, Taubert J, Williams E, Mannion O, Parkinson HE, Brazma A.
Nucleic acids research Volume 42 (2014) p.D926-32


UniChem: a unified chemical structure cross-referencing and identifier tracking system.
Chambers J, Davies M, Gaulton A, Hersey A, Velankar S, Petryszak R, Hastings J, Bellis L, McGlinchey S, Overington JP.
Journal of cheminformatics Volume 5 (2013) p.3


Gene Expression Atlas update--a value-added database of microarray and sequencing-based functional genomics experiments.
Kapushesky M, Adamusiak T, Burdett T, Culhane A, Farne A, Filippov A, Holloway E, Klebanov A, Kryvych N, Kurbatova N, Kurnosov P, Malone J, Melnichuk O, Petryszak R, Pultsin N, Rustici G, Tikhonov A, Travillian RS, Williams E, Zorin A, Parkinson H, Brazma A.
Nucleic acids research Volume 40 (2012) p.D1077-81


Building a biological space based on protein sequence similarities and biological ontologies.
Kersey P, Lonsdale D, Mulder NJ, Petryszak R, Apweiler R.
Combinatorial chemistry & high throughput screening Volume 11 (2008) p.653-660


New developments in the InterPro database.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.
Nucleic acids research Volume 35 (2007) p.D224-8


Integr8 and Genome Reviews: integrated views of complete genomes and proteomes.
Kersey P, Bower L, Morris L, Horne A, Petryszak R, Kanz C, Kanapin A, Das U, Michoud K, Phan I, Gattiker A, Kulikova T, Faruque N, Duggan K, Mclaren P, Reimholz B, Duret L, Penel S, Reuter I, Apweiler R.
Nucleic acids research Volume 33 (2005) p.D297-302

The predictive power of the CluSTr database.
Petryszak R, Kretschmann E, Wieser D, Apweiler R.
Bioinformatics (Oxford, England) Volume 21 (2005) p.3604-3609


Sequence clustering as a method of protein functional annotation
Petryszak R, Kersey P.
Volume (0) p.11-23

Team members

Elisabet Barrera Casanova
Wojciech Bazant
Nuno Fonseca
Anja Fullgrabe
Laura Huerta Martinez
Maria Keays
Alfonso Munoz-Pomer Fuentes
Irene Papatheodorou
Amy Tang