Functional genomics research

Brazma group figureA combined human and mouse gene expression data matrix (principal components 1 nd 3). Each dot represents a sample, which is lbelled by (a) species and (b) tissue type.

The Brazma research group complements the Functional Genomics service team, and focuses on developing new methods and algorithms and integrating new types of data across multiple platforms. We are particularly interested in cancer genomics and transcript isoform usage. We collaborate closely with the Marioni group and others throughout EMBL.

As a part of our participation in the GEUVADIS project (funded by the European Commission's Seventh Framework Programme), we analysed mRNA and small RNA from lymphoblastoid cell lines of 465 individuals who participated in the 1000 Genomes Project. Our group led the analysis of transcript isoform use and fusion gene discovery. By integrating RNA and DNA sequencing data, we were able to link gene expression and genetic variation, and to characterise mRNA and miRNA variation in several human populations. All of the data generated in the project are available through ArrayExpress.

The human transcriptome contains in excess of 100,000 different transcripts. We analysed transcript composition in 16 human tissues and five cell lines to show that, in a given condition, most protein coding genes have one major transcript expressed at significantly higher level than others, and that in human tissues the major transcripts contribute almost 85% to the total mRNA. We also found that often the same major transcript is expressed in many tissues. These observations can help prioritise candidate targets in proteomics research and to predict the functional impact of the detected changes in variation studies.

Future plans

Large-scale data integration and systems biology will remain in the focus or our research. We will work to develop methods for RNA-seq data analysis and processing, and apply these to address important biological questions such as the role of alternative splicing and splicing mechanisms. With our collaborators from the International Cancer Genome Consortium we will be seeking new insights into cancer genomes and their impacts on functional changes in cancer development, discovery and analysis of fusion genes and their role in cancer development.

Selected publications

Rung, J. and Brazma A. (2012) Reuse of public genome-wide gene expression data. Nat Rev Genet (in press)

Fonseca, N.A., et al. (2012) Tools for mapping high-throughput sequencing data. Bioinformatics 28, 3169-3177.

Goncalves, A., et al. (2012) Extensive compensatory cis-trans regulation in the evolution of mouse gene expression. Genome Res 22, 2376-2384.