Compara exercises


Exercise 1 – Orthologues, paralogues and gene trees for the human BRAF gene

(a) How many orthologues are predicted for this gene in primates? How much sequence identity does the Tarsier (Carlito syrichta) protein have to the human one? Click on the View Sequence Alignments link next to the Ensembl identifier to view a protein alignment in Clustal format.

(b) Go to the orthologue in marmoset. Is there a genomic alignment between marmoset and human? Is there a gene for both species in this region?

Exercise 2 – Zebrafish orthologues

Go to to find the sardh gene in the zebrafish genome.

(a) Go to the Location tab for this gene. View the Alignments (image) and Alignments (text) for the 11 fish. Which fish genomes are represented in the alignment? Do all the fish show a gene in these alignments?

(b) Export the alignments (as Clustal).

(c) Click on the Region in detail link at the left and turn on the tracks for multiple alignments, constrained elements and conservation score for 11 fish EPO_LOW_COVERAGE by configuring the page.
What is the difference between the 11 fish EPO_LOW_COVERAGE track and the 11 way GERP elements and scores tracks? Which regions of the gene do most of the constrained element blocks match up to?
Can you find more information on how the constrained elements track was generated?

Exercise 3 – Synteny

Go to Find the Rhodopsin (RHO) gene for Human. Go to the Location tab.

(a) Click Synteny at left. Are there any syntenic regions in dog? If so, which chromosomes are shown in this view?

(b) Stay in the Synteny view. Is there a homologue in dog for human RHO? Are there more genes in this syntenic block with homologues?

Exercise 4 – Whole genome alignments

(a) Find the BRCA2 (Breast cancer type 2 susceptibility protein) gene for human and go to the Region in detail page.

(b) Turn on the 32 amniota vertebrates Mercator-Pecan track. Does the degree of conservation between human and the various other species reflect their evolutionary relationship? Which parts of the BRCA2 gene seem to be the most conserved? Did you expect this?

(c) Have a look at the Conservation score and Constrained elements tracks for the set of 70 mammals and the set of 32 amniota vertebrates. Do these tracks confirm what you already saw in the tracks with pairwise alignment data?

(d) Retrieve the genomic alignment for a constrained element. Highlight the bases that match in >50% of the species in the alignment.

Extra Exercise 5 – Pan-taxonomic Compara

Find the NMA2179 gene in the genome of the bacterial strain Neisseria meningitidis Z2491, (TaxID 122587).

(a) What is the function of this gene?

(b) How many orthologues are predicted for this gene in bacteria? What is the maximum identity of a bacterial protein with the one of Neisseria meningitidis Z2491?

(c) How many vertebrate species have predicted orthologues for NMA2179? How many orthologues are predicted in human and what is their type of relationship with NMA2179?

(d) Export the Gene Tree in Newick format.