Exercise solutions

Finding out about Ensembl species - solutions

 

Exercise 1 — Panda

(a) Select Panda from the drop down species list, or click on View full list of all Ensembl species, then choose Panda from the list.

The assembly is ailMel1 or GCA 000004335.1

(b) Click on More information and statistics. Statistics are shown in the tables on the left.

The length of the genome is 2,245,312,831 bp.

There are 19,343 coding genes.

 

Exercise 2 — Zebrafish

(a) Click on Zebrafish on the front page of Ensembl to go to the species homepage. News is in the top right.

What's new in Zebrafish release 90:

  • Microarray Probe Mapping Update
  • Add transcript models from new RNAseq to zebrafish core set
  • New zebrafish pri-miRNAs

(b) Under Other assemblies two previous assembly names and the releases you can find them in are listed.

Assembly Zv9 is available in the archived release 79 and assembly Zv8 is available in the archived release 59.

 

Exercise 3 — Mosquitos

(a) Go to metazoa.ensembl.org. Open the drop down list or click on View full list of all Ensembl Metazoa species. Type Anopheles into the filter box in the top left.

There are two Anopheles species: Anopheles gambiae and Anopheles darlingi.

(b) Click on Anopheles gambiae, then on More information and statistics.

The genome was revised in April 2014.

 

Exercise 4 — Bacteria

Go to bacteria.ensembl.org and start to type the name Belliella baltica into the search species box. It will autocomplete, allowing you to select Belliella baltica DSM 15883, (TaxID 866536) from the drop-down list. Click on More information and statistics.

Belliella baltica has 3,680 coding genes and 53 non-coding.

 

Region in Detail view

Exercise 5 — Exploring a genomic region in human

(a) Go to the Ensembl homepage (ensembl.org).

Select Search: Human and type 13:31937000-32633000 in the text box (or alternatively leave the Search drop-down list like it is and type human 13:31937000-32633000 in the text box).

Click Go.

This genomic region is located on cytogenetic band q13.1. It is made up of eight contigs, indicated by the alternating light and dark blue bars in the Contigs track. Note that KF455761.1 is a tiny contig that splits AL137143.8 in two.

(b) Draw with your mouse a box encompassing the BRCA2 transcripts. Click on Jump to region in the pop-up menu.

(c) Click Configure this page in the side menu (or on the cog wheel icon in the top left hand side of the bottom image).

Type tilepath in the Find a track text box.

Select Tilepath.

Click on the (i) button to find out more

The tilepath track shows the BAC clones that the assembly was based upon.

Save and close the new configuration by clicking on the tick (or anywhere outside the pop-up window).

There is not just one clone that contains the complete BRCA2 gene. The BAC clone RP11-37E23 contains most of the gene, but not its very 3' end (contained in RP11-298P3). This was reflected on the two contigs that make up the entire BRCA2 gene (the Contigs track is on by default). You may find this easier to see if you highlight the 3' exon on BRCA2.

(d) Click Export data in the side menu. Leave the default parameters as they are.

Click Next.

Click on Text.

Note that the sequence has a header that provides information about the genome assembly (GRCh38), the chromosome, the start and end coordinates and the strand. For example:

13 dna:chromosome chromosome:GRCh38:13:32311910:32405865:1

(e) Click Configure this page in the side menu.

Click Reset configuration.

Click on the tick.

 

Exercise 6 — Exploring assembly exceptions in human

(a) Go to the Ensembl homepage (ensembl.org).

Select Search: Human and type 21:32630000-32870000 in the text box (or alternatively leave the Search drop-down list like it is and type human 21:32630000-32870000 in the text box).

Click Go.

You will see a red highlighted region in the middle of this region. Click on the thin dark red bar in any of the three views to see the label CHR_HSCHR21_3_CTG1_1:32769079-32843731. Click on What are assembly exceptions? to open a new window which explains assembly exceptions.

(b) Assembly exceptions are marked in the chromosome view at the top.

There are seven haplotypes on chromosome 21.

(c) Another option in the drop-down is Compare with reference. Click on this.

Scroll down the page to see the comparison between the haplotype and primary assembly. Aligned sequences are highlighted in pink and linked together in green.

The assembly exception CHR_HSCHR21_3_CTG1_1 contains an extra region compared to the primary assembly.