BioMart exercise answers


image

Answers

To get the identical answers, you can follow along using the version 70 archive site, or if you want to see the answers you'd get with the most recent version of Ensembl, use our main BioMart site.

Answer 6

Human

Click New.

Choose the ENSEMBL Genes 70 database.

Choose the Homo sapiens genes (GRCh37) dataset.

Click on Filters in the left panel.

Expand the GENE section by clicking on the + box.

Select ID list limit - RefSeq protein ID(s) and enter the list of IDs in the text box (either comma separated or as a list).

HINT: You may have to scroll down the menu to see these.

Count shows 11 genes (remember one gene may have multiple splice variants coding for different proteins).

Click on Attributes in the left panel.

Select the Features attributes page.

Expand the External section by clicking on the + box.

Select HGNC symbol and RefSeq Protein ID from the External References section.

Click the Results button on the toolbar.

Select View All rows as HTML or export all results to a file. Tick the box Unique results only.

 

Answer 7

Ciona savignyi

Click New.

Choose the ENSEMBL Genes 70 database.

Choose the Ciona savignyi genes (CSAV2.0) dataset.

Click on Filters in the left panel.

Expand the GENE section by clicking on the + box.

Enter the gene list in the ID List Limit box.

Click on Attributes in the left panel.

Select the Homologs attributes page.

Expand the Orthologs section by clicking on the + box.

Select Human Ensembl Gene ID.

Click Results (remember to tick the unique results only box).

 

Answer 8

Human

(a) Choose Ensembl Variation 70 and Homo sapiens Structural Variation.

Filters:Region: Chromosome 1, Base pair start: 130408, Base pair end: 210597

Count shows 22 out of 3388380 structural variants.

Attributes: Structural Variation (SV) Information: DGVa Study Accession and Source Name

Structural Variation (SV) Location: Chromosome name, Sequence region start (bp) and Sequence region end (bp).

 

(b) Choose Ensembl Variation 70 and Homo sapiens Variation.

Filters:Filter by Variation ID enter: rs1801500, rs1801368.

Attributes: Variation Name, Variant Alleles, Phenotype description, and Associated gene.

Help

You can view this same information in the Ensembl browser. Click on one of the variation IDs (names) in the result table. The variation tab should open in the Ensembl browser. Click Phenotype Data.

 

Answer 9

Human

(a) Click New.

Choose the ENSEMBL Genes 70 database.

Choose the Homo sapiens genes (GRCh37)dataset.

Click on Filters in the left panel.

Expand the GENE section by clicking on the + box.

Select ID list limit - Affy hg u133 plus 2 probeset ID(s) and enter the list of probeset IDs in the text box (either comma separated or as a list).

Count shows 25 genes match this list of probesets.

Click on Attributes in the left panel.

Select the Features attributes page.

Expand the GENE section by clicking on the + box.

In addition to the default selected attributes, select Description.

Expand the External section by clicking on the + box.

Select HGNC symbol from the External References section and AFFY HG U133-PLUS-2 from the Microarray Attributes section.

Click the Results button on the toolbar.

Select View All rows as HTML or export all results to a file. Tick the box Unique results only.

Your results should show that the 25 probes map to 25 Ensembl genes.

 

(b) Don’t change Dataset and Filters- simply click on Attributes.

Select the Sequences attributes page.

Expand the SEQUENCES section by clicking on the + box.

Select Flank (Transcript) and enter 2000 in the Upstream flank text box.

Expand the Header information section by clicking on the + box.

Select, in addition to the default selected attributes, Description and Associated Gene Name.

Note: Flank (Transcript) will give the flanks for all transcripts of a gene with multiple transcripts. Flank (Gene) will give the flanks for one possible transcript in a gene (the most 5’ coordinates for upstream flanking).

Click the Results button on the toolbar.

 

(c) You can leave the Dataset and Filters the same, and go directly to the Attributes section:

Click on Attributes in the left panel.

Select the Homologs attributes page.

Expand the GENE section by clicking on the +box.

Select Associated Gene Name.

Deselect Ensembl Transcript ID.

Expand the ORTHOLOGS section by clicking on the + box.

Select Mouse Ensembl Gene ID, Mouse Chromosome Name, Mouse Chr Start (bp) and Mouse Chr End (bp).

Click the Results button on the toolbar.

Check the box Unique results only. Select View All rows as HTML or export all results to a file.

Your results should show that for most of the human genes at least one mouse orthologue has been identified.

 

Answer 10

Schizosaccharomyces pombe

Start at http://fungi.ensembl.org/biomart/martview

Select Ensembl Fungi Genes 15

Dataset: Schizosaccharomyces pombe genes.

Filters:Region: Chromosome III

             Gene: Limit to genes … with PomBase ID

                         Gene type: protein_coding

Attributes:Ensembl Gene ID and Ensembl Transcript ID (defaults)

Count should show 920 genes.