Exercises 1

1. A GenomeDB is used to link the Compara database to each of the Core species databases. Print the name, assembly version and genebuild version for all the GenomeDBs in the compara db.

Hint: First you will need an adaptor of type "GenomeDB". Then use the fetch_all() method from the GenomeDB adaptor to bring back GenomeDB objects (these will be returned as an array-ref). Then get the name, assembly version and genebuild version from these GenomeDB objects.

2. A DnaFrag represents a top-level SeqRegion in the Compara database. Print all the DnaFrags for chimp.

Hint: First you will need an adaptor of type DnaFrag. Then use the fetch_all_by_GenomeDB_region() from the adaptor to bring back all the dnafrags associated with a region (the region you want is "chromosome").

3. The MethodLinkSpeciesSet is a central component in the Compara database, it stores information connecting the various analyses (method_link_type) with a set of species (species_set).

a) (ii) Print the total number of MethodLinkSpeciesSet entries stored in the database.

Hint: use the MethodLinkSpeciesSet adaptor fetch_all() method.

(iii) Print a unique list of method_link_types and a count of their number in the database.

Hint: With the MethodLinkSpeciesSet adaptor use the "method->type" method to get all the "method_type" entries in the database. In order to get a unique set of method_types, use the returned values from the "method->type" method as keys in a hash (and increment this for the total number of entries of that type).

b) Print a list of the species and their internal ids (dbIDs) for the 12 eutherian mammal EPO alignments.

Hint: use the MethodLinkSpeciesSet adaptor fetch_by_method_link_type_species_set_name() method, and then the MethodLinkSpeciesSet (object) "species_set_obj()->genome_dbs" method (this brings back a list-ref of genome_db objects). The method_link_type is 'EPO' and the species_set_name is 'mammals'.



Stephen explains the answers to these questions in this 12 min video. You can download his sample scripts and outputs:

1. sample script and output

2. sample script and output

3a. sample script and output

3b. sample script and output