Getting data from the DGVa
DGVa data can be accessed in a number of ways:
Genome Variation Format files
A GVF file is available for each study that is stored in the DGVa and can be downloaded from the data download page. This type of file is useful if you are interested in accessing all of the data produced by a particular study and it contains accession numbers for all calls and regions published in that study. The list of GVF files available for download is usually updated on the last Thursday of the month.
Structural variants and regions can be viewed graphically in their genomic context using the Ensembl genome browser. This is useful for viewing variation according to genomic location. An example would be if you wanted to see whether any structural variation had been reported in the region of human chromosome 17q21. View the Ensembl output, an example of which is shown in Figure 3 (below).
Figure 3 Searching for structural variation on human chromosome 17q21.31. By clicking on any of the structural variation tracks (shown in black), an information box will open with links to further information.
BioMart is a freely available, open source database system that provides access to a wide range of data sources. The Ensembl Biomart tool allows you to perform advanced searches on structural variants that are currently incorporated into Ensembl and to download specific datasets. You just need to specify the 'Database' (this should be the latest Ensembl Variation database) and 'Dataset of interest' (e.g. 'Homo sapiens Structural Variation'). To calculate the number of records that will be returned, click the 'Count' button and then get your results by clicking on the 'Results' button. Figure 4 (below) shows an example query output.
Figure 4 The output from an example query to list all variant calls (SSVs) reported on human chromosome 2. In this figure, the filters selected were: 'Chromosome'=2(Region) and 'Limit to variants from this source'=DGVa (General Structural Variation Filters). Attributes selected were 'Chromosome Name' (Supporting Structural Variation (SSV) Placement) and 'Supporting Structural Variant Accession' (Supporting Structural Variation (SSV)).
The DGVa is the primary supplier of data to the Database of Genomic Variants (DGV) (hosted by The Centre for Applied Genomics in Toronto, Canada). The DGV is a comprehensive catalogue of structural variation in healthy human individuals. Additional curation and interpretation services are provided and variants can be viewed with DGV's custom genomic browser.