Database of Genomic Variants archive

The Database of Genomic Variants archive (DGVa) is a repository that provides archiving, accessioning and distribution of publicly available genomic structural variants, in all species.

 

    image

The DGVa team accepts direct submissions from researchers and also curates data from the published literature.  Objects are accessioned with the prefix 'e'.  As part of a regular exchange, DGVa data is sent to its partner archive, dbVar (hosted by the National Center for Biotechnology Information).  dbVar also sends data to DGVa; these have been archived and accessioned by dbVar, with the prefix 'n'.

 

Accessioned objects

Stable identifiers (accession numbers) are provided for the study, the region in which the variation occurs, and the individual sample call. The archive also collects and stores important information relating to those objects.

 

    Accessioned objects

The study (accession number prefixed with 'estd') is the container for all accessioned objects relating to that body of work. A variant region (accession number prefixed with 'esv') is based on the evidence (assertion) of one or more overlapping calls (accession number prefixed with 'essv') in individual samples (or groups of samples).  Associated information for the call includes the experiment (the procedure that identified the presence of variation), the type (inversion, deletion, etc.), the placement (where it is located in the genome) and of course information for the sample in which it was observed.  The assertion method is important for the region, as it explains how the researcher has asserted that this is a structurally variant region of the genome, e.g. all the calls overlap by at least 80%.  As exampled in the diagram above, 3 sample level calls are overlapping and are used to assert the presence of a variant region.  In some circumstances, a region can be based on the evidence of two or more overlapping regions.

 

DGVa data can be accessed in a number of ways:


GVF files 

All accessioned objects for each study are packaged into a GVF file, available for download on our data download page. The list of studies available is usually updated on the last Thursday of the month.

Ensembl

View structural variants in their genomic context with Ensembl's genome browser.

DGV

DGVa is the primary supplier of data to the Database of Genomic Variants (DGV), hosted by the The Centre for Applied Genomics in Toronto, Canada.  DGV provides extra curation and interpretation services and maintains the most comprehensive resource for genomic structural variation in humans. 

BioMart

Download specific datasets, e.g. all structural variants residing within a specified genomic interval, from Biomart.

 

For more information about DGVa, please contact the DGVa helpdesk.