GWAS Catalog

The NHGRI-EBI Catalog of published genome-wide association studies

10th Anniversary of the NHGRI-EBI GWAS Catalog


The GWAS Catalog was launched in 2008 to provide a publicly available curated resource of all published human genome wide association studies (GWAS) and association results. Over 10 years, the Catalog has grown, from a single published GWAS on age-related macular degeneration (Klein et al., 2005), into the primary source of disease-related associations with genetic variants (watch the associations increase over time in this video). The Catalog now contains over 3,300 publications and almost 60,000 unique SNP-trait associations, with global reach and over 10,000 visitors per month. Extensive manual curation of GWA studies by expert scientists ensures that the Catalog provides an accurate, comprehensive and structured summary of GWAS results. This vast amount of data underpins research into common disease, enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies.



The GWAS landscape has evolved over the last 10 years, with developments in study design and genotyping technologies. GWAS have illustrated the highly polygenic nature of complex traits and the opportunity to address the issue of missing heritability (Manolio et al., 2009). The field has thus evolved with the examination of increasing sample sizes, number of genetic variants assayed and ever more specific traits, endophenotypes and measurements. Consequently, we have seen an increase in the complexity of GWAS: publications include multiple GWAS studies, combinatorial trait models, interaction effects, pleiotropy and multi-ancestry analysis. Here at the GWAS Catalog we adapt to ensure the resource contains the most relevant and up-to-date research that has the greatest utility to the scientific community.

Highlights over the last 10 years include releasing a new GWAS Catalog website in 2015 (www.ebi.ac.uk/gwas); mapping curated trait descriptions to ontology terms to enable enriched ontology-driven search capabilities; improving the well-known GWAS Catalog diagram; and more completely capturing ancestry. Accurate characterization of ancestry is essential to interpret human genomics data. To facilitate this we developed a framework to systematically describe detailed ancestry information (see our recent published framework and guidelines for reporting ancestry (Morales et al., 2018). Recently, with support from the scientific community, we started hosting summary statistics for GWAS Catalog studies. Integration of these full p-value datasets with the structured meta-data in the Catalog will facilitate community analysis and re-use of these valuable data. In the future, we will make these datasets more accessible via a comprehensive, searchable database, providing a platform for downstream meta-analysis. Please submit your data to us to extend the utility of your results and the potential for follow on studies.

We are also investigating expanding our scope to include targeted arrays and going beyond array-based genotyping to include sequencing-based genotyping, which enables a deeper interrogation of rare disease-associated variants. With these developments come many challenges, including how best to represent the increasing complexity of study designs, trait architecture and statistical analyses, but with the help of GWAS Catalog publication authors and users we are excited about what the next 10 years of GWAS hold.

References

Klein RJ, Zeiss C, Chew EY, Tsai JY, Sackler RS, Haynes C, Henning AK, SanGiovanni JP, Mane SM, Mayne ST, Bracken MB, Ferris FL, Ott J, Barnstable C, Hoh J. Complement factor H polymorphism in age-related macular degeneration. Science. 2005 Apr 15;308(5720):385-9. Epub 2005 Mar 10. PubMed PMID: 15761122; PubMed Central PMCID: PMC1512523.

Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM. Finding the missing heritability of complex diseases. Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494. Review. PubMed PMID: 19812666; PubMed Central PMCID: PMC2831613.

Morales J, Welter D, Bowler EH, Cerezo M, Harris LW, McMahon AC, Hall P, Junkins HA, Milano A, Hastings E, Malangone C, Buniello A, Burdett T, Flicek P, Parkinson H, Cunningham F, Hindorff LA, MacArthur JAL. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol. 2018 Feb 15;19(1):21. doi: 10.1186/s13059-018-1396-2. PubMed PMID: 29448949; PubMed Central PMCID: PMC5815218.