Welcome to the GWAS Catalog RESTful API!

This documentation provides detailed guidance on effectively integrating GWAS Catalog data services into your applications (scientific pipelines, scripts, web applications, etc). It's designed with developers in mind and includes plenty of helpful examples. If you have any questions, feel free to reach out to our helpdesk gwas-info@ebi.ac.uk.

The GWAS Catalog API platform provides all the tools you need to search for and identify variants linked to your disease of interest. Whether you're building a simple platform to ascertain disease-specific associations, searching across phenotypes for a specific variant, summarising associations within a population group, or identifying studies with full-genomewide summary statistics available for download, we provide you with all the resources needed to achieve your goal.

🤔 What data can I find in the API?

This API allows access to the literature-curated top associations & metadata (the same data that is available via the GWAS Catalog website, eligibility criteria described elsewhere).
A second API enabling access to data from the full genome-wide summary statistics collection is under development.

🛠️ Where can I find a technical reference for the API?

You can find a detailed reference manual containing endpoints, schemas, parameters, syntax, definitions, and response codes here:

API Reference

The page includes a visual, interactive interface for exploring and testing the endpoints.

For more documentation about the extensive controlled vocabularies in use in the Catalog, including the API, we refer you to lists available for:

More documentation about population descriptors, including methods of assigning ancestry labels, is available here: https://www.ebi.ac.uk/gwas/population-descriptors

🚀 Help me get started!

Here are some features of the API that it’s useful to know about before you start:

Pagination

API responses are paginated to manage performance, scalability, and usability. The default page size is 20 records. If there are many results returned, you'll need to handle next links in the response.

Throttling

Users of the API are limited to 15 queries per second to prevent server overload and ensure fair usage. If the limit is exceeded, the API will slow down subsequent calls.

Trait search

When querying by efo_trait (ontology trait mapping), you have two options:

  1. To retrieve only data annotated directly with the query term (e.g. querying for “asthma” returns data annotated with “asthma” but not “status asthmaticus”) - set show_child_traits=false.
  2. To retrieve data annotated directly with the query term AND any more specific child traits (e.g. querying for “asthma” returns data annotated with “asthma”, plus “status asthmaticus” and other subtypes) - set show_child_traits=true. This is the default setting on the GWAS Catalog trait pages.

If you prefer to first identify a wider range of relevant traits and query for a list, the best place to start is the efo_trait endpoint. Searching here for a free text term will return any traits including the term. For example, searching for "COVID-19" could return "COVID-19", "COVID-19 symptoms measurement", "long COVID-19", "response to COVID-19 vaccine", and "time to remission of COVID-19 symptoms". You can then query your desired endpoint using the list of traits.

Note, trait examples are shown here with their efo_trait name for readability, but it’s often better to use the efo_id (e.g. MONDO_0004979) for precision. There’s more information about how we annotate traits in the ontology documentation.

Gene filter

When querying associations or single-nucleotide-polymorphisms by gene, you have two options:

  1. To use the set of genes defined as the gene(s) in which a GWAS catalog variant maps, or the nearest upstream and downstream genes within 50kb, according to Ensembl’s annotation - apply the option extended_geneset=false. This is the same annotation that’s shown in the GWAS Catalog web interface.
  2. To use an extended set of genes, defined as all Ensembl and RefSeq genes mapping within 50kb upstream and downstream of each GWAS Catalog variant - apply the option extended_geneset=true. This is the annotation that was used in the V1 API.

By default, the query uses option 1.

🧑‍💻 Examples of a few key ways you can use the GWAS API

For hands-on, executable examples, please explore our collection of Jupyter Notebooks.

For example, answering scientific questions like "what variants are associated with a particular disease?", "Which studies of a trait have full summary statistics available?", "Which SNP has the strongest effect size for a particular disease?" and many more. This is a great place to start if you are new to the REST API.