Prepare your gene-based or Copy Number Variant summary statistics for submission to the GWAS Catalog
This tool will check your summary statistics files contain valid data and automatically prepare them for submission.
If you need help with your submission, please email gwas-subs@ebi.ac.uk.
All validation runs locally in your browser. Your data is not uploaded to a server.
If you need to validate more than a few summary statistic files, the command line interface is a better way to get started.
Thank you for submitting your data to the GWAS Catalog.
Different types of genetic variants have different data requirements. For example, structural variants like CNVs require start and end positions (a range), while SNPs only require a single position.
SNP submissions are not supported by this tool yet.
Please see gwas-sumstats-tools for SNP submissions.
Only SNP, CNV, and gene-based genome-wide studies are currently validated by the GWAS Catalog.
To enquire about submitting a non-standard data type or pilot study, please contact gwas-subs@ebi.ac.uk.
What reference genome were your variants called against?
GWAS Catalog submissions are expected to be full genome-wide datasets, not just top hits. Files with too few rows may be reviewed or rejected during submission if they don't represent a complete genome-wide analysis.
Your file may not meet the minimum row count requirement.
You can still continue and validate your file, but submissions without genome-wide coverage may be reviewed or rejected. If you believe this is correct for your study design, please contact gwas-subs@ebi.ac.uk before submitting.
It's OK if quality control steps have reduced the number of rows in your final dataset below the minimum row count, but please ensure you are submitting the full set of variants that were analysed in your study, including data which didn't meet GWAS significance.
Different types of GWAS data are represented in different ways. Telling us what's in your file helps us validate it correctly.
This checklist describes the fields (column names) that should be present in your summary statistics table.
If you're not sure what to put in each field, please download the example file from here and adapt it to your data.
Column names look like this. Column names must appear exactly as shown below. Any differences or typos in your files will cause validation errors.
look like this
Thresholding or rounding p values limits the downstream usability of the data.
If your data includes p values thresholded or rounded to zero, please provide negative log₁₀ p values instead.
If this isn't possible, you will be asked to include more details about the GWAS software that you used in the study metadata when you submit.
Accepted formats: .tsv, .csv, .txt (optionally gzip compressed: .tsv.gz, etc.)
.tsv
.csv
.txt
.tsv.gz
Selected file: —
Size: —
Loading Python environment…