CTCF Quantative Trait Loci

This is a page to organise data from the paper Quantitative Genetics of CTCF Binding Reveal Local Sequence Effects and Different Modes of X-Chromosome Association. As well the supplementary information in the paper, this web page helps organise information from the publication

Raw Data

The raw data for this project is present ENA

Binding Region Phenotypes

This is the phenotype file with normalised phenotype matrix with each row a binding region and each column a sample.

QTLs

This is the called QTLs 1% FDR threshold (q value <= 0.01) and kept only cluster variants defined as having P value within one order of magnitude to the P value of the lead variant for the same binding region.

Allele Specific Summary

This is the summary data for the allele specific SNPs (ie, behind Figure 4a and 4b)

Chromosome X binding

This is the summary data for the CTCF X chromosome information

Additional Data Files

Here are some additional data file that people might find interesting.

Genotypes: genotype data in VCF format stored separately for each chromosome. Starting from the 1000 Genomes Phase 1 release, only variants that are within 50kb to a CTCF binding region, as defined in the phenotype file, are included. Variants with allele frequency less than 5% in the 51 samples in study are excluded.

Allele specific sites per cell line. The columns are:

  • chr: Chromosome of SNP
  • position: Location of SNP
  • ref: Reference allele
  • alt: Alternative allele
  • ref_count: reference count
  • alt_count: alternative count
  • percent_ref: percent reference allele (0-1)
  • dbSNP: dbSNP ID
  • genotype: heterozygous(1|0) only in these files
  • low_count: mininum of ref_count and alt_count
  • pVal: Binomial P value of allele bias
  • Pval.adj, FDR (BH method) corrected binomial P value coordinate: chr|position