Comment[ArrayExpressAccession] E-GEOD-44748 MAGE-TAB Version 1.1 Public Release Date 2013-10-17 Investigation Title Genome-wide binding profiles of KLF3 and KLF3 mutants in MEF cells Comment[Submitted Name] Genome-wide binding profiles of KLF3 and KLF3 mutants in MEF cells Experiment Description Transcription factors are often regarded as being comprised of a DNA-binding domain and a functional domain. The two domains are considered separable and autonomous, with the DNA-binding domain directing the factor to its target genes and the functional domain imparting transcriptional regulation. We have examined a typical Zinc Finger (ZF) transcription factor from the Krüppel-like factor (KLF) family, KLF3. This factor has an N-terminal repression domain that binds the co-repressor C-terminal binding protein (CtBP), and a DNA-binding domain composed of three classical (ZFs) at its C-terminus. We established a system to compare the genomic occupancy profile of wildtype KLF3 with two mutants affecting the N-terminal functional domain: a mutant unable to contact its cofactor CtBP and a mutant lacking the entire N-terminal domain, but retaining the ZFs intact. We used chromatin immunoprecipitation followed by sequencing (ChIP-seq) to assess binding across the genome in murine embryonic fibroblasts. Our results define the in vivo recognition site for KLF3 and the two mutants as a typical CACCC-like element. Unexpectedly, we observe that mutations in the N-terminal functional domain severely affect DNA binding. In general, both mutations reduce binding but there are also instances where binding is retained or even increased. These results provide a clear demonstration that the correct localization of transcription factors to their target genes is not solely dependent on their DNA-contact domains. This informs our understanding of how transcription factors operate and is of relevance to the design of artificial ZF proteins. ChIP-seq was performed on the three samples, KLF3, ΔDL and DBD in duplicate (biological replicates). Input samples were used as controls. Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name Burdach Burdach Funnell Artuz Sin Tan Pearson Crossley Person First Name Jon Jon Alister Crisbel Mak Lit Richard Merlin Person Mid Initials P M K Y C Person Email geo@ncbi.nlm.nih.gov Person Affiliation University of New South Wales Person Address BABS, University of New South Wales, UNSW, Sydney, NSW, Australia Person Roles submitter Protocol Name P-GSE44748-3 P-GSE44748-2 P-GSE44748-1 Protocol Description Libraries (6 inputs and 6 IP samples) were multiplexed into four lanes using sample specific adapters such that there were 3 samples per lane. Samples were sequenced using 50 bp chemistry on the HiSeq 2000 (Illumina, San Diego, CA). Quality control on the sequence data was performed using FastQC v0.10.1 available from http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Base-calling and demultiplexing was performed using the standard Illumina software packaged with the HiSeq2000. Quality control on the sequence data was performed using FastQC v0.10.1 available from http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Reads were aligned to the mm9/NCBI37 Mus musculus genome using Bowtie2 v2.0.0-beta7 (Langmead and Salzberg 2012). In the first round, Bowtie2 was set to --very-sensitive and –D 40. Non-aligned reads were subjected to a second round of alignment where the read could be soft clipped by running Bowtie2 with the switch --very-sensitive-local. Resulting alignments were sorted, merged and indexed using Samtools v0.1.18 (Li et al. 2009). Peak calling and downstream analysis was primarily performed using the HOMER software package v4.1 (available from http://biowhat.ucsd.edu/homer/ngs/index.html) (Heinz et al. 2010). The script findPeaks.pl was used to for peak discovery using the paired input sample as a control with the settings -style factor, -F 5 and -L 5, requiring 5x fold enrichment over input and 5x fold enrichment over background (surrounding 10 kb) to call a peak. Peaks were subjected to an FDR cut-off of 0.001. Peaks were merged using mergePeaks using the switch -d given meaning that peaks had to literally overlap in genomic space to be considered overlapping. Peak lists were annotated using annotatePeaks.pl using the HOMER annotation set for mm9/NCBI37. HOMER was used to quantify ChIP tag density at peak locations across the genome. Tags were counted within 400 bp around the peak centre (as peak widths could vary across the three different samples). All tag counts were normalised to 100M reads and were thus expressed as reads/100M reads to allow comparison across samples. HOMER was also used to create bedgraph files using the makeUCSCfile program. Genome_build: mm9/NCBI37 Supplementary_files_format_and_content: Processed data files include 1) a normalised ChIP tag density track (.bedgraph). This tag density track reflects the normalised tag density across the merged replicates. 2) Peak lists for each individual replicate (tab-delimited-txt). 3) Peak lists that reflect the overlap between replicates and include log2 normalised tag counts for each peak (tab-delimited-txt). 4) A matrix table that includes peaks across all three conditions and log2 normalised tag counts for each peak (tab-delimited-txt). ChIP was conducted in duplicate on Klf3-/- MEFs expressing recombinant Klf3, ΔDL or DBD. Aproximately 5x107 cells were used for each experiment and ChIP was conducted as previously described (Schmidt et al. 2009) using an anti-V5 antibody (Cat# R960-CUS, Life Technologies, Carlsbad, CA). Library preparation was performed using the TruSeq DNA Sample Preparation Kit (Cat# FC-121-2001, Illumina, San Diego, CA) according to manufacturer’s instructions with minor modifications. Adapter sequences were diluted 1/40 before use and following adapter ligation, the library size extracted from the gel was 100-280 bp (excluding adapters) in line with the size of sonicated fragments. Cells were grown in DMEM supplemented with 10% FCS and with 1X PSG Protocol Type normalization data transformation protocol nucleic acid library construction protocol growth protocol Experimental Factor Name GENOTYPE ANTIBODY Experimental Factor Type genotype antibody Publication Title Regions outside the DNA-binding domain are critical for proper in vivo specificity of an archetypal zinc finger transcription factor. Publication Author List Burdach J, Funnell AP, Mak KS, Artuz CM, Wienert B, Lim WF, Tan LY, Pearson RC, Crossley M PubMed ID 24106088 Publication DOI 10.1093/nar/gkt895 Comment[SecondaryAccession] GSE44748 Comment[GEOReleaseDate] 2013-10-17 Comment[ArrayExpressSubmissionDate] 2013-02-28 Comment[GEOLastUpdateDate] 2013-11-15 Comment[AEExperimentType] ChIP-seq Comment[AdditionalFile:Data1] GSE44748_All_peaks.txt Comment[AdditionalFile:Data2] GSE44748_DBD-1_peaks.txt Comment[AdditionalFile:Data3] GSE44748_DBD-2_peaks.txt Comment[AdditionalFile:Data4] GSE44748_DBD_merged_peaks.txt Comment[AdditionalFile:Data5] GSE44748_DDL-1_peaks.txt Comment[AdditionalFile:Data6] GSE44748_DDL-2_peaks.txt Comment[AdditionalFile:Data7] GSE44748_DDL_merged_peaks.txt Comment[AdditionalFile:Data8] GSE44748_KLF3-1_peaks.txt Comment[AdditionalFile:Data9] GSE44748_KLF3-2_peaks.txt Comment[AdditionalFile:Data10] GSE44748_KLF3_merged_peaks.txt Comment[SecondaryAccession] SRP019227 Comment[SequenceDataURI] http://www.ebi.ac.uk/ena/data/view/SRR771348-SRR771359 SDRF File E-GEOD-44748.sdrf.txt