Please note that we have stopped the regular imports of Gene Expression Omnibus (GEO) data into ArrayExpress. This may not be the latest version of this experiment.

E-GEOD-27381 - Multiplexed identification of the genomic targets of DNA-binding proteins

Released on 18 February 2011, last updated on 3 May 2014
Saccharomyces cerevisiae
Samples (22)
Protocols (3)
Transcription factors direct gene expression, and so there is much interest in mapping their genome-wide binding locations.  Current methods do not allow for the multiplexed analysis of TF binding, and this limits their throughput. We describe a novel method for determining the genomic target genes of multiple transcription factors simultaneously. DNA-binding proteins are endowed with the ability to direct transposon insertions into the genome near to where they bind. The transposon becomes a “Calling Card” marking the visit of the DNA-binding protein to that location. A unique sequence “barcode” in the transposon matches it to the DNA-binding protein that directed its insertion. The sequences of the DNA flanking the transposon (which reveal where in the genome the transposon landed) and the barcode within the transposon (which identifies the TF that put it there) are determined by massively-parallel DNA sequencing. To demonstrate the method’s feasibility, we determined the genomic targets of eight transcription factors in a single experiment. The Calling Card method promises to significantly reduce the cost and labor needed to determine the genomic targets of many transcription factors in different environmental conditions and genetic backgrounds. These data contain Ty5 insertion sites mapped by an Illumina GAII analyzer in the S. cerevisiae genome for the background strain without any Sir4 present (1 run), in strains expressing Sir4-tagged copies of three well-characterized TFs: Gal4, Leu3, and Gcn4 (1 run each), and a multiplex of eight Sir4-tagged TFs pooled in a single experiment (2 biological replicates), and insertions from the Thi2-Sir4 fusion expressed from its native locus in two conditions (1 run each). The format of each insertions file is [chromosome number] [position of genomic base] [direction of insertion] [number of reads at that position]. Raw sequencing data comes in two varieties. Paired-end data contains a 5 bp barcode at the beginning of read #2. Single-end data contains a 2 bp barcode on the beggining of read #1.
Experiment type
Exp. designProtocolsVariablesProcessedSeq. reads
Investigation descriptionE-GEOD-27381.idf.txt
Sample and data relationshipE-GEOD-27381.sdrf.txt
Processed data (1)