Comment[ArrayExpressAccession] E-GEOD-34073 MAGE-TAB Version 1.1 Public Release Date 2013-06-10 Investigation Title Uniform optimal framework for integrative next-gen sequence analysis Comment[Submitted Name] Uniform optimal framework for integrative next-gen sequence analysis Experiment Description Here, we have collapsed multiple analysis problems into two coherent categories, signal detection and signal estimation and adapted linear-optimal solutions from signal processing theory. Our algorithms for detection (DFilter) and estimation (EFilter) extend naturally to integration of multiple datasets. In benchmarking tests, DFilter outperformed assay-specific algorithms at identifying promoters from histone ChIP-seq, binding sites from transcription factor (TF) ChIP-seq and open chromatin regions from DNase- and FAIRE-seq data. EFilter similarly outperformed an existing method for predicting mRNA levels from histone ChIP-seq data (Spearman correlation: 0.81 - 0.89). We performed H3K4me3 and H3K36me3 ChIP-seq on e11.5 mouse forebrain and used DFilter and EFilter to predict promoters and developmental gene expression, uncovering plausible gene targets for SNPs associated with neurodevelopmental disorders. Generated two histone modifiction ChiP-seq in developing embryo mouse forebrain and using them for making bioligical inferences Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name Kumar Kumar Prabhaker Masafumi Person First Name Vibhor Vibhor Shyam MURATANI Person Email kumarv1@gis.a-star.edu.sg Person Affiliation Genome Institute of Singapore Person Address CMB-6, Genome Institute of Singapore, 60 Biopolis Street, Genome, #02-01, Singapore, Singapore Person Roles submitter Protocol Name P-GSE34073-3 P-GSE34073-5 P-GSE34073-4 P-GSE34073-2 P-GSE34073-1 Protocol Description Input-FB-CMM012__unique_hits.tags.txt; genome build: mm8 The tags were extended by 200 bp to make wiggle track for sequenced libraries binning whole genome in 200bp bins. For finding peaks in H3K4me3 ChIPseq Dfilter was used. The peaks are provided in supplementary material. The ChIPseq data was also used by EFilter to predict expression of gene in mouse forebrain. For Efilter GC correction was done for both ChIP libraries using corresponding input libararies. Input-FB-CMM023_023_unique_hits.tags.txt; genome build: mm8 The tags were extended by 200 bp to make wiggle track for sequenced libraries binning whole genome in 200bp bins. For finding peaks in H3K4me3 ChIPseq Dfilter was used. The peaks are provided in supplementary material. The ChIPseq data was also used by EFilter to predict expression of gene in mouse forebrain. For Efilter GC correction was done for both ChIP libraries using corresponding input libararies. forebrain-track-CMM026-H3K36me3.vstep.wig; genome build: mm8 H3K36me3-FB-CMM026_184_unique_hits.tags.txt; genome build: mm8 The tags were extended by 200 bp to make wiggle track for sequenced libraries binning whole genome in 200bp bins. For finding peaks in H3K4me3 ChIPseq Dfilter was used. The peaks are provided in supplementary material. The ChIPseq data was also used by EFilter to predict expression of gene in mouse forebrain. For Efilter GC correction was done for both ChIP libraries using corresponding input libararies. forebrain-track-CMM008-H3K4me3.vstep.wig; genome build: mm8 H3K4me3-FB-CMM008_138_unique_hits.tags.txt; genome build: mm8 The tags were extended by 200 bp to make wiggle track for sequenced libraries binning whole genome in 200bp bins. For finding peaks in H3K4me3 ChIPseq Dfilter was used. The peaks are provided in supplementary material. The ChIPseq data was also used by EFilter to predict expression of gene in mouse forebrain. For Efilter GC correction was done for both ChIP libraries using corresponding input libararies. Mouse embryonic forebrain tissues were micro-dissected and pooled from 11.5 dpc embryos. Dissected tissues were mechanically dissociated by passing cell strainer and fixed in 1% formaldehyde (diluted from 37% stock solution, Sigma, F8775) simultaneously following pre-treatment with 0.125% tripsin/versene solution. 36 forebrain pieces were used for each ChIP-seq library preparation. Cells were resuspended in SDS buffer and sonicated 12 times for 30 seconds with 30 seconds interval with Bioruptor water bath sonicator (Diagenode). Following sonication, samples were diluted with IP dilution buffer and incubated with 50 micro-litter affinity resin coupled with 10 micro-litter anti-H3K4me3 antibodies (Upstate, Cat# 07-473, Lot# 32497). Following washing steps, chromatin was reverse-crosslinked for purification of DNA. H3K4me3 ChIPed DNA and 0.5 % input DNA samples were amplified for 8 cycles with GenomePlex Single Cell Whole Genome Amplification Kit (WGA4, Sigma) using universal primer linked to BpmI restriction site. Amplified DNA samples were digested with BpmI to remove universal primer. After quantification of DNA with Quant-iT PicoGreen sdDNA Assay Kit (Invitrogen, P7589), 12 ng of the DNA sample was directly used for Illumina sequencing adaptor preparation. Illumina sequencing was performed using GAII platform in GIS. H3K36me3 data was obtained similarly using 15 micro-litter of anti-H3K36me3 antibody (Abcam, Cat# Ab9050, Lot# 707981). As DNA yield of H3K36me3 antibody ChIP was much higher, WGA amplification step was not performed and ChIP DNA was directly used for Illumina sample preparation. Protocol Type normalization data transformation protocol normalization data transformation protocol normalization data transformation protocol normalization data transformation protocol nucleic acid library construction protocol Experimental Factor Name CHIP ANTIBODY VENDOR CHIP ANTIBODY LOT# CHIP ANTIBODY CHIP ANTIBODY CATALOG# Experimental Factor Type chip antibody vendor chip antibody lot# chip antibody chip antibody catalog# Comment[SecondaryAccession] GSE34073 Comment[GEOReleaseDate] 2013-06-10 Comment[ArrayExpressSubmissionDate] 2011-12-01 Comment[GEOLastUpdateDate] 2013-06-10 Comment[AEExperimentType] ChIP-seq Comment[SecondaryAccession] SRP009567 Comment[SequenceDataURI] http://www.ebi.ac.uk/ena/data/view/SRR387507-SRR387513 SDRF File E-GEOD-34073.sdrf.txt