normalization data transformation protocol
Bowtie alignment protocol. Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes.Reads that passed default parameters of the Illumina quality filter were mapped with Bowtie 0.12.7. The reference sequence used to built an index for mapping was the Flybase reference r5.22. The reference index was created using the bowtie-build function with default parameters. Reads were mapped using -n 2 -l 28 -e 70 -k 1 -m 10 --best in Bowtie (-n 2, -l 28, -e 70 default). Parameters-n INT number of allowed mismatch-l INT seed length-e INT maximum permitted total of quality values at all mismatched read positions-m INT suppress all alignments for a particular read or pair if more than INT reportable alignments-k INT report up to INT valid alignments--best report alignments in best-to worst orderReads were converted to SAM formant using the bowtie2sam.pl script provided by the DCC. SPP Peak calling protocol. The aligned reads were called using SPP R package (ver.1.10, Kharchenko et al., Nat.Biotech 2008). The binding characteristics and tag shift were estimated using SPP get.binding.characteristics() function. The reads showing abnormality were removed using SPP remove.local.tag.anomalies() function. The regions of enrichment were identified as continuous blocks of positions that exhibit significant enrichment of IP over input reads. The significance of enrichment was assessed within a 1Kb window around each point using a Poisson model. A position was considered significantly enriched if the number of IP read counts was significantly higher (Z-score > 3) than the number of input read counts, after adjusting for the total number of reads sequenced in IP and input measurements. For each replicate set, if replicates were consistent (>80% agreement for top 40% of peaks/enrichment clusters), reads in each replicates were combined and enrichment regions were called as described above. Processed data are obtained using following parameters: genome version is FlyBase r5
