Comment[ArrayExpressAccession] E-GEOD-51556 MAGE-TAB Version 1.1 Public Release Date 2014-01-20 Investigation Title The dsRBP and inactive editor, ADR-1, utilizes dsRNA binding to regulate A-to-I RNA editing across the C. elegans transcriptome Comment[Submitted Name] The dsRBP and inactive editor, ADR-1, utilizes dsRNA binding to regulate A-to-I RNA editing across the C. elegans transcriptome Experiment Description Purpose: The purpose of this experiment is to expand the repertoire of C. elegans edited transcripts and identify the roles of ADR-1 as indirect regulator of editing and ADR-2 as the only active deaminase in vivo. Methods: Strand-specific RNA sequencing of wild-type and adr mutant worms, followed by a novel RNA variant calling and comparative analysis pipeline. Results: Despite lacking deaminase function, ADR-1 affects editing of over 60 adenosines within the 3’ UTRs of 16 different mRNAs. Furthermore, ADR-1 interacts directly with ADR-2 substrates, even in the absence of ADR-2; and mutations within its dsRNA binding domains abolished both binding and editing regulation. Conclusions: ADR-1 acts as a major regulator of editing by binding ADR-2 substrates in vivo and raises the possibility that other dsRNA binding proteins, including the inactive human ADARs, regulate RNA editing by deaminase-independent mechanisms. Strand-specific RNA sequencing of wild-type and adr mutant worms, followed by a novel RNA variant calling and comparative analysis pipeline. Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name Kakaradov Washburn Kakaradov Sundararaman Wheeler Hoon Yeo Hundley Person First Name Boyko Michael Boyko Balaji Emily Shawn Gene Heather Person Mid Initials C W A Person Email geo@ncbi.nlm.nih.gov Person Affiliation UCSD Person Address Bioinformatics, UCSD, 2880 Torrey Pines Scenic Dr., La Jolla, CA, USA Person Roles submitter Protocol Name P-GSE51556-4 P-GSE51556-1 P-GSE51556-3 P-GSE51556-2 Protocol Description ssRNAseq: The adr-1(-);adr-2(-) sample was sequenced on one lane of Illumina's HiSeq 2000 yielding 216 million single-end 76nt reads. Each other sample was sequenced on a lane of Illumina GAII yielding between 37 and 42 million reads of the same type. Illumina Casava1.8.2 software used for basecalling. Mapping: Sequenced reads were mapped to the C. elegans reference genome (ce10) using TopHat aligner (version 2.0.6) allowing only uniquely-mapped reads with up to two mismatches each with command line options -Mx 1 and -N 2 Variant calling: sites with RNA-DNA differences were identified by SAMtools mpileup (version 0.1.18) tallying up to 1000 alignments per site. Additional command line options used were -D -I and -g. Site filters: Annotated SNPs were obtained from Illumina's iGenomes collection for C. elegans (ce10) and unannotated variants were extracted from the adr-1(-);adr-2(-) RNA-Seq dataset. These genomic variants were filtered from the putative sites in all other strains reducing the number of false-positive predictions. Read filters: Each read aligned to one the remaining putative sites was filtered out if: a) it was a suspected PCR duplicate, according to SAMtools rmdup (version 0.1.18); b) it had a junction overhang < 10nt according to its SAMtools CIGAR string; c) it had > 1 non-A2G or non-C2T mismatch or any short indel, per its MD tag; d) it had a mismatch less than 25nt away from either end of the read Identify sites: Putative RNA editing sites were identified from A2G variants on the sense strand and T2C variants on the antisense strand that were covered by more than 5 reads which passed the filters in step 5, including the stringent 25nt threshold for filter 5d). Quantify sites: The extent of editing at each site and our confidence in that prediction were quantified by a novel extension of the classical Bayesian model used for genomic variants, which is described in more detail in the next section. To increase the accuracy and confidence of our predictions, we used additional reads from the relaxed version of filter 5d) that overlap the sites identified in step 6. Moreover, we dropped sites that exhibited editing in 100% of the reads (suggesting a genomic variant not filtered out by step 4) and those with very low editing (less than 10%), which would have been hard to distinguish from sequencing errors. The predicted RNA editing sites from each strain were characterized according to their position in annotated genic regions (introns, exons, 3'/5' UTRs, etc.) and according to their overlap with other strains. Supplementary_files_format_and_content: tab-delimited text file containing #READS and %EDITING values for each editing site Genome_build: ce10 The injection mix used for generating transgenics contained the following: 1ng/μl of the transgene of interest, 20ng/μl of the dominant marker, and 79ng/μl of 1kb DNA ladder (NEB) Total RNA was extracted using Trizol reagent. RNA libraries were prepared for strand specific RNA sequencing using the published protocol (Parkhomchuk et al. Nucleic Acid Research. 2009 37(18):e123) Transgenic worm lines were generated by microinjection into the gonads of young adult worms of the appropriate genetic background Protocol Type normalization data transformation protocol sample treatment protocol nucleic acid library construction protocol growth protocol Experimental Factor Name GENOTYPE Experimental Factor Type genotype Publication Title The dsRBP and inactive editor ADR-1 utilizes dsRNA binding to regulate A-to-I RNA editing across the C. elegans transcriptome. Publication Author List Washburn MC, Kakaradov B, Sundararaman B, Wheeler E, Hoon S, Yeo GW, Hundley HA PubMed ID 24508457 Publication DOI 10.1016/j.celrep.2014.01.011 Comment[SecondaryAccession] GSE51556 Comment[GEOReleaseDate] 2014-01-20 Comment[ArrayExpressSubmissionDate] 2013-10-22 Comment[GEOLastUpdateDate] 2014-03-28 Comment[AEExperimentType] RNA-seq of coding RNA Comment[SecondaryAccession] SRP031831 Comment[SequenceDataURI] http://www.ebi.ac.uk/ena/data/view/SRR1015363-SRR1015370 SDRF File E-GEOD-51556.sdrf.txt