Comment[ArrayExpressAccession] E-GEOD-46479 MAGE-TAB Version 1.1 Public Release Date 2013-08-01 Investigation Title Ultra-deep RNA-seq of in vitro and in vivo transcripts by Escherichia coli RNA polymerase Comment[Submitted Name] Ultra-deep RNA-seq of in vitro and in vivo transcripts by Escherichia coli RNA polymerase Experiment Description We report a high-resolution Illumina RNA-seq method that can analyze non-coded base substitutions in mRNA at 10(-4)-10(-5) per base frequencies in vitro and in vivo. The RNA samples were generated by transcription of pPR9 plasmid that contains a 5.7 kb fragment of E. coli rpoBC operon transcribed from a strong lambda phage PR promoter and terminated at an fd phage transcription terminator. The reference transcription reaction was performed in a buffer with 5 mM MgCl2 to determine the standard error rate (barcode 1). To reduce fidelity, we replaced Mg2+ with Mn2+ (barcode 2). To increase fidelity, we added GreA/GreB proteins for proofreading activity (barode 3 and 4). The same transcrit was purified from E. coli cells (barcode 5). Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name Imashimizu Imashimizu Kashlev Oshima Person First Name Masahiko Masahiko Mikhail Taku Person Email imashimizum@mail.nih.gov Person Affiliation NCI Person Address FNLCR, NCI, Bld539, Frederick, MD, USA Person Roles submitter Protocol Name P-GSE46479-3 P-GSE46479-2 P-GSE46479-1 Protocol Description CASAVA version 1.4 A single large fastq format file of high quality reads (Q ≥ 30) was split into about 10 smaller files by using a shell script splitReads.sh The obtained reads were aligned and mapped to the pPR9 plasmid DNA sequences using Bowtie 0.12.7. The numbers of 4 bases A, T, G, C, and N were counted in each position of the mapped reads by using the program SAMtools 0.1.18 with supplemental use of a Perl script. Each type of error rates per position was determined as the number of sequence reads with a particular type of base-substitution divided by the number of the reads with the reference base in each DNA position. Genome_build: pPR9 plasmid (ref is Kashlec et al., 1989, PMID: 2547695) Supplementary_files_format_and_content: tab-delimited text file of transition error rate per position. The position is corresponding to the position of Sequenced region of pPR9 plasmid (one of the attached txt file). In vitro RNA preparation: The 5.7 kb RNA was purified from the digested DNA, NTPs, abortive oligo-RNA products, and proteins by Acidic phenol extraction, G50 spin column, followed by EtOH precipitation. In vivo RNA preparation: The cells in 200 ml culture were harvested and resuspended with a solution containing 0.5% SDS, 20 mM sodium acetate (pH 5.5), and 10 mM EDTA. The suspended cells were mixed with an equal volume of pre-warmed saturated phenol (20 mM sodium acetate, 10 mM EDTA pH 5.5) and incubated for 5 min at 60 C. The mixture was centrifuged, and RNA and DNA were precipitated with ethanol from the supernatant. The pellet was dissolved in DNase I buffer with 10U of DNaseI and incubated for 30 min. RNA was separated from the digested DNA by acidic phenol extraction followed by G-50 Micro column (GE Healthcare) purification, and then precipitated with ethanol. The pellet was dissolved in diethylpyrocarbonate-treated water and used for cDNA synthesis. The cells culture were harvested and resuspended with a solution containing 0.5% SDS, 20 mM sodium acetate (pH 5.5), and 10 mM EDTA. The suspended cells were mixed with an equal volume of pre-warmed saturated phenol (20 mM sodium acetate, 10 mM EDTA pH 5.5) and incubated for 5 min at 60˚C. We established a method for preparing five different cDNA libraries each with its own barcode for Illumina sequencing. Each 6-nt barcode allows multiplexing all five in vitro and in vivo preparations in a single sequencing analysis. our method introduces internal control sequences to the library that are subjected to the artifact errors, but are not for RNAP errors. The 5’ fragment of the 5.7 kb RNA transcripts was reverse transcribed to make the cDNA. The cDNA was subjected to PCR reactions that generated six 200 bp segments. The primers contained a specific barcode for each of the five starting preparations and the inner Illuminasequencing adapters. The 2nd-step of PCR generated the final cDNA libraries for the Illumina sequencing by using the 1st-step PCR product as a template and primers containing the outer sequencing adapters in the 5’ tails. mRNA-seq with barcode (Illumina TruSeq Index 1-5) Cells were cultured in LB medium containing ampicillin at 28˚C. The overnight cell culture was inoculated into the fresh medium at 1/70 (v/v) and was incubated for 2 hr at 28˚C (OD600 reached 0.35) and then for 2 hr at 42˚C (OD600 reached 2.3) to induce the PR promoter. Protocol Type normalization data transformation protocol nucleic acid library construction protocol growth protocol Experimental Factor Name MRNA SYNTHESIS Experimental Factor Type mrna synthesis Publication Title Direct assessment of transcription fidelity by high-resolution RNA sequencing. Publication Author List Imashimizu M, Oshima T, Lubkowska L, Kashlev M PubMed ID 23925128 Publication DOI 10.1093/nar/gkt698 Comment[SecondaryAccession] GSE46479 Comment[GEOReleaseDate] 2013-08-01 Comment[ArrayExpressSubmissionDate] 2013-04-29 Comment[GEOLastUpdateDate] 2013-08-28 Comment[AEExperimentType] RNA-seq of coding RNA Comment[AEExperimentType] other Comment[AdditionalFile:Data1] GSE46479_Seq_region.txt Comment[AdditionalFile:Data2] GSE46479_error_rate.txt Comment[AdditionalFile:Data3] GSE46479_pPR9_plasmid.txt Comment[SecondaryAccession] SRP021886 Comment[SequenceDataURI] http://www.ebi.ac.uk/ena/data/view/SRR835953-SRR835957 SDRF File E-GEOD-46479.sdrf.txt