Comment[ArrayExpressAccession] E-GEOD-38725 MAGE-TAB Version 1.1 Public Release Date 2013-02-19 Investigation Title Stallion sperm transcriptome as revealed by microarray analysis and RNA sequencing Comment[Submitted Name] Stallion sperm transcriptome as revealed by microarray analysis and RNA sequencing Experiment Description Purpose: In order to understand the functional significance of sperm transcriptome in stallion fertility, the aim of this study was to generate a detailed body of knowledge about the sperm RNA profile that defines a normal fertile stallion. Methods: The 50 bp single-end ABI SOLiD raw reads were directly aligned with the horse reference sequence EcuCab2 using ABI aligner software (NovoalignCS version 1.00.09, novocraft.com) which uses multiple indexes in the reference genome, identifies candidate alignment locations for each primary read, and allows completion of the alignment. Results: Next generation sequencing (NGS) of total RNA from the sperm of two reproductively normal stallions generated about 70 million raw reads and more than 3 Gb of sequence per sample; over half of these aligned with the EcuCab2 reference genome. Altogether, 19,257 sequence tags with average coverage ≥1 (normalized number of transcripts) were mapped in the horse genome. Conclusion: The sequence of stallion sperm transcriptome is an important foundation for the discovery of transcripts of known and novel genes, and non-coding RNAs, thus improving the annotation of the horse genome sequence draft and providing markers for evaluating stallion fertility. Reproductively fertile Stallion sperm transcriptome as revealed by RNA sequencing Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name Das Das Raudsepp Person First Name Pranab Pranab Terje Person Mid Initials Jyoti J Person Email pjdas@cvm.tamu.edu Person Affiliation Texas A&M University Person Phone 979-458-0520 Person Fax 979-845-9972 Person Address Department of Veterinary Integrative BioSciences, Texas A&M University, Room 314B Bldg. 1197, College Station, TX, USA Person Roles submitter Protocol Name P-GSE38725-1 P-GSE38725-2 P-GSE38725-3 Protocol Description Fresh sperm ejaculates from reproductively normal stallions were collected using an artificial vagina. The ejaculates were first evaluated for sperm concentration, motility characteristics and morphological features, followed by purification from somatic cells and immature sperm by EquiPure™ (Nidacon International, Sweden) discontinuous gradient centrifugation. Total RNA was isolated from sperm with TRIzol reagent (Invitrogen). Total RNA from the sperm of reproductively normal stallions was used for next generation sequencing (NGS) on the ABI SOLiD platform. THE TOTAL 500 ng of total RNA was directly used for SOLiD single-end RNA sequencing fragment library construction according to the ABI protocol The 50 bp single-end SOLiD raw reads were directly aligned with the horse reference sequence EcuCab2 using ABI aligner software (NovoalignCS version 1.00.09, novocraft.com) which uses multiple indexes in the reference genome, identifies candidate alignment locations for each primary read, and allows completion of the alignment. The single highest-scoring alignment for each raw read was mapped. Alignments in repetitive sequences were discarded by removing reads with multiple similarly scoring alignments (non-unique matches). Sequence alignment and alignment clustering to define expressed loci and perform linear normalization across the two sperm RNA samples was carried out with a software package EXP (Cofactor Genomics). Gene expression level or average coverage (AC) was calculated by normalizing each sample to the fewest reads and directly comparing different loci. Expression level of a transcript was estimated from the number of reads that mapped to that transcript. The variability present in sequencing depths in different samples was taken care of by the use of two biological replicates. Sequencing depth at each locus and differences in gene expression (AC) between the two sperm samples were calculated using log (base2) ratio, thus showing the association between the two samples. Genome_build: EcuCab2 Supplementary_files_format_and_content: tab-delimited text files include chromosomal location of 19,257 RNA Seq tag combining two biological replicate with their expression value (Average Coverage), Average coverage≥1 Protocol Type specified_biomaterial_action nucleic acid library construction protocol feature_extraction Publication Title Stallion Sperm Transcriptome Comprises Functionally Coherent Coding and Regulatory RNAs as Revealed by Microarray Analysis and RNA-seq. Publication Author List Das PJ, McCarthy F, Vishnoi M, Paria N, Gresham C, Li G, Kachroo P, Sudderth AK, Teague S, Love CC, Varner DD, Chowdhary BP, Raudsepp T PubMed ID 23409192 Publication DOI 10.1371/journal.pone.0056535 Comment[SecondaryAccession] GSE38725 Comment[GEOReleaseDate] 2013-02-19 Comment[ArrayExpressSubmissionDate] 2012-06-14 Comment[GEOLastUpdateDate] 2013-02-19 Comment[AEExperimentType] RNA-seq of coding RNA Comment[AdditionalFile:Data1] GSE38725_RNASeq.Sample1.Sample2.txt Comment[SecondaryAccession] SRP013752 Comment[SequenceDataURI] http://www.ebi.ac.uk/ena/data/view/SRR513397-SRR513398 SDRF File E-GEOD-38725.sdrf.txt