Comment[ArrayExpressAccession] E-GEOD-50890 MAGE-TAB Version 1.1 Public Release Date 2013-12-01 Investigation Title Efficient identification of microRNAs for classification of tumor origin (I) Comment[Submitted Name] Efficient identification of microRNAs for classification of tumor origin (I) Experiment Description Carcinomas of unknown primary origin constitute 3-5% of all newly diagnosed metastatic cancers, of which the primary source is difficult to classify with current histological methods. Effective cancer treatment depends on early and accurate identification of the tumor, which is why patients with metastases of unknown origin have poor prognosis and short survival. Because microRNA expression is highly tissue specific, the microRNA profile of a metastasis may be used to identify its origin. As a first step to realize this goal, we evaluated the potential of microRNA profiling for identification of the primary tumor of known metastases. 208 formalin-fixed paraffin-embedded samples representing 15 different histologies were profiled on an LNA-enhanced microarray platform, which allows for highly sensitive and specific detection of microRNA. Based on these data, we developed and cross-validated a novel classification algorithm, LASSO (Least Absolute Shrinkage and Selection Operator), which had an overall accuracy of 85%. When the classifier was applied on an independent test set of 48 metastases, the primary site was correctly identified in 42 cases (88% accuracy). Our findings suggest that microRNA expression profiling on paraffin tissue can efficiently predict the primary origin of a tumor, and may provide pathologists with a molecular diagnostic tool that can improve their capability to correctly identify the origin of hitherto unidentifiable metastatic tumors, and eventually, enable tailored therapy. 94 samples Term Source Name ArrayExpress EFO Term Source File http://www.ebi.ac.uk/arrayexpress/ http://www.ebi.ac.uk/efo/efo.owl Person Last Name SM-CM-8kilde SM-CM-8kilde Litman Person First Name Rolf Rolf Thomas Person Email rolf.soekilde@gmail.com Person Affiliation Lund University, Sweden Person Address Department of Oncology, Lund University, Sweden, ScheelevM-CM-$gen 2, Lund, Sweden Person Roles submitter Protocol Name P-GSE50890-1 P-GSE50890-5 P-GSE50890-6 P-GSE50890-2 P-GSE50890-3 P-GSE50890-4 P-GSE50890-7 Protocol Description All low-level analyses were carried out in the R environment, including importing and pre-processing of the data using the LIMMA package (http://bioconductor.org). Mean pixel intensities were used to calculate signal (foreground) spot intensities, and median pixel intensities were applied to estimate background intensity. After excluding flagged spots from the analysis, the M-bM-^@M-^\normexpM-bM-^@M-^] background correction method, with offset=10 was applied (23). For intra-slide normalization, the global Lowess (LOcally Weighted Scatterplot Smoothing) regression algorithm was applied, and log2 ratios of four intra-slide replicates were averaged. All expression data were deposited in the Rosetta Resolver (Rosetta Biosoftware, UK) data management and analysis system. ID_REF = VALUE = averaged log2 scaled Cy3/Cy5 ratios For microarray analysis, we applied a common reference design, where the reference sample contains a mixture of total RNA representing all tissue types in the study. This allows for both one- and two-channel data analysis, as described in detail by SM-CM-8kilde et al. (21). In the current study, we applied the two-channel ratio analysis, as this permits comparison across different array versions. 1 M-5g of total RNA from each sample was labeled using the miRCURYM-bM-^DM-" LNA microRNA Power labeling Kit (Exiqon, VedbM-CM-&k, Denmark) following a two-step protocol: First, Calf Intestinal Alkaline Phosphatase (CIAP) was applied to remove terminal 5M-bM-^@M-^Yphosphates, and next, fluorescent labels were attached enzymatically to the 3M-bM-^@M-^Y-end of the microRNAs. Sample specific RNA was labeled with Hy3 (green) fluorophore, while the common reference RNA pool was labeled with the Hy5 (red). The Hy3 and Hy5 labeled RNA samples were mixed, and co-hybridized to miRCURYM-bM-^DM-" LNA Arrays v.Dx10 and v.11 (Exiqon, VedbM-CM-&k, Denmark), which contain Tm normalized capture probes targeting miRNAs from human, mouse and rat , as registered in miRBase v.19.0 at the Sanger Institute (22) . Hybridization was carried out overnight for 16 hours at 65 M-0C in a Tecan HS4800 hybridization station (Tecan, MM-CM-$nnedorf, Switzerland). NA NA Total RNA was extracted from 20M-5m FFPE sections with the High Pure miRNA Isolation Kit (Roche Applied Science, Mannheim, Germany) according to the manufacturerM-bM-^@M-^Ys instructions. After elution in 40M-5l RNase free water, the RNA concentration (A260 nm) and purity (A260/280 and A260/230 ratios) were assessed with a Nanodrop ND-1000 spectrophotometer (Thermo Scientific, Wilmington DE). The RNA was stored at -80 C until further analysis. After washing and drying, the microarray slides were scanned under ozone free conditions (ozone level < 2.0 ppb to minimize bleaching of the fluorescent dyes) in a G2565BA Microarray Scanner System, (Agilent, Santa Clara, CA). The resulting images were quantified using Imagene v. 8.0 (BioDiscovery, El Segundo, CA), and both automatic quality control (flagging of poor spots by the software) as well as manual, visual inspection was performed to ensure the highest possible data quality. Protocol Type normalization data transformation protocol labelling protocol hybridization protocol sample treatment protocol growth protocol nucleic acid extraction protocol array scanning protocol Experimental Factor Name ORGANISM PART Experimental Factor Type organism part Comment[SecondaryAccession] GSE50890 Comment[GEOReleaseDate] 2013-12-01 Comment[ArrayExpressSubmissionDate] 2013-09-16 Comment[GEOLastUpdateDate] 2013-12-01 Comment[AEExperimentType] transcription profiling by array SDRF File E-GEOD-50890.sdrf.txt