1. Submitted
  2. In curation
  3. In review
  4. Public
MTBLS321: eRah: A computational tool integrating spectral deconvolution and alignment with quantification and identification of metabolites in GCMS- based metabolomics

Abstract

Gas chromatography coupled to mass spectrometry (GC-MS) has been a long- standing approach used for identifying small molecules due to the highly reproducible ionization process of electron impact ionization (EI). However, the use of GC-EI MS in untargeted metabolomics produces large and complex datasets characterized by coeluting compounds and extensive fragmentation of molecular ions caused by the hard electron ionization. In order to identify and extract quantitative information of metabolites across multiple biological samples, integrated computational workflows for data processing are needed. Here we introduce eRah, a free computational tool written in the open language R composed of five core functions: (i) noise filtering and baseline removal of GC-MS chromatograms, (ii) an innovative compound deconvolution process using multivariate analysis techniques based on compound match by local covariance (CMLC) and orthogonal signal deconvolution (OSD), (iii) alignment of mass spectra across samples, (iv) missing compound recovery, and (v) identification of metabolites by spectral library matching using publicly available mass spectra. eRah outputs a table with compound names, matching scores and the integrated area of compounds for each sample. The automated capabilities of eRah are demonstrated by the analysis of GC-TOF MS data from plasma samples of adolescents with hyperinsulinaemic androgen excess and healthy controls. The quantitative results of eRah are compared to centWave, the peak-picking algorithm implemented in the widely used XCMS package, MetAlign and ChromaTOF software. Significantly dysregulated metabolites are further validated using pure standards and targeted analysis by GC-QqQ MS, LC-QqQ and NMR.

Click to read more

 Authors: Xavier Domingo-Almenara, Oscar Yanes

  Release date: 22-Sep-2016

 Status: Public

Organism(s)

Homo sapiens

  Study Design

CHMO:gas chromatography-mass spectrometry

untargeted metabolites

eRah

  Experimental Factors

Disease

Protocol Description
Sample collection A dataset of a total of 25 plasma samples (11 young, non-obese adolescents with hyperinsulinaemic androgen excess and 14 age-, weight- and ethnicity-matched healthy controls) were analyzed by GC-EI-qTOF-MS (Agilent Technologies)[1].

Ref: [1] Samino S, Vinaixa M, Díaz M, Beltran A, Rodríguez MA, Mallol R, Heras M, Cabre A, Garcia L, Canela N, de Zegher F, Correig X, Ibáñez L, Yanes O. Metabolomics reveals impaired maturation of HDL particles in adolescents with hyperinsulinaemic androgen excess. Sci Rep. 2015 Jun 23;5:11496. doi: 10.1038/srep11496.
Extraction Plasma aliquots (25 µl) were thawed at 4 °C. Samples were briefly vortex-mixed and each aliquot was supplemented with 20 µl/min of 1 µg/µl succinic-d4 acid (internal standard). Proteins were then precipitated by the addition of 475 µl cold methanol/water (8:1 vol/vol) followed by 3 min ultrasonication and 10 sec vortex-mixing. Aliquots were subsequently maintained on ice for 10 min. After centrifugation for 10 min (19.000 g, 4 °C), 100 µl of supernatant were transferred to a GC autosampler vial and lyophilized. We incubated the lyophilized plasma residues with 50 µl methoxyamine in pyridine (40 µg/µl) for 30 min at 60 °C. To increase the volatility of the compounds, we silylated the samples using 30 µl N-methyl-N-trimethylsilyltrifluoroacetamide with 1% trimethylchlorosilane (Thermo Fisher Scientific) for 30 min at 60 °C.
Chromatography Samples were anaylsed using an Aligent 7890A gas chromatographer coupled to a qTOF MS 7200 (Agilent, Santa Clara, CA). Derivatized samples (1 μl each) were injected in the gas chromatograph system with a split inlet equipped with a J&W Scientific DB5-MS+DG stationary phase column (30 mm × 0.25 mm i.d., 0.1 μm film, Agilent Technologies). Helium was used as a carrier gas at a flow rate of 1 ml/min in constant flow mode. The injector split ratio was adjusted to 1:5, and the oven temperature was programmed at 70 °C for 1 min and increased at 10 °C/min to 325 °C.
Mass spectrometry The qTOF MS 7200 (Agilent, Santa Clara, CA, USA) was operated in the electron impact ionization mode at 70 eV. Mass spectral data were acquired in full scan mode from m/z 35 to 700 with an acquisition rate of 5 spectra per second.
Data transformation GC-MS data files were converted to .mzXML format using Proteowizard software.
Metabolite identification Aligned compounds were identified by comparing the mean spectrum to reference spectra in a MS library. The mean spectrum is determined by the mean of the compounds spectra found only in the deconvolution and alignment steps. The current eRah package integrates the free and downloadable version of the MassBank repository, which after removing duplicated compounds contains a set of 500 unique compounds with EI GC-TOF mass spectra. However, users may import other libraries such as the Golm Metabolome Database (GMD), Fiehn Human Metabolome Database (HMDB) or an internal database, as long as the library is available in an interpretable (i.e. readable) format. This refers to a non-binary non-coded format, which is consistent with public databases. By comparing the empirical spectra with a reference MS library, eRah generates a list of candidate metabolites along with a similarity match factor, determined using the cosine product.
Source Name Organism Organism part Protocol REF Sample Name Disease
CON_BASA_567795 Homo sapiens blood plasma Sample collection CON_BASA_567795 Healthy
CON_BASA_574488 Homo sapiens blood plasma Sample collection CON_BASA_574488 Healthy
CON_BASA_580552 Homo sapiens blood plasma Sample collection CON_BASA_580552 Healthy
CON_BASA_581904 Homo sapiens blood plasma Sample collection CON_BASA_581904 Healthy
CON_BASA_587625 Homo sapiens blood plasma Sample collection CON_BASA_587625 Healthy
CON_BASA_591439 Homo sapiens blood plasma Sample collection CON_BASA_591439 Healthy
CON_BASA_602733 Homo sapiens blood plasma Sample collection CON_BASA_602733 Healthy
CON_BASA_619640 Homo sapiens blood plasma Sample collection CON_BASA_619640 Healthy
CON_BASA_651137 Homo sapiens blood plasma Sample collection CON_BASA_651137 Healthy
CON_BASA_675644 Homo sapiens blood plasma Sample collection CON_BASA_675644 Healthy
CON_BASA_677375 Homo sapiens blood plasma Sample collection CON_BASA_677375 Healthy
CON_BASA_688344 Homo sapiens blood plasma Sample collection CON_BASA_688344 Healthy
CON_BASA_723817 Homo sapiens blood plasma Sample collection CON_BASA_723817 Healthy
CON_BASA_736592 Homo sapiens blood plasma Sample collection CON_BASA_736592 Healthy
DIA_BASE_630974 Homo sapiens blood plasma Sample collection DIA_BASE_630974 HIAE
DIA_BASE_635799 Homo sapiens blood plasma Sample collection DIA_BASE_635799 HIAE
DIA_BASE_687353 Homo sapiens blood plasma Sample collection DIA_BASE_687353 HIAE
DIA_BASE_701185 Homo sapiens blood plasma Sample collection DIA_BASE_701185 HIAE
DIA_BASE_1164621 Homo sapiens blood plasma Sample collection DIA_BASE_1164621 HIAE
PIO_BASE_603628 Homo sapiens blood plasma Sample collection PIO_BASE_603628 HIAE
PIO_BASE_604520 Homo sapiens blood plasma Sample collection PIO_BASE_604520 HIAE
PIO_BASE_649358 Homo sapiens blood plasma Sample collection PIO_BASE_649358 HIAE
PIO_BASE_654139 Homo sapiens blood plasma Sample collection PIO_BASE_654139 HIAE
PIO_BASE_830049 Homo sapiens blood plasma Sample collection PIO_BASE_830049 HIAE
PIO_BASE_976853 Homo sapiens blood plasma Sample collection PIO_BASE_976853 HIAE

Assay 

Assay file name: a_pcos_gc-qtof_metabolite_profiling_mass_spectrometry.txt
Measurement: metabolite profiling
Technology: mass spectrometry
Platform: Agilent 7200 Accurate-Mass Q-TOF

Instrumentation

Sample Name Protocol REF Post Extraction Derivatization Extract Name Protocol REF Chromatography Instrument Column model Column type Labeled Extract Name Label Protocol REF Scan polarity Scan m/z range Instrument Ion source Mass analyzer MS Assay Name Raw Spectral Data File Protocol REF Normalization Name Derived Spectral Data File Protocol REF Data Transformation Name Metabolite Assignment File
CON_BASA_567795 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_567795 CON_BASA_567795_50.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_574488 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_574488 CON_BASA_574488.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_580552 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_580552 CON_BASA_580552_50.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_581904 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_581904 CON_BASA_581904.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_587625 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_587625 CON_BASA_587625_50.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_591439 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_591439 CON_BASA_591439.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_602733 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_602733 CON_BASA_602733.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_619640 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_619640 CON_BASA_619640.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_651137 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_651137 CON_BASA_651137.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_675644 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_675644 CON_BASA_675644.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_677375 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_677375 CON_BASA_677375.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_688344 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_688344 CON_BASA_688344.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_723817 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_723817 CON_BASA_723817.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
CON_BASA_736592 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight CON_BASA_736592 CON_BASA_736592.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_630974 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight DIA_BASE_630974 DIA_BASE_630974.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_635799 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight DIA_BASE_635799 DIA_BASE_635799.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_687353 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight DIA_BASE_687353 DIA_BASE_687353.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_701185 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight DIA_BASE_701185 DIA_BASE_701185.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_1164621 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight DIA_BASE_1164621 DIA_BASE_1164621.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_603628 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_603628 PIO_BASE_603628.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_604520 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_604520 PIO_BASE_604520.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_649358 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_649358 PIO_BASE_649358.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_654139 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_654139 PIO_BASE_654139.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_830049 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_830049 PIO_BASE_830049.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
PIO_BASE_976853 Extraction sylilation Chromatography Agilent 7890A GC DB-5ms GC (0.1 µm, 0.25 mm x 30 m; Agilent Technologies) low polarity Mass spectrometry positive 35-700 Agilent 7200 Q-TOF electron ionization quadrupole time-of-flight PIO_BASE_976853 PIO_BASE_976853.mzXML Data transformation Metabolite identification m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
   Download study (FTP)  |     Download metadata    

Aspera Download Details:

List of study files   Subset

File
CON_BASA_581904.mzXML
CON_BASA_602733.mzXML
CON_BASA_675644.mzXML
CON_BASA_723817.mzXML
PIO_BASE_603628.mzXML
DIA_BASE_1164621.mzXML
a_pcos_gc-qtof_metabolite_profiling_mass_spectrometry.txt
i_Investigation.txt
DIA_BASE_630974.mzXML
CON_BASA_591439.mzXML
CON_BASA_651137.mzXML
DIA_BASE_687353.mzXML
CON_BASA_677375.mzXML
CON_BASA_736592.mzXML
m_pcos_gc-qtof_metabolite_profiling_mass_spectrometry_v2_maf.tsv
DIA_BASE_701185.mzXML
DIA_BASE_635799.mzXML
CON_BASA_619640.mzXML
CON_BASA_587625_50.mzXML
CON_BASA_688344.mzXML
CON_BASA_580552_50.mzXML
CON_BASA_574488.mzXML
CON_BASA_567795_50.mzXML
PIO_BASE_604520.mzXML
PIO_BASE_976853.mzXML
PIO_BASE_830049.mzXML
metexplore_mapping.json
PIO_BASE_649358.mzXML
PIO_BASE_654139Ç.mzXML
s_PCOS GC-qTOF.txt
PIO_BASE_654139.mzXML
audit

Info:To download a single file, just click on the file name. To download multiple files, use the check boxes and then click "Download selected files" button. Files will be zipped and downloaded to your browser.



Validations marked with (*) have been allowed by the MetaboLights Curators.
Click here for the detailed description of Validations.
Condition Status Description Requirement Group Message
PASSES Study Title MANDATORY STUDY OK
PASSES Study Description MANDATORY STUDY OK
PASSES Study text successfully parsed OPTIONAL STUDY OK
PASSES Study Contact(s) have listed email MANDATORY CONTACT OK
PASSES Sample(s) MANDATORY SAMPLES OK
PASSES Sample Name consistency check MANDATORY ASSAYS OK
PASSES Publication(s) associated with this Study MANDATORY PUBLICATION OK
PASSES Minimal Experimental protocol MANDATORY PROTOCOLS OK
PASSES Comprehensive Experimental protocol OPTIONAL PROTOCOLS OK
PASSES Extraction protocol description MANDATORY PROTOCOLS OK
PASSES Data transformation protocol description MANDATORY PROTOCOLS OK
PASSES Metabolite Identification protocol description MANDATORY PROTOCOLS OK
PASSES Mass spectrometry protocol description MANDATORY PROTOCOLS OK
PASSES Chromatography protocol description MANDATORY PROTOCOLS OK
PASSES Sample Collection protocol description MANDATORY PROTOCOLS OK
PASSES Protocols text successfully parsed OPTIONAL PROTOCOLS OK
PASSES Organism name MANDATORY ORGANISM OK
PASSES Organism part MANDATORY ORGANISM OK
PASSES Study Factors MANDATORY FACTORS OK
PASSES Assay platform information OPTIONAL ASSAYS OK
PASSES Assay has raw files referenced MANDATORY FILES OK
PASSES Assay referenced raw files detection in filesystem MANDATORY FILES OK
PASSES Raw files in the Assay(s) have the correct format MANDATORY FILES OK
PASSES Assay(s) MANDATORY ASSAYS OK
PASSES All Assays have Metabolite Assignment File (MAF) referenced OPTIONAL FILES OK
PASSES Metabolite Assignment File (MAF) is present in Study folder MANDATORY FILES OK
PASSES Metabolite Assignment File (MAF) has correct format MANDATORY FILES OK
PASSES Metabolite Identification File (MAF) content MANDATORY FILES OK
PASSES ISA-Tab investigation file check MANDATORY ISATAB OK

Pathways - Assay 



MetExplore Pathways Mapping

Name DB Identifier Mapped Metabolite(s)