Project PXD004574

PRIDE Assigned Tags:
Biomedical Dataset Technical Dataset



Inference and quantification of peptidoforms in large sample cohorts by SWATH-MS: Twin blood plasma quantitative variance components


This data set represents a reanalysis of human blood plasma samples measured by SWATH-MS of 36 pairs of monozygotic and 22 pairs of dizygotic twins that were sampled at two longitudinal time points (Liu et al., 2015, PMID:25652787). The data were used to determine the overall quantitative variability of 4322 peptidoforms in the sample cohort and to assign the measured variability to heritability, environmental or longitudinal effects.

Sample Processing Protocol

All samples and data were used as originally described (Liu et al., 2015, PMID:25652787).

Data Processing Protocol

Assay library generation All raw instrument data acquired in DDA mode were centroided and converted to mzXML using msconvert (ProteoWizard 3.0.7162). The non-redundant reviewed human protein FASTA was obtained from the UniProtKB/Swiss-Prot72 (2015-02-23) and appended with iRT peptide sequences and pseudo-reverse decoys. In addition, a second FASTA was generated based on the first with additional sequence variants of the human ApoE protein. The files were searched using Comet (2015.02) using the default parameters for high mass accuracy instruments: peptide mass tolerance: 20 ppm (monoisotopic), isotope error enabled, fully tryptic digestion with max 2 missed cleavages, variable M (Oxidation), max variable mods: 5. Based on these parameters, for each additional variable modification type, a separate search was conducted: MW (Oxidation), c (Amidated), NQ (Deamidated), Kn (Carbamyl), K (Label:13C(6)15N(2)), R (Label:13C(6)15N(4)), KnST (Formyl), EK (Carboxy), Kn (Acetyl), Y (Nitro), KRDEc (Methyl), STY (Phospho), Y (Sulfo), K (GG), KR (Dimethyl), ApoE. PeptideProphet (TPP 4.8.0) with parameters -dDECOY_ - OAPdlIwt was run independently per file and iProphet was used to combine all results. The heavy isotope labeled forms of K and R were included because heavy peptide standards with established SRM assays were spiked in the plasma digest for SWATH-MS quantification. SpectraST (TPP/SVN r7019, custom build with disabled hardcoded modifications) was used to generate a spectral library of all peptide identifications at iProphet FDR 0.2% with the following parameters: -cP0.9 -c_IRR -c_IRTirtkit.txt -cICID-QTOF -c_RDYDECOY -cAC –cM. OpenMS was used for all following steps: ConvertTSVToTraML was used to convert the SpectraST MRM file to a TraML. OpenSwathAssayGenerator was applied on the TraML with following parameters: -swath_windows_file swath32.txt -allowed_fragment_charges 1,2,3,4 -enable_ms1_uis_scoring - max_num_alternative_localizations 20 -enable_identification_specific_losses - enable_identification_ms2_precursors. OpenSwathDecoyGenerator was applied to append decoys to the assays using the following parameters: -method shuffle - append -mz_threshold 0.1 -remove_unannotated. All OpenMS tools were executed using the modified chemistry parameters for the extended modification set (OpenMS.extended.params). OpenSWATH / pyprophet / TRIC OpenSwathWorkflow was run with the following parameters -min_upper_edge_dist 1 - mz_extraction_window 0.05 -rt_extraction_window 600 - extra_rt_extraction_window 100 -min_rsq 0.95 -min_coverage 0.6 - use_ms1_traces -enable_uis_scoring -Scoring:uis_threshold_peak_area 0 - Scoring:uis_threshold_sn -1 -Scoring: stop_report_after_feature 5 -tr_irt hroest_DIA_iRT.TraML. The following subset of scores was used on MS2-level: xx_swath_prelim_score library_corr yseries_score xcorr_coelution_weighted massdev_score norm_rt_score library_rmsd bseries_score intensity_score xcorr_coelution log_sn_score isotope_overlap_score massdev_score_weighted xcorr_shape_weighted isotope_correlation_score xcorr_shape. All MS1 and UIS scores were used for pyprophet. pyprophet was run on a concatenated file of all 13 runs with the following parameters: --final_statistics.emp_p --qvality.enable --qvality.generalized -- ms1_scoring.enable --uis_scoring.enable --xeval.num_iter=10 -- ignore.invalid_score_columns. TRIC was run with the following parameters: --file_format openswath --fdr_cutoff 0.01 --max_fdr_quality 0.2 --mst:useRTCorrection True --mst:Stdev_multiplier 3.0 --method LocalMST -- max_rt_diff 30 --alignment_score 0.0001 --frac_selected 0 --realign_method lowess_cython --disable_isotopic_grouping --disable_isotopic_grouping --disable_isotopic_transfer --realign_runs lowess_cython --method singleShortestPath --do_single_run


George Rosenberger, Columbia University
Ruedi Aebersold, Institute of Molecular Systems Biology ETH Zurich Switzerland ( lab head )

Submission Date


Publication Date



    Liu Y, Buil A, Collins BC, Gillet LC, Blum LC, Cheng LY, Vitek O, Mouritsen J, Lachance G, Spector TD, Dermitzakis ET, Aebersold R; Quantitative variability of 342 plasma proteins in a human twin population., Mol Syst Biol, 2015 Feb 4, 11, 1, 786, PubMed: 25652787

    Rosenberger G, Liu Y, Röst HL, Ludwig C, Buil A, Bensimon A, Soste M, Spector TD, Dermitzakis ET, Collins BC, Malmström L, Aebersold R. Inference and quantification of peptidoforms in large sample cohorts by SWATH-MS. Nat Biotechnol. 2017 Jun 12 PubMed: 28604659