Quantitative variability of 342 plasma proteins in a human twin population
The human plasma proteome is clinically highly significant and has been studied extensively because it is thought that its molecular makeup provides a window into the health state of an individual. However, neither the quantitative variability of the plasma proteome nor the origins of the variation are known. To determine the relative contributions of heritability, environmental and longitudinal factors to plasma proteome variability we systematically decomposed the biological variance of 1904 peptides defining 342 unique plasma protein profiles from 232 plasma samples that were collected with 2-7 year intervals from monozygotic (MZ) and dizygotic (DZ) twins. The data were collected via SWATH mass spectrometry (SWATH-MS), an emerging technology characterized by high degree of reproducibility and quantitative accuracy. The data indicate abundance variability is an important feature for different proteins among population, that the abundances of about 20% of plasma protein are considerably heritable, and that the degrees of genetic control and aging effects vary across specific biological processes. Moreover, we identified 13 cis- SNPs significantly influencing the abundance level of specific plasma proteins. These results substantially extend the understanding of the impact of heritability on the human proteomic dynamics and therefore have implications for the effective design of plasma-based biomarker studies.
Sample Processing Protocol
Crude plasma samples were centrifuged at 18400g for 10 min at 4°C. The following sample preparation steps were performed with 96-well format plates with five whole-process experimental replicates distributed in different plate. 5 µL of plasma from each sample were diluted to 50 µL and filtered by G-10 gel filtration cartridges (Nest Group Inc.). Three external proteins were spiked (bovine Alpha-1-acid glycoprotein with the targeted plasma concentration at 85 µg/mL, bovine Fetuin-B at 8.5 µg/mL, and human Prostate-specific antigen at 0.85 µg/mL), before 80 µL of 10 M Urea in 100 mM ammonium bicarbonate was added into each sample for denaturing at 37 °C, 30 min. After reduction and alkylation with 10mM tris(carboxyethyl)phosphine (Sigma-Aldrich) and 20 mM iodoacetamide (Sigma-Aldrich), the samples were diluted to 1M Urea and were digested with sequencing-grade porcine trypsin (Promega) at a protease/protein ratio of 1:50 overnight at 37 °C. Digests were purified with Vydac C18 Silica MicroSpin columns (The Nest Group Inc.). An aliquot of retention time calibration peptides from iRT-Kit (Biognosys) was spiked into each sample before all LC-MS analysis at a ratio of 1:30 (v/v) to correct relative retention times between runs. Selected, heavy isotope–labeled internal standard peptides according to our previous study were synthesized (JPT Peptide Technologies and Thermo Fisher) and spiked into each sample for SRM and SWATH-MS measurements.SWATH-MS measurement was performed with direct digest of crude plasma samples (without fractionation or depletion) to maximize experimental reproducibility for variance dissection purpose. Peptides were directly injected onto a 20-cm PicoFrit emitter (New Objective, self-packed to 20 cm with Magic C18 AQ 3-μm 200-Å material), and then separated using a 120-min gradient from 2–35% (buffer A 0.1% (v/v) formic acid, 2% (v/v) acetonitrile, buffer B 0.1% (v/v) formic acid, 90% (v/v) acetonitrile) at a flow rate of 300 nL/min. Specifically in SWATH-MS mode, the instrument was specifically tuned to optimize the quadrupole settings for the selection of 25-m/z wide precursor ion selection windows. Using an isolation width of 26 m/z (containing 1 m/z for the window overlap), a set of 32 overlapping windows was constructed covering the precursor mass range of 400–1,200 m/z. The effective isolation windows can be considered as being 399.5–424.5, 424.5–449.5, etc. SWATH MS2 spectra were collected from 100 to 2,000 m/z. The collision energy (CE) was optimized for each window according to the calculation for a charge 2+ ion centered upon the window with a spread of 15 eV. An accumulation time (dwell time) of 100 ms was used for all fragment-ion scans in high-sensitivity mode and for each SWATH-MS cycle a survey scan in high-resolution mode was also acquired for 100 ms, resulting in a duty cycle of ~3.4 s.
Data Processing Protocol
Library generation by shotgun: Profile-mode .wiff files from shotgun data acquisition were centroided and converted to mzML format using the AB Sciex Data Converter v.1.3 and converted to mzXML format using MSConvert v.3.04.238. The MS2 spectra were queried against the reviewed canonical SwissProt complete proteome database for human (Nov. 2012) appended with common contaminants and reversed sequence decoys (40,951 protein sequences including decoys). The SEQUEST database search48 through Sorcerer PE version 4.2 included the following criteria: static modifications of 57.02146 Da for cysteines, variable modifications of 15.99491 Da for methionine oxidations. The mono-isotopic parent and fragment mass tolerances were set to be 50 p.p.m; semi-tryptic peptides and peptides with up to two missed cleavages were allowed. The identified peptides were processed and analyzed through Trans-Proteomic Pipeline 4.5.2 (TPP) and were validated using the PeptideProphet score. All the peptides were filtered at a false discovery rate (FDR) of 1%. The raw spectral libraries were generated from all valid peptide spectrum matches, and then refined into the non redundant consensus libraries44 using SpectraST. For each peptide, the retention time was mapped into the iRT space with reference to a linear calibration constructed for each shotgun run, as previously described44. The MS assays constructed from Top 6 most intense transitions with Q1 range from 400 to 1200 m/z excluding the precursor SWATH window were used for targeted data analysis of SWATH maps. Targeted data analysis for SWATH maps: SWATH-MS .wiff files were first converted to profile mzXML using ProteoWizard. The whole process of SWATH targeted data analysis was carried out using OpenSWATH running on an internal computing cluster. OpenSWATH utilizes a target-decoy scoring system like mProphet to estimate identification FDR. The best scoring classifier that was built from the sample of most protein identifications was utilized in this study. Based on our final spectral library for human plasma proteome, OpenSWATH firstly identified the peak groups from all individual SWATH maps at a global peptide FDR=1% (enabled by FDR cutoff of 0.0307% at the level of total peak groups) and aligned them between SWATH maps based on the clustering behaviors of retention time in each run with a non-linear alignment algorithm. Specifically, only those peptide peak groups identified in more than 1/3 samples were reported and considered for alignment with the max FDR quality of 0.1 (quality cutoff to still consider a feature for alignment) and/or the further constraint of less than 100 second RT difference in LC gradient after iRT normalization.
Yansheng Liu, Institute of Molecular Systems Biology, ETH Zurich
Ruedi Aebersold, Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Wolfgang-Pauli-Str.16, 8093 Zurich, Switzerland. ( lab head )
Liu Y, Buil A, Collins BC, Gillet LC, Blum LC, Cheng LY, Vitek O, Mouritsen J, Lachance G, Spector TD, Dermitzakis ET, Aebersold R. Quantitative variability of 342 plasma proteins in a human twin population. Mol Syst Biol. 2015 Feb 4;11(2):786 PubMed: 25652787