yeast-UPS1 standard LC-MS/MS dataset
Proteomic workflows based on nanoLC-MS/MS data-dependent-acquisition analysis have progressed tremendously in recent years due to the technical improvement of mass spectrometers, and now allow to extensively characterize complex protein mixtures. High-resolution and fast sequencing instruments have enabled the use of label-free quantitative methods, which appear as an attractive way to analyze differential protein expression in complex biological samples. Classical label-free quantitative workflows are based either on spectral counting of MS/MS sequencing scans for each protein, or on the extraction of peptide ion peak area values in the LC-MS map composed of all the survey MS scans acquired during the chromatographic gradient. However, the computational processing of the data for label-free quantification still remains a challenge. Here, we provide a dual proteomic standard composed of an equimolar mixture of 48 human proteins (Sigma UPS1) spiked at different concentrations into a background of yeast cell lysate, that was used to benchmark several label-free quantitative workflows, involving different software packages developed in recent years. This experimental design allowed to finely assess their performances in terms of sensitivity and false discovery rate, by measuring the number of true and false-positive (respectively UPS1 or yeast background proteins found as differential). This dataset can also be used to benchmark other label-free workflows, adjust software parameter settings, improve algorithms for extraction of the quantitative metrics from raw MS data, or evaluate downstream statistical methods
Sample Processing Protocol
A yeast cell lysate was prepared in 8M urea / 0.1M ammonium bicarbonate buffer and this lysate was used to resuspend and perform a serial dilution of the UPS1 standard mixture (Sigma). Twenty µL of each of the resulting samples, corresponding to 9 different spiked levels of UPS1 (respectively 0,05 - 0,125 - 0,250 - 0,5 - 2.5 - 5 - 12,5 - 25 - 50 fmol of UPS1 /µg of yeast lysate), were reduced with DTT and alkylated with iodoacetamide. The urea concentration was lowered to 1M by dilution, and proteins were digested in solution by addition of 2% of trypsin overnight. Enzymatic digestion was stopped by addition of TFA (0.5% final concentration). Samples (2µg of yeast cell lysate + different spiked level of UPS1) were analyzed in triplicate by nanoLC-MS/MS using a nanoRS UHPLC system (Dionex, Amsterdam, The Netherlands) coupled to an LTQ-Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) using a 105min gradient on a 15cm C18 column, and a top20 data-dependent acquisition method.
Data Processing Protocol
MS/MS data were searched in a yeast database from UniprotKB (S_cerevisiae_ 20121108.fasta, 7798 sequences) and a compiled database containing the UPS1 human sequences (48 sequences).Two series of database searches were perfomed with Mascot, resulting in 2 series of result files used in the different computational workflows described in the paper (Ramus et al, JPR). Serie 1 (used in workflows 1, 4 and 5, noted WF1-4-5_): the Mascot Daemon software (version 2.4; Matrix Science, London, UK) was used to perform database searches, using the Extract_msn.exe macro provided with Xcalibur (version 2.0 SR2; Thermo Fisher Scientific) to generate peaklists. Parameters used for creation of the peaklists were: parent ions in the mass range 400–4500, no grouping of MS/MS scans, and threshold at 1000. Peaklists were submitted to Mascot database searches (version 2.4.2). ESI-TRAP was chosen as the instrument, trypsin/P as the enzyme and 2 missed cleavages were allowed. Precursor and fragment mass error tolerances were set at 5 ppm and 0.8 Da, respectively. Peptide variable modifications allowed during the search were: acetyl (Protein N-ter), oxidation (M), whereas carbamidomethyl (C) was set as fixed modification. Serie 2 (used in workflows 3 and 8, noted WF3-8_): Data were processed automatically using Mascot Distiller software (version 220.127.116.11, Matrix Science). ESI-TRAP was chosen as the instrument, trypsin/P as the enzyme and 2 missed cleavages were allowed. Precursor and fragment mass error tolerances were set at 5 ppm and 0.8 Da, respectively. Peptide variable modifications allowed during the search were: acetylation (Protein N-ter), oxidation (M), whereas carbamidomethyl (C) was set as fixed modification.
Ramus C, Hovasse A, Marcellin M, Hesse AM, Mouton-Barbosa E, Bouyssié D, Vaca S, Carapito C, Chaoui K, Bruley C, Garin J, Cianférani S, Ferro M, Dorssaeler AV, Burlet-Schiltz O, Schaeffer C, Couté Y, Gonzalez de Peredo A. Spiked proteomic standard dataset for testing label-free quantitative software and statistical methods. Data Brief. 2015 Dec 17;6:286-94. eCollection 2016 Mar PubMed: 26862574