A Proteomic Chronology of Gene Expression through the Cell Cycle in Human Myeloid Leukemia Cells.
Technological advances have enabled the analysis of cellular protein and RNA levels with unprecedented depth and sensitivity, allowing for an unbiased re-evaluation of gene regulation during fundamental biological processes. Here, we have chronicled the dynamics of protein and mRNA expression levels across a minimally perturbed cell cycle in human myeloid leukemia cells using centrifugal elutriation combined with mass spectrometry-based proteomics and RNA-Seq, avoiding artificial synchronization procedures. We identify myeloid-specific gene expression and variations in protein abundance, isoform expression and phosphorylation at different cell cycle stages. We dissect the relationship between protein and mRNA levels for both bulk gene expression and for over ~6,000 genes individually across the cell cycle, revealing complex, gene-specific patterns. This dataset, one of the deepest surveys to date of gene expression in human cells, is presented in an online, searchable database, the Encyclopedia of Proteome Dynamics (http://www.peptracker.com/epd/).
Sample Processing Protocol
Single Shot Analysis (PepTracker submissions 483, 581, 561). For protein extraction, NB4 cells were pelleted, washed twice with cold PBS and then lysed in 0.3 – 1.0 ml HES lysis buffer (2% SDS, 10 mM HEPES pH 7.4, 1 mM EDTA, 250 mM sucrose, Roche protease inhibitors, Roche PhosStop). Lysates were heated to 95 °C for 10 min and homogenized using Qiashredder (Qiagen). 200 μg of the lysate was further processed for LC-MS/MS analysis using a modification of the FASP protocol. Briefly, lysates were loaded onto pre-equilibrated 30 kD-cutoff spin columns (Sartorius) and washed twice using denaturing urea buffer (8 M urea, 10 mM Tris, pH 7.4). Proteins were reduced with TCEP (25 mM in denaturing urea buffer), for 15 min at room temperature and alkylated with iodoacetamide (55 mM in denaturing urea buffer), in the dark for 45 min at room temperature. Lysates were then buffer-exchanged into 0.1 M triethylammonium bicarbonate, pH 8.5 (TEAB, Sigma) and digested with trypsin (1:50, Promega) overnight at 37 °C. Digestion efficiency was checked by SDS-PAGE analysis and protein staining with SimplyBlue SafeStain (Life Technologies). After collecting the first peptide flow-through, the spin column was washed twice with 0.1 M TEAB, then twice with 0.5 M NaCl. The flow-through and washes were combined and desalted using SepPak-C18 SPE cartridges (Waters). Peptides were then dried and resuspended in 5% formic acid for LC-MS/MS analysis. hSAX Analysis (PepTracker submissions 2441 and 2441). For protein extraction, NB4 cells were pelleted, washed twice with cold PBS and then lysed in 0.3 – 1.0 ml urea lysis buffer (8 M urea, 100 mM Tris pH 7.4, Roche protease inhibitors, Roche PhosStop). Lysates were vigorously mixed for 30 min at room temperature and homogenized using a Branson Digital Sonifier (30% power, 30 s). Lysates were diluted with digest buffer (100 mM Tris pH 8.0 + 1 mM CaCl2) to reach 4 M urea, and then digested with 1:50 Lys-C (Wako Chemicals) overnight at 37 °C. The digest was then split into two fractions. The first was retained as the Lys-C digest. The second was diluted with digest buffer to reach 0.8 M urea and double-digested with trypsin (1:50, Promega). Digest efficiencies were checked by SDS-PAGE analysis and protein staining. The digests were then desalted using SepPak-C18 SPE cartridges, dried, and resuspended in 50 mM borate, pH 9.3. Peptides were separated onto a Dionex Ultimate 3000 HPLC system equipped with an AS24 strong anion exchange column, using hSAX chromatography. Peptides were chromatographed using a borate buffer system, namely 10 mM sodium borate, pH 9.3 (Buffer A) and 10 mM sodium borate, pH 9.3 + 0.5 M sodium chloride (Buffer B) and eluted using an exponential elution gradient into 12 x 750 μl fractions. The peptide fractions were desalted using SepPak-C18 SPE plates and then resuspended in 5% formic acid for LC-MS/MS analysis.
Data Processing Protocol
The RAW data files produced by the mass spectrometer were analysed using the quantitative proteomics software MaxQuant, version 220.127.116.11 with an integrated search engine, Andromeda. The database supplied to the search engine was a UniProt human protein database (Human Reference Proteome retrieved on August 19, 2012) combined with a commonly observed contaminants list. The initial mass tolerance was set to 7 p.p.m. and MS/MS mass tolerance was 0.5 Da. Enzyme was set to trypsin/P with up to 2 missed cleavages. Deamidation, oxidation of methionine and Gln->pyro-Glu were searched as variable modifications. Identification was set to a false discovery rate of 1%. To achieve reliable identifications, all proteins were accepted based on the criteria that the number of forward hits in the database was at least 100-fold higher than the number of reverse database hits, thus resulting in a false discovery rate of less than 1%. Protein isoforms and proteins that cannot be distinguished based on the peptides identified are grouped by MaxQuant and displayed on a single line with multiple UniProt identifiers. The label free quantitation (LFQ) algorithm in MaxQuant was used for protein quantitation. The algorithm has been previously described (Luber, Cox et al. 2010). The MaxQuant data analysis was repeated with searches for the following post-translational modifications: Phospho(STY), Methyl/Di-Methyl (KR), and Acetyl (K). Protein quantitation was performed on unmodified peptides and peptides that have modifications that are known to occur during sample processing (pyro-Glu, deamidation). All resulting MS data were integrated and managed using PepTracker (http://www.PepTracker.com). The downstream data interpretation (protein and RNA data) of cell cycle stages in this study was performed primarily using the R language (version 0.95.262). An initial cleaning step was performed to improve the quality and value of the dataset. This step involved removing proteins with less than 2 peptide identifications, those labeled as either contaminants, or reverse hits and those where data was missing in any of the fractions. Proteins were further filtered using a procedure analogous to a Œchecksum function in computing. An algorithm was constructed to assess the self-consistency of the quantitation based on known relationships between the elutriated fractions. The intensities measured in the asynchronous NB4 cell population can thus be modeled as a linear combination of the intensities originating from the six elutriated fractions that have been normalized by the measured cell count in each elutriated fraction. For each protein, the theoretical linear combination of elutriated fraction intensities (scaled by cell number) should match the protein intensity measured experimentally in the asynchronous population. Similar factors were calculated between adjacent fractions (for example, F1 versus F2), using cell number and the proportions of cells in each phase, as determined by flow cytometry. These stringent criteria left a subset of the total proteins detected with very high data coverage across the 6 elutriated cell cycle fractions and high self-consistency in quantitation. Absolute protein abundances were estimated using the iBAQ algorithm, as previously described (Schwanhausser, Busse et al. 2011).
Ly T, Ahmad Y, Shlien A, Soroka D, Mills A, Emanuele MJ, Stratton MR, Lamond AI. A proteomic chronology of gene expression through the cell cycle in human myeloid leukemia cells. Elife. 2014 Jan 1;3:e01630 PubMed(s) : 24596151