Project PXD005235

PRIDE Assigned Tags:
Biomedical Dataset



Genomic determinants of protein abundance variation in colorectal cancer cell lines


Assessing the impact of genomic alterations on protein networks is fundamental in identifying the mechanisms that shape cancer heterogeneity. We have used isobaric labelling to characterize the proteomic landscapes of 50 colorectal cancer cell lines and to decipher the functional consequences of somatic genomic variants. The robust quantification of over 9,000 proteins and 11,000 phosphopeptides on average, enabled the de novo construction of a functional protein correlation network which ultimately exposed the collateral effects of mutations on protein complexes. CRISPR-cas9 deletion of key chromatin modifiers confirmed that the consequences of genomic alterations can propagate through protein interactions in a transcript-independent manner. Lastly, we leveraged the quantified proteome to perform unsupervised classification of the cell lines and to build predictive models of drug response in colorectal cancer. Overall, we provide a deep integrative view of the functional network and the molecular structure underlying the heterogeneity of colorectal cancer cells.

Sample Processing Protocol

Protein digestion and TMT labeling PBS washed cell pellets were dissolved in 150 μL 0.1 M triethylammonium bicarbonate (TEAB), 0.1% SDS with pulsed probe sonication on ice for 20 sec and direct boiling at 95 °C for 10 min. This was performed twice and cellular debris removed by centrifugation at 12,000 rpm for 10 min. Protein concentration was measured by Bradford Protein Assay. Aliquots containing 100 μg of total protein were prepared for trypsin digestion. Cysteine disulfide bonds were reduced with 5 mM tris-2-carboxymethyl phosphine followed by 1 h incubation at 60 °C. Cysteine residues were blocked with 10 mM Iodoacetamide solution and 30 min at room temperature in dark. Trypsin was added at mass ratio 1:30 for overnight digestion. Peptides were diluted up to 100 μL with 0.1 M TEAB buffer. A 41 μL volume of anhydrous acetonitrile was added to each TMT 10-plex reagent vial and after vortex mixing the content of each TMT vial was transferred to each sample. The labeling reaction was quenched after 1 hour using 8 μL 5% hydroxylamine. Samples were combined and mixture dried with speedvac concentrator and stored at -20 °C u. The SW48 cell line was used as the reference sample to enable inter-experimental comparison and was newly cultured in parallel with the rest of the cell lines in each sample batch. High pH Reverse Phase peptide fractionation was performed with a XBridge C18 column on a Dionex Ultimate 3000 HPLC. Mobile phase (A) was composed of 0.1% ammonium hydroxide and mobile phase (B) was composed of 100% acetonitrile, 0.1% ammonium hydroxide. The TMT labelled peptide mixture was reconstituted in 100 μL mobile phase (A). The multi-step gradient elution method at 0.2 mL/min was: for 5 minutes isocratic at 5% (B), for 35 min gradient to 35% (B), for 5 min gradient to 80% (B), isocratic for 5 minutes and  re-equilibration to 5% (B). Signal was recorded at 215 and 280 nm, fractions were collected in a time dependent manner every 30 sec. The collected fractions were dried with SpeedVac concentrator and stored at -20 °C until the LC-MS analysis. For the replication sample set and the CRISPR/cas9 proteomic experiments, peptide fractionation was performed on reversed-phase OASIS HLB cartridges up to 10 fractions were collected for each. Phosphopeptide enrichment Peptide fractions were reconstituted in 10 uL of 20% isopropanol, 0.5% formic acid binding solution and were loaded on 10 uL of phosphopeptide enrichment IMAC resin , washed and conditioned with binding solution. The resin was washed three times with 40 uL of binding solution and centrifugation at 300 g after 2 h of binding and the flow-through solutions were collected. Phosphopeptides were eluted three times with 70 uL of 40% acetonitrile, 400 mM ammonium hydroxide solution. Both the eluents and flow-through solutions were dried in a speedvac and stored at -20 °C. LC-MS Analysis   LC-MS analysis was performed on the Dionex Ultimate 3000 UHPLC system coupled with the Orbitrap Fusion Tribrid Mass Spectrometer. Each peptide fraction was reconstituted in 40 μL 0.1% formic acid and a volume of 7 μL was loaded to the Acclaim PepMap 100, trapping column with the μlPickUp mode at 10 μL/min flow rate. The sample was then subjected to a multi-step gradient elution on the Acclaim PepMap  RSLC C18 capillary column retrofitted to an electrospray emitter. Mobile phase (A) was composed of 0.1% formic acid and mobile phase (B) was composed of 80% acetonitrile, 0.1% formic acid. The gradient separation method at flow rate 300 nL/min was as follows: for 95 min gradient to 42% B, for 5 min up to 95% B, for 8 min isocratic at 95% B, re-equilibration to 5% B in 2 min, for 10 min isocratic at 5% B. Precursors were selected with mass resolution of 120k, AGC  3×105 and IT 100 ms in the top speed mode within 3 sec and were isolated for CID fragmentation with quadrupole isolation width 0.7 Th. Collision energy was set at 35% with AGC 1×104 and IT 35 ms. MS3 quantification spectra were acquired with further HCD fragmentation of the top 10 most abundant CID fragments isolated with Synchronous Precursor Selection (SPS) excluding neutral losses of maximum m/z 30. Quadrupole isolation width was set at 0.5 Th, collision energy was applied at 45% and the AGC setting was at 6×104 with 100 ms IT. The HCD MS3 spectra were acquired within 120-140 m/z with 60k resolution. Targeted precursors were dynamically excluded for further isolation and activation for 45 seconds with 7 ppm mass tolerance. Phosphopeptide samples were analyzed with CID-HCD method at the MS2 level. MS level AGC was set at 6×105, IT was set at 150 ms and exclusion duration at 30sec. The fractions for the replication and CRISPR/cas9 sets were analysed with 180 min and 300 min LC-MS runs respectively and the analysis was repeated by setting an upper intensity threshold at 2-5×106 to capture lower abundant peptides.

Data Processing Protocol

Protein identification and quantification The acquired mass spectra were submitted to SequestHT search in Proteome Discoverer 1.4 for protein identification and quantification. The precursor mass tolerance was set at 20 ppm and the fragment ion mass tolerance was set at 0.5 Da for the CID and at 0.02 Da for the HCD spectra used for the phosphopeptide analysis. Spectra were searched for fully tryptic peptides with maximum 2 miss-cleavages and minimum length of 6 amino acids. TMT6plex at N-termimus, K and Carbamidomethyl at C were defined as static modifications. Dynamic modifications included oxidation of M and Deamidation of N,Q. Maximum two different dynamic modifications were allowed for each peptide with maximum two repetitions each. Search for phospho-S,T,Y was included only for the IMAC data. Peptide confidence was estimated with the Percolator node. Peptide FDR was set at 0.01 and validation was based on q-value and decoy database search. All spectra were searched against a UniProt fasta file containing 20,165 reviewed human entries. The Reporter Ion Quantifier node included a custom TMT-10plex Quantification Method with integration window tolerance 15 ppm, integration method the Most Confident Centroid at the MS3 level and missing channels were replaced by minimum intensity. Only peptides uniquely belonging to protein groups were used for quantification. Peptide Log2-ratios were computed against the SW48 cell line in each set and were averaged per protein and phosphopeptide. Proteins and phosphopeptides quantified in less than half of the samples were discarded and batch effects due to SW48 variation were regressed out to obtain Log2-scaled relative protein and phosphopeptide abundances. To detect true phosphorylation changes we regressed out the relative protein abundances from the respective phosphopeptide levels. For the identification of single amino acid variant peptides we constructed a protein fasta file by replacing the canonical amino acids with the respective ones encoded by 77k missense mutations using Ensembl gene IDs and protein sequences. The canonical sequences were also included in the proteogenomic search performed in Proteome Discoverer 2.1. The identified mutant peptides at <1% Percolator FDR were further filtered based on their TMT S/N, which was required to exhibit maximum intensity in the cell line where each variant peptide was expected to occur.   


James Wright, Functional Proteomics, Institute Cancer Research & Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute
Jyoti Choudhary, Wellcome Trust Sanger Institute ( lab head )

Submission Date


Publication Date




Cell Type

stem cell


colon cancer


Orbitrap Fusion


Not available

Experiment Type

Shotgun proteomics


    Roumeliotis TI, Williams SP, Gonçalves E, Alsinet C, Del Castillo Velasco-Herrera M, Aben N, Ghavidel FZ, Michaut M, Schubert M, Price S, Wright JC, Yu L, Yang M, Dienstmann R, Guinney J, Beltrao P, Brazma A, Pardo M, Stegle O, Adams DJ, Wessels L, Saez-Rodriguez J, McDermott U, Choudhary JS. Genomic Determinants of Protein Abundance Variation in Colorectal Cancer Cells. Cell Rep. 2017 20(9):2201-2214 PubMed: 28854368