Project PXD002726

PRIDE Assigned Tags:
Technical Dataset
Download Project Files
Project Protein Table
Project Peptide Table
Visualize in PRIDE Inspector
Follow the next three steps to open your selected project or assay in PRIDE Inspector:

  • 1.

    Download, uncompress and open PRIDE Inspector
  • 2.

    Click in the magnifier on the left top corner, paste the project or assay that you would like to open in the search box, and hit search
  • 3.

    Click in the corresponding "Download" button to download the files and visualize them



Optimized data analysis avoiding trypsin artefacts


Most bottom-up proteomics experiments share two features: The use of trypsin to digest proteins for mass spectrometry and the statistic driven matching of the measured peptide fragment spectra against protein database derived in silico generated spectra. While this extremely powerful approach in combination with latest generation mass spectrometers facilitates very deep proteome coverage, the assumptions made have to be met to generate meaningful results. One of these assumptions is that the measured spectra indeed have a match in the search space, since the search engine will always report the best match. However, one of the most abundant proteins in the sample, the protease, is often not represented in the employed database. It is therefore widely accepted in the community to include the protease and other common contaminants in the database to avoid false positive matches. Although this approach accounts for unmodified trypsin peptides, the most widely employed trypsin preparations are chemically modified to prevent autolysis and premature activity loss of the protease. In this study we observed numerous spectra of modified trypsin derived peptides in samples from our laboratory as well as in datasets downloaded from public repositories. In many cases the spectra were assigned to other proteins, often with good statistical significance. We therefore designed a new database search strategy employing an artificial amino acid which accounts for these peptides with a minimal increase in search space and the concomitant loss of statistical significance. Moreover, this approach can be easily implemented into existing workflows for many widely used search engines.

Sample Processing Protocol

Proteins from an in-gel digest of S.cerevisiae lysate were identified using LC-MS/MS

Data Processing Protocol

Database search was performed using Mascot Ver. 2.4.1. Data were searched against S. cerevisiae Swissprot database. For the identification of dimethylated trypsin, modified trypsin was added to the database (where all lysines were replaced by dimethylated lysines (J) ). For false discovery rate estimation a decoy database search was used. Peptides were matched using trypsin as a digestion enzyme. Peptides mass tolerance was set to 10 ppm (or 100ppm for dataset 5) and fragment mass tolerance to 0.8 Da. A maximum of two missed cleavages was allowed. Carbamidomethylation of cysteine was set as a fixed modification and oxidation of methionine was set as variable modification. In order to detect peptides containing the artificial amino acid J, the cleavage specificity for trypsin was changed to J, K and R, not after P. Furthermore to detect mixed peptides (i.e. peptides including dimethylated and unmodified lysine) we added loss of dimethylation of lysines (i.e. only J) as a variable modification.


Katarina Fritz, Medical University of Graz
Ruth Birner-Gruenberger, Omics Center Graz Research Unit "Functional Proteomics and Metabolic Pathways" Institute of Pathology Medical University Graz, Austria ( lab head )

Submission Date


Publication Date



Not available


Not available

Assay count



    Schittmayer M, Fritz K, Liesinger L, Griss J, Birner-Gruenberger R. Cleaning out the Litterbox of Proteomic Scientists' Favorite Pet: Optimized Data Analysis Avoiding Trypsin Artifacts. J Proteome Res. 2016 Apr 1;15(4):1222-9 PubMed: 26938934


Showing 1 - 4 of 4 results
# Accession Title Proteins Peptides Unique Peptides Spectra Identified Spectra View in Reactome
1 56435 no assay title provided (mzIdentML) 30 1106 292 9297 613
2 56434 no assay title provided (mzIdentML) 40 1563 453 9297 1034
3 56433 no assay title provided (mzIdentML) 40 1622 475 9297 1080
4 56432 no assay title provided (mzIdentML) 146 4141 1269 21578 2882