PRIDE Assigned Tags:Biomedical Dataset
Dataset Belongs to:Malaria Host-Pathogen Interaction Center (MaHPIC)
Investigation of Plasmodium vivax trophozoite-schizont transition proteome
The apicomplexan parasite Plasmodium vivax reportedly caused 13.8 million cases of vivax malaria in 2015. Much of the unique biology of this pathogen remains unknown. To expand our proteomics interrogation of the blood-stage interaction with its host animal model Saimiri boliviensis, we analyzed the proteome of infected host reticulocytes undergoing transition from the trophozoite to schizont stages. Two biological replicates analyzed using five database search engines identified 1923 P. vivax and 3188 S. boliviensis proteins. This project is part of the Malaria Host-Pathogen Interaction Center (MaHPIC) - a transdisciplinary malaria systems biology research program supported by an NIH/NIAID contract (# HHSN272201200031C; see http://www.systemsbiology.emory.edu). The MaHPIC generates many data types (e.g., clinical, parasitological, metabolomics, functional genomics, lipidomics, proteomics, immune response) and mathematical models, to iteratively test and develop hypotheses related to the complex host-parasite dynamics in the course of malaria in non-human primates, and metabolomics data via collaborations with investigators conducting clinical studies in malaria endemic countries, with the overarching goal of better understanding human disease, pathogenesis, and immunity. Within the MaHPIC, this project is known as 'Integral Supporting Project 05 (S05)'. Curation and maintenance of all data and metadata are the responsibility of the MaHPIC. This dataset was produced by Dave Anderson at SRI International.
Sample Processing Protocol
SOLUTIONS AND REAGENTS: 0.1% v/v formic acid in water (A), 0.1% v/v formic acid in HPLC grade acetonitrile (B), Acetonitrile (MeCN), Trifluoroacetic acid (TFA), Individual solutions of ammonium formate in 90% A- 10% v/v B: PROTEOME1(5mM, 10mM, 15mM, 20mM, 25mM, 30mM, 50mM, 75mM, 100mM, 150mM, 200mM, 300mM, 500mM, 1500mM); PROTEOME2(2.5mM, 3.75mM, 5mM, 7.5mM, 10mM, 12.5mM, 15mM, 17.5mM, 20mM, 25mM, 30mM, 40mM, 50mM, 75mM, 100mM, 125mM, 150mM, 175mM, 200mM, 250mM, 300mM, 350mM, 400mM, 500mM, 1500mM). 2D LC/MS/MS: Strong Cation Exchange (SCX) Column - self packed IntegraFrit capillary column (New Objective Inc.) ~10-15 cm x 75 um internal diameter, resin: PolyLC Inc. polysulfoethyl A, 300A pores, 5 um particles ; C18 Reversed Phase (RP) Column - self packed PicoFrit capillary column (New Objective Inc.) ~20-27 cm x 75 um internal diameter, 15 um tip, uncoated; resin: Phenomenex Jupiter C18, 300A pores, 5 um PROTOCOL: SAMPLE PREPARATION 1. Receive dried FASP II-processed peptides in 200 ul polypropylene Agilent #5182-0549 microvial insert, in Agilent #5182-0716 glass microvials with screw-cap lids. Store dry at -20C until experiment. 2. Dissolve sample in ~12-16 ul solution A with repeated pipettings. Put insert into Agilent glass microvial, load into Agilent autosampler, temp = 4C. MASS SPECTROMETRY. 1. Tune and calibrate instrument using Thermo LTQ ESI Positive Ion Calibration Solution #88322. 2. At start of series of runs on a sample, optionally inject ~1/20 of sample on RP capillary column only, examine base peak intensity chromatogram, check backpressure after injection and after run; analyze data with Sequest for number of identified proteins. Repeat if problems. 3. Connect two capillary columns in sequence (SCX first). Set up LTQ-XL Orbitrap ETD instrument with the following parameters in addition to tuning/calibration-set parameters: spray voltage 2.1kV; ion transfer capillary temperature 225-250C no sheath gas, no auxiliary gas CID fragmentation using 35% normalized collision energy full scan in Orbitrap, R=30000, range 350 - 2000 m/z, monoisotopic precursor selection enabled; precursor peaks collected in profile mode, MS/MS peaks collected in centroid mode; MS/MS in LTQ on top 10 most intense peaks if available; reject 1+ ions for MS/MS except optionally for one of ~4 runs (to expand proteome by sequencing 1+ peptides); dynamic exclusion = 60 sec, repeat count =1, exclude precursors +/- 5 ppm; Orbitrap full scan target ions = 5e5, LTQ MS/MS target ions = 1e4; activation time 30ms, activation Q = 0.250; min signal required to trigger MS/MS = 500 counts; automatic gain control on polysiloxane internal lock masses when available: 593.157607, 519.138815, 445.120024, 391.284286, 371.101233 m/z NANO HPLC: 1. Set up Agilent 1200 nano-HPLC: autosampler temp= 4C, column temp = room emperature (~20C); use solutions A and B (above); flow rate 250 nl/min. except as noted. Loading sample: isocratic elution with 5% B; inject ~half of vol (~6-8ul of 12-16ul), load (25-30 min. for proteome1)(~35-40 min. for proteome2), dilute remaining sample to 12ul with A, inject half of new 12ul for 25-30min, repeat for 3rd, 4th loads; collect lc/ms/ms data during entire loading process. 2a. Individual salt step injections for proteome1: at start of each run, inject 2ul of individual ammonium formate stocks: 0 (optional depending on total protein load), 5, 10, 15, 20, 25, 30, 50, 75, 100, 150, 200, 300, 500, 1500 mM; may wash columns with final injection of 8ul B (collect this data as well). The HPLC elution gradients are: time(min)->%B[0->5,60->5,110->40,130->95,135->95,136->5,156->5]. 2b. Individual salt step injections for proteome2: at start of each run, inject 2ul of individual ammonium formate stocks: 0mM (water), 2.5, 3.75, 5, 7.5, 10, 12.5, 15, 17.5, 20, 25, 30, 40, 50, 75, 100, 125, 150, 175, 200, 250, 300, 350, 400, 500, 1500, second 1500 mM, MeCN, 2nd MeCN washes; final 1500mM and MeCN injections are to clean SCX and RPC columns, however data is also collected from these runs. The HPLC elution gradients are: time(min)->%B [0->5,30->5,120->40,121->95,123->95,124->5,140->5]. 3. Data is collected as Thermo format (proprietary) .raw files; save/store all runs in folder for analysis, including initial test RP runs if utilized.
Data Processing Protocol
Identification of proteins is performed using five different search engines which are run separately. Namely: the Crux algorithm (J Proteome Res 7, 3022, 2008) with tide-search and Percolator scoring as implemented in Crux 2.0, the Sequest algorithm (J Am Soc Mass Spectrom 5, 976, 1994) as implemented in Thermo's Proteome Discoverer software, v. 1.4, the Mascot algorithm (Electrophoresis 20, 3551, 1999) as implemented in Mascot Server v. 2.3.02 with creation of Peaklists (profile mode) of peptides using Mascot Distiller v. 18.104.22.168, the Andromeda algorithm (J Proteome Res. 10, 1794, 2011) v. 22.214.171.124 as implemented in Maxquant v. 126.96.36.199 and finally the MSGF algorithm (Mol Cell Proteomics 9, 2840, 2010) as implemented in MSGF+ v. 10072 (June 30 2014 version). The reproducibility of identification of a protein by each of the above five search engines is calculated using a custom Excel macro. Grouped proteins or additional proteins co-identified with a listed protein by a particular search engine are included in output files for individual search engines when available. Details of the identification processes of individual search engines is available in the DataProcessingSOP060517.docx document.
Anderson DC, Lapp SA, Barnwell JW, Galinski MR. A large scale Plasmodium vivax- Saimiri boliviensis trophozoite-schizont transition proteome. PLoS One. 2017 12(8):e0182561 PubMed: 28829774