PRIDE FAQ

 Frequently Asked Questions - FAQ

If you cannot find the answer to your question in this FAQ or other help pages then please contact us at pride-support@ebi.ac.uk for advice.

The FAQ is divided into 9 sections:

1. Submission options - What submission options do I have?

2. Submission process - How to do a submission?

3. Data access policy - How can the data be accessed?

4. Post-submission process - How do I modify, reference and make datasets public?

5. Quality Control - How can I annotate my data better? 

6. Recommendations for popular proteomics software tools and search engines - What fits my custom pipeline?

7. Recommendations for specific proteomics experiment types - What might fit my bleeding edge research?

8. Related journal guidelines - What are the known requirements of different proteomics journals?

9. Troubleshooting

 

1. Submission options - What submission options do I have?

 

What is ProteomeXchange?

What are the PX Submission options via PRIDE?

What is a Complete ProteomeXchange Submission?

What is a Partial ProteomeXchange Submission?

 

 

2. Submission process - How to do a submission?

 

Register 

How to manually set the proxy details with the PX Submission Tool?

What type of machine raw files are supported for a PX Submission?

Can PRIDE XML files without identifications be submitted for a Complete PX Submission?

How to export the summary file with the PX Submission Tool?

How to generate the PX Submission Summary File without the PX Submission Tool?

 

3. Data access policy - How can the data be accessed?

 

Are the uploaded datasets kept private by default upon a ProteomeXchange Submission?

How to access private datasets?

 

 

4. Post-submission process - How do I modify, reference and make datasets public?

 

How can I modify the original dataset?

How to reference a PX accession in a manuscript?

How to release a dataset publicly?

 

 

5. Quality Control - How can I annotate my data better?

 

What are the delta m/z values displayed in PRIDE Inspector Peptide View?

What to do when you see delta m/z outliers in PRIDE Inspector?

 

 

6. Recommendations for popular proteomics software tools and search engines - What fits my custom pipeline?

 

What kind of Scaffold files are recommended for a PX Submission?

What kind of MaxQuant files are recommended for a PX Submission?

What kind of Proteome Discoverer files are recommended for a PX Submission?

What kind of ProteinPilot files are recommended for a PX Submission?

 

 

7. Recommendations for specific proteomics experiment types - What might fit my bleeding edge research?

 

I have mass spectrometry imaging data, what should I do?

I have multi-omics data uploaded to different databases, what should I do?

How to handle newly discovered, unreported modifications?

 

8. Related journal guidelines - What are the known requirements of different proteomics journals?

 

Does my PX dataset fulfil the requirements of the journal Molecular and Cellular Proteomics?

Does my PX dataset fulfil the requirements of the journal Proteomics?

 

9. Troubleshooting

 

PX Submission Tool does not launch

PX Submission tool launches, but I cannot login / make a submission

How to switch the PX Submission Tool from Aspera to FTP?

PRIDE Converter 2 Tool does not launch or cannot convert my files

PRIDE Converter 2 Tool still has problems converting my files

How do I update my PRIDE account information?

PRIDE Inspector Java Web Start does not launch

PRIDE Inspector Tool does not launch

Aspera does not connect

Aspera connects, but transfers 0% of my files

I cannot to log in with the PX Submission tool

All my Java applications don't launch! I'm on a Mac...

 

1. Submission options - What submission options do I have?

 

What is ProteomeXchange?

The ProteomeXchange consortium has been set up to provide a single point of submission of MS proteomics data to the main existing proteomics repositories, and to encourage the data exchange between them for optimal data dissemination.

Current members accepting submissions are:

The PRIDE PRoteomics IDEntifications database at the European Bioinformatics Institute focusing mainly on shotgun mass spectrometry proteomics data

PeptideAtlas/PASSEL focusing on SRM/MRM datasets.

 

What are the PX Submission options via PRIDE?

The default way of submitting data to PRIDE is following the ProteomeXchange (PX) consortium (www.proteomexchange.org) guidelines. The figure below shows the overall submission process submitters will have to follow up to the point of uploading their datasets:

new pride px workflow

Each submitted PX dataset will contain:

- peptide/protein identification files (called ‘RESULT’),

- mass spectrometer output files (called ‘RAW’), which are either machine raw files or not heavily processed files in a XML standard format such as mzXML or mzML,

- other files like peak list files (called ‘PEAK’), search engine output files (called ‘SEARCH’), quantification files and different post-processing files, amongst others.

The current version of pipeline does not explicitly support quantification results unless they are provided included in the PRIDE XML files. However, quantification result files can be submitted as accompanying (‘OTHER’) files.

There are two different submission workflows depending on whether peptide/protein identification results can be submitted in a format that can be handled by PRIDE (PRIDE XML at present, the data standard mzIdentML in the near future) or not. If PRIDE XML ‘RESULT’ files are provided a "Complete Submission" option is available. If PRIDE XML files are not available a "Partial Submission" can be done.

 

What is a Complete ProteomeXchange Submission?

You can choose between 2 main submission types depending on the availability of mzIdentML or PRIDE XML files as "Result" files for Complete Submissions. The recommended submission subtype is a Complete Submission, but alternatively Partial Submissions are accepted as well.

3A. Complete Submission: mzIdentML- or PRIDE XML-based

The 2 subtypes of Complete Submissions are either mzIdentML- or PRIDE XML-based. Complete Submissions mixing the 2 types of ‘RESULT’ files are not allowed. 

An mzIdentML-based Complete Submission requires 3 types of files:

  • Result files: mzIdentML 1.1 files with identifications provided. In the submission tool they should be tagged as “RESULT”. It is also recommended to check your mzIdentML files before submission using the PRIDE Inspector tool (the version supporting mzIdentML will be out in early January, 2014). mzIdentML version 1.0 files are not supported.
  • Peak list files: Since the mzIdentML files themselves do not contain the spectra information it is mandatory to provide the peak list files (eg. mgf files) that were used for the original search and are referenced in the mzIdentML file. These are different from the provided mandatory raw files. In the submission tool they should be tagged as “PEAK” and the submission tool will try to automatically map the peak files to the mzIdentML file where they are listed.
  • Raw files: the MS instrument output files, for instance Thermo RAW files. As an alternative, lightly processed  mzML, mzXML, mzData files are acceptable if MS1 level spectra information is available and the different peak processing steps are known. In the submission tool they should be tagged as “RAW”.

Please check our Guide to generate mzIdentML files. It is possible that you are already using a pipeline/search engine where mzIdentML files are amongst the native search engine output formats. mzIdentML files can be created/exported already with numerous tools, please see a list here.

Besides the three mandatory file types above, there are optional and recommended file types that can be prepared and uploaded as well:

  • Search engine result files: The original output from your search engine or your analysis pipeline used by you for further post-processing, such as Mascot .dat files, Trans Proteomics Pipeline (TPP) pep.xml and/or prot.xml files among many others. In case your search engine generated mzIdentML files by default you already provided them as "Result" files. They search engine files should contain peptide/protein identification results. In the submission tool they should be tagged as “SEARCH”.
  • Quantification result files: In many cases current mass spectometry proteomics studies do involve a quantitative analysis on the peptides/proteins present in the samples. Quantification related files reporting on peptide/protein quantitative values/ratios can be provided and tagged as "QUANT" in the submission tool.
  • Sequence database files: Sequence database file (usually in FASTA format) that was used to perform the mass spectral search. Sequence database files (both protein and DNA) are labelled as ‘FASTA’ in the tool.
  • Spectrum libraries:Spectral library file that was used for performing the mass spectrometry search. In the PX Submission Tool they should be tagged as ‘SPECTRUM_LIBRARY’.
  • Gel image files: In case two-dimensional gel electrophoresis has been used as a separation method the gel image files can be provided. In the submission tool they should be tagged as ‘GEL’.
  • Other files: Everything else that did not fit into the 6 categories above for instance protein inference files generated by post-processing of the search engine results or R scripts used for data analysis. If you have used custom search databases you can provide those as well. In the submission tool they should be tagged as ‘OTHER’.

PRIDE XML-based Complete Submission requires 2 types of files:

  • Result files: fully supported by PRIDE: PRIDE XML files with identifications provided. In the submission tool they should be tagged as “RESULT”. It is also recommended to check your PRIDE XML files before submission using the PRIDE Inspector tool.
  • Raw files: the MS instrument output files, for instance Thermo RAW files. As an alternative, lightly processed  mzML, mzXML, mzData files are acceptable if MS1 level spectra information is available and the different peak processing steps are known. In the submission tool they should be tagged as “RAW” and mapped to the corresponding "RESULT" files.

Try to create PRIDE XML files using the PRIDE Converter 2 tool. Please take a moment to review our Guide to generate PRIDE XML files concerning the input files you can use for PRIDE XML generation. There are other tools that can produce PRIDE XML files, not mantained by the PRIDE team, like PeptideShaker, Waters PLGS,  ProteiosEasyProtMIAPE Extractor (ProteoRed), or the original PRIDE Converter (no longer further developed). 

Besides the two mandatory file types above, there are optional and recommended file types that can be prepared and uploaded as well:

  • Peak list files. It is strongly recommended to provide the peak list files (eg. mgf files) that were used for the original search and these are different from the provided mandatory raw files. In the submission tool they should be tagged as “PEAK”.
  • Search engine result files: the original output from your search engine or your analysis pipeline, such as Mascot .dat files, Trans Proteomics Pipeline (TPP) pep.xml and/or prot.xml files or mzIdentML files, among many others. They should contain the peptide/protein identifications. In the submission tool they should be tagged as “SEARCH”.
  • Quantification result files: In many cases current mass spectometry proteomics studies do involve a quantitative analysis on the peptides/proteins present in the samples. Quantification related files reporting on peptide/protein quantitative values/ratios can be provided and tagged as "QUANT" in the submission tool.
  • Sequence database files: Sequence database file (usually in FASTA format) that was used to perform the mass spectral search. Sequence database files (both protein and DNA) are labelled as ‘FASTA’ in the tool.
  • <strong">Spectrum libraries:Spectral library file that was used for performing the mass spectrometry search. In the PX Submission Tool they should be tagged as ‘SPECTRUM_LIBRARY’.
  • Gel image files: In case two-dimensional gel electrophoresis has been used as a separation method the gel image files can be provided. In the submission tool they should be tagged as ‘Gel’.
  • Other files: Everything else that did not fit into the 6 categories above for instance protein inference files generated by post-processing of the search engine results or R scripts used for data analysis. If you have used custom search databases you can provide those as well. In the submission tool they should be tagged as ‘OTHER’.

In case of a Complete Submission a DOI (Digital Object Identifier) will be assigned to your dataset and its transparency level will be higher. That is good for your data and good for the community.

 

What is a Partial ProteomeXchange Submission?

You should only choose this option if your search results cannot be converted/exported to PRIDE XML or mzIdentML v1.1 (plus the accompanying spectra). It is not the recommended option, since it will significantly reduce the reusability of your dataset. ‘RAW’ files need to be provided together with search engine output files (‘SEARCH’). Uploading peak list (‘PEAK’), quantification and other types of files (‘QUANT’, ‘GEL’ or ‘OTHER’) is possible but not enforced. As a result, you will be issued with a ProteomeXchange accession number but not with a DOI. Once it is made public, your dataset will be available to download viaFTP.

Partial Submission requires 2 types of files:

  • Search engine result files: (called ‘SEARCH’): the original output files from your search engine or your analysis pipeline, Trans-Proteomic Pipeline (TPP) pep.xml and/or prot.xml files, or MaxQuant text output files, among many others. They should contain the peptide/protein identifications. In the submission tool they should be tagged as ‘SEARCH’.
  • Raw files (called ‘RAW’): MS instrument binary output files, such as  Thermo RAW files, BRUKER .baf files or not heavily processed mzXML or mzML files. If your ‘RAW’ files are organized in directories instead of individual files, please compress them into one individual file (for instance to .zip) before upload. In the submission tool they should be tagged as ‘RAW’.

Besides the two mandatory file types above, there are optional and recommended file types that can be prepared and uploaded as well:

 

  • Peak list files. It is strongly recommended to provide the peak list files (eg. mgf files) that were used for the original search and these are different from the provided mandatory raw files. In the submission tool they should be tagged as 'PEAK'.
  • Quantification related files: In many cases current mass spectometry proteomics studies do involve a quantitative analysis on the peptides/proteins present in the samples. Quantification related files reporting on peptide/protein quantitative values/ratios can be provided and tagged as 'QUANT' in the submission tool.
  • <strong">Sequence database files: Sequence database file (usually in FASTA format) that was used to perform the mass spectral search. Sequence database files (both protein and DNA) are labelled as ‘FASTA’ in the tool.
  • Spectrum libraries:Spectral library file that was used for performing the mass spectrometry search. In the PX Submission Tool they should be tagged as ‘SPECTRUM_LIBRARY’.
  • Gel images files: In case two-dimensional gel electrophoresis has been used as a separation method the gel image files can be provided. In the submission tool they should be tagged as ‘GEL’.
  • Any other files: Everything else that did not fit into the 5 categories above for instance protein inference files generated by post-processing of the search engine results or R scripts used for data analysis. If you have used custom search databases you can provide those as well. In the submission tool they should be tagged as ‘OTHER’.

To perform a partial Submission means that a PX accession number will be assigned to your files but PRIDE experiment accession numbers won't be issued. Also, you won't have a DOI assigned to your dataset.

 

 

2. Submission process - How to do a submission?

 

Register

If you do not already have a PRIDE account, create one here. Currently we don't send out automatic emails upon succesful registration. Please contact pride-support@ebi.ac.uk if your login information is not valid after 24 hours following registration.

 

What type of machine raw files are supported for a PX Submission?

Thermo .RAW, ABSCIEX .wiff, .wiff.scan, Agilent .d/, Waters .raw/ • imzML, Shimadzu .run/, Bruker .yep, Bruker .baf

As an alternative, lightly processed  mzML, mzXML, mzData files are acceptable if MS1 level spectra information is available and the different peak processing steps are known.


How to manually set the proxy details with the PX Submission Tool?

From version 1.0.4 and up the proxy details can be manually set with the PX Submission Tool. The tool's working directory contains a 'config' folder with a text file 'config.props' There the proxy host and port can be manually set by overwriting and uncommenting the proxy details below:

# px.proxy.host = localhost
# px.proxy.port = 8080

See also screenshot:

PX Submission Tool ftp proxy

 

 

 


Can PRIDE XML files without identifications be submitted for a Complete PX Submission?

No. PRIDE XML files without protein/peptide identifications unfortunately do not qualify for a Complete PX Submission. You will have to investigate the options to provide identifications as well.

In case you are unable to produce pride xml files with identifications included you can still do a Partial PX Submission without pride xml files at all. You can provide your identification files as search files and map raw files for that.

Alternatively please check whether the software tools you are using can export MzIdentML 1.1 and we recommend uploading those files with the accompanying mgf peak list files that are referenced in them. Right now these can only be downloaded as Partial Submission but soon we will be able to natively support MzIdentML 1.1 as a Complete Submission. So in case you have uploaded your peak list files as well there is a chance that those datatsets can be turned into Complete Submissions later by adding results file level metadata information.

 

How to export the summary file with the PX Submission Tool?

For some bulk submissions or for Aspera upload the summary.px file can still be generated and exported with the PX Submission Tool, although the files won't actually be uploaded with the tool itself.

When using the PX tool there is an "Export Summary" button that you can click after you have done the mappings and that will export the summary.px. It is necessary to have all the files available in a folder so this folder will be used to supply all the file names. But the data won't be actually uploaded, only the summary file exported. See screenshot.

How to export the summary file with the PX Submission Tool?

You don't have to start the actual file upload that way. You can send that small file to the database curators via email and upload the files via Aspera.

 

How to generate the PX Submission Summary File without the PX Submission Tool?

In case of bulk submissions when Aspera is used and there are too many files to handle them with the PX Submission Tool Gui the summary files can be generated by scripting. Details of the tab delimited PX Submission format can be found in the ProteomeXchange_Submission_Summary_File_Format.pdf that is distributed with the PX Submission Tool and can be downloaded from here. Example submission summary files (Complete and Partial) can be downloaded form here.

 

3. Data access policy - How can the data be accessed?

 

Are the uploaded datasets kept private by default upon a ProteomeXchange Submission?

Yes. All submissions to ProteomeXchange via PRIDE are private by default and the curators only make a particulat dataset public once it was requested by the submitter or the paper referencing the dataset has been published.

 

How to access private datasets?

Submitted datasets are 'private' by default, which means you need to be logged-in to view your data. We will however also create a PX reviewer account for your submission which you can include in your manuscript. The PX reviewer account will give you access to all of the files belonging to your submission. You can access the private dataset files in two ways: via the PRIDE Archive web page or via PRIDE Inspector.

PRIDE Archive web page

The new PRIDE Archive web site is available at http://www.ebi.ac.uk/pride/archive from the 6th of Januar, 2014. Registered submitters can use their personal accounts or the reviewer accounts to access and download the individual PX datasets. For every submission there is a separate reviewer account generated.

Please navigate first to the login page available at http://www.ebi.ac.uk/pride/archive/login (see Figure):

PRIDE Archive Login

 

PRIDE Inspector

For this option you need to use PRIDE Inspector that is going to be out on the second week of January.

The following applies for both Complete and Partial Submissions:

Open PRIDE Inspector by clicking on the pride-inspector-<version-number>.jar file in the tool's working directory -> Private Download -> ProteomeXchange -> PX reviewer account details. You can open the PRIDE XML and mzIdentML result files with PRIDE Inspector or just download all the files that you wish to investigate.

Inspector's Private Download

Downloading data with the reviewer account using PRIDE Inspector' Private Download option.

In case of Complete Submissions you can alternatively launch PRIDE Inspector with a WebStart URL provided in the automatic "Submission Complete" e-mail. This option is for downloading the PRIDE XML and mzIdentML files only into a target folder. In order to use the PRIDE Inspector Java Web Start option to display your data there is a waiting period of up to one day upon getting the automatic "Submission Complete" e-mail..

 

4. Post-submission process - How do I modify, reference and make datasets public?

 

How can I modify the original dataset?

In case you need to add to a small number of additional "other files" (like csv, plain text files, spreadhseets, scripts) we can provide you with FTP details to upload and can add these to the original dataset without you resubmitting the whole dataset. In case you have used the PX Subission Tool and you need to add additional raw files and accompanying result or search files, you need to resubmit the whole dataset again. Please follow the procedure here. In case of an Aspera bulk submission you have to upload the modified and missing files into a new subdirectory within your target directory and regenerate the submission summary file including all the old and modified and new files again.

 

How to reference a PX accession in a manuscript?

By default we recommend to add the following formula to your manuscript (typically in "Material and Method"s or just before/in the Acknowledgements):

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [1] with the dataset identifier <PXD000xxx>."

In case of Complete PX Submissions an extra DOI can be added to this:

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository [1] with the dataset identifier <PXD000xxx> and DOI 10.6019/<PXD000xxx>."

[1] and also for general PRIDE reference, please use: Vizcaino JA, Cote RG, Csordas A, Dianes JA, Fabregat A, Foster JM, Griss J, Alpi E, Birim M, Contell J, O'Kelly G, Schoenegger A, Ovelleiro D, Perez-Riverol Y, Reisinger F, Rios D, Wang R, Hermjakob H. The Proteomics Identifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 2013 Jan 1;41(D1):D1063-9. doi: 10.1093/nar/gks1262. Epub 2012 Nov 29. PubMed PMID:23203882.

Additionally we'd like to ask you to also put this information in a much abridged form into the abstract itself, like this: "The data have been deposited to the ProteomeXchange with identifier <PXD000xxx>." See for example this Chromosome-Centric Human Proteome Project dataset and paper: http://www.ncbi.nlm.nih.gov/pubmed/?term=23312004 and other examples on PubMed. A PX Identifier in the abstract makes the dataset much more visible and accessible.

 

How to release a dataset publicly?

By default, your data will be made publicly available after your manuscript has been accepted, or when we have your instructions to do so. While we may also receive acceptance notifications from some journals, we would like to ask all submitters to kindly notify us separately. Otherwise, it can happen that we don’t now that your manuscript is already published. You can notify us two ways: 

A), Via the new PRIDE  Archive web site (http://www.ebi.ac.uk/pride/archive) that is going to be available from the 6th of January, 2014. Once you have logged in with your user account at http://www.ebi.ac.uk/pride/archive/login you can click the green “Publish” buttons located next to your unpublished datasets. Here you can provide details for your dataset and submit a web form, please see Figure.

PRIDE Public Release Web Form

B)  Contacting pride-support@ebi.ac.uk.

Upon making the project public, a project page will be released over at ProteomeCentral (http://proteomecentral.proteomexchange.org) and from a particular dataset page an FTP location will be available.

 

5. Quality Control - How can I annotate my data better?

 

What are the delta m/z values displayed in PRIDE Inspector Peptide View?

PRIDE Inspector calculates the ‘delta m/z’ values for the reported identified peptides by calculating the difference between the experimentally detected m/zvalue (corresponding to the precursor peptide ion m/z) and the theoretical mass of the peptide identified. If the resulting value is outside of a normal range (depending on the accuracy of the mass spectrometer used), this constitutes a good indication that something has gone wrong while annotating the data. For instance, an outlier value can indicate whether the precursor charge was wrongly assigned, or the protein modifications were not reported correctly. In the PRIDE Inspector ‘Peptide View’, the delta m/z values are displayed. Currently, the delta m/z values outside of the −4.0 to +4.0 m/z units range are highlighted in red, while the normal values are highlighted in green. See details in this article.

image

 

What to do when you see delta m/z outliers in PRIDE Inspector?

PRIDE Inspector calculates the ‘delta m/z’ values for the reported identified peptides by calculating the difference between the experimentally detected m/zvalue (corresponding to the precursor peptide ion m/z) and the theoretical mass of the peptide identified. If the resulting value is outside of a normal range (depending on the accuracy of the mass spectrometer used), this constitutes a good indication that something has gone wrong while annotating the data. See details here and in the following open access article.

If the distribution of all the ‘delta m/z’ values from the whole experiment (MS run) is taken into account, it can give a clear indication that something has gone wrong in the experimental set up, or that there has been a mistake in the reporting of the final results at a global level. This chart is available in the ‘Summary charts’ view in PRIDE Inspector.

So what can you do in case you see outliers:

First decide whether there is a consistent pattern, for instance if the same charge state yields the same consistent mass delta. Usually charge state misassignments result in much bigger 'delta m/z' values in the range of the (multiples of the) whole peptide masses while modification problems result in much lower 'delta m/z' values.

In case it ilooks like a modification related problem, check whether you have left out some modifications or that the modification that you have picked during the conversion process in PRIDE Converter 2 are the correct ones (for instance, not use upper more generic terms in the ontology, since they do not have a mass delta). Also it is possible that hey are not represented in PSI-MOD at all. In that case, those terms need to be added to the PSI-MOD ontology

 

6. Recommendations for popular proteomics software tools and search engines - What fits my custom pipeline?

 

What kind of Scaffold files are recommended for a PX Submission?

Scaffold 4.0 can export MzIdentML 1.1 and we recommend uploading those files with the accompanying mgf peak list files that are referenced in them. This way an MzIdentML based Complete Submission can be done. 

For older Scaffold versions we recommend exporting the binary .sf3 result files as ProtXML files and use them for a Partial Submission as "search" files. 

 

What kind of MaxQuant files are recommended for a PX Submission?

 

MaxQuant output is not supported by PRIDE Converter 2 so a Partial ProteomeXchange Submission is recommended where search/identification files should be uploaded alongside with the machine raw files.

If you are using the latest version of MaxQuant (1.3.0.5) there is a txt folder generated and by default you can just zip this text folder and upload as search/identification file.

If this is complicated we would recommend uploading the following particular text output files:

peptides.txt

modifiedPeptides.txt

proteinGroups.txt

msms.txt

parameters.txt

and your Experimental Design Template file saved as a tab separated file.

 

What kind of Proteome Discoverer files are recommended for a PX Submission?

Technically PRIDE Converter 2 does support Proteome Discoverer .msf files but only from version 1.2 and up. In case there are problems with the conversion we recommend doing a Partial Submission instead of a Complete Submission. For a Partial Submission please export human readable pep.xml files out of the .msf files and upload those as well as search files since Proteome Discoverer can export pep.xml files from the binary .msf files.

 

What kind of ProteinPilot files are recommended for a PX Submission?

In case you have used ProteinPilot for search the only submission option currently availabe is a Partial PX Submission. For ProteinPilot as search/identification files we strongly recommend providing human readable files besides the binary group file. Please export the group files into xml files using the following command line feature:

"Command Line Control and Open Results

To support users and third-party software vendors that want to integrate ProteinPilot&trade Software, it is possible to script searches via command line and decrypt the .group file results into clear XML for full access to all the data it contains."

 Here is a howto on the conversion process from one of the PX submitters:

1. Create a txt file in Notepad entitled say "group2XML_Example.bat.txt" and save it in the ProteinPilot folder (where the group2xml.exe is located).

2. Rename "group2XML_Example.bat.txt" to "group2XML_Example.bat", giving it a Windows batch file extension.

3. Opened this batch file in Notepad and type in the following command line instructions:

group2XML.exe XML <full path to the .group file to be converterd> <full path to the .xml file the .group file will be converted into>

for instance

group2XML.exe XML "C:\AB SCIEX\ProteinPilot Data\Results\Example.group" "C:\AB SCIEX\ProteinPilot Data\Results\Example.xml"

The command has the following argument structure: group2XML.exe <Type> <Result.group> <Output.file>

where:

- <Type> specifies the type of output

- <Result.group> is a .group file created by ProteinPilot Software

- <Output.file> is the name of the file to be created 

4. Save and close the file.

5. Double-click on the file to run the conversion.

 

7. Recommendations for specific proteomics experiment types - What might fit my bleeding edge research?

 

I have mass spectrometry imaging data, what should I do?

There is special support with different file type covering MS Imaging datasets but these can only be submitted as Partial submissions.

These are the main specific points to consider for this type of submissions:

(i) Additional file tags have been created: metadata information about the images (labeled as ‘MS_IMAGE_DATA’) and an optical image (labeled as ‘OPTICAL’).

(ii) It is mandatory to provide the MS raw data (called ‘RAW’).
- It is recommended to submit MS imaging data in imzML format as it offers the most flexible options for viewing, but proprietary data formats are also accepted.
- There is the possibility to submit two different mass spectral related files for one dataset, as required for several MS imaging data formats (e.g. imzML and Analyze). The mass spectral data file (*.ibd for imzML or *.img file in Analyze format) must be labeled as ‘RAW’. The file that contains metadata (such as pixel dimensions and additional information) must be labeled as ‘MS_IMAGE_DATA’ (e.g. *.imzml file for imzML or *.hdr file for Analyze).
- If an ‘ibd file (imzML format) is submitted as ‘RAW’ an ‘MS_IMAGE_DATA’ (*.imzml) is mandatory.
- However, in the case of ‘RAW’ proprietary formats that only consist of one file, a ‘MS-IMAGE_DATA’ file is not required.

(iii) In addition, PRIDE requires a mandatory ‘SEARCH’ file for Partial submissions, which corresponds to the processed results. There is currently no strict definition for the format of this mandatory file, but it should contain a list of m/z values, names of (tentatively) identified compounds and additional information that were used to the generate MS images in the published work.

(iv) It is also supported the inclusion of an optical image (‘OPTICAL’) of the measured sample, which can allow validation and/or interpretation. The ‘OPTICAL’ file could contain a photograph of the imaged sample or an adjacent section that shows comparable spatial features. Native samples, classical histological techniques (H&E, toluidine) or immunohistochemistry staining (antibody staining) can be provided for this purpose.

I have multi-omics data uploaded to different databases, what should I do?

A new, exciting wave of research investigates the same samples with multiple 'omics' technologies, for instance DNA, RNA sequencing, mass spectrometry, NMR. This way for instance transcriptomics, proteomics, and metabolomics datasets are combined to support one study. In case of these 'multi-omics', 'cross-omics' datasets the different datasets are deposited into different repositories. When uploading a proteomics dataset to PX belonging to a multi-omics study it is strongly recommended to provide the database identifiers, accessions of other 'omics' datasets (for instance transcriptomics data uploaded to ArrayExpress, metabolomics data uploaded to Metabolights) in the experiment description, or at least indicate that thosed datasets are in preparation to be submitted to a particular database. In the future releases of the PX Submission tool we are going to provide a separate metadata field for cross-references to other dataset accessions.

How to handle newly discovered, unreported modifications?

We recommend using the PSI-MOD ontology to report protein modifications in PRIDE XML files as result files for Complete PX Submissions. In case the protein modifications you were using or have discovered throughout your study have not been reported in the ontology we recommend adding the modification first to the ontology. For that, we need the proposed name, delta mass and/or a reference or web URL (if it is a commercial product). 

 

8. Related journal guidelines - What are the known requirements of different proteomics journals?

Does my PX dataset fulfil the requirements of the journal Molecular and Cellular Proteomics?

The authoritative and up to date source for the precise MCP requirements is the journal website, the following information is given for convenience:

Data deposition: The journal Molecular and Cellular Proteomics recommends that all the MS data associated to a manuscript is submitted to a public data repository such as PRIDE/ProteomeXchange. All types of PX submissions provide stable data availability.

Annotated spectra: MCP requires researchers to provide annotated spectra in several cases such as those proteins identified by one or two peptides, or for all peptide identifications containing post-translational modifications (PTMs). In the case of PX submissions:

- Complete submissions meet this requirement. It is possible to see annotated spectra in PRIDE Inspector.

- In the case of Partial submissions, this requirement is met if a free spectral viewer is available for the search engine output files. For instance, for Scaffold output files, there is a free viewer available. There are other cases when this is possible. We will update this page with other tools that can be available for particular cases.

Does my PX dataset fulfil the requirements of the journal Proteomics?

The authoritative and up to date source for the precise Proteomics requirements is the journal website, the following information is given for convenience:

The journal Proteomics requires data availability in the case of the "Dataset Briefs" articles and recommend it for the other article types. All PX submissions provide stable data availability to fulfil this requirement.

 

9. Troubleshooting

PX Submission Tool does not launch

  1. Ensure Java is installed on your computer: http://www.java.com
  2. Download the latest version of the PX Submission Tool: http://www.proteomexchange.org/submission
  3. Extract the PX Submission Tool to a directory.
  4. From the tool's directory, double-click the jar file, e.g. px-submission-tool-2.0.1.jar
  5. If 4.) doesn't work, open a command line to the tool's directory and run the command: java -jar <px-submission-tool-X.Y.Z.jar>, e.g. java -jar px-submission-tool-2.0.1.jar
  6. If this still doesn't work, please contact PRIDE directly: pride-support@ebi.ac.uk and attach the 'px_submission.log' file found in the 'log' sub-directory.

PX Submission tool launches, but I cannot login / make a submission

Are you connected to the Internet via a proxy? If so, with a plain text editor open the 'config.props' file in the 'config' sub-directory and make changes by following the stated instructions within the file.

 

How to switch the PX Submission Tool from Aspera to FTP?

Should there be problems with the Aspera upload submitters can switch to the slower FTP file transfer protocol by changing the ‘px.upload.protocol = aspera’ line to ‘px.upload.protocol = ftp’ in the plain config.props text file located in the ‘config’ subdirectory in the PX Submission Tool’s working directory.

PRIDE Converter 2 Tool does not launch or cannot convert my files

  1. Ensure your opearting system is 64 bit, and you have 64 bit Java installed on your computer: http://www.java.com
  2. Download the latest version of the PRIDE Converter 2 Tool: https://code.google.com/p/pride-converter-2/
  3. Extract the PRIDE Converter 2 Tool to a directory.
  4. From the tool's directory, double-click the jar file, e.g. pride-converter-2.0-SNAPSHOT.jar
  5. If 4.) doesn't work, open a command line to the tool's directory and run the command: java -jar <pride-converter-X.Y-SNAPSHOT.jar>, e.g. java -jar pride-converter-2.0-SNAPSHOT.jar
  6. Increase the allocated memory for the tool. With a plain text editor open the 'converter.properties' file then amend the line:

jvm.args=-Xms64M -Xmx1024M
to a higher allowance, e.g.: jvm.args=-Xms128M -Xmx3G

PRIDE Converter 2 Tool still has problems converting my files

Please contact PRIDE directly: pride-support@ebi.ac.uk. It might be the case that your files are simply incompatible with the PRIDE Converter 2 tool.

How do I update my PRIDE account information?

Log into the PRIDE website with your user account details:

www.ebi.ac.uk/pride/archive/login

And then click 'edit profile' to change your details and/or password.

PRIDE Inspector Java Web Start does not launch

Due to Oracle's recent Java update, you may experience security related problems with Java Web Start. Please add "http://www.ebi.ac.uk" to the URLs listed in the Security tab of your Java Control Panel.

PRIDE Inspector Tool does not launch

  1. Ensure Java is installed on your computer: http://www.java.com
  2. Download the latest version of the PRIDE Inspector Tool: https://code.google.com/p/pride-toolsuite/wiki/PRIDEInspector
  3. Extract the PRIDE Inspector Tool to a directory.
  4. From the tool's directory, double-click the jar file, e.g. pride-inspector-2.0.0.jar
  5. If 4.) doesn't work, open a command line to the tool's directory and run the command: java -jar <pride-inspector-X.Y.Z.jar>, e.g. java -jar pride-inspector-2.0.0.jar
  6. If this still doesn't work, please contact PRIDE directly: pride-support@ebi.ac.uk and attach the 'pride_inspector.log' file found in the 'log' sub-directory.

Aspera does not connect

Ensure that the following port allows outbound traffic on your router, firewall, or network: TCP 33001.

Aspera connects, but transfers 0% of my files

Ensure that the following port allows outbound traffic on your router, firewall, or network: UDP 33001.

I cannot to log in with the PX Submission tool

Please confirm that you can log into the main PRIDE website with your account:
http://www.ebi.ac.uk/pride/archive/login
Perhaps you can try changing your password then trying again.

All my Java applications don't launch! I'm on a Mac...

You may need to run your Java application by: <right-click> then "open" the .jar file, or update your GateKeeper security settings to allow applications downloaded from 'anywhere' to be run. See the Apple Support website for more details.