Proteomics Identifications Database logo

Proteomics Identifications Database

spacer
Username: Password:

Populating the cvParam and userParam elements in PRIDE 2.1 XML

Introduction

The PRIDE 2.1 XML schema incorporates the PSI mzData schema to hold information describing the sample and instrumentation and to hold processed peak list data.   This schema makes heavy use of <cvParam ...> and <userParam ...> elements to allow additional, structured annotation to be added.

This mechanism has been extended for use in the wider PRIDE 2.1 XML schema.

You may wish to view the XML Spy PRIDE schema documentation in which the schema is described in detail, which includes detailed descriptions of the cvParam element and the userParam element .

The Elements Described

The simpler of these two elements, userParam, is used to hold any additional annotation that is not related to any available controlled vocabulary, ontology or database. Examples of possible use:

<userParam name="MS run time (mins)" value="350"/>

<userParam name="Failure occurred during MS run - first sample lost."/>

Note that the name attribute is mandatory, the value attribute is optional. 

The use of the userParam element is discouraged where it is possible to make use of the cvParam element, as this allows for searchable, queryable annotation based upon a CV.  Examples of possible use:

<cvParam cvLabel="NEWT" accession="9606" name="HUMAN"/>

<cvParam cvLabel="PSI" accession="PSI:1000008" name="IonizationType" value="ESI"/>

Note that the cvLabel, accession and name attributes are mandatory, the value attribute is optional.

In both the mzData and PRIDE 2.1 XML schemata a cvLookup element is provided to reference any CVs or ontologies that are used. If the two example cvParam entries given above are included in the XML file, then the following cvLookup entries should also be present:

<cvLookup cvLabel="NEWT" version="2005-06-30" address="http://www.ebi.ac.uk/newt"/>

<cvLookup cvLabel="PSI" version="0.0" fullName="The PSI Ontology" address="http://psidev.sourceforge.net/ontology/"/>

Note that the cvLabel, version and address attributes are mandatory, the fullName attribute is optional.  Following submission, this data will be curated to ensure that CV references are kept consistent across PRIDE.


cvLabel entries

To ensure consistency across PRIDE, the PRIDE controlled vocabulary (OBO format) includes terms defining possible entries in the cvLabel field of both the cvLookup and cvParam elements.

These terms can all be found under the root term PRIDE:0000028 'CvLabel' and include terms such as GO, ISBN, BTO, CL, DOID, NEWT, PRIDE, PSI, PubMed, RESID and UNIMOD.

Please use the terms themselves (not the CV accession) in your cvLabel attributes.

Please also note that NEWT should be used for both NEWT and NCBI taxonomy terms as the NCBI taxonomy is a sub-set of NEWT.


Context of the cvParam and userParam elements and recommended CVs or ontologies

cvParam and userParam elements appear at various points in both the mzData and PRIDE 2.1 XML schemata.  While the contents of the userParam elements is effectively unrestricted, the cvParam elements are populated in relation to a suitable CV or ontology.  In addition to this, the set of terms that may be used in a particular context is also logically restricted. 

Suitable controlled vocabularies and ontologies and their context in PRIDE 2.1 XML are described below.


The PRIDE Controlled Vocabulary

The PRIDE controlled vocabulary has been developed to provide consistent annotation for any cvParam element in PRIDE XML that is not covered by an existing, external CV.  The PRIDE CV is structured with root terms that indicate the context of each child term in PRIDE XML.  At the time of writing, these contextualizing root terms are as follows:

Term ID  Term Context in PRIDE 2.0 XML 
PRIDE:0000006  experimentation's ExperimentCollection/Experiment/additional/cvParam
PRIDE:0000004  IdentificationAdditionalParameter ExperimentCollection/Experiment/Identification/additional/cvParam
PRIDE:0000002 ModificationItemAdditionalParameter ExperimentCollection/Experiment/Identification/PeptideItem/ModificationItem/additional/cvParam
PRIDE:0000003 PeptideItemAdditionalParameter ExperimentCollection/Experiment/Identification/PeptideItem/additional/cvParam
PRIDE:0000001 ProtocolStepDescriptionAdditionalParameter ExperimentCollection/Experiment/Protocol/ProtocolSteps/StepDescription
PRIDE:0000000 ReferenceAdditionalParameter ExperimentCollection/Experiment/Reference/additional/CvParam
PRIDE:0000017 SampleDescriptionAdditionalParameter ExperimentCollection/Experiment/mzData/description/admin/sampleDescription/cvParam

Some (but not all) of these root terms currently have child terms that may be used as cvParam terms to annotate the relevant aspect of the PRIDE entry.

External CVs, Ontologies and Databases for Annotating PRIDE Data.

There are many mature and widely used ontologies, CVs and databases that are relevant to the identification data that can be held in PRIDE.  Recommended external ontologies and their context and use in PRIDE are described in the table below:

Context in PRIDE 2.0 XML
Ontology / CV / Database cvLabel Purpose
ExperimentCollection/Experiment/Reference/additional/CvParam PubMed database PubMed Provides a cross reference to the citation.
ExperimentCollection/Experiment/mzData/description/admin/sampleDescription/cvParam NCBI taxonomy / NEWT NEWT Provides an unambiguous identification of the species.
ExperimentCollection/Experiment/mzData/description/admin/sampleDescription/cvParam MeSH (Medical Subject Headings) MeSH Details of the gross anatomical term related to the sample.
Details of any disease associated with the organism from which the sample is derived.**
ExperimentCollection/Experiment/mzData/description/admin/sampleDescription/cvParam GO GO Details of the sub-cellular location of the sample, if appropriate.
ExperimentCollection/Experiment/Identification/PeptideItem/ModificationItem/additional/cvParam RESID RESID Database of naturally occurring post translational modifications.
ExperimentCollection/Experiment/Identification/PeptideItem/ModificationItem/additional/cvParam UNIMOD UNIMOD Database of modifications to amino acids, including artefactual modifications arising from mass spectrometry. 
ExperimentCollection/Experiment/mzData and child elements. PSI Ontology PSI Ontology designed specifically to annotate mzData.  Covers many aspects of sample, instrumentation and spectra.

** If you have screened the organism(s) that are the source of the sample and wish to describe them as free of disease, you can use the PRIDE ontology term PRIDE:0000018 'DiseaseFree' in the context of the ExperimentCollection/Experiment/mzData/description/admin/sampleDescription/cvParam element.

A new resource has been deployed at the EBI to facilitate the use of ontologies and controlled vocabularies. The Ontology Lookup Service offers a simple and intuitive interface to browse and search ontologies. We encourage users who wish to submit data to make use of it while constructing the PRIDE XML files they wish to submit.


Stretching the use of cvParam: Books and Literature

To make a reference to published literature, it is recommended that you include the PubMed id as in the following example:

<cvParam cvLabel="PubMed" accession="108562573" name="-"/>

Please note: the 'name' attribute has been set to a hyphen in the example above.

If you wish to reference a book, you can achieve this as follows:

<cvParam cvLabel="ISBN" accession="0-470-85681-5" name="-"/>



spacer