PSI-MI and PSI-PAR validation

The validator can use two different data models to validate the files :

Molecular interactions (PSI-MI ontology)

The validator will execute a set of rules based on the PSI-MI ontology to check your file(s). It can check 5 possible levels with lower levels being included in a higher rank verification. The 'Customized rules' level is a little different as it includes the two first levels (File syntax and usage of controlled vocabulary usage).

  • File syntax
  • Usage of controlled-vocabulary defined in the PSI-MI ontology
  • PSI-MI basic checks
  • MIMIx-compliance
  • IMEx-compliance

Protein Affinity Reagents (PSI-PAR ontology)

The validator will execute a set of rules based on the PSI-PAR ontology to check your file(s). It can check 2 possible levels with lower levels being included in a higher rank verification.

  • File syntax
  • Usage of controlled-vocabulary defined in the PSI-PAR ontology
The scope 'XML syntax' is not specific to the chosen data-model because both data models are using the same XML schema : PSI-XML 2.5.

Validation Scope

The validation levels described below define the the stringency of the validation rules applied to a dataset. As the image below illustrates, the currently supported levels are built on top of one another except for the 'Customized rules' scope which includes XML syntax and controlled vocabulary usage but then the object rules can be a mix of PSI-MI, MIMIx and IMEx.

PSI-MI Validation Scopes

WARNING: All the rules for MIMIx are used in the IMEx scope except for one : the taxId '-5' for 'in sillico' is a valid taxId for MIMIx but not for IMEx because IMEx is relying in experimental molecular interactions.

PSI-PAR Validation Scopes

File syntax

If the file is a PSI-XML 2.5 file, checks that the document to validate is properly defined according to the following XML schemas:

You can also look at the auto-generated documentation of the schema2.5.4

If the file is a MITAB file, checks that the document to validate is properly defined according to the following MITAB specifications:

Note: If your document does not pass this validation stage (report some FATAL messages), none of the levels below can be executed.

Controlled Vocabulary Usage

This ensures that the use of controlled vocabularies is correct. That is, for each controlled vocabulary place-holder, it checks that one uses the appropriate set of terms as defined in the PSI-MI ontology

The rule is currently looking at:

Interaction detection method PSI-MI
Interaction type PSI-MI
Participant identification methods PSI-MI
Participant's experimental role PSI-MI
Participant's biological role PSI-MI
Participant's Feature detection method PSI-MI
Participant's feature type PSI-MI or PSI-MOD
Interactor's type PSI-MI
Participant's feature range status PSI-MI
Participant's experimental preparation PSI-MI

PSI-MI basic checks

This scope includes a few basic rules which are not specified in MIMIx or IMEx but are important for the consitency of the file. It checks the consistency of the database cross references using the PSI-MI ontology, the consistency of the participant's features as it is specified in the PSI-XML 2.5 documentation and the interactor sequences (only for DNA, RNA or protein sequences)

Description of the rules executed with the scope 'PSI-MI'

PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIInteraction and Participant's Confidence syntax checkCheck that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term
PSI-MIExperiment's publication checkCheck that each experiment has a publication.
PSI-MIFeature range consistency CheckChecks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features
PSI-MIInteraction's experiment checkCheck that each interaction evidence has an experiment.
PSI-MIInteractor name's checkCheck that each interactor has a valid name.
PSI-MIInteractor sequence checkCheck that interactors of type biopolymer have a valid sequence.
PSI-MIParameter's syntax checkCheck that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier
PSI-MIDatabase cross reference CheckChecks that the each database cross reference is using a valid database accession which matches the regular expression of the database.
PSI-MIDatabase cross reference syntax checkCheck that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers.
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIControlled vocabulary name's checkCheck that each controlled vocabulary term has a valid name.
PSI-MIChecksum syntax checkCheck that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method.
PSI-MIMI file grammar checkRule that listens to MI file parsing events and report grammar errors.

MIMIx guidelines

This checks that the Minimum Information required for reporting a Molecular Interaction eXperiment (MIMIx) are respected.

Description of the rules executed with the scope 'MIMIx'

MIMIxExperiment Host Organism CheckChecks that each experiment has a host organism.
MIMIxConfidence Score Definition CheckChecks that the interaction defines its confidence score (if any) correctly.
MIMIxInteractor database reference checkCheck that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided.
MIMIxInteractor Organism CheckChecks that each protein and gene has an organism.
MIMIxOrganism CheckChecks that each organism has a valid NCBI taxid
MIMIxParticipant Identification Method checkChecks that each participant has at least one Participant Identification Method (MI:0002).
MIMIxExperiment bibliographic reference checkChecks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title).
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIInteraction and Participant's Confidence syntax checkCheck that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term
PSI-MIExperiment's publication checkCheck that each experiment has a publication.
PSI-MIFeature range consistency CheckChecks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features
PSI-MIInteraction's experiment checkCheck that each interaction evidence has an experiment.
PSI-MIInteractor sequence checkCheck that interactors of type biopolymer have a valid sequence.
PSI-MIInteractor name's checkCheck that each interactor has a valid name.
PSI-MIParameter's syntax checkCheck that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier
PSI-MIDatabase cross reference CheckChecks that the each database cross reference is using a valid database accession which matches the regular expression of the database.
PSI-MIDatabase cross reference syntax checkCheck that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers.
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIControlled vocabulary name's checkCheck that each controlled vocabulary term has a valid name.
PSI-MIChecksum syntax checkCheck that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method.
PSI-MIMI file grammar checkRule that listens to MI file parsing events and report grammar errors.

IMEx Curation standards

This ensures that the curation standards agreed by the International Molecular Exchange Consortium (IMEx) are respected.

The curation manual can be accessed here.

Description of the rules executed with the scope 'IMEx'

IMExDependency Check : Experiment Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExDependency Check : Participant's feature type and participant's feature range statusChecks association participant's feature type - participant's feature range status is valid and respects IMEx curation rules.
IMExDependency Check : Participant's feature type and feature detection methodChecks that each association participant's feature type - feature detection method is valid and respects IMEx curation rules.
IMExParticipant's feature Type CheckChecks that each participant's feature has a feature type with a valid PSI MI/PSI-MOD cross reference.
IMExBinding domain CheckChecks that each binding domain contains more than three amino acids.
IMExDependency Check : Feature Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExNegative interaction checkNegative interactions are currently outside of the remit of IMEx and should be removed from the file.
IMExInteraction Identity Xref CheckChecks that the interaction defines an identity cross reference to an interaction database.
IMExInteraction Type CheckChecks that each interaction has at least one interaction type and all the interactions types should have a valid PSI MI cross reference.
IMExDependency Check : Interaction Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExInteraction Imex-primary cross reference checkChecks that each interaction imexIDs are correct.
IMExDependency Check : Interaction detection method and interaction typeChecks that each association interaction detection method - interaction type is valid and respects IMEx curation rules.
IMExInteraction Figure Legend CheckChecks that each interaction has at least one figure-legend attached to it.
IMExInteractor type checkInteractor's type cannot be set to 'nucleic acid' or 'small molecule' as it is currently outside of the remit of IMEx.
IMExDependency Check : interactor Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExProtein identity checkCheck that each protein has an identity cross reference to the sequence database: UniProtKB or RefSeq
IMExTaxId Organism CheckChecks that each organism has a valid NCBI taxid
IMExCell line XRef CheckChecks that each organism cell line (if present) has a CABRI or Cell type ontology cross reference with a qualifier 'identity'. If no cross references can be found, one pubmed primary-reference should be added to be able to retrieve information about this cell type.
IMExTissue XRef CheckChecks that each organism tissue (if it is present) has a BRENDA or Tissue List cross reference with a qualifier 'identity'
IMExDependency Check : Interaction detection method and participant's experimental roleChecks that each association interaction detection method - participant's experimental role is valid and respects IMEx curation rules
IMExDependency Check : Interaction detection method and participant's biological roleChecks that each association interaction detection method - participant's biological role is valid and respects IMEx curation rules
IMExDependency Check : Interaction detection method and participant identification methodChecks that each association interaction detection method - participant identification methods is valid and respects IMEx curation rules.
IMExPublication Imex-primary cross reference checkChecks that a publication imexID is correct.
MIMIxExperiment Host Organism CheckChecks that each experiment has a host organism.
MIMIxConfidence Score Definition CheckChecks that the interaction defines its confidence score (if any) correctly.
MIMIxInteractor Organism CheckChecks that each protein and gene has an organism.
MIMIxInteractor database reference checkCheck that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided.
MIMIxParticipant Identification Method checkChecks that each participant has at least one Participant Identification Method (MI:0002).
MIMIxExperiment bibliographic reference checkChecks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title).
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIInteraction and Participant's Confidence syntax checkCheck that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term
PSI-MIExperiment's publication checkCheck that each experiment has a publication.
PSI-MIFeature range consistency CheckChecks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features
PSI-MIInteraction's experiment checkCheck that each interaction evidence has an experiment.
PSI-MIInteractor name's checkCheck that each interactor has a valid name.
PSI-MIInteractor sequence checkCheck that interactors of type biopolymer have a valid sequence.
PSI-MIParameter's syntax checkCheck that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier
PSI-MIDatabase cross reference syntax checkCheck that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers.
PSI-MIDatabase cross reference CheckChecks that the each database cross reference is using a valid database accession which matches the regular expression of the database.
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIControlled vocabulary name's checkCheck that each controlled vocabulary term has a valid name.
PSI-MIChecksum syntax checkCheck that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method.
PSI-MIMI file grammar checkRule that listens to MI file parsing events and report grammar errors.

Dependency details

Some of the rules for IMEx are checking dependencies between two controlled-vocabulary terms.

  • Interaction detection method and interaction type
  • Interaction detection method and participant's biological role
  • Interaction detection method and participant's experimental role
  • Interaction detection method and participant identification method
  • Database cross reference and reference qualifier
  • Feature type and feature detection method
  • Feature type and feature range status

The documentation describing these dependencies is availableHere

Customized rules

This scope allows the selection of the object rules among the PSI-MI, MIMIx and IMEx rules which will be used for validating the file.

List of rules which can be selected with the scope 'customized rules'

PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIInteraction and Participant's Confidence syntax checkCheck that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term
PSI-MIExperiment's publication checkCheck that each experiment has a publication.
PSI-MIFeature range consistency CheckChecks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features
PSI-MIInteraction's experiment checkCheck that each interaction evidence has an experiment.
PSI-MIInteractor name's checkCheck that each interactor has a valid name.
PSI-MIInteractor sequence checkCheck that interactors of type biopolymer have a valid sequence.
PSI-MIParameter's syntax checkCheck that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier
PSI-MIDatabase cross reference CheckChecks that the each database cross reference is using a valid database accession which matches the regular expression of the database.
PSI-MIDatabase cross reference syntax checkCheck that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers.
PSI-MIAlias syntax checkCheck that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type.
PSI-MIControlled vocabulary name's checkCheck that each controlled vocabulary term has a valid name.
PSI-MIChecksum syntax checkCheck that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method.
PSI-MIMI file grammar checkRule that listens to MI file parsing events and report grammar errors.

MIMIxExperiment Host Organism CheckChecks that each experiment has a host organism.
MIMIxConfidence Score Definition CheckChecks that the interaction defines its confidence score (if any) correctly.
MIMIxInteractor database reference checkCheck that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided.
MIMIxInteractor Organism CheckChecks that each protein and gene has an organism.
MIMIxOrganism CheckChecks that each organism has a valid NCBI taxid
MIMIxParticipant Identification Method checkChecks that each participant has at least one Participant Identification Method (MI:0002).
MIMIxExperiment bibliographic reference checkChecks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title).

IMExDependency Check : Experiment Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExDependency Check : Participant's feature type and participant's feature range statusChecks association participant's feature type - participant's feature range status is valid and respects IMEx curation rules.
IMExDependency Check : Participant's feature type and feature detection methodChecks that each association participant's feature type - feature detection method is valid and respects IMEx curation rules.
IMExParticipant's feature Type CheckChecks that each participant's feature has a feature type with a valid PSI MI/PSI-MOD cross reference.
IMExBinding domain CheckChecks that each binding domain contains more than three amino acids.
IMExDependency Check : Feature Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExNegative interaction checkNegative interactions are currently outside of the remit of IMEx and should be removed from the file.
IMExInteraction Identity Xref CheckChecks that the interaction defines an identity cross reference to an interaction database.
IMExInteraction Type CheckChecks that each interaction has at least one interaction type and all the interactions types should have a valid PSI MI cross reference.
IMExDependency Check : Interaction Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExInteraction Imex-primary cross reference checkChecks that each interaction imexIDs are correct.
IMExDependency Check : Interaction detection method and interaction typeChecks that each association interaction detection method - interaction type is valid and respects IMEx curation rules.
IMExInteraction Figure Legend CheckChecks that each interaction has at least one figure-legend attached to it.
IMExInteractor type checkInteractor's type cannot be set to 'nucleic acid' or 'small molecule' as it is currently outside of the remit of IMEx.
IMExDependency Check : interactor Cross reference database and cross reference qualifierChecks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'.
IMExProtein identity checkCheck that each protein has an identity cross reference to the sequence database: UniProtKB or RefSeq
IMExTaxId Organism CheckChecks that each organism has a valid NCBI taxid
IMExCell line XRef CheckChecks that each organism cell line (if present) has a CABRI or Cell type ontology cross reference with a qualifier 'identity'. If no cross references can be found, one pubmed primary-reference should be added to be able to retrieve information about this cell type.
IMExTissue XRef CheckChecks that each organism tissue (if it is present) has a BRENDA or Tissue List cross reference with a qualifier 'identity'
IMExDependency Check : Interaction detection method and participant's experimental roleChecks that each association interaction detection method - participant's experimental role is valid and respects IMEx curation rules
IMExDependency Check : Interaction detection method and participant's biological roleChecks that each association interaction detection method - participant's biological role is valid and respects IMEx curation rules
IMExDependency Check : Interaction detection method and participant identification methodChecks that each association interaction detection method - participant identification methods is valid and respects IMEx curation rules.
IMExPublication Imex-primary cross reference checkChecks that a publication imexID is correct.