PSI-MI and PSI-PAR validation
The validator can use two different data models to validate the files :
Molecular interactions (PSI-MI ontology)
The validator will execute a set of rules based on the PSI-MI ontology to check your file(s). It can check 5 possible levels with lower levels being included in a higher rank verification. The 'Customized rules' level is a little different as it includes the two first levels (File syntax and usage of controlled vocabulary usage).
- File syntax
- Usage of controlled-vocabulary defined in the PSI-MI ontology
- PSI-MI basic checks
- MIMIx-compliance
- IMEx-compliance
Protein Affinity Reagents (PSI-PAR ontology)
The validator will execute a set of rules based on the PSI-PAR ontology to check your file(s). It can check 2 possible levels with lower levels being included in a higher rank verification.
- File syntax
- Usage of controlled-vocabulary defined in the PSI-PAR ontology
Validation Scope
The validation levels described below define the the stringency of the validation rules applied to a dataset. As the image below illustrates, the currently supported levels are built on top of one another except for the 'Customized rules' scope which includes XML syntax and controlled vocabulary usage but then the object rules can be a mix of PSI-MI, MIMIx and IMEx.
PSI-MI Validation Scopes
WARNING: All the rules for MIMIx are used in the IMEx scope except for one : the taxId '-5' for 'in sillico' is a valid taxId for MIMIx but not for IMEx because IMEx is relying in experimental molecular interactions.
PSI-PAR Validation Scopes
File syntax
If the file is a PSI-XML 2.5 file, checks that the document to validate is properly defined according to the following XML schemas:
You can also look at the auto-generated documentation of the schema2.5.4
If the file is a MITAB file, checks that the document to validate is properly defined according to the following MITAB specifications:
Note: If your document does not pass this validation stage (report some FATAL messages), none of the levels below can be executed.
Controlled Vocabulary Usage
This ensures that the use of controlled vocabularies is correct. That is, for each controlled vocabulary place-holder, it checks that one uses the appropriate set of terms as defined in the PSI-MI ontology
The rule is currently looking at:
Interaction detection method | PSI-MI |
Interaction type | PSI-MI |
Participant identification methods | PSI-MI |
Participant's experimental role | PSI-MI |
Participant's biological role | PSI-MI |
Participant's Feature detection method | PSI-MI |
Participant's feature type | PSI-MI or PSI-MOD |
Interactor's type | PSI-MI |
Participant's feature range status | PSI-MI |
Participant's experimental preparation | PSI-MI |
PSI-MI basic checks
This scope includes a few basic rules which are not specified in MIMIx or IMEx but are important for the consitency of the file. It checks the consistency of the database cross references using the PSI-MI ontology, the consistency of the participant's features as it is specified in the PSI-XML 2.5 documentation and the interactor sequences (only for DNA, RNA or protein sequences)
Description of the rules executed with the scope 'PSI-MI'
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Interaction and Participant's Confidence syntax check | Check that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term |
PSI-MI | Experiment's publication check | Check that each experiment has a publication. |
PSI-MI | Feature range consistency Check | Checks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features |
PSI-MI | Interaction's experiment check | Check that each interaction evidence has an experiment. |
PSI-MI | Interactor name's check | Check that each interactor has a valid name. |
PSI-MI | Interactor sequence check | Check that interactors of type biopolymer have a valid sequence. |
PSI-MI | Parameter's syntax check | Check that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier |
PSI-MI | Database cross reference Check | Checks that the each database cross reference is using a valid database accession which matches the regular expression of the database. |
PSI-MI | Database cross reference syntax check | Check that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers. |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Controlled vocabulary name's check | Check that each controlled vocabulary term has a valid name. |
PSI-MI | Checksum syntax check | Check that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method. |
PSI-MI | MI file grammar check | Rule that listens to MI file parsing events and report grammar errors. |
MIMIx guidelines
This checks that the Minimum Information required for reporting a Molecular Interaction eXperiment (MIMIx) are respected.
Description of the rules executed with the scope 'MIMIx'
MIMIx | Experiment Host Organism Check | Checks that each experiment has a host organism. |
MIMIx | Confidence Score Definition Check | Checks that the interaction defines its confidence score (if any) correctly. |
MIMIx | Interactor database reference check | Check that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided. |
MIMIx | Interactor Organism Check | Checks that each protein and gene has an organism. |
MIMIx | Organism Check | Checks that each organism has a valid NCBI taxid |
MIMIx | Participant Identification Method check | Checks that each participant has at least one Participant Identification Method (MI:0002). |
MIMIx | Experiment bibliographic reference check | Checks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title). |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Interaction and Participant's Confidence syntax check | Check that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term |
PSI-MI | Experiment's publication check | Check that each experiment has a publication. |
PSI-MI | Feature range consistency Check | Checks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features |
PSI-MI | Interaction's experiment check | Check that each interaction evidence has an experiment. |
PSI-MI | Interactor sequence check | Check that interactors of type biopolymer have a valid sequence. |
PSI-MI | Interactor name's check | Check that each interactor has a valid name. |
PSI-MI | Parameter's syntax check | Check that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier |
PSI-MI | Database cross reference Check | Checks that the each database cross reference is using a valid database accession which matches the regular expression of the database. |
PSI-MI | Database cross reference syntax check | Check that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers. |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Controlled vocabulary name's check | Check that each controlled vocabulary term has a valid name. |
PSI-MI | Checksum syntax check | Check that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method. |
PSI-MI | MI file grammar check | Rule that listens to MI file parsing events and report grammar errors. |
IMEx Curation standards
This ensures that the curation standards agreed by the International Molecular Exchange Consortium (IMEx) are respected.
The curation manual can be accessed here.
Description of the rules executed with the scope 'IMEx'
IMEx | Dependency Check : Experiment Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Dependency Check : Participant's feature type and participant's feature range status | Checks association participant's feature type - participant's feature range status is valid and respects IMEx curation rules. |
IMEx | Dependency Check : Participant's feature type and feature detection method | Checks that each association participant's feature type - feature detection method is valid and respects IMEx curation rules. |
IMEx | Participant's feature Type Check | Checks that each participant's feature has a feature type with a valid PSI MI/PSI-MOD cross reference. |
IMEx | Binding domain Check | Checks that each binding domain contains more than three amino acids. |
IMEx | Dependency Check : Feature Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Negative interaction check | Negative interactions are currently outside of the remit of IMEx and should be removed from the file. |
IMEx | Interaction Identity Xref Check | Checks that the interaction defines an identity cross reference to an interaction database. |
IMEx | Interaction Type Check | Checks that each interaction has at least one interaction type and all the interactions types should have a valid PSI MI cross reference. |
IMEx | Dependency Check : Interaction Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Interaction Imex-primary cross reference check | Checks that each interaction imexIDs are correct. |
IMEx | Dependency Check : Interaction detection method and interaction type | Checks that each association interaction detection method - interaction type is valid and respects IMEx curation rules. |
IMEx | Interaction Figure Legend Check | Checks that each interaction has at least one figure-legend attached to it. |
IMEx | Interactor type check | Interactor's type cannot be set to 'nucleic acid' or 'small molecule' as it is currently outside of the remit of IMEx. |
IMEx | Dependency Check : interactor Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Protein identity check | Check that each protein has an identity cross reference to the sequence database: UniProtKB or RefSeq |
IMEx | TaxId Organism Check | Checks that each organism has a valid NCBI taxid |
IMEx | Cell line XRef Check | Checks that each organism cell line (if present) has a CABRI or Cell type ontology cross reference with a qualifier 'identity'. If no cross references can be found, one pubmed primary-reference should be added to be able to retrieve information about this cell type. |
IMEx | Tissue XRef Check | Checks that each organism tissue (if it is present) has a BRENDA or Tissue List cross reference with a qualifier 'identity' |
IMEx | Dependency Check : Interaction detection method and participant's experimental role | Checks that each association interaction detection method - participant's experimental role is valid and respects IMEx curation rules |
IMEx | Dependency Check : Interaction detection method and participant's biological role | Checks that each association interaction detection method - participant's biological role is valid and respects IMEx curation rules |
IMEx | Dependency Check : Interaction detection method and participant identification method | Checks that each association interaction detection method - participant identification methods is valid and respects IMEx curation rules. |
IMEx | Publication Imex-primary cross reference check | Checks that a publication imexID is correct. |
MIMIx | Experiment Host Organism Check | Checks that each experiment has a host organism. |
MIMIx | Confidence Score Definition Check | Checks that the interaction defines its confidence score (if any) correctly. |
MIMIx | Interactor Organism Check | Checks that each protein and gene has an organism. |
MIMIx | Interactor database reference check | Check that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided. |
MIMIx | Participant Identification Method check | Checks that each participant has at least one Participant Identification Method (MI:0002). |
MIMIx | Experiment bibliographic reference check | Checks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title). |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Interaction and Participant's Confidence syntax check | Check that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term |
PSI-MI | Experiment's publication check | Check that each experiment has a publication. |
PSI-MI | Feature range consistency Check | Checks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features |
PSI-MI | Interaction's experiment check | Check that each interaction evidence has an experiment. |
PSI-MI | Interactor name's check | Check that each interactor has a valid name. |
PSI-MI | Interactor sequence check | Check that interactors of type biopolymer have a valid sequence. |
PSI-MI | Parameter's syntax check | Check that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier |
PSI-MI | Database cross reference syntax check | Check that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers. |
PSI-MI | Database cross reference Check | Checks that the each database cross reference is using a valid database accession which matches the regular expression of the database. |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Controlled vocabulary name's check | Check that each controlled vocabulary term has a valid name. |
PSI-MI | Checksum syntax check | Check that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method. |
PSI-MI | MI file grammar check | Rule that listens to MI file parsing events and report grammar errors. |
Dependency details
Some of the rules for IMEx are checking dependencies between two controlled-vocabulary terms.
- Interaction detection method and interaction type
- Interaction detection method and participant's biological role
- Interaction detection method and participant's experimental role
- Interaction detection method and participant identification method
- Database cross reference and reference qualifier
- Feature type and feature detection method
- Feature type and feature range status
The documentation describing these dependencies is availableHere
Customized rules
This scope allows the selection of the object rules among the PSI-MI, MIMIx and IMEx rules which will be used for validating the file.
List of rules which can be selected with the scope 'customized rules'
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Interaction and Participant's Confidence syntax check | Check that each interaction confidence and participant confidence has a confidence type and a confidence value.If a MI term is provided for the confidence type, check that is a valid MI term |
PSI-MI | Experiment's publication check | Check that each experiment has a publication. |
PSI-MI | Feature range consistency Check | Checks that the each participant's feature range is valid : not out of bound (inferior to 1 or superior to the sequence length), not overlapping and compliant with the feature range status. WARNING : the status 'c-terminal' and 'n-terminal' cannot be used anymore for n-terminal/c-terminal features where the exact positions are not known. It is recommended to use the new terms 'c-terminal range' (MI:1039) and 'n-terminal range' (MI:1040) for such features |
PSI-MI | Interaction's experiment check | Check that each interaction evidence has an experiment. |
PSI-MI | Interactor name's check | Check that each interactor has a valid name. |
PSI-MI | Interactor sequence check | Check that interactors of type biopolymer have a valid sequence. |
PSI-MI | Parameter's syntax check | Check that each parameter (interaction's parameters and participant's parameters) have a parameter type and a parameter factor.If the parameter's type/unit MI identifier is provided, check that it is a valid MI identifier |
PSI-MI | Database cross reference Check | Checks that the each database cross reference is using a valid database accession which matches the regular expression of the database. |
PSI-MI | Database cross reference syntax check | Check that each database cross reference has a non empty database and a non empty database accession. Checks that if MI identifiers are provided for database and qualifiers, they are valid MI identifiers. |
PSI-MI | Alias syntax check | Check that each alias has a valid name and if it has a MI alias type, it must have a valid MI term for alias type. |
PSI-MI | Controlled vocabulary name's check | Check that each controlled vocabulary term has a valid name. |
PSI-MI | Checksum syntax check | Check that each checksum has a valid name and if it has a MI method, it must have a valid MI term for checksum method. |
PSI-MI | MI file grammar check | Rule that listens to MI file parsing events and report grammar errors. |
MIMIx | Experiment Host Organism Check | Checks that each experiment has a host organism. |
MIMIx | Confidence Score Definition Check | Checks that the interaction defines its confidence score (if any) correctly. |
MIMIx | Interactor database reference check | Check that each interactor has a cross reference to an appropriate reference database. If not, a sequence should be provided. |
MIMIx | Interactor Organism Check | Checks that each protein and gene has an organism. |
MIMIx | Organism Check | Checks that each organism has a valid NCBI taxid |
MIMIx | Participant Identification Method check | Checks that each participant has at least one Participant Identification Method (MI:0002). |
MIMIx | Experiment bibliographic reference check | Checks that each experiment has a publication reference (pubmed or doi) or valid publication details (contact email, author list and publication title). |
IMEx | Dependency Check : Experiment Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Dependency Check : Participant's feature type and participant's feature range status | Checks association participant's feature type - participant's feature range status is valid and respects IMEx curation rules. |
IMEx | Dependency Check : Participant's feature type and feature detection method | Checks that each association participant's feature type - feature detection method is valid and respects IMEx curation rules. |
IMEx | Participant's feature Type Check | Checks that each participant's feature has a feature type with a valid PSI MI/PSI-MOD cross reference. |
IMEx | Binding domain Check | Checks that each binding domain contains more than three amino acids. |
IMEx | Dependency Check : Feature Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Negative interaction check | Negative interactions are currently outside of the remit of IMEx and should be removed from the file. |
IMEx | Interaction Identity Xref Check | Checks that the interaction defines an identity cross reference to an interaction database. |
IMEx | Interaction Type Check | Checks that each interaction has at least one interaction type and all the interactions types should have a valid PSI MI cross reference. |
IMEx | Dependency Check : Interaction Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Interaction Imex-primary cross reference check | Checks that each interaction imexIDs are correct. |
IMEx | Dependency Check : Interaction detection method and interaction type | Checks that each association interaction detection method - interaction type is valid and respects IMEx curation rules. |
IMEx | Interaction Figure Legend Check | Checks that each interaction has at least one figure-legend attached to it. |
IMEx | Interactor type check | Interactor's type cannot be set to 'nucleic acid' or 'small molecule' as it is currently outside of the remit of IMEx. |
IMEx | Dependency Check : interactor Cross reference database and cross reference qualifier | Checks that each association database - qualifier respects IMEx curetion rules. For example, for each feature, all the interpro cross references should have a qualifier 'identity'. |
IMEx | Protein identity check | Check that each protein has an identity cross reference to the sequence database: UniProtKB or RefSeq |
IMEx | TaxId Organism Check | Checks that each organism has a valid NCBI taxid |
IMEx | Cell line XRef Check | Checks that each organism cell line (if present) has a CABRI or Cell type ontology cross reference with a qualifier 'identity'. If no cross references can be found, one pubmed primary-reference should be added to be able to retrieve information about this cell type. |
IMEx | Tissue XRef Check | Checks that each organism tissue (if it is present) has a BRENDA or Tissue List cross reference with a qualifier 'identity' |
IMEx | Dependency Check : Interaction detection method and participant's experimental role | Checks that each association interaction detection method - participant's experimental role is valid and respects IMEx curation rules |
IMEx | Dependency Check : Interaction detection method and participant's biological role | Checks that each association interaction detection method - participant's biological role is valid and respects IMEx curation rules |
IMEx | Dependency Check : Interaction detection method and participant identification method | Checks that each association interaction detection method - participant identification methods is valid and respects IMEx curation rules. |
IMEx | Publication Imex-primary cross reference check | Checks that a publication imexID is correct. |