efovalidator validates the Experimental Factor Ontology (EFO). It reads EFO in OWL-format, validates the contents and writes a log file of validation errors. Optionally, some simple errors and inconsistencies are fixed.
efovalidator detects various types of error and inconsistency that can creep into an ontology through manual editing and automated updating. This includes syntactic issues such as malformed URIs, and semantic issues such as clashes between labels and synonyms. It can also perform a set of supplied tests for whether specific classes and relations exist. Optionally, it will perform some simple error correction. efovalidator can be incorporated into an ontology production process to assist with quality assurance, highlighting errors for correction that might otherwise go uncorrected.
Download efovalidator from http://www.ebi.ac.uk/fgpt/sw/downloads/.
efovalidator [-h] [-i file] [-u url] [-l log] [-t file] [-o file] [-c file] [-n file] [-v] [-x]
efovalidator is invoked using the script validate.sh. For example, to read EFO from the file efo.owl, validate EFO and log validation message to efo.log:
$ validate.sh -i efo.owl -l efo.log
When you first use efovalidator, you must generate validate.sh for your specific environment using the bundled script build.sh, which requires the path to java and efovalidator installation directory, e.g.:
$ build.sh /usr/bin/java/ /usr/local/efovalidator/
build.sh creates validate.sh which you can then use.
|-h||--help||Print (to |
|-i||--infile||file||Read EFO from OWL-format file|
|-l||--log||file||Validate EFO and log validation messages to file|
|-o||--outfile||file||Fix validation errors and write a fixed EFO to OWL-format file|
|-t||--test||file||Unit test EFO. Unit testing messages will be logged to the file specified by -log|
|-u||--url||url||Read EFO in OWL-format from url|
|-c||--capitals||Read valid terms with capitals from file.|
|-n||--namespaces||Read valid namespaces from file.|
|-v||--verbose||Turn on output (to stdout) with summary of validation report.|
|-x||--xref||Process cross-references to other ontologies.|
Modes of Operation
|Validate||Validate an ontology||-l||Validation message log file|
|Validate and fix||Validate and fix ontology||-o||Fixed (output) ontology file|
|Unit test||Unit test an ontology||-t||Unit test definition file|
The application returns the following values to the operating system:
0(No critical errors but possible warnings)
1(Command invoked cannot execute validation)
A warning is raised when a class synonym is non-unique or duplicates a label. All other failed validation checks are "critical".
Depending on the status, efovalidator will print to <stdout> on of the following messages:
"No critical validation errors"
"Command invoked cannot execute validation"
"Possible validation error(s). Please see the log file: logFileName"
"Critical validation error(s). Please see the log file: logFileName"
To see a list of all command-line options, run efovalidator with the -h option:
$ ./validate.sh -h
-c,--capitals <file> Read valid terms with capitals from <file>.
-h,--help Print (to stdout) usage information and the bug-reporting address, then exit.
-i,--infile <file> Read EFO from OWL-format <file>.
-l,--log <file> Validate EFO and log validation messages to <file>.
-n,--namespaces <file> Read valid namespaces from <file>.
-o,--outfile <file> Fix validation errors and write a fixed EFO to OWL-format <file>.
-t,--testlog <file> Unit test EFO and log unit testing messages to <file>.
-u,--url <url> Read EFO in OWL-format from <url>.
-v,--verbose Turn on output (to stdout) with summary of validation report.
-x,--xref Process cross-references to other ontologies.
To validate EFO loaded from an input file:
$ ./validate.sh -i efo.owl
To validate and fix EFO loaded from an input file:
$ ./validate.sh -i efo.owl -o efo.fix
To unit test EFO loaded from a URL:
$ ./validate.sh -u
http://bar.ebi.ac.uk:8080/trac/export/head/branches/curator/ExperimentalFactorOntology/ExFactorInOWL/releasecandidate/efo_release_candidate.owl -t efo.tests
To unit test and validate EFO loaded from URL (URL and log file specified in properties file):
$ ./validate.sh -u -t efo.tests -l
Validation checks will include:
The EFO input file is in valid OWL format.
A single ontology version number is specified, in an
Date is specified in an
<rdfs:comment>element with a value prefixed by
<rdfs:comment>Date: 15th March 2012</rdfs:comment>
Every class ID is a URI formed from a valid namespace name. Valid namespaces include the default (EFO) namespace and the namespaces of acceptable imports:
The above are default valid namespaces, but these can be configured.
EFO IDs used in class IRIs are valid. IRI fragment of IRIs in EFO namepace must have the form:
XXXXXXXis a 7-digit number.
Annotation of EFO URIs are valid. Where a class was imported from an ontology and replaced an existing EFO class, the new class ID will use the imported namespace and the old id (an EFO URI) may be retained as an annotation, using the
In such cases, the
<EFO_URI>annotation must include the EFO namespace and a valid EFO ID which does not exist in EFO.
Every class has a single label, specified in the
Class labels are unique.
Class synonyms are unique. Synonyms are defined in the
Class labels and synonyms collectively are unique. A synonym for a class should never be the label of another class.
There are not an excessive number (currently 20) of class synonyms.
Obsolete classes are marked as organizational using the
<organizational_class>element as follows:
TRUEare also acceptable.
Annotation (on the ontology or a class) has no value, e.g.
Annotation (on the ontology or a class) has no leading or trailing whitespace.
Term capitalisation. Labels or synonyms with capitals are now reported (as non-critical errors).
Unit tests specified in the unit test definition file are performed. These may include checks of whether a set of OWL classes persist, and checks whether a sub-class relationships are defined.
efovalidator can be configured with a list of valid capitalisations (which are allowed to slip through unreported):
validate.sh -i efo_release_candidate.owl -c efo.caps
efo.caps looks like this:
This allows reporting on trivial variations in capitals e.g. "Drug Usage" or "Cancers, Stomach" but not on valid, significant cases, e.g. "C28H45O2R", "MCF7 cell", "CV system" etc.
Case sensitivity when checking terms
The checks for unique terms (labels and synonyms) are case-sensitive within a single class, but case-insensitive between classes. This is to prevent erroneous reporting of label:synyonym duplication within a single class where only the capitalisation differed. Case-sensitivity between different classes, although useful in some cases, isn't implemented as it would have meant not missing many bad duplication.
efovalidator may be configured using Java properties defined in a file (.efovalidator.properties) in the invocation directory, e.g.:
efovalidator.infile = efo.owl
efovalidator.url = http://bar.ebi.ac.uk:8080/trac/export/head/branches/curator/ExperimentalFactorOntology/ExFactorInOWL/releasecandidate/efo_release_candidate.owl
efovalidator.outfile = efofix.owl
efovalidator.log = efo.log
efovalidator.testlog = efotest.log
efovalidator.verbose = false
efovalidator.xref = false
This provides default values for the command-line options. For example, given the properties file above, the following command-line would read EFO from efo.owl and write a log file called efo.log:
$ ./validate.sh -i -l
The properties correspond to and are overridden by the command-line options of the same name. For example, here EFO will be read from efo2.owl; the value set using -infile takes precedence over the value of efovalidator.infile:
$ ./validate.sh -i efo2.owl -l
The properties file is written (to the current working directory) whenever efovalidator is run. The values written are those set on the command-line in the last run, or default values (if not set).
Developers downloading the code may also configure using Spring XML. The Spring configuration has a lower precedence than both command-line options and java properties. Default property values are defined in a bundled file (efovalidator.properties).
You can configure the valid namespaces like this:
validate.sh -i efo_release_candidate.owl -n efo.ns
where efo.ns looks like this:
Input file format (OWL-format ontology)
efovalidator requires an input file in any of the standard serialization formats of the Web Ontology Language (OWL).
Input file format (unit test file)
The unit test definition file format is as follows:
Output file format
The log file format is as follows:
There are a couple of validators that perform complementary (syntactic) checks:
Issues and support
Issues and feature requests
To request a new feature or if you think you've found a bug, please use the JIRA Tracker.
If you need help using this tool, please email email@example.com.
For more information or to get involved please email Jon Ison.
This tool was developed by Jon Ison and the EBI Functional Genomics Production Team.
We gratefully acknowledge the support of our funders.