This document describes the metadata validations during Webin submissions. The following conventions are used in this document:
- '@': denotes an XML attribute
- CAPITAL: capital letters denote an XML tag
- submission.xml: the XML file containing the Submission object
- study.xml: the XML file containing the Study objects
- sample.xml: the XML file containing the Sample objects
- experiment.xml: the XML file containing the Experiment objects
- run.xml: the XML file containing the Run objects
- analysis.xml : the XML file containing the Analysis objects
Common XML validation
- The submitted XMLs are validated against the latest XML schemas.
- The @center_name and @broker_name values must be identical with the information registered in the submission account.
- The @alias is mandatory and should be unique with in a submission_account.
- If the submission account is a broker account and if the @broker_name has been given then it must match the broker name registered for the submission account.
- If the submission account is a broker account and if the @broker_name has not been given then the @broker_name attribute will be created based on information from the submission account.
- If the submission account is a normal account and if the @center_name has been given then it must match the center name registered for the submission account.
- If the submission account is a normal account and if the @center_name has not been given then the @center_name attribute will be created based on information from the submission account.
- The @center_name must be one of the pre-registered centers.
- The @broker_name must be one of the pre-registered brokers.
- The @refname and @refcenter used in RefNameGroup XML Type (e.g. in <EXPERIMENT_REF> in experiment.xml) must uniquely identify a previously submitted Object or an Object within the same submission. Alternatively @accession can be used to refer to other Objects. In this case the @accession must refer to a previously submitted Object.
- If IDENTIFIERS/PRIMARY_ID value is provided then it must match the @accession, if not provided then the IDENTIFIERS/PRIMARY_ID will be created with @accession value.
- If IDENTIFIERS/SUBMITTER_ID value is provided then it must match the @alias, if not provided then the IDENTIFIERS/SUBMITTER_ID will be created with @alias value.
- If IDENTIFIERS/SUBMITTER_ID@namespace value is provided then it must match the @center_name, if not provided then the IDENTIFIERS/SUBMITTER_ID@namespace will be created with @center_name
- If <*_SET> (e.g. <SUBMISSION_SET>) is missing from the submitted XML then this element will be automatically added. All Objects are expected to be nested within a <*_SET> element.
- It is possible to add new objects only by using <ADD> action in submission.xml.
- It is possible to update existing objects only by using <MODIFY> action in submission.xml.
- The <SUPPRESS> action in submission.xml is not supported.
- If <ADD> action is used in submission.xml then the @aliases in the objects being submitted must be unique and must not already exist for the given @center_name.
- If <MODIFY> action is used in submission.xml then the @aliases in the objects being updated must already have been submitted by the @center_name.
- Objects mirrored from NCBI are verified to contain @accessions with either SR or DR prefix.
Submission XML validation
- Only a single <SUBMISSION> is allowed in submission.xml.
- The @source and @schema are mandatory for <ADD>, <MODIFY> and <VALIDATE> action
- The @HoldUntilDate in <HOLD> should be of the data format yyyy-mm-dd.
Study XML validation
- The @existing_study_type is mandatory for <STUDY_TYPE>.
- <STUDY_TITLE> is mandatory.
- If the @existing_study_type="Other" please provide @new_study_type with in <STUDY_TYPE>.
- <STUDY_ABSTRACT> must be provided.
Sample XML validation
- An assigned INSDC Taxonomic identifier can't be unassigned.
- The <TAXON_ID> must be provided with a valid associated <SCIENTIFIC_NAME> and <COMMON_NAME>
- If the Sample Object being submitted is connected to a Study Object, then this Sample object will have the status of this Study object(eg: the sample will become private if the associated study is private)
Experiment XML validation
- The @NOMINAL_LENGTH is mandatory for <PAIRED>expriments.
- If the Experiment Object being submitted refers to a public Study object, then the Experiment object will have the same status as the Study object
- If <SPOT_DESCRIPTOR/READ_SPEC> with RELATIVE_ORDER is given then @follows_read_index and @precedes_read_index must be provided.
- <SPOT_LENGTH> must exist for ILLUMINA and for ABI_SOLID platforms.
Run XML validation
- The @run_center must be one of the pre-registered center names.
- The data file denoted by @filename in run.xml must be unique with in a submission.
- The <SPOT_DESCRIPTOR>, <PLATFORM> and <PROCESSING> are not supported
- In <FILE> the @checksum_method , @checksum, @filename and @filetype are mandatory
- The <READ_LABEL> must exist in the referenced experiment.
- If the Run Object being submitted will get the same status of the associated Study object.