SRA 1.2 XML metadata format

The SRA 1.2 XML metadata format replaced the SRA 1.1 XML metadata format in November 2010. The SRA 1.2 metadata format is designed to be backwards compatible. For examples how to prepare the SRA XMLs please refer to Preparing XMLs.

 

XML Schema XML Schema Document Description
SRA.submission.xsd SRA.submission.pdf A submission contains submission actions to be performed by the archive. For small studies the submission accession number can be quoted in place of study accession number.
SRA.sample.xsd SRA.sample.pdf A sample contains information about the sample upon which the sequencing experiments are based. Samples can be used in any number of sequencing experiments.
SRA.study.xsd SRA.study.pdf A study contains information about the sequencing project. Studies can contain any number of sequencing experiments and analysis.
SRA.experiment.xsd SRA.experiment.pdf An experiment contains information about the sequencing experiment. Experiments are associated with runs which contain the actual sequencing results.
SRA.run.xsd SRA.run.pdf A run contains the sequencing results from sequencing experiments. Each run can contain all or part of the results for a particular experiment. For example, each Illumina Genome Analyzer lane is typically represented by a single run.
SRA.analysis.xsd SRA.analysis.pdf An analysis contains secondary analysis results computed from the primary sequencing results.
SRA.common.xsd SRA.common.pdf Common types.
EGA.dac.xsd EGA.dac.pdf An European Genome-phenome Archive (EGA) data access committee (DAC). Required for authorized access submissions.
EGA.policy.xsd EGA.policy.pdf An European Genome-phenome Archive (EGA) data access policy. Required for authorized access submissions.
EGA.dataset.xsd EGA.dataset.pdf An European Genome-phenome Archive (EGA) data set. Required for authorized access submissions.

Deprecated fields in SRA Experiment

  • EXPERIMENT.DESIGN.LIBRARY_DESCRIPTOR.LIBRARY_SOURCE value 'NON GENOMIC'
  • EXPERIMENT.DESIGN.LIBRARY_DESCRIPTOR.LIBRARY_STRATEGY value 'BARCODE'
  • EXPERIMENT.PROCESSING.BASE_CALLS element
  • EXPERIMENT.PROCESSING.QUALITY_SCORES element
  • EXPERIMENT.expected_number_spots attribute
  • EXPERIMENT.expected_number_reads attribute
  • EXPERIMENT.DESIGN.SPOT_DESCRIPTOR.SPOT_DECODE_METHOD element
  • EXPERIMENT.DESIGN.SPOT_DESCRIPTOR.SPOT_DECODE_SPEC.NUMBER_OF_READS_PER_SPOT element
  • EXPERIMENT.PLATFORM.ILLUMINA.CYCLE_SEQUENCE element
  • EXPERIMENT.PLATFORM.ILLUMINA.CYCLE_COUNT element
  • EXPERIMENT.PLATFORM.ILLUMINA.INSTRUMENT_MODEL value 'Solexa 1G Genome Analyzer'
  • EXPERIMENT.PLATFORM.ABI_SOLID.CYCLE_COUNT element
  • EXPERIMENT.PLATFORM.LS454.instrument_model values 'GS 20', 'GS FLX', '454 Titanium'

Deprecated fields in SRA Run

  • RUN.DATA_BLOCK.total_spots attribute
  • RUN.DATA_BLOCK.total_reads attribute
  • RUN.DATA_BLOCK.number_channels attribute
  • RUN.DATA_BLOCK.format_code attribute
  • RUN.instrument_model attribute
  • RUN.run_file attribute
  • RUN.total_data_blocks attribute

Deprecated fields in SRA Study

  • STUDY.DESCRIPTOR.CENTER_NAME element
  • STUDY.DESCRIPTOR.PROJECT_ID element: STUDY.DESCRIPTOR.RELATED_STUDIES.RELATED_STUDY should be used instead
  • STUDY.DESCRIPTOR.RELATED_STUDIES.STUDY element

Deprecated fields in SRA Submission

  • SUBMISSION.submission_id attribute
  • SUBMISSION.handle attribute
  • SUBMISSION.ACTIONS.ACTION.HOLD.HoldForPeriod attribute

Unsupported fields in SRA Run

The SPOT_DESCRIPTOR, PLATFORM and PROCESSING elements have been added to SRA 1.2 Run but are not currently supported by SRA EBI. Please specify this information in the SRA Experiment instead.

New mandatory fields in SRA Experiment

  • EXPERIMENT.PLATFORM.ILLUMINA.SEQUENCE_LENGTH element
  • EXPERIMENT.PLATFORM.ABI_SOLID.SEQUENCE_LENGTH element
  • EXPERIMENT.DESIGN.SPOT_DESCRIPTOR.SPOT_DECODE_SPEC.SPOT_LENGTH element (for ILLUMINA & ABI_SOLID platforms)

New platforms in SRA Experiment

  • COMPLETE_GENOMICS
  • PACBIO_SMRT

New instrument models in SRA Experiment

  • Illumina Genome Analyzer IIx
  • Illumina HiSeq 2000
  • AB SOLiD 4 System
  • AB SOLiD 4hq System
  • AB SOLiD PI System
  • 454 GS Junior
  • 454 GS FLX Titanium

New file types in SRA Run

  • bam: BAM file submissions are now supported

New library strategy and selection terms in SRA Experiment

  • Methylation-Sensitive Restriction Enzyme sequencing strategy:
    <LIBRARY_STRATEGY>MRE-Seq</LIBRARY_STRATEGY>

Latest ENA news

01 Jul 2015: ENA release 124
Release 124 of ENA's assembled/annotated sequences now available

20 Jun 2015: Sample Checklist Updates - June 2015
ENA are planning to update several sample metadata reporting checklists. Some of these changes have been carried out for harmonisation of attributes/fields between various checklist. Other changes were made to allow a standardised missing/null value reporting. All changes will come into effect as of 3rd August 2015.

03 Jun 2015: Changes to read data submission services 1st of October 2015
ENA will make a number of changes to submission services for raw sequence read data on first of October 2015. We continue to track an ever evolving landscape of available and preferred formats and introduce these changes with a view to overall simplification of the submission system to allow us to provide a more efficient service with faster turnaround.