Read domain 1.3 XML metadata format

The SRA 1.3 XML metadata format replaced SRA 1.2 XML metadata format in August 2011. The SRA 1.3 metadata format is designed to be backwards compatible with only a very few exceptions.


XML Schema Description
SRA.submission.xsd A submission contains submission actions to be performed by the archive.
SRA.sample.xsd A sample contains information about the sample upon which the sequencing experiments are based. Samples can be used in any number of sequencing experiments. A study contains information about the sequencing project. Studies can contain any number of sequencing experiments and analysis.
SRA.experiment.xsd An experiment contains instrument, library and spot information about the sequencing experiment. Experiments are associated with studies, samples and runs. Runs contain the actual sequencing reads. A run contains the sequencing reads from sequencing experiments. Each run can contain all or part of the results for a particular experiment.
SRA.analysis.xsd An analysis contains secondary analysis results computed from the primary sequencing reads. Analyses are assocated with studies.
SRA.common.xsd Common types used in other SRA XML Schemas.
EGA.dac.xsd An European Genome-phenome Archive (EGA) data access committee (DAC). Required for authorized access submissions.
EGA.policy.xsd An European Genome-phenome Archive (EGA) data access policy. Required for authorized access submissions.
EGA.dataset.xsd An European Genome-phenome Archive (EGA) data set. Required for authorized access submissions.

For examples how to prepare the SRA XMLs please refer to Preparing XMLs.

Removed instrument models (SRA.common.xsd)

  • 454 Titanium (use 454 GS FLX Titanium)
  • GS 20 (use 454 GS 20)
  • GS FLX (use 454 GS FLX)
  • Solexa 1G Genome Analyzer (use Illumina choices)

New instrument platforms (SRA.common.xsd) *1


New instrument models (SRA.common.xsd) *1

  • Illumina HiSeq 1000
  • Illimina MiSeq
  • AB SOLiD 5500xl
  • AB SOLiD 5500
  • Ion Torrent PGM
  • Complete Genomics
  • PacBio RS

New library sources (SRA.common.xsd)

  • METATRANSCRIPTOMIC has been added as a new library source

Removed library strategies (SRA.common.xsd)

  • Deprecated library strategy BARCODE has been removed

Gap descriptor (SRA.common.xsd) *1

GapDescriptor element has been introduced for Experiment and Run to define the placement of gaps relative to a reference sequence. The GapDescriptor was introduced to be able to describe the CompleteGenomics spot layout. The GapDescriptor is expected to become a replacement for the LIBRARY_LAYOUT.

Changes to Study (

  • CENTER_PROJECT_NAME has been made optional
  • Deprecated RELATED_STUDIES/STUDY has been removed

Changes to Sample (SRA.sample.xsd)

  • TAXON_ID has been made mandatory

Changes to Submission (SRA.submission.xsd)

  • Deprecated 'submission_id' attribute has been removed
  • Deprecated 'handle' attribute has been removed

Changes to Experiment (SRA.experiment.xsd)

  • The SPOT_DESCRIPTOR element has been made optional and is no longer required for file formats which can be interpreted without external spot layout information *1
  • The PROCESSING element has been made optional
  • GAP_DESCRIPTOR is now available on the level of experiment *1

Changes to Run (

  • Only single DATA_BLOCK is supported in run
  • Optional 'unencrypted_checksum' attribute has been added to FILE element to contain the unencrypted file checksum for encrypted (EGA) files *1
  • SPOT_DESCRIPTOR is now supported on the level of run *1
  • GAP_DESCRIPTOR is now available on the level of run *1
  • Added new filetype option PacBio_HDF5 *1
  • Added CompleteGenomics_native file type *1
  • Deprecated _seq.txt, _prb.txt, _sig2.txt, _qhg.txt filetype options have been removed

*1 Also backported to SRA XML 1.2.

Changes to Analysis (SRA.analysis.xsd)

  • DATA_BLOCK is made optional to support updates without files
  • Only single DATA_BLOCK is supported in analysis
  • The PROCESSING elements have been removed
  • Removed data_block name attribute from RUN_LABELS and SEQ_LABELS
  • Removed gi attribute from SEQ_LABELS
  • Removed TARGET element and added SAMPLE_REF and RUN_REF elements

Latest ENA news

12 Jul 2017: Submission service maintenance - 14/7/17 to 17/7/17

Webin submission services will not be available between Friday 14/7...

07 Jul 2017: Update to Aspera server

EBI has built a new Aspera server on up-dated hardware with the latest Aspera version and configuration. This should improve...

06 Jul 2017: ENA Release 132

Release 132 of ENA's assembled/annotated sequences now available

30 Jun 2017: Taxon support for sequence, WGS and assembly in ENA Browser Tools

You can now download sequence, WGS and assembly data by tax ID using ENA Browser Tools

23 Jun 2017: New tools to download data from ENA

Introducing two new tools to make retrieving data from ENA much easier: enaBrowserTools and ENA FTP Downloader.