Identifying objects

Each object is uniquely identified within a submission account using the alias attribute. Once an object has been submitted no other object of the same type can use the same alias within the submission account.

Objects can refer to other objects within a submission account using the refname attribute. For example, if a sample has alias="sample1" an experiment can reference this sample by using refname="sample1".

Identifying submitters

The center_name attribute defines the submitting institution. The center name is required to match the one registered for the submission account.

If the submitter is brokering a submission for another institute, the center name should reflect the institute where the data was generated. Brokers should request a special broker account and provide their center name acronym in the broker_name attribute. Brokers can also point to an equivalent object in their resource and to their customer using the BROKER_OBJECT_ID and BROKER_CUSTOMER_NAME attributes. The submission tool used by the broker can be captured using the BROKER_SUBMISSION_TOOL attribute.

If the work has been contracted to another partly, the run_center or analysis_center attributes can be used to provide this information.

Metadata XMLs

The metadata model consists of several objects each represented using XML ... more information.

Read submissions

A typical read submission consists of five XMLs:

  • Submission XML
  • Study XML
  • Sample XML
  • Experiment XML
  • Run XML

A submission does not have to contain all five XMLs. For example, it is possible to submit only samples or studies to be referenced in the future. Please note that whatever the submission scenario you are always required to provide a Submission XML.

Technical reads (e.g. barcodes, adaptors or linkers ) should be removed prior submission. If they are included then a spot descriptor is required so that the technical reads can be identified.

Supported file types in Run XML for read submissions:

Format Filetype
CRAM (recommended format) cram
BAM (recommended format) bam
Fastq fastq
SFF cram
PacBio format PacBio_HDF5
Oxford Nanopore format OxfordNanopore_native 
Complete Genomics format CompleteGenomics_native

Assembly submissions

A typical assembly submission consists of four XMLs:

  • Submission XML
  • Study XML
  • Sample XML
  • Analysis XML ( SEQUENCE_ASSEMBLY )

Supported file types in Analysis XML for assembly submissions after 26 September 2017:

Format Filetype
Sequences in fasta format fasta
Sequences and functional annotation in fasta format flatfile
Chromosomes and scaffolds in AGP format agp
List of chromosomes chromosome_list
List of unlocalised scaffolds and contigs unlocalised_list

Supported file types in Analysis XML for assembly submissions prior 26 September 2017 (do not validate against analysis XML schema but are supported in the programmic submission service):

Format Filetype
Contigs in Fasta format contig_fasta
Contigs in Flat File format contig_flatfile
Scaffolds in Fasta format scaffold_fasta
Scaffolds in Flat File format scaffold_flatfile
Scaffolds in AGP format scaffold_agp
Chromosomes in Fasta format chromosome_fasta
Chromosomes in Flat File format chromosome_flatfile
Chromosomes in AGP format chromosome_agp
List of chromosomes chromosome_list
List of unlocalised contigs unlocalised_contig_list
List of unlocalised scaffods unlocalised_scaffold_list

Alignment submissions

A typical re-aligned read submission consists of four XMLs:

  • Submission XML
  • Study XML
  • Sample XML
  • Analysis XML ( REFERENCE_ALIGNMENT )

Previously submitted studies and samples can also be referenced.

Supported file types in Analysis XML for alignment submissions:

Format Filetype
BAM bam

Annotation submissions

A typical annotation submission consists of four XMLs:

  • Submission XML
  • Study XML
  • Sample XML
  • Analysis XML ( SEQUENCE_ANNOTATION )

Previously submitted studies and samples can also be referenced.

Supported file types in Analysis XML for annotation submissions:

Format Filetype File suffix
Tab separated table tab

.tab

.tab.gz

Variation submissions

A typical variation submission consists of four XMLs:

  • Submission XML
  • Study XML
  • Sample XML
  • Analysis XML ( SEQUENCE_VARIATION )

Previously submitted studies and samples can also be referenced.

Supported file types in Analysis XML for variation submissions:

Format Filetype
VCF vcf
VCF vcf_aggregate

 

Submission XML

The submission XML is used to submit, update, release or validate other objects. The list of other XML documents that can be used in assocation with the submission:

Document Schema
Submission SRA.Submission.xsd
Study SRA.study.xsd
Sample SRA.sample.xsd
Experiment SRA.experiment.xsd
Run SRA.run.xsd
Analysis SRA.analysis.xsd
EGA DAC EGA.dac.xsd
EGA Policy EGA.policy.xsd
EGA Dataset EGA.dataset.xsd

Public release of objects

Data and other objects associated with a study are released to public only when the study is made public. Please note that samples may also be made public independently of studies. After public release withdrawal from public access is only possible by contacting us at datasubs@ebi.ac.uk.

Studies can be kept confidential for up to two years by using the HOLD action. If HOLD action is not used or if RELEASE action is used then the submitted studies will become immediately public with all associated data and other objects.

Submitting objects

New objects are submitted using the ADD action.

An example of a submission XML used to submit new objects using the ADD action is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<SUBMISSION_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.submission.xsd">
<SUBMISSION>
        <ACTIONS>
            <ACTION>
                <ADD/>
            </ACTION>
            <ACTION>
                <RELEASE/>
                <HOLD HoldUntilDate="TODO: hold until date 2010-01-01"/>
                <!-- Choose either RELEASE oe HOLD action. RELEASE is for immediate public release.
                     HOLD is for an embargo (maximum 2 years). By default RELEASE is assumed. -->
            </ACTION>                            
        </ACTIONS>
    </SUBMISSION>
</SUBMISSION_SET>

Updating objects

Object updates are done using the MODIFY action.

An example of a submission XML used to update existing objects is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<SUBMISSION_SET>
   <SUBMISSION>
	<ACTIONS>
   		<ACTION>
   			<MODIFY/>
   		</ACTION>
   	</ACTIONS>
   </SUBMISSION>
</SUBMISSION_SET>

Releasing objects

If studies have been previously submitted with HOLD action then they can be made immediately public by using the RELEASE action.

For example, the following XML publishes the study ERP001835 with associated data and metadata:

<?xml version="1.0" encoding="UTF-8"?>
<SUBMISSION_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.submission.xsd">
<SUBMISSION>
        <ACTIONS>
             <ACTION>
                  <RELEASE target="ERP001835"/>
             </ACTION>
        </ACTIONS>
   </SUBMISSION>
</SUBMISSION_SET> 

Validating objects

Objects can be validated by using the VALIDATE action instead of the ADD action.

<?xml version="1.0" encoding="UTF-8"?>
<SUBMISSION_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.submission.xsd">
<SUBMISSION>
        <ACTIONS>
             <ACTION>
                  <VALIDATE/>
             </ACTION>
        </ACTIONS>
   </SUBMISSION>
</SUBMISSION_SET> 

Study XML

The study XML is used to describe the sequencing study including a title, a study type and an abstract as it would appear in a publication.

An example of a study XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<STUDY_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.study.xsd">
    <STUDY alias="TODO: UNIQUE NAME FOR SUBMISSION" 
        center_name="TODO: CENTER NAME">
        <DESCRIPTOR>
            <STUDY_TITLE>TODO: STUDY TITLE AS IT COULD APPEAR IN A PUBLICATION</STUDY_TITLE>
            <STUDY_TYPE existing_study_type="TODO: CHOOSE FROM CONTROLLED VOCABULARY"/>
            <STUDY_ABSTRACT>TODO: STUDY ABSTRACT AS IT COULD APPEAR IN A
                PUBLICATION</STUDY_ABSTRACT>
        </DESCRIPTOR>
        <STUDY_ATTRIBUTES>
            <STUDY_ATTRIBUTE>
                <TAG>TODO: TAG NAME</TAG>
                <VALUE>TODO: TAG VALUE</VALUE>
            </STUDY_ATTRIBUTE>
            <STUDY_ATTRIBUTE>
                <TAG>TODO: TAG NAME</TAG>
                <VALUE>TODO: TAG VALUE</VALUE>
            </STUDY_ATTRIBUTE>
            <!-- You can generate your own fields and values here using STUDY_ATTRIBUTE 
                tag-value pairs. Please delete any unused attributes and add as many as 
                required. -->
        </STUDY_ATTRIBUTES>
    </STUDY>
    <!-- If you are submitting more than one study, replicate the block <STUDY> to </STUDY> 
        here, as many times as necessary. -->
</STUDY_SET>

Please use the following notation when including PubMed citations in Study XML:

<STUDY_LINKS>
    <STUDY_LINK>
        <XREF_LINK>
            <DB>PUBMED</DB>
            <ID>18987735</ID>
        </XREF_LINK>
    </STUDY_LINK>
</STUDY_LINKS>

Sample XML

The sample XML is used to describe the sequenced samples. The mandatory fields are minimal and include information about the taxonomy of the sample. However, since the sample is one of the most important objects to be described biologically, it is highly recommended that “TAG-VALUE” pairs are generated to describe the sample in as much detail as possible. We recommend the adoption of GSC (Genomic Standards Consortium) terms for the TAG names were possible. For a full list of terms in the specific standards please visit the GSC wiki.

An example of a sample XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<SAMPLE_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.sample.xsd">
    <SAMPLE alias="TODO: UNIQUE NAME FOR SAMPLE" 
        center_name="TODO: CENTER NAME">                     
        <TITLE>TODO: A SHORT INFORMATIVE DESCRIPTION OF THE SAMPLE</TITLE>
        <SAMPLE_NAME>
            <TAXON_ID>TODO: PROVIDE NCBI TAXID FOR ORGANISM (e.g. 9606 for human)
                 </TAXON_ID>
            <!-- For complete prokaryotic genomes, a taxid should be generate for the strain. 
                 Please contact us so we can generate this on your behalf. -->
            <SCIENTIFIC_NAME>TODO: SCIENTIFIC NAME AS APPEARS IN NCBI TAXONOMY FOR THE
                TAXON_ID (e.g. homo sapiens)</SCIENTIFIC_NAME>
            <COMMON_NAME>TODO: OPTIONAL COMMON NAME AS APPEARS IN NCBI TAXONOMY FOR 
                THE TAXON_ID (e.g. human)</COMMON_NAME>
        </SAMPLE_NAME>
        <DESCRIPTION>TODO: A LONGER DESCRIPTION OF SAMPLE AND HOW IT DIFFERS FROM 
            OTHER SAMPLES</DESCRIPTION>
        <SAMPLE_ATTRIBUTES>
            <SAMPLE_ATTRIBUTE>
                <TAG>TODO: TAG NAME</TAG>
                <VALUE>TODO: TAG VALUE</VALUE>
                <UNITS>TODO: OPTIONAL UNIT</UNITS>
            </SAMPLE_ATTRIBUTE>
            <SAMPLE_ATTRIBUTE>
                <TAG>TODO: TAG NAME</TAG>
                <VALUE>TODO: TAG VALUE</VALUE>
                <UNITS>TODO: OPTIONAL UNIT</UNITS>
            </SAMPLE_ATTRIBUTE>
            <!-- You can generate your own fields and values here using SAMPLE_ATTRIBUTE 
                tag-value pairs. An example tag could be "Isolation Source" and the value 
                could be "Seawater". You can also use the UNITS element to include 
                scientific units. E.g., TAG "Age" VALUE "5" UNITS "Years". Please refer
                to online documentation for further help with sample tag-value pairs.
                Please delete any unused attributes and add as many as required. -->
        </SAMPLE_ATTRIBUTES>
    </SAMPLE>
    <!-- If you are submitting more than one sample, replicate the block <SAMPLE> to </SAMPLE> 
         here, as many times as necessary. -->
</SAMPLE_SET> 

Experiment XML

The experiment XML is used to describe the experimental setup including instrument and library preparation details, and any additional information required to correctly interpret the submitted data. Where any of these values differ between runs, a new experiment object must exist. Each experiment references a study and a sample. Please note that pooled data must be demultiplexed by barcode and any technical reads must be removed before submission.

An example of an experiment XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<EXPERIMENT_SET xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.experiment.xsd">
    <EXPERIMENT alias="TODO: UNIQUE NAME FOR EXPERIMENT"
        center_name="TODO: CENTER NAME">
        <TITLE>TODO: TITLE OF EXPERIMENT</TITLE>
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <DESIGN>
            <DESIGN_DESCRIPTION>TODO: DETAILS ABOUT THE SETUP AND GOALS OF THE 
                EXPERIMENT AS SUPPLIED BY INVESTIGATOR</DESIGN_DESCRIPTION>
            <SAMPLE_DESCRIPTOR refname="TODO: SAMPLE ALIAS OR ACCESSION"/>
            <LIBRARY_DESCRIPTOR>
                <LIBRARY_NAME>TODO: NAME OF LIBRARY</LIBRARY_NAME>
                <LIBRARY_STRATEGY>TODO: CHOOSE FROM CONTROLLED VOCABULARY</LIBRARY_STRATEGY>
                <LIBRARY_SOURCE>TODO: CHOOSE FROM CONTROLLED VOCABULARY</LIBRARY_SOURCE>
                <LIBRARY_SELECTION>TODO: CHOOSE FROM CONTROLLED VOCABULARY</LIBRARY_SELECTION>
                <LIBRARY_LAYOUT>
                    <TODO: CHOOSE LIBRARY LAYOUT FROM CONTROLLED VOCABULARY/>
                </LIBRARY_LAYOUT>
                <LIBRARY_CONSTRUCTION_PROTOCOL>TODO: PROTOCOL BY WHICH THE LIBRARY WAS
                    CONSTRUCTED</LIBRARY_CONSTRUCTION_PROTOCOL>
            </LIBRARY_DESCRIPTOR>         
        </DESIGN>
        <PLATFORM>
            <TODO: CHOOSE PLATFORM FROM CONTROLLED VOCABULARY>
                <INSTRUMENT_MODEL>TODO: CHOOSE FROM CONTROLLED VOCABULARY</INSTRUMENT_MODEL>
            </TODO: CHOOSE PLATFORM FROM CONTROLLED VOCABULARY>
        </PLATFORM>
    </EXPERIMENT>
    <!-- If you are submitting more than one experiment, replicate the block <EXPERIMENT> 
        to </EXPERIMENT> here, as many times as necessary. -->
</EXPERIMENT_SET>


Run XML

The run XML is used to associate data files with experiments. Please note that pooled data must be demultiplexed by barcode and any technical reads must be removed before submission. The md5 checksums for the files can be provided within the Run XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of a run XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<RUN_SET  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="ftp://ftp.sra.ebi.ac.uk/meta/xsd/sra_1_5/SRA.run.xsd">
    <RUN alias="TODO: UNIQUE NAME FOR RUN" center_name="TODO: CENTER NAME">    
        <EXPERIMENT_REF refname="TODO: EXPERIMENT ALIAS OR ACCESSION"/>
<DATA_BLOCK> <FILES> <FILE filename="TODO: FILENAME1" filetype="TODO: CHOOSE FROM CONTROLLED VOCABULARY" checksum_method="MD5" checksum="TODO: CHECKSUM1"/> </FILES> </DATA_BLOCK> </RUN> <!-- If you are submitting more than one run, replicate the block <RUN> to </RUN> here, as many times as necessary. --> </RUN_SET>

Analysis XML for assembly submissions

Analysis XML can be used to submit genome assemblies. Only one assembly can be submitted in each analysis. Each analysis can contain 1 or more data files and must be associated with a single study and a single sample. The md5 checksums for the files can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ASSEMBLY" 
        center_name="TODO: CENTER NAME">
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION"/>
        <ANALYSIS_TYPE>
            <SEQUENCE_ASSEMBLY>
<NAME>TODO: UNIQUE NAME FOR ASSEMBLY</NAME>
<PARTIAL>TODO: TRUE or FALSE</PARTIAL>
<COVERAGE>TODO: NUMERIC SEQUENCING COVERAGE</COVERAGE>
<PROGRAM>TODO: ASSEMBLY PROGRAM</PROGRAM>
<PLATFORM>TODO: SEQUENCING PLATFORM</PLATFORM>
</SEQUENCE_ASSEMBLY> 
</ANALYSIS_TYPE>
<FILES>
<FILE filename="TODO: FILENAME1"
filetype="TODO: CHOOSE FROM CONTROLLED VOCABULARY"
checksum_method="MD5" checksum="TODO: CHECKSUM1"/>
</FILES>
<ANALYSIS_ATTRIBUTES>
<ANALYSIS_ATTRIBUTE>
<TAG>TODO: add any tag and value pairs</TAG>
<VALUE>TODO: add any tag and value pairs</VALUE>
</ANALYSIS_ATTRIBUTE>
</ANALYSIS_ATTRIBUTES>
</ANALYSIS>
</ANALYSIS_SET>

Analysis XML for alignment submissions

Analysis XML can be used to submit BAM alignments. Only one BAM file can be submitted in each analysis and the samples used within the BAM must be associated with submitted samples. In addition, the analysis must be associated with a study. Optimally the BAM file would be associated with an INSDC reference assembly and sequences either by using accessions (as for the references sequences in the example below) or by using commonly used labels (as for the reference assembly in the example below). The BAM index can be submitted together with the BAM. If the BAM index file is not submitted then it will be created by ENA. The md5 checksums for the files can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ANALYSIS" 
        center_name="TODO: CENTER NAME">
        <TITLE>TODO: a descriptive title for the analysis shown in search results</TITLE>
        <DESCRIPTION>TODO: a detained description of the analysis</DESCRIPTION>
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION 1"
            label="TODO: sample name in the BAM file"/>        
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION 2"
            label="TODO: sample name in the BAM file"/>        
        <ANALYSIS_TYPE>
            <REFERENCE_ALIGNMENT>
                <ASSEMBLY>
                    <STANDARD refname="TODO: INSDC assembly name (e.g. GRCh37 or GRCh37.p1)" 
accession="TODO: INSDC assembly accession (e.g. GCA_000001405.1)"/> </ASSEMBLY> <SEQUENCE accession="TODO: INSDC sequence accession and version" label="TODO: reference sequence name in the BAM file"/> <SEQUENCE accession="TODO: INSDC sequence accession and version" label="TODO: reference sequence name in the BAM file"/> </REFERENCE_ALIGNMENT> </ANALYSIS_TYPE> <FILES> <FILE filename="TODO: FILENAME.bam" filetype="bam" checksum_method="MD5" checksum="TODO: CHECKSUM" unencrypted_checksum="TODO: CHECKSUM"/> </FILES> <ANALYSIS_ATTRIBUTES> <ANALYSIS_ATTRIBUTE> <TAG>TODO: add any tag and value pairs</TAG> <VALUE>TODO: add any tag and value pairs</VALUE> </ANALYSIS_ATTRIBUTE> </ANALYSIS_ATTRIBUTES> </ANALYSIS> </ANALYSIS_SET>

Analysis XML for annotation submissions

Analysis XML can be used to submit annotations in tabulated files. Only one annotation file can be submitted in each analysis. Each analysis must be associated with a single study and a single sample. The md5 checksums for the files can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ANALYSIS" 
        center_name="TODO: CENTER NAME">
        <TITLE>TODO: a descriptive title for the analysis shown in search results</TITLE>
        <DESCRIPTION>TODO: a detained description of the analysis</DESCRIPTION>
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION 1"/>        
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION 2"/>        
        <ANALYSIS_TYPE>
            <SEQUENCE_ANNOTATION/>
        </ANALYSIS_TYPE>
        <FILES>
            <FILE filename="TODO: FILENAME.csv" filetype="tab" checksum_method="MD5"
                checksum="TODO: CHECKSUM" unencrypted_checksum="TODO: CHECKSUM" checklist="TODO: CHECKLIST NAME"/>
        </FILES>
        <ANALYSIS_ATTRIBUTES>
            <ANALYSIS_ATTRIBUTE>
                <TAG>TODO: add any tag and value pairs</TAG>
                <VALUE>TODO: add any tag and value pairs</VALUE>
            </ANALYSIS_ATTRIBUTE>
        </ANALYSIS_ATTRIBUTES>
    </ANALYSIS>
</ANALYSIS_SET>

Analysis XML for variation submissions

Analysis XML can be used to submit VCF files to the European Variation Archive (EVA). Only one VCF file can be submitted in each analysis and the samples used within the VCF files must be associated with submitted samples. In addition, the analysis must be associated with a study. Optimally the VCF file would be associated with an INSDC reference assembly and sequences either by using accessions (as for the references sequences in the example below) or by using commonly used labels (as for the reference assembly in the example below). The md5 checksums for the files can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ANALYSIS" center_name="TODO: CENTER NAME">
        <TITLE>TODO: a descriptive title for the analysis shown is search results</TITLE>
        <DESCRIPTION>TODO: a detailed description of the analysis</DESCRIPTION>
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION"
            label="TODO: the sample name in the VCF file"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION"
            label="TODO: the sample name in the VCF file"/>
        <ANALYSIS_TYPE>
            <SEQUENCE_VARIATION>
                <ASSEMBLY>
<STANDARD refname="TODO: INSDC assembly name (e.g. GRCh37 or GRCh37.p1)"
accession="TODO: INSDC assembly accession (e.g. GCA_000001405.1)"/> </ASSEMBLY> <SEQUENCE accession="TODO: INSDC sequence accession and version" label="TODO: the reference sequence name in the VCF file" />
<EXPERIMENT_TYPE>Use one of: "Whole genome sequencing", "Exome sequencing", "Genotyping by array"</EXPERIMENT_TYPE>
  </SEQUENCE_VARIATION> </ANALYSIS_TYPE> <FILES> <FILE filename="TODO: FILENAME.vcf" filetype="vcf" checksum_method="MD5" checksum="TODO: CHECKSUM" unencrypted_checksum="TODO: CHECKSUM"/> </FILES> <ANALYSIS_ATTRIBUTES> <ANALYSIS_ATTRIBUTE> <TAG>TODO: add any tag and value pairs</TAG> <VALUE>TODO: add any tag and value pairs</VALUE> </ANALYSIS_ATTRIBUTE> </ANALYSIS_ATTRIBUTES> </ANALYSIS> </ANALYSIS_SET>

Analysis XML for Bionano genome maps

Analysis XML can be used to submit Bionano genome maps to the European Nucleotide Archive (ENA). The submission consists of bnx, cmap, xmap, smap and coord files. The md5 checksums for the files can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ANALYSIS" center_name="TODO: CENTER NAME">
        <TITLE>TODO: a descriptive title for the analysis shown is search results</TITLE>
        <DESCRIPTION>TODO: a detailed description of the analysis</DESCRIPTION>
        <STUDY_REF refname="TODO: STUDY ALIAS OR ACCESSION"/>
        <SAMPLE_REF refname="TODO: SAMPLE ALIAS OR ACCESSION"/>
<ANALYSIS_TYPE> <GENOME_MAP> <PROGRAM>Irys</PROGRAM>
<PLATFORM>BioNano</PLATFORM>
  </GENOME_MAP> </ANALYSIS_TYPE> <FILES> <FILE filename="TODO: FILENAME.bnx" filetype="BioNano_native" checksum_method="MD5" checksum="TODO: CHECKSUM"/>
<!-- TODO: ADD ADDITIONAL FILES --> </FILES> <ANALYSIS_ATTRIBUTES> <ANALYSIS_ATTRIBUTE> <TAG>TODO: add any tag and value pairs</TAG> <VALUE>TODO: add any tag and value pairs</VALUE> </ANALYSIS_ATTRIBUTE> </ANALYSIS_ATTRIBUTES> </ANALYSIS> </ANALYSIS_SET>

Analysis XML for CRAM reference sequence submissions

Analysis XML can be used to submit reference sequences into the CRAM reference registry. Only one Fasta file can be submitted in each analysis. The md5 checksums for the file can be provided within the Analysis XML or in files with the same name as the submitted files postfixed with '.md5'.

An example of an analysis XML is provided below:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="TODO: UNIQUE NAME FOR ANALYSIS" center_name="TODO: CENTER NAME">
        <ANALYSIS_TYPE>
            <REFERENCE_SEQUENCE\>
        </ANALYSIS_TYPE>
        <FILES>
            <FILE filename="TODO: FILENAME.fasta.gz"
                filetype="fasta"
                checksum_method="MD5" checksum="TODO: CHECKSUM"/>
        </FILES>
    </ANALYSIS>
</ANALYSIS_SET>

Latest ENA news

11 Oct 2017: Read data download issues resolved

Read data download issues previously affecting ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk services now resolved.

06 Oct 2017: ENA read data download issues

Issues with read data download from ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk

04 Oct 2017: ENA Release 133

Release 133 of ENA's assembled/annotated sequences now available