Weekly submitted reads reports

Weekly submitted sequence read reports are available in the submitter drop boxes in the /report subdirectory. All reports are compressed tab separated text files:

Please contact datasubs@ebi.ac.uk if you have any questions or suggestions concerning these reports.

submitted_run_files.txt.gz

This report provides information about the submitted run files and has the following columns:

Column Description
FILE_NAME Name of the submitted file.
CHECKSUM_METHOD File checksum method (e.g. MD5).
CHECKSUM File checksum.
BYTES File size in bytes.
SUBMISSION_DATE Date when the file was submitted.
STATUS
  • PUBLIC: file has been archived and is publicly available. Please note that suppressed files will remain publicly available.
  • CONFIDENTIAL: file has been archived but is not publicly available.
  • NOT ARCHIVED: file is waiting to be archived.
RELEASE_DATE Date when the file will become public. The file will be made public when the study associated with the file will become public.
STUDY_ID Study accession assigned by ENA.
EXPERIMENT_ID Experiment accession assigned by ENA.
RUN_ID Run accession assigned by ENA.
SAMPLE_ID Sample accession assigned by ENA. Please note that if multiple samples are associated with a run then one row will be created in the report for each sample.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
EXPERIMENT_SUBMITTER_ID Experiment identifier (alias) provided by the submitter (in some cases generated by the archive).
RUN_SUBMITTER_ID Run identifier (alias) provided by the submitter (in some cases generated by the archive).
SAMPLE_SUBMITTER_ID Sample identifier (alias) provided by the submitter (in some cases generated by the archive).
ERROR File processing error (e.g. invalid file checksum).

submitted_analysis_files.txt.gz

This report provides information about the submitted analysis files and has the following columns:

Column Description
FILE_NAME Name of the submitted file.
CHECKSUM_METHOD File checksum method (e.g. MD5).
CHECKSUM File checksum.
BYTES File size in bytes.
SUBMISSION_DATE Date when the file was submitted.
STATUS
  • PUBLIC: file has been archived and is publicly available. Please note that suppressed files will remain publicly available.
  • CONFIDENTIAL: file has been archived but is not publicly available.
  • NOT ARCHIVED: file is waiting to be archived.
RELEASE_DATE Date when the file will become public. The file will be made public when the study associated with the file will become public.
STUDY_ID Study accession assigned by ENA.
ANALYSIS_ID Analysis accession assigned by ENA.
SAMPLE_ID Sample accession assigned by ENA. Please note that if multiple samples are associated with an analysis then one row will be created in the report for each sample.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
ANALYSIS_SUBMITTER_ID Analysis identifier (alias) provided by the submitter (in some cases generated by the archive).
SAMPLE_SUBMITTER_ID Sample identifier (alias) provided by the submitter (in some cases generated by the archive).
ERROR File processing error (e.g. invalid file checksum).

submitted_studies.txt.gz

This report provides information about the submitted studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA that have a ERP prefix. ERP prefixed accessions are scheduled to be replaced by PRJ prefixed accessions (existing ERP accessions will become secondary to PRJ accessions).
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
SUBMISSION_DATE Date when the study was submitted.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
NEW_STUDY_ID Study accession assigned by ENA that have a PRJ prefix. PRJ prefixed accessions are scheduled to replace ERP prefixed accessions (existing ERP accessions will become secondary to PRJ accessions).

submitted_runs_by_study.txt.gz

This report provides summary level information about run data submitted for studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
EXPERIMENT_COUNT Number of experiments associated with the study *1.
RUN_COUNT Number of runs associated with the study *1.
SAMPLE_COUNT Number of samples associated with the study *1.
PROCESSED_RUN_COUNT Number of processed runs *1. Run has been processed when ENA has transformed it into a standard data product.
PROCESSED_RUN_PERCENT Percentage of processed runs.
PROCESSED_READ_COUNT Number of reads in processed runs.
PROCESSED_BASE_COUNT Number of bases in processed runs.
WITHDRAWN_EXPERIMENT_COUNT Number of withdrawn experiments *2.
WITHDRAWN_RUN_COUNT Number of withdrawn runs *2.
WITHDRAWN_SAMPLE_COUNT Number of withdrawn samples *2.

*1: Excluding withdrawn objects (see*2).
*2: An object has been withdrawn if its status has been changed to CANCELLED, to SUPPRESSED without a RELEASE_DATE, or to KILLED without a RELEASE_DATE.

submitted_analyses_by_study.txt.gz

This report provides summary level information about analysis data submitted for studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
ANALYSIS_COUNT Number of analyses associated with the study *1.
SAMPLE_COUNT Number of samples associated with the study *1.
WITHDRAWN_ANALYSIS_COUNT Number of withdrawn analyses *2.
WITHDRAWN_SAMPLE_COUNT Number of withdrawn samples *2.

*1: Excluding withdrawn objects (see*2).
*2: An object has been withdrawn if its status has been changed to CANCELLED, to SUPPRESSED without a RELEASE_DATE, or to KILLED without a RELEASE_DATE.

unsubmitted_files.txt.gz

This report lists all files that have been uploaded by the submitter more than 2 months ago. All uploaded files must be submitted within two months. Once submitted the uploaded files are moved into a permanent archive by ENA.

Column Description
FILE_NAME The name of the submitted file (including relative path within the data drop box).
FILE_SIZE The size of the file in bytes.
UPLOAD_DATE The date when the file was uploaded.

Latest ENA news

09 Dec 2014: ENA release 122
Release 122 of ENA's assembled/annotated sequences is now available.

12 Nov 2014: Simplification of data release procedures
The European Nucleotide Archive will couple the public release of sequence records and the release of study records that contain these sequence records, with immediate effect.

11 Nov 2014: ENA/EMG Sample Record Annotation Workshop
European Nucleotide Archive (ENA) and EBI Metagenomics Portal (EMG), are organising the ENA/EMG Sample Record Annotation Workshop on the 1-5 December 2014 to enrich the environmental sample records.