Weekly submitted reads reports

Weekly submitted sequence read reports are available in the submitter drop boxes in the /report subdirectory. All reports are compressed tab separated text files:

Please contact datasubs@ebi.ac.uk if you have any questions or suggestions concerning these reports.

submitted_assemblies.txt.gz

This report provides information about submitted assemblies and has the following columns:

Column Description
ANALYSIS_ID Analysis accession assigned by ENA.
STUDY_ID Study accession assigned by ENA.
PROJECT_ID Project accession assigned by ENA.
SAMPLE_ID Sample accession assigned by ENA.
ASSEMBLY_NAME The name of the assembly.
CONTIG_ACC The contig accession number range.
SCAFFOLD_ACC The scaffold accession number range.
CHROMOSOME_ACC The chromosome accession number range.
SUBMISSION_DATE Date when the assembly was submitted.
RELEASE_DATE Date when the assembly will become public.

 

submitted_run_files.txt.gz

This report provides information about the submitted run files and has the following columns:

Column Description
FILE_NAME Name of the submitted file.
CHECKSUM_METHOD File checksum method (e.g. MD5).
CHECKSUM File checksum.
BYTES File size in bytes.
SUBMISSION_DATE Date when the file was submitted.
STATUS
  • PUBLIC: file has been archived and is publicly available. Please note that suppressed files will remain publicly available.
  • CONFIDENTIAL: file has been archived but is not publicly available.
  • NOT ARCHIVED: file is waiting to be archived.
RELEASE_DATE Date when the file will become public. The file will be made public when the study associated with the file will become public.
STUDY_ID Study accession assigned by ENA.
EXPERIMENT_ID Experiment accession assigned by ENA.
RUN_ID Run accession assigned by ENA.
SAMPLE_ID Sample accession assigned by ENA. Please note that if multiple samples are associated with a run then one row will be created in the report for each sample.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
EXPERIMENT_SUBMITTER_ID Experiment identifier (alias) provided by the submitter (in some cases generated by the archive).
RUN_SUBMITTER_ID Run identifier (alias) provided by the submitter (in some cases generated by the archive).
SAMPLE_SUBMITTER_ID Sample identifier (alias) provided by the submitter (in some cases generated by the archive).
ERROR File processing error (e.g. invalid file checksum).

submitted_analysis_files.txt.gz

This report provides information about the submitted analysis files and has the following columns:

Column Description
FILE_NAME Name of the submitted file.
CHECKSUM_METHOD File checksum method (e.g. MD5).
CHECKSUM File checksum.
BYTES File size in bytes.
SUBMISSION_DATE Date when the file was submitted.
STATUS
  • PUBLIC: file has been archived and is publicly available. Please note that suppressed files will remain publicly available.
  • CONFIDENTIAL: file has been archived but is not publicly available.
  • NOT ARCHIVED: file is waiting to be archived.
RELEASE_DATE Date when the file will become public. The file will be made public when the study associated with the file will become public.
STUDY_ID Study accession assigned by ENA.
ANALYSIS_ID Analysis accession assigned by ENA.
SAMPLE_ID Sample accession assigned by ENA. Please note that if multiple samples are associated with an analysis then one row will be created in the report for each sample.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
ANALYSIS_SUBMITTER_ID Analysis identifier (alias) provided by the submitter (in some cases generated by the archive).
SAMPLE_SUBMITTER_ID Sample identifier (alias) provided by the submitter (in some cases generated by the archive).
ERROR File processing error (e.g. invalid file checksum).

submitted_studies.txt.gz

This report provides information about the submitted studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA that have a ERP prefix. ERP prefixed accessions are scheduled to be replaced by PRJ prefixed accessions (existing ERP accessions will become secondary to PRJ accessions).
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
SUBMISSION_DATE Date when the study was submitted.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
NEW_STUDY_ID Study accession assigned by ENA that have a PRJ prefix. PRJ prefixed accessions are scheduled to replace ERP prefixed accessions (existing ERP accessions will become secondary to PRJ accessions).

submitted_runs_by_study.txt.gz

This report provides summary level information about run data submitted for studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
EXPERIMENT_COUNT Number of experiments associated with the study *1.
RUN_COUNT Number of runs associated with the study *1.
SAMPLE_COUNT Number of samples associated with the study *1.
PROCESSED_RUN_COUNT Number of processed runs *1. Run has been processed when ENA has transformed it into a standard data product.
PROCESSED_RUN_PERCENT Percentage of processed runs.
PROCESSED_READ_COUNT Number of reads in processed runs.
PROCESSED_BASE_COUNT Number of bases in processed runs.
WITHDRAWN_EXPERIMENT_COUNT Number of withdrawn experiments *2.
WITHDRAWN_RUN_COUNT Number of withdrawn runs *2.
WITHDRAWN_SAMPLE_COUNT Number of withdrawn samples *2.

*1: Excluding withdrawn objects (see*2).
*2: An object has been withdrawn if its status has been changed to CANCELLED, to SUPPRESSED without a RELEASE_DATE, or to KILLED without a RELEASE_DATE.

submitted_analyses_by_study.txt.gz

This report provides summary level information about analysis data submitted for studies and has the following columns:

Column Description
STUDY_ID Study accession assigned by ENA.
STUDY_SUBMITTER_ID Study identifier (alias) provided by the submitter.
STATUS
  • CONFIDENTIAL: the study is pre-publication confidential. Study will become public when RELEASE_DATE expires.
  • CANCELLED: the study has been cancelled before becoming public.
  • PUBLIC: the study is public.
  • SUPPRESSED: the study is no longer public but data remains available by accession number. If RELEASE_DATE is given then the study will become public again when this date expires.
  • KILLED: the study is no longer public and data is not available through any means. If RELEASE_DATE is given then the study will become public again when this date expires.
Please refer here for full ENA data availability policy details.
RELEASE_DATE Date when the study will become public.
ANALYSIS_COUNT Number of analyses associated with the study *1.
SAMPLE_COUNT Number of samples associated with the study *1.
WITHDRAWN_ANALYSIS_COUNT Number of withdrawn analyses *2.
WITHDRAWN_SAMPLE_COUNT Number of withdrawn samples *2.

*1: Excluding withdrawn objects (see*2).
*2: An object has been withdrawn if its status has been changed to CANCELLED, to SUPPRESSED without a RELEASE_DATE, or to KILLED without a RELEASE_DATE.

unsubmitted_files.txt.gz

This report lists all files that have been uploaded by the submitter more than 2 months ago. All uploaded files must be submitted within two months. Once submitted the uploaded files are moved into a permanent archive by ENA.

Column Description
FILE_NAME The name of the submitted file (including relative path within the data drop box).
FILE_SIZE The size of the file in bytes.
UPLOAD_DATE The date when the file was uploaded.

Latest ENA news

11 Oct 2017: Read data download issues resolved

Read data download issues previously affecting ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk services now resolved.

06 Oct 2017: ENA read data download issues

Issues with read data download from ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk

04 Oct 2017: ENA Release 133

Release 133 of ENA's assembled/annotated sequences now available