Downloading read and analysis data

Sequencing read and analysis data are available for download through FTP and Aspara protocols in their original format and for read data also in an archive generated fastq formats described here.

Submitted data files

Submitted read data files

Submitted read data files are organised by submission accession number under vol1/ directory in ftp.sra.ebi.ac.uk:

ftp://ftp.sra.ebi.ac.uk/vol1/<submission accession prefix>/<submission accession>

where <submission accession prefix> contains the first 6 letters and numbers of the SRA Submission accession.

For example, the files submitted in the SRA Submission ERA007448 are available at: ftp://ftp.sra.ebi.ac.uk/vol1/ERA007/ERA007448/.

Submitted analysis data files

Submitted analysis data are organised by analysis accession number under vol1/ directory in ftp.sra.ebi.ac.uk:

ftp://ftp.sra.ebi.ac.uk/vol1/<analysis accession prefix>/<analysis accession>

where <analysis accession prefix> contains the first 6 letters and numbers of the SRA Analysis accession.

For example, the files submitted in the SRA Submission ERA007448 are available at: ftp://ftp.sra.ebi.ac.uk/vol1/ERZ454/ERZ454001/.

Manifest files for submitted data

ENA produces md5 manifest files for the users to be confirm the integrity of data submitted to and downloaded from the archive. Manifest files are produced for submitted files associated with runs and analysis.

The manifest can used to verify file integrity by using the command:

md5sum

Please note that to validate the content of a run after downloading the data files the subfolder structure (organised by file format) should be preserved. The same is true for any analyses which contain subfolders.

For example, submitted file integrity for run ERR1438847 can be confirmed by doing the following actions.

1. Download manifest file:

ftp://ftp.sra.ebi.ac.uk/vol1/ERA645/ERA645809/ERR1438847.md5

The content of the manifest file is:

fafe406ac8d98725474048db7f617668 fastq/PHESPV0057.R2.fastq.gz
7d2062f9040e0282287162938f4d9276 fastq/PHESPV0057.R1.fastq.gz

Please note that the file refers to file format specific subfolders containing the data files:

ftp://ftp.sra.ebi.ac.uk/vol1/ERA645/ERA645809/fastq/PHESPV0057.R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/ERA645/ERA645809/fastq/PHESPV0057.R2.fastq.gz

To run the md5sum command these files should be downloaded into the fastq subfolder.

3. Execute the md5sum command:

md5sum -c ERR1438847.md5

Archive generated fastq files

Archive generated fastq files are organised by run accession number under vol1/fastq directory in ftp.sra.ebi.ac.uk:

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/<dir1>[/<dir2>]/<run accession>

<dir1> is the first 6 letters and numbers of the run accession ( e.g. ERR000 for ERR000916 ),

<dir2> does not exist if the run accession has six digits. For example, fastq files for run ERR000916 are in directory: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR000/ERR000916/.

If the run accession has seven digits then the <dir2> is 00 + the last digit of the run accession. For example, fastq files for run SRR1016916 are in directory: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR101/006/SRR1016916/.

If the run accession has eight digits then the <dir2> is 0 + the last two digits of the run accession. 

If the run accession has nine digits then the <dir2> is the last three digits of the run accession. 

Archive generated Fastq are not available for the following data formats submitted to ENA:

  • BAM files containing @PG:longranger
  • CRAM files containing @PG:longranger
  • Complete genomics native (data folder) submissions
  • PacBio native (HDF5) submissions
  • Many ONT native format submissions

Downloading files using FTP

Files can be downloaded through ftp.sra.ebi.ac.uk using any FTP client.

Example using wget:

wget  ftp://ftp.sra.ebi.ac.uk/vol1/ERA012/ERA012008/sff/library08_GJ6U61T06.sff

 

Example using ftp:

ftp ftp.sra.ebi.ac.uk
Name: anonymous
Password: enter your e-mail address
ftp> cd vol1/ERA012/ERA012008/sff
ftp> get library08_GJ6U61T06.sff

Downloading files using Globus GridFTP

Files can be downloaded through Globus ebi#public endpoint from 'ena' subfolder:

Globus ebi#public ENA endpoint

Downloading files using ENA FTP Downloader

The ENA FTP Downloader is a standalone application that you can download here.

You can download files for a given accession, or upload an Advanced Search or portal API report to perform a bulk download of all files for a given set of criteria.

 

Downloading files using Aspera

Aspera ascp command line client can be downloaded here. Please select the correct operating system. The ascp command line client is distributed as part of the Aspera connect high-performance transfer browser plug-in.

Your command should look similar to this on Unix:

ascp -QT -l 300m -P33001 -i <aspera connect installation directory>/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:<file or files to download> <download location>

 

and on Mac OSX:

ascp -QT -l 300m -P33001 -i <aspera connect installation directory>/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:<file or files to download> <download location>

 

On Windows please use quotes to avoid errors caused by spaces in file path:

"%userprofile%\AppData\Local\Programs\Aspera\Aspera Connect\bin\ascp" -QT -l 300m -i "%userprofile%\AppData\Local\Programs\Aspera\Aspera Connect\etc\asperaweb_id_dsa.openssh" era-fasp@fasp.sra.ebi.ac.uk:<file or files to download> <download location>

Note: The asperaweb_id_dsa.openssh public key was introduced in Aspera Connect plugin 3.3.3.  Earlier versions can still use asperaweb_id_dsa.putty.

Unix Examples:

ascp -QT -l 300m -P33001 -i /etc/aperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERA012/ERA012008/sff/library08_GJ6U61T06.sff .

 

ascp -QT -l 300m -P33001 -i /etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR036/ERR036000/ERR036000_1.fastq.gz .

 

ascp -QT -l 300m -P33001 -i /etc/aperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERA012/ERA012008/sff/ .



Latest ENA news

11 Oct 2017: Read data download issues resolved

Read data download issues previously affecting ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk services now resolved.

06 Oct 2017: ENA read data download issues

Issues with read data download from ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk

04 Oct 2017: ENA Release 133

Release 133 of ENA's assembled/annotated sequences now available