Downloading read and analysis data
Sequencing read and analysis data are available for download through FTP and Aspara protocols in their original format and for read data also in an archive generated fastq formats described here.
Submitted read data files
Submitted read data files are organised by submission accession number under vol1/ directory in ftp.sra.ebi.ac.uk:
ftp://ftp.sra.ebi.ac.uk/vol1/<submission accession prefix>/<submission accession>
where <submission accession prefix> contains the first 6 letters and numbers of the SRA Submission accession.
For example, the files submitted in the SRA Submission ERA007448 are available at: ftp://ftp.sra.ebi.ac.uk/vol1/ERA007/ERA007448/.
Submitted analysis data files
Submitted analysis data are organised by analysis accession number under vol1/ directory in ftp.sra.ebi.ac.uk:
ftp://ftp.sra.ebi.ac.uk/vol1/<analysis accession prefix>/<analysis accession>
where <analysis accession prefix> contains the first 6 letters and numbers of the SRA Analysis accession.
For example, the files submitted in the SRA Submission ERA007448 are available at: ftp://ftp.sra.ebi.ac.uk/vol1/ERZ454/ERZ454001/.
Manifest files for submitted data
ENA produces md5 manifest files for the users to be confirm the integrity of data submitted to and downloaded from the archive. Manifest files are produced for submitted files associated with runs and analysis.
The manifest can used to verify file integrity by using the command:
Please note that to validate the content of a run after downloading the data files the subfolder structure (organised by file format) should be preserved. The same is true for any analyses which contain subfolders.
For example, submitted file integrity for run ERR1438847 can be confirmed by doing the following actions.
1. Download manifest file:
The content of the manifest file is:
Please note that the file refers to file format specific subfolders containing the data files:
To run the md5sum command these files should be downloaded into the fastq subfolder.
3. Execute the md5sum command:
md5sum -c ERR1438847.md5
Archive generated fastq files are organised by run accession number under vol1/fastq directory in ftp.sra.ebi.ac.uk:
<dir1> is the first 6 letters and numbers of the run accession ( e.g. ERR000 for ERR000916 ),
<dir2> does not exist if the run accession has six digits. For example, fastq files for run ERR000916 are in directory: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR000/ERR000916/.
If the run accession has seven digits then the <dir2> is 00 + the last digit of the run accession. For example, fastq files for run SRR1016916 are in directory: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR101/006/SRR1016916/.
If the run accession has eight digits then the <dir2> is 0 + the last two digits of the run accession.
If the run accession has nine digits then the <dir2> is the last three digits of the run accession.
Archive generated Fastq are not available for the following data formats submitted to ENA:
- BAM files containing @PG:longranger
- CRAM files containing @PG:longranger
- Complete genomics native (data folder) submissions
- PacBio native (HDF5) submissions
- Many ONT native format submissions
Files can be downloaded through ftp.sra.ebi.ac.uk using any FTP client.
Example using wget:
Example using ftp:
Password: enter your e-mail address
ftp> cd vol1/ERA012/ERA012008/sff
ftp> get library08_GJ6U61T06.sff
Files can be downloaded through Globus ebi#public endpoint from 'ena' subfolder:
The ENA FTP Downloader is a standalone application that you can download here.
You can download files for a given accession, or upload an Advanced Search or portal API report to perform a bulk download of all files for a given set of criteria.
Aspera ascp command line client can be downloaded here. Please select the correct operating system. The ascp command line client is distributed as part of the Aspera connect high-performance transfer browser plug-in.
Your command should look similar to this on Unix:
ascp -QT -l 300m -P33001 -i <aspera connect installation directory>/etc/asperaweb_id_dsa.openssh email@example.com:<file or files to download> <download location>
and on Mac OSX:
ascp -QT -l 300m -P33001 -i <aspera connect installation directory>/asperaweb_id_dsa.openssh firstname.lastname@example.org:<file or files to download> <download location>
On Windows please use quotes to avoid errors caused by spaces in file path:
"%userprofile%\AppData\Local\Programs\Aspera\Aspera Connect\bin\ascp" -QT -l 300m -i "%userprofile%\AppData\Local\Programs\Aspera\Aspera Connect\etc\asperaweb_id_dsa.openssh" email@example.com:<file or files to download> <download location>
Note: The asperaweb_id_dsa.openssh public key was introduced in Aspera Connect plugin 3.3.3. Earlier versions can still use asperaweb_id_dsa.putty.
ascp -QT -l 300m -P33001 -i /etc/aperaweb_id_dsa.openssh firstname.lastname@example.org:/vol1/ERA012/ERA012008/sff/library08_GJ6U61T06.sff .
ascp -QT -l 300m -P33001 -i /etc/asperaweb_id_dsa.openssh email@example.com:/vol1/fastq/ERR036/ERR036000/ERR036000_1.fastq.gz .
ascp -QT -l 300m -P33001 -i /etc/aperaweb_id_dsa.openssh firstname.lastname@example.org:/vol1/ERA012/ERA012008/sff/ .