ENA accession numbers

There are a set of defined rules that describe the format of ENA accession numbers.  The regular expressions for each record accession number are listed below. Please note the ".d+" at the end denoting sequence versions for the assembled/annotated and protein coding sequences.

Accession number type Accession number format
Asssembled/Annotated sequences [A-Z]{1}\d{5}.\d+
[A-Z]{2}\d{6}.\d+
[A-Z]{4}S?\d{8,9}.\d+
Protein coding sequences [A-Z]{3}\d{5}.\d+
Traces TI\d+
Studies (E|D|S)RP\d{6,}
PRJ(E|D|N)\d+
Samples ERS\d{6,}
SAM(E|D|N)[A-Z]?\d+
Experiments (E|D|S)RX\d{6,}
Runs (E|D|S)RR\d{6,}
Analyses (E|D|S)RZ\d{6,}

Latest ENA News

20 Aug 2014: Read data through Globus GridFTP
Read data can now be downloaded using Globus GridFTP through ebi#ena Globus Online public endpoint.

18 Aug 2014: Changes to SRA XML 1.5
Small changes to Experiment XML, Analysis XML, EGA Dataset XML, EGA DAC XMLs were deployed on 11th of August 2014.

1 Jul 2014: ENA release 120
Release 120 of ENA's assembled/annotated seqences now available

23 May 2014: Change to date format for advanced search
From 16th June 2014, the date format used in the advanced search will be changed to ISO format (YYYY-MM-DD).

20 May 2014: Update to the ENA SAMPLE checklist
From 10th of June 2014 the ENA SAMPLE checklist XML will be updated and the older version will be deprecated.