Read domain retrieval tutorial

This is a step by step tutorial for browsing read domain (formerly the Sequence Read Archive (SRA)) metadata using the European Nucleotide Archive's browser (ENA Browser). We will concentrate on the study described in the paper Population genomics of domestic and wild yeasts by Liti et al. Nature 2009. This paper cites a submission made to ENA's read domain: 'The Solexa data were submitted to the Sequence Read Archive with accession number ERA000011'.

Step 1. Search read domain metadata

Please type ERA000011 in the search box from the header of any ENA page and hit the Search button. You can also go directly to the submission page by using the URL: http://www.ebi.ac.uk/ena/data/view/ERA000011.

Question 1. Who is the submitter?

Step 2. Look at read domain metadata

You can see three tabs on the page: ‘Navigation’, ‘Fastq Files’ and ‘Submitted Files. The ‘Navigation’ tab contains links to objects submitted in this submission. The ‘Fastq files’ tab contains a list of archive generated fastq files or ‘Not available’ if a fastq product has not been created. The ‘Submitted files’ tab contains the list of original read domain files submitted to the ENA.

Please click the ‘Navigation’ tab and go to the study ERP000001. You can also go directly to the study page by using the URL: http://www.ebi.ac.uk/ena/data/view/ERP000001.

Question 2. What is the title of the study?

The title is displayed at the top of the page after the accession number.

Question 3. How many different samples were used in this study?

Links to the samples are displayed in the ‘Navigation’ tab. The associated sample accessions are displayed as a range.

Question 4. What sequencing platform was used for the first ten runs?

The ‘Fastq Files’ and ‘Submitted Files’ tabs contain a table of all runs associated with the study. Each row corresponds to one Run. Data submitted to the read domain are associated with runs, the runs are associated with experiments and experiments with studies and samples. The column ‘Instrument Model’ contains the sequencing platform.

Question 5. Which library strategy was used for the runs in the first page?

Please click ‘Select columns’ link and select ‘Library Strategy’. This will display an additional column in the table containing the library strategy.

Question 6. Are all archive generated fastq files available for download for this study?

Step 3. Download SRA data

You can view and download the full list of files associated with the study by clicking the ‘TEXT’ links right above the list of files in ‘Fastq Files’ and ‘Submitted Files’ tabs or you can use the FileDownloader (Figure 1) available by clicking the 'Bulk download Fastq/Submitted files' link above the tabs. The FileDownloader currently uses the FTP protocal but we plan to add support for a faster protocol shortly.

Figure 1.FileDownloader

Read domain file downloader

Answers

Question 1. The Wellcome Trust Sanger Institute.

Question 2. Population genomics of domestic and wild yeasts .

Question 3. Twelve; three Saccharomyces cerevisiae strains and nine Saccharomyces paradoxus strains.

Question 4. Illumina Genome Analyzer.

Question 5. WGS; whole genome sequencing.

Question 6. No. Some archive generated fastq files are reported as ‘Not available’. If you require these files please click the ‘Send Feedback’ link at the top right corner of the page and request for the files to be generated. The archive generated fastq file format is described here.

Latest ENA news

09 Dec 2014: ENA release 122
Release 122 of ENA's assembled/annotated sequences is now available

12 Nov 2014: Simplification of data release procedures
The European Nucleotide Archive will couple the public release of sequence records and the release of study records that contain these sequence records, with immediate effect.

11 Nov 2014: ENA/EMG Sample Record Annotation Workshop
European Nucleotide Archive (ENA) and EBI Metagenomics Portal (EMG), are organising the ENA/EMG Sample Record Annotation Workshop on the 1-5 December 2014 to enrich the environmental sample records.

24 Sep 2014: ENA release 121
Release 121 of ENA's assembled/annotated sequences now available.