Read domain retrieval tutorial

This is a step by step tutorial for browsing read domain (formerly the Sequence Read Archive (SRA)) metadata using the European Nucleotide Archive's browser (ENA Browser). We will concentrate on the study described in the paper Population genomics of domestic and wild yeasts by Liti et al. Nature 2009. This paper cites a submission made to ENA's read domain: 'The Solexa data were submitted to the Sequence Read Archive with accession number ERA000011'.

Step 1. Search read domain metadata

Please type ERA000011 in the search box from the header of any ENA page and hit the Search button. You can also go directly to the submission page by using the URL: http://www.ebi.ac.uk/ena/data/view/ERA000011.

Question 1. Who is the submitter?

Step 2. Look at read domain metadata

You can see two tabs on the page: ‘Navigation’ and ‘Read Files’. For some record pages, you will also see a ‘Analysis Files' tab. The ‘Navigation’ tab contains links to objects submitted in this submission. The ‘Read Files’ tab contains a table of all run objects (and associated objects and selected data) with download links to any FASTQ and submitted files. The ‘Analysis files’ tab contains a table of all analysis objects (and associated objects and selected data) with download links to any submitted files

Please click the ‘Navigation’ tab and go to the study ERP000001. You can also go directly to the study page by using the URL: http://www.ebi.ac.uk/ena/data/view/ERP000001.

Question 2. What is the title of the study?

The title is displayed at the top of the page after the accession number.

Question 3. How many different samples were used in this study?

Links to the samples are displayed in the ‘Navigation’ tab. The associated sample accessions are displayed as a range.

Question 4. What sequencing platform was used for the first ten runs?

The ‘Read Files’ tab contains a table of all runs associated with the study. Each row corresponds to one Run. Data submitted to the read domain are associated with runs, the runs are associated with experiments and experiments with studies and samples. The column ‘Instrument Model’ contains the sequencing platform.

Question 5. Which library strategy was used for the runs in the first page?

Please click ‘Select columns’ link and select ‘Library Strategy’. This will display an additional column in the table containing the library strategy.

Question 6. Are all archive generated fastq files available for download for this study?

Step 3. Download SRA data

You can view and download the full list of files associated with the study by clicking the ‘TEXT’ links right above the list of files in ‘Read Files’ and ‘Analysis Files’ tabs or you can use the FileDownloader available by clicking the 'Download Files' button at the top of the tabs. Note that there are some browser compatibility issues and at this time we recommend using Firefox

Answers

Question 1. The Wellcome Trust Sanger Institute.

Question 2. Population genomics of domestic and wild yeasts .

Question 3. Twelve; three Saccharomyces cerevisiae strains and nine Saccharomyces paradoxus strains.

Question 4. Illumina Genome Analyzer.

Question 5. WGS; whole genome sequencing.

Question 6. Yes. The archive generated fastq file format is described here.

Latest ENA news

11 Oct 2017: Read data download issues resolved

Read data download issues previously affecting ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk services now resolved.

06 Oct 2017: ENA read data download issues

Issues with read data download from ftp.sra.ebi.ac.uk and fasp.sra.ebi.ac.uk

04 Oct 2017: ENA Release 133

Release 133 of ENA's assembled/annotated sequences now available