How to search ENA with text

Text search

You can search ENA using free text, such as:

  • a gene name, for example, P53;
  • a disease name, such as diabetes mellitus;
  • an ENA accession number, for example, BN000065;
  • a UniProt accession number, for example, Q00987;
  • a keyword, for example, mRNA cap structure;
  • a data class, for example, CON;
  • a taxonomic division, for example, HUM.

If you search using an ENA accession number, your search results will take you directly to the appropriate entry. If you search using any other text, your search results will consist of a list of entries (Figure 15). The browser also supports accession ranges and lists.

Searching ENA using an accession number or simply any free text

Figure 15. Searching ENA using an accession number or simply any free text.

Notes

[A] Text search box will query any free text.

[B] Searching using an ENA accession number (e.g. BN000065) takes you directly to that entry.

[C] Searching using any other free text (e.g. kinase) provides a list of relevant entries.

Steps

1. Open the ENA Browser in a new window.

2. Type the search term human into the text search box [A].

3. Click ‘Search’ [D] to obtain search results.

Information

Search terms consisting of two or more words are searched together, therefore all terms must be found to report a match. For example, searching for lung cancer is equivalent to searching for lung AND cancer (the AND is automatically inserted). If you want to search for terms independently insert an OR between them, for example searching for lung OR cancer will report matches that have either lung or cancer or both terms (caution: this will return an exceedingly long list).

Text search results page

When you search ENA using any term other than an ENA accession number, then you will obtain a list of search results grouped by data type (Figure 16).

Assembled nucleotide entries from EMBL-Bank include:

  • Annotated Nucleotide Sequences: entries from the STD, EST, GSS, HGT, HTC, PAT, STS and TSA data classes;
  • Whole Genome Shotgun Sequences: entries from the WGS data class;
  • Genomic Contructed Sequences: entries from the CON data class;
  • Protein-coding Sequences: entries from the CDS data class.

Raw nucleotide data from SRA can be viewed grouped by a study, sample, experiment, run or submission.

Other  data includes:

  • Projects group together both assembled and raw data for a specific sequencing project;
  • Taxonomy group together both assembled and raw data for a taxonomic group.

ENA browser results page for a text search on ‘human’

Figure 16. ENA browser results page for a text search on ‘human’.

Notes

[A] Assembled Nucleotide Sequences provides a list of relevant entries grouped by data class from EMBL-Bank Release and EMBL-Bank Update (data deposited after the current release date).

[B] Raw Nucleotide Sequences provides a list of SRA (Next Gen sequencing data) entries grouped by study, sample, experiment, run or submission

[C] Other provides a list of assembled and raw data grouped by the sequencing project ('Projects') or taxonomy ('Taxa').

[D] Expand to view entries in each category.

Information

Trace Archive data can be obtained by searching on either the TI accession or TI name.