Help - Databases and Programs


The EBI has a range of search engines available to query many types of molecular biology databases including biological ontologies, nucleotide sequence, protein sequence, proteomic, protein function/interactions, gene expression & microarray, gene ontologies, biological structure, scientific literature & text mining and metabolic pathway databases. Please see the 'EBI Database Queries' page, or you can also go to the databases page and browse to your chosen database and see what options you have from there. The purpose of this page is related to sequence comparisons and which programs are available for a particular database.

The Databases are accessible by a variety of tools. When a suitable program option is chosen from the tool pages program option (if there is one) it will then be possible to choose from a variety of databases to query against. The databases consist of the following types:

Database Description Tool Program
ASD
asd
The Alternative Splicing Database (ASD) Project aims to understand the mechanism of alternative splicing on a genome-wide scale by creating a database of alternative splice events and the resultant isoform splice patterns of genes from human, and other model species. WU-BLAST2 - ASD Nucleotide WU-BLASTN, WU-TBLASTX, WU-TBLASTN.
WU-BLAST2 - ASD Protein WU-BLASTP, WU-BLASTX.
FASTA - ASD Nucleotide FASTA3, tfastx3, tFASTY3.
FASTA - ASD Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
DSSP
dssp
Dictionary Secondary Structure of Protein. Definition of secondary structure of proteins given a set of 3D coordinates. You can access DSSP via SRS or our FTP site. Can be searched via SRS (Sequence Retrieval System).
EMBL
embl
EMBL Nucleotide Sequence Database, Europe's primary nucleotide sequence resource. The main sources of the DNA and RNA sequences in the database are submissions from individual researchers, genome sequencing projects and patent applications.
trans
FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
NCBI-BLAST2 - Nucleotide blastn.
WU-BLAST2 - Nucleotide WU-BLASTN, WU-TBLASTX,
WU-tblastn.
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
EB-eye  
EMBLNEW
embl
EMBL Updates. FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
NCBI-BLAST2 - Nucleotide blastn.
WU-BLAST2 - Nucleotide WU-BLASTN, WU-TBLASTX,
WU-tblastn.
EB-eye  
EMBLALL
embl
EMBL + Updates. FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
EB-eye  
EMBL Divisions
(e.g.HUMAN)
embl
EMBL by Division. FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
EEST
embl
EMBL ESTs. FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
NCBI-BLAST2 - Nucleotide blastn.
WU-BLAST2 - Nucleotide WU-BLASTN, WU-TBLASTX,
WU-tblastn.
EMVEC
embl
EMBL Vectors. BLAST2 EVEC - Nucleotide blastn.
FASTA - Nucleotide FASTA3, tfastx3, tFASTY3.
NCBI-BLAST2 - Nucleotide blastn.
WU-BLAST2 - Nucleotide WU-BLASTN.
EMBL SVA
embl sva
The EMBL Sequence Version Archive is a repository of all entries which have ever appeared in the EMBL Nucleotide Sequence Database. EMBL-SVA fetch - retrieves single or batch entries. EMBL-SVA fetch program.
Dbfetch  
EPO Patent Protein Sequences
epo
Patented sequences from the european patents office. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Ensembl
ensembl
Maintains automatic annotation of large eukaryotic genomes. Ensembl Genome Browser Export, BLAST, Data Mining, SSAHA.
EB-eye  
HGVbase
hgvbase
HGVbase (Human Genome Variation database) consists of all known sequence variations in the human genome. FASTA - Nucleotide FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
SNP-FASTA3 FASTA3, tfastx3, tFASTY3.
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
IMGT/HLA
hla
The International ImMunoGeneTics database compromising IMGT/HLA database of the human MHC complex. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST.
IMGT/LIGM
imgt
The International ImMunoGeneTics database at the Laboratoire d'ImmunoGénétique Moléculaire compromising IMGT/LIGM database of immunoglobulins and T-cell receptors. WU-BLAST2 - Nucleotide WU-BLASTN, WU-TBLASTN, WU-TBLASTX.
NCBI-BLAST2 - Protein blastn.
IntAct
intact
The IntAct sequence database is derived from UniProt entries and data from MassSpec experiments submitted to the IntAct protein-interaction database. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
Position specific iterative BLAST PSI-BLAST
EB-eye  
InterPro
interpro
An integrated documentation resource for protein families, domains and functional sites. Text Search
 
SRS Search SRS Program.
InterProScan InterProScan Program.
Dbfetch Dbfetch Program.
EB-eye  
IPI
ipi
International Protein Index a non-redundant human proteome set constructed from UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, Ensembl and RefSeq.
FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST
Dbfetch Dbfetch Program.
JPO Patent Protein Sequences
jpo
Patented sequences from the Japanese patents office. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Ligand-Gated Ion Channel Database
ligc
The LGICdb contains entries of ligand-activated ion channel subunits covering the three different superfamilies of extracellularly activated ligand-gated ion channel subunits. Can be searched via Text Search.  
FASTA - LGIC Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
FASTA - LIGC Nucleotide FASTA3, tfastx3, tFASTY3.
Parasite-Genome
parasites
A parasite specific nucleotide sequence database. Parasite Genomes WU-BLAST2 WU-blastn, WU-tBLASTX, WU-tblastn.
PDB
pdb
Brookhaven Protein Sequence Database, the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
EB-eye  
Prints
prints
A fingerprint is a group of conserved motifs used to characterise a protein family. Prints is a compendium of such protein fingerprints. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs, their full diagnostic potency deriving from the mutual context afforded by motif neighbours. FingerPRINTScan FingerPRINTScan program.
PROSITE
prosite
PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs. PPSearch PPSearch Program.
Structural Genomics Targets Database
msd
Structural Genomics targets include public domain SG and PDB including pre-released sequences. The server will soon include SPINE targets. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
UniProtKB/Swiss-Prot
Swiss-Prot

UniProtKB/Swiss-Prot Protein Database, a curated protein sequence database that provides a high level of annotation, a minimal level of redundancy and high level of integration with other databases.
FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Can be searched via SRS (Sequence Retrieval System) SRS Program.
UniProt Search- Full text, advanced search, set manipulation and search filtering.  
UniProtKB/TrEMBL
TrEMBL
UniProtKB/TrEMBL is a computer-annotated supplement to UniProtKB/Swiss-Prot. UniProtKB/TrEMBL contains the translations of all coding sequences (CDS) present in the EMBL Nucleotide Sequence Database, which are not yet integrated into UniProtKB/Swiss-Prot.
Can be searched via SRS (Sequence Retrieval System) SRS Program.
UniProt Search - Full text, advanced search, set manipulation and search filtering.  
UniProtKB
UniProtKB
Curated protein information, including function, classification, and cross-references. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
SRS - Sequence Retrieval System  
UniProt Search - Full text, advanced search, set manipulation and search filtering. [help]  
EB-eye  
UniRef
UniRef
The UniRef (UniProt Reference Clusters) databases combine closely related sequences into a single record to speed searches. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
SRS - Sequence Retrieval System  
UniRef Search - Full text, advanced search, set manipulation and search filtering. [help]  
UniParc
uniparc
A comprehensive non-redundant protein sequence archive. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST 
Dbfetch - Dbfetch program for fetching up to 50 whole entries Dbfetch Program.
SRS - Sequence Retrieval System  
UniParc Search - Full text, advanced search, set manipulation and search filtering. [help]  
JPO Patent Protein Sequences
uspop
Patented sequences from the Japanese patents office. FASTA - Protein FASTA3, FASTAX3, FASTY3, FASTF3, FASTS3.
NCBI-BLAST2 - Protein BLASTP, BLASTX.
WU-BLAST2 - Protein WU-BLASTP, WU-BLASTX.
Pattern Hit Initiated BLAST PSI-BLAST, PHI-BLAST