|Annotation||The process of attaching additional information to biological entities. Annotation can be structural (i.e. identification of the elements from a sequence, such as protein coding regions or the location of regulatory motifs) or functional (i.e. adding biological information to the identified elements, such as the biological function of a protein domain or an entire protein, or the molecular interactions or regulatory role of a nucleotide sequence). Annotation can either be applied automatically or can be manually added (in a process called 'curation') from various sources, such as the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. Annotation can either be applied automatically or it can be curated (manually) from the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. |
|DIGE||Differential In-Gel Electrophoresis is a method of comparing the relative abundace of proteins in two or more samples. Samples are labelled with fluorescent dyes, combined and separated by gel electrophoresis (usually in two dimensions - isoelectric point (pI) and mass). The same protein from different samples will co-migrate on the gel. The gel can then be scanned under exposure to the wavelength activating each fluorescent dye, recording the intensity of the fluorescence in each protein spot. This allows the relative abundance of a protein in the inital samples to be quantified. Usually, protein spots with differential abundance are exised from the gel for identification by Tandem-MS. |
|Ensembl||Ensembl is a joint project between the EMBL-EBI and the Wellcome Trust Sanger Institute that aims to develop a system that maintains automatic annotation of large eukaryotic genomes. All the software and data are free to access without any constraints. The project is primarily funded by the Wellcome Trust. It is a comprehensive source of stable annotation with confirmed gene predictions that have been integrated from external data sources. Ensembl annotates known genes and predicts new ones, with functional annotation from InterPro, OMIM, SAGE and gene families. |
|GOA||The UniProt Gene Ontology Annotation (GOA) program aims to provide high-quality Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB).
|High-throughput technologies||HTT is the use of automation equipments to address biological questions. High throughput research can be defined as the automation of experiments such that large scale repetition becomes feasible. |
|IntAct||An EBI-hosted database of molecular interactions. Most of the interactions hosted in IntAct are protein–protein interactions. IntAct represents the interactions with a high level of detail, following the guidelines of the International Molecular Exchange (IMEX) Consortium. http://www.ebi.ac.uk/intact/ |
|InterPro||The EBI’s integrated resource for protein motifs, families and domains. It provides a single, consistent interface of protein signatures contributed by ten different databases, each of which uses a slightly different method for deriving protein signatures. |
|Mass spectrometry||Mass Spectrometry is a technique for visualising the masses of species in a sample. Species are introduced into the MS by various method which result in the accumulation of charge on the molecules. The data are visualised as the Mass Spectra of the charged species arranged by the ratio of their mass to the number of charges (m/z) against a measure of relative intensity. |
|PRIDE||An EBI-hosted database of experimental evidence for peptide and protein identifications. |
|Proteome||A proteome is a set of proteins produced in an organism, system, or biological context. We may refer to, for instance, the proteome of a species (for example, Homo sapiens) or an organ (for example, the liver). The proteome is not constant; it differs from cell to cell and changes over time. To some degree, the proteome reflects the underlying transcriptome, however protein activity is also modulated by many factors in addition to rates of production. |
|Proteomics||Proteomics is the large-scale study of proteomes. A proteome is a set of proteins produced in an organism, system, or biological context. We may refer to, for instance, the proteome of a species (for example, Homo sapiens) or an organ (for example, the liver). |
|Reactome||A database of the biological processes known to be associated with particular proteins in humans. It is a collaboration between the EBI, The Ontario Institute For Cancer Research, Cold Spring Harbor Laboratory and New York University Medical Center. www.reactome.org |
|Subcellular||A component of the cell, for example the mitochondrion, the nucleus or plasma membrane. |
|Tandem-MS||Tandem-MS is a mass spectrometry based technique for identifying proteins in a biological sample. The proteins are first digested into shorter peptides using a protease such as Trypsin before mass spectrometric analysis. Tandem MS referes to the method of scanning inside the mass spectrometer (MS). A first MS scan is used visualise the peptide ions are selected inside the MS. Ions of interest are selected and fragmented, usually via collision with neutral gas molecules. A second MS scan (hence 'Tandem') acquires data on the fragment spectrum. The fragment spectra can be identified by manual peak assignment, or in a high-throughput manner by searching against theoretical spectra generated from a database of known protein sequences, such as UniProtKB. |
|Transcriptome||The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in one or a population of cells.
(Source: Wikitionary) |
|UniProt||UniProt – Universal Protein Resource: The world's most comprehensive catalogue of information on proteins and a central repository of protein sequence and function, created by joining the information contained in UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, and PIR http://www.ebi.ac.uk/uniprot/ |
|UniProtKB||UniProtKB (UniProt Knowledgebase) is the central access point for extensive curated protein information, including function, classification, and cross-reference. |
|accession number||A unique, relatively stable, identifier given to database record which allows you to track different versions of that record over time in a single data repository.
For example, in in the ArrayExpress Archive, experiments and array designs are given unique accession numbers in the format of E-XXXX-n for experiments and A-XXXX-n for array designs. XXXX is a four letter code indicating the course of submission and n is a number e.g. E-MEXP-568. Some experiments also have secondary accession numbers.
In the UniProt database, proteins have unique UniProt Accession Numbers (e.g. P04637) and UniProt Protein ID's (e.g. P53_HUMAN). Uniprot accessions are unique to specific protein isoforms in specific species, and are used as the standard method for uniquely referencing a protein in EBI resources. Uniprot accessions cross-link the entries in various UniProt databases. Most often, researchers will find it useful to follow the Uniprot accession back to an entry in UniProtKB/Swiss-Prot to view a curated summary of known information about that protein.
There is a 'ID Mapping' Tool on the UniProt homepage which can be useful for converting Accession Numbers to corresponding idenfiers in other databases.
|curation||In the context of biological databases, curation is the process of interpreting and representing biological data using standardised annotation, controlled vocabularies and standardised formats, so the data can be stored and made available to the scientific community. |