|Annotation||The process of attaching additional information to biological entities. Annotation can be structural (i.e. identification of the elements from a sequence, such as protein coding regions or the location of regulatory motifs) or functional (i.e. adding biological information to the identified elements, such as the biological function of a protein domain or an entire protein, or the molecular interactions or regulatory role of a nucleotide sequence). Annotation can either be applied automatically or can be manually added (in a process called 'curation') from various sources, such as the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. Annotation can either be applied automatically or it can be curated (manually) from the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. |
|ChEBI||Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds (i.e. excluding biopolymers such as proteins and nucleic acids). http://www.ebi.ac.uk/chebi |
|Computationally infer||To ascertain or conclude by automated means and without human oversight. |
|Cross-reference||An instance within a database which refers to related or synonymous information in another database. Biological databases cross-reference each other using accession numbers and/or IDs as a way of linking their related knowledge together. |
|Digital Object Identifier||A Digital Object Identifier [no-glossary](DOI)[/no-glossary] is a unique alphanumeric string that is used to identify content. The DOI can be associated with metadata, including a URL to the document. A DOI is useful because it is permanent, whereas a document's location and other metadata may change. http://www.doi.org/ |
|Ensembl Compara||The database that holds all the comparative genomic information in Ensembl. These include pairwise and multiple whole-genome alignments, syntenties, genome conservation analyses, protein families and gene trees from which orthologues and paralogues are inferred. |
|Extracellular||The part of a multicellular organism outside the cells proper, usually taken to be outside the plasma membranes, and occupied by fluid. |
|GPCRs||G protein-coupled receptors are a large and diverse group of proteins that are involved in many biological processes, including photoreception, regulation of the immune system, and nervous system transmission. |
|Gene ontology||Gene Ontology (GO) is a controlled vocabulary used to describe the biology of a gene product in any organism. There are 3 independent sets of vocabularies, or ontologies, that describe: the molecular function of a gene product, the biological process in which the gene product participates and the cellular component where the gene product can be found (http://www.geneontology.org). |
|IntAct||An EBI-hosted database of molecular interactions. Most of the interactions hosted in IntAct are protein–protein interactions. IntAct represents the interactions with a high level of detail, following the guidelines of the International Molecular Exchange (IMEX) Consortium. http://www.ebi.ac.uk/intact/ |
|Ontology||Is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts |
|Protein–protein interaction||Protein–protein interactions occur when two or more proteins bind together, often to carry out their biological function. |
|Reactome||A database of the biological processes known to be associated with particular proteins in humans. It is a collaboration between the EBI, The Ontario Institute For Cancer Research, Cold Spring Harbor Laboratory and New York University Medical Center. www.reactome.org |
|Reactome Pathway Browser||The Reactome Pathway Browser is the primary means of navigating and viewing pathway graphics in Reactome. The pathway graphics are interactive and the browser includes tools for analysing the pathway. |
|Small molecules||Low molecular weight organic compounds which are not polymers. |
|UniProt||UniProt – Universal Protein Resource: The world's most comprehensive catalogue of information on proteins and a central repository of protein sequence and function, created by joining the information contained in UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, and PIR http://www.ebi.ac.uk/uniprot/ |
|accession number||A unique, relatively stable, identifier given to database record which allows you to track different versions of that record over time in a single data repository.
For example, in in the ArrayExpress Archive, experiments and array designs are given unique accession numbers in the format of E-XXXX-n for experiments and A-XXXX-n for array designs. XXXX is a four letter code indicating the course of submission and n is a number e.g. E-MEXP-568. Some experiments also have secondary accession numbers.
In the UniProt database, proteins have unique UniProt Accession Numbers (e.g. P04637) and UniProt Protein ID's (e.g. P53_HUMAN). Uniprot accessions are unique to specific protein isoforms in specific species, and are used as the standard method for uniquely referencing a protein in EBI resources. Uniprot accessions cross-link the entries in various UniProt databases. Most often, researchers will find it useful to follow the Uniprot accession back to an entry in UniProtKB/Swiss-Prot to view a curated summary of known information about that protein.
There is a 'ID Mapping' Tool on the UniProt homepage which can be useful for converting Accession Numbers to corresponding idenfiers in other databases.
|curation||In the context of biological databases, curation is the process of interpreting and representing biological data using standardised annotation, controlled vocabularies and standardised formats, so the data can be stored and made available to the scientific community. |
|curator||A professional scientist who collects, annotates, and validates information that is disseminated by biological and model organism databases. The role of a biocurator encompasses quality control of primary biological research data intended for publication, extracting and organizing data from original scientific literature, and describing the data with standard annotation protocols and vocabularies that enable powerful queries and biological database inter-operability. Curators communicate with researchers to ensure the accuracy of curated information and to foster data exchanges with research laboratories. |
|database schema||The structure of a database system described in a formal language supported by the database management system (DBMS) and refers to the organization of data to create a blueprint of how a database will be constructed (divided into database tables). The formal definition of database schema is a set of formulas (sentences) called integrity constraints imposed on a database. These integrity constraints ensure compatibility between parts of the schema. All constraints are expressible in the same language. A database can be considered a structure in realization of the database language. The states of a created conceptual schema are transformed into an explicit mapping, the database schema. This describes how real world entities are modeled in the database.
|gene||A molecular unit of heredity of a living organism. Genes hold the information to build and maintain an organism's cells and pass genetic traits to offspring. All organisms have many genes corresponding to various biological traits, some of which are immediately visible, such as eye color or number of limbs, and some of which are not, such as blood type or increased risk for specific diseases, or the thousands of basic biochemical processes that comprise life. |
|gene expression||The process by which information from a gene is used in the synthesis of a functional product. Gene expression is the most fundamental level at which the genotype gives rise to the phenotype. The genetic code stored in DNA is "interpreted" by gene expression, and the properties of the expression give rise to the organism's phenotype.
|manually infer||To ascertain or come to a conclusion via manually defined qualifiers or parameters. |
|orthologue||Genes that are found in different species that evolved from a common ancestral gene by speciation. E.g. the human gene BRCA2 and the mouse gene Brca2 are orthologues. Often, orthologues retain the same function in the course of evolution (see paralogue for comparison). |
|relational database||A relational database is a collection of data items organized as a set of formally described tables from which data can be accessed easily.