|Amino acid||Amino Acids are the chemical units or "building blocks" of the body that make up proteins.
You can find out more about amino acids on the Wikipedia page: http://en.wikipedia.org/wiki/Amino_acid |
|Annotation||The process of attaching additional information to biological entities. Annotation can be structural (i.e. identification of the elements from a sequence, such as protein coding regions or the location of regulatory motifs) or functional (i.e. adding biological information to the identified elements, such as the biological function of a protein domain or an entire protein, or the molecular interactions or regulatory role of a nucleotide sequence). Annotation can either be applied automatically or can be manually added (in a process called 'curation') from various sources, such as the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. Annotation can either be applied automatically or it can be curated (manually) from the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. |
|Controlled vocabulary||A controlled vocabulary makes a database easier to search by drawing together all of the different words and phrases used to describe a concept under a single word or phrase. Synonyms are also listed and searchable so that you do not need to know the selceted word or phrase in advance. |
|GOA||The UniProt Gene Ontology Annotation (GOA) program aims to provide high-quality Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB).
|Gene ontology||Gene Ontology (GO) is a controlled vocabulary used to describe the biology of a gene product in any organism. There are 3 independent sets of vocabularies, or ontologies, that describe: the molecular function of a gene product, the biological process in which the gene product participates and the cellular component where the gene product can be found (http://www.geneontology.org). |
|IMEx||The International Molecular Exchange Consortium is a collaboration between several protein interaction databases who have agreed to share curation effort. IMEX datasets are curated from the literature to a consistent standard using an agreed set of rules. The curation rules can be found on the IMEx homepage. |
|IntAct||An EBI-hosted database of molecular interactions. Most of the interactions hosted in IntAct are protein–protein interactions. IntAct represents the interactions with a high level of detail, following the guidelines of the International Molecular Exchange (IMEX) Consortium. http://www.ebi.ac.uk/intact/ |
|InterPro||The EBI’s integrated resource for protein motifs, families and domains. It provides a single, consistent interface of protein signatures contributed by ten different databases, each of which uses a slightly different method for deriving protein signatures. |
|PSICQUIC||PSICQUIC is a project led by the HUPO Proteomics Standard Initiative (HUPO-PSI) that aims to standardise programmatic access to molecular interaction databases. A single query can access multiple databases using the PSICQUIC web service. |
|Proteomics||Proteomics is the large-scale study of proteomes. A proteome is a set of proteins produced in an organism, system, or biological context. We may refer to, for instance, the proteome of a species (for example, Homo sapiens) or an organ (for example, the liver). |
|Reactome||A database of the biological processes known to be associated with particular proteins in humans. It is a collaboration between the EBI, The Ontario Institute For Cancer Research, Cold Spring Harbor Laboratory and New York University Medical Center. www.reactome.org |
|UniProtKB||UniProtKB (UniProt Knowledgebase) is the central access point for extensive curated protein information, including function, classification, and cross-reference. |
|XML||Extensible Markup Language (XML) defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
You can find out more about XML on the Wikipedia page: http://en.wikipedia.org/wiki/XML |
|accession number||A unique, relatively stable, identifier given to database record which allows you to track different versions of that record over time in a single data repository.
For example, in in the ArrayExpress Archive, experiments and array designs are given unique accession numbers in the format of E-XXXX-n for experiments and A-XXXX-n for array designs. XXXX is a four letter code indicating the course of submission and n is a number e.g. E-MEXP-568. Some experiments also have secondary accession numbers.
In the UniProt database, proteins have unique UniProt Accession Numbers (e.g. P04637) and UniProt Protein ID's (e.g. P53_HUMAN). Uniprot accessions are unique to specific protein isoforms in specific species, and are used as the standard method for uniquely referencing a protein in EBI resources. Uniprot accessions cross-link the entries in various UniProt databases. Most often, researchers will find it useful to follow the Uniprot accession back to an entry in UniProtKB/Swiss-Prot to view a curated summary of known information about that protein.
There is a 'ID Mapping' Tool on the UniProt homepage which can be useful for converting Accession Numbers to corresponding idenfiers in other databases.
|curation||In the context of biological databases, curation is the process of interpreting and representing biological data using standardised annotation, controlled vocabularies and standardised formats, so the data can be stored and made available to the scientific community. |
|gene||A molecular unit of heredity of a living organism. Genes hold the information to build and maintain an organism's cells and pass genetic traits to offspring. All organisms have many genes corresponding to various biological traits, some of which are immediately visible, such as eye color or number of limbs, and some of which are not, such as blood type or increased risk for specific diseases, or the thousands of basic biochemical processes that comprise life. |