Leading manufacturer of oligonucleotide arrays. |
|Annotation||The process of attaching additional information to biological entities. Annotation can be structural (i.e. identification of the elements from a sequence, such as protein coding regions or the location of regulatory motifs) or functional (i.e. adding biological information to the identified elements, such as the biological function of a protein domain or an entire protein, or the molecular interactions or regulatory role of a nucleotide sequence). Annotation can either be applied automatically or can be manually added (in a process called 'curation') from various sources, such as the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. Annotation can either be applied automatically or it can be curated (manually) from the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. |
|ArrayExpress||ArrayExpress is a database of functional genomics experiments that can be queried and the data downloaded. www.ebi.ac.uk/arrayexpress/ |
|ArrayExpress Archive||The ArrayExpress Archive is a database of functional genomics data supported by scientific publications. |
The term indicates either a single hybridisation in a microarray experiment or a single sequencing run in a HTS experiment. |
|BAM||BAM is a common file format for next-generation sequencing and analysis tools. It is the compressed binary version of a SAM file. |
|ChIP-chip||A technique that combines chromatin immunoprecipitation ("ChIP") with microarray technology ("chip"). |
|Chromatin immunoprecipitation||ArrayExpress definition:
Chromatin immunoprecipitation (ChIP) is a method used to determine the location of DNA binding sites on the genome for a particular protein of interest. |
|Controlled vocabulary||A controlled vocabulary makes a database easier to search by drawing together all of the different words and phrases used to describe a concept under a single word or phrase. Synonyms are also listed and searchable so that you do not need to know the selceted word or phrase in advance. |
|Domain||Independently stable tertiary structures of proteins. They are distinct functional and/or structural units and can evolve, exist and function independently. |
|EGA||European Genome-phenome Archive. A repository for genotype experiments, including information such as population and family studies. http://www.ebi.ac.uk/ega/page.php |
|European Nucleotide Archive||The European Nucleotide Archive (ENA) is a comprehensive databank of primary nucleotide sequence information. ENA provides access to both assembled sequence and unassembled (raw) sequence reads, but places them in separate databases in order to optimise accessibility and analysis. http://www.ebi.ac.uk/ena/
The complete set of assays performed in a study. |
|Ligand||A substance that specifically and reversibly binds to a biomacromolecule to form a larger complex and alters its activity or fuction. |
A simple tab-delimited, spreadsheet-based format that is used for annotating and communicating microarray data in a MIAME-compliant fashion. |
|MIAME||Minimal information about a microarray experiment as recommended by the Functional Genomics Data (FGED) Society: http://www.mged.org/ |
Minimum Information about a high-throughput SeQuencing Experiment (http://www.mged.org/MINSEQE/). |
|Metadata||A term used to describe data that provides additional information about a particular data set. This information can include: how, when and where the data set was generated and what standards were used. In the proteomics context the addition of metadata such as peptide and protein identifications and quantification of their expression values gives meaning to a simple collection of mass spectra output files. |
|Methylation||Methylation is the addition of a methyl group to a substrate or the substitution of an atom, or group, by a methyl group.
You can find out more about methylation on the Wikipedia page: http://en.wikipedia.org/wiki/Methylation |
A DNA microarray consists of an arrayed series of thousands of microscopic spots of DNA oligonucleotides, called features, each containing picomoles of a specific DNA sequence, known a probe (or reporter). This can be a short section of a gene or other DNA element that is used to hybridise a cDNA or cRNA sample (called target) under high-stringency conditions. Probe-target hybridisation is usually detected and quantified by detection of fluorophore-, silver, or chemiluminescence-labeled targets to determine the relative abundance of nucleic acid sequences in the target. DNA microarrays can be used to measure changes in expression levels, to detect single-nucleotide polymorphisms (SNPs), to detect genomic gains and losses, etc. |
|Processed data||Is the data that has been subjected to processing, such as normalization, or other manipulation |
|RMA||Robust Multichip Average (RMA) is a three step normalisation procedure for Affymetrix data. The three steps consist of: background correction, quantile normalisation and summarization. |
RNA-Seq refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA content. |
|Raw data||Data that have not been subjected to processing or any other manipulation |
A biological material used in the study, e.g. a mouse, a tumor sample, a bacterial culture, a group of seedlings. You'll need at least one sample for each condition studied. If your experiment includes biological replicates create a sample for each biological replicate. If your experiment uses a common reference create this as a sample too. |
|Uracil||Uracil is one of the four nucleobases in the nucleic acid of RNA and is represented by the letter U.
You can learn more about Uracil on the Wikipedia page: http://en.wikipedia.org/wiki/Uracil |
|accession number||A unique, relatively stable, identifier given to database record which allows you to track different versions of that record over time in a single data repository.
For example, in in the ArrayExpress Archive, experiments and array designs are given unique accession numbers in the format of E-XXXX-n for experiments and A-XXXX-n for array designs. XXXX is a four letter code indicating the course of submission and n is a number e.g. E-MEXP-568. Some experiments also have secondary accession numbers.
In the UniProt database, proteins have unique UniProt Accession Numbers (e.g. P04637) and UniProt Protein ID's (e.g. P53_HUMAN). Uniprot accessions are unique to specific protein isoforms in specific species, and are used as the standard method for uniquely referencing a protein in EBI resources. Uniprot accessions cross-link the entries in various UniProt databases. Most often, researchers will find it useful to follow the Uniprot accession back to an entry in UniProtKB/Swiss-Prot to view a curated summary of known information about that protein.
There is a 'ID Mapping' Tool on the UniProt homepage which can be useful for converting Accession Numbers to corresponding idenfiers in other databases.
|curation||In the context of biological databases, curation is the process of interpreting and representing biological data using standardised annotation, controlled vocabularies and standardised formats, so the data can be stored and made available to the scientific community. |
|curator||A professional scientist who collects, annotates, and validates information that is disseminated by biological and model organism databases. The role of a biocurator encompasses quality control of primary biological research data intended for publication, extracting and organizing data from original scientific literature, and describing the data with standard annotation protocols and vocabularies that enable powerful queries and biological database inter-operability. Curators communicate with researchers to ensure the accuracy of curated information and to foster data exchanges with research laboratories. |
|gene||A molecular unit of heredity of a living organism. Genes hold the information to build and maintain an organism's cells and pass genetic traits to offspring. All organisms have many genes corresponding to various biological traits, some of which are immediately visible, such as eye color or number of limbs, and some of which are not, such as blood type or increased risk for specific diseases, or the thousands of basic biochemical processes that comprise life. |
|gene expression||The process by which information from a gene is used in the synthesis of a functional product. Gene expression is the most fundamental level at which the genotype gives rise to the phenotype. The genetic code stored in DNA is "interpreted" by gene expression, and the properties of the expression give rise to the organism's phenotype.
|high throughput sequencing||Next generation sequencing or high-throughput sequencing technologies parallelise the sequencing process, producing thousands or millions of sequences at once.
You can find out more about NGS /HTS on the Wikipedia page: http://en.wikipedia.org/wiki/Next-generation_sequencing#High-throughput_sequencing |
|miRNA||microRNA. A short single-stranded, non-coding RNA that may function in the regulation of gene expression. |
|morphology||Morphology is the branch of linguistics that studies patterns of word-formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages. For example, English speakers recognize that the words dog, dogs, and dog-catcher are closely related. English speakers recognize these relations from their tacit knowledge of the rules of word-formation in English. They intuit that dog is to dogs as cat is to cats; similarly, dog is to dog-catcher as dish is to dishwasher. [From wikipedia] |