Leading manufacturer of oligonucleotide arrays. |
|Annotation||The process of attaching additional information to biological entities. Annotation can be structural (i.e. identification of the elements from a sequence, such as protein coding regions or the location of regulatory motifs) or functional (i.e. adding biological information to the identified elements, such as the biological function of a protein domain or an entire protein, or the molecular interactions or regulatory role of a nucleotide sequence). Annotation can either be applied automatically or can be manually added (in a process called 'curation') from various sources, such as the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. Annotation can either be applied automatically or it can be curated (manually) from the scientific literature. At EMBL-EBI, we use a combination of automatic and manual annotation to enrich our databases. |
|ArrayExpress||ArrayExpress is a database of functional genomics experiments that can be queried and the data downloaded. www.ebi.ac.uk/arrayexpress/ |
|ArrayExpress Archive||The ArrayExpress Archive is a database of functional genomics data supported by scientific publications. |
The term indicates either a single hybridisation in a microarray experiment or a single sequencing run in a HTS experiment. |
Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data (http://www.bioconductor.org/). |
|Digital Object Identifier||A Digital Object Identifier [no-glossary](DOI)[/no-glossary] is a unique alphanumeric string that is used to identify content. The DOI can be associated with metadata, including a URL to the document. A DOI is useful because it is permanent, whereas a document's location and other metadata may change. http://www.doi.org/ |
The complete set of assays performed in a study. |
|Expression Atlas||Expression Atlas provides information on gene expression patterns under different biological conditions. |
A simple tab-delimited, spreadsheet-based format that is used for annotating and communicating microarray data in a MIAME-compliant fashion. |
|MIAME||Minimal information about a microarray experiment as recommended by the Functional Genomics Data (FGED) Society: http://www.mged.org/ |
MIAMExpress is a MIAME-compliant web-based tool for submitting microarray data and microarray designs to the AE database. It is suitable for submitting both single-channel (e.g. Affymetrix) and two-channel data. MIAMExpress can be used for experiments up to 50 hybridisations in size. |
Minimum Information about a high-throughput SeQuencing Experiment (http://www.mged.org/MINSEQE/). |
|Metadata||A term used to describe data that provides additional information about a particular data set. This information can include: how, when and where the data set was generated and what standards were used. In the proteomics context the addition of metadata such as peptide and protein identifications and quantification of their expression values gives meaning to a simple collection of mass spectra output files. |
A DNA microarray consists of an arrayed series of thousands of microscopic spots of DNA oligonucleotides, called features, each containing picomoles of a specific DNA sequence, known a probe (or reporter). This can be a short section of a gene or other DNA element that is used to hybridise a cDNA or cRNA sample (called target) under high-stringency conditions. Probe-target hybridisation is usually detected and quantified by detection of fluorophore-, silver, or chemiluminescence-labeled targets to determine the relative abundance of nucleic acid sequences in the target. DNA microarrays can be used to measure changes in expression levels, to detect single-nucleotide polymorphisms (SNPs), to detect genomic gains and losses, etc. |
|Processed data||Is the data that has been subjected to processing, such as normalization, or other manipulation |
|RMA||Robust Multichip Average (RMA) is a three step normalisation procedure for Affymetrix data. The three steps consist of: background correction, quantile normalisation and summarization. |
|Raw data||Data that have not been subjected to processing or any other manipulation |
|Sample||A biological material used in a study, e.g. a mouse, a tumour sample, a bacterial culture, a group of seedlings. |
|Tiling arrays||ArrayExpress definition:
Tiling arrays are a subtype of microarray chips. They differ in the nature of the probes. Short fragments are designed to cover the entire genome or contiguous regions of the genome. Depending on the probe lengths and spacing, different degrees of resolution can be achieved. Traditional DNA microarrays designed to look at gene expression use a few probes for each known or predicted gene. On top of individual gene expression analysis, other uses of tiling arrays are in transcriptome mapping, ChiP-chip, MeDIP-chip and DNase chip studies and array CGH, among others. |
|XML||Extensible Markup Language (XML) defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
You can find out more about XML on the Wikipedia page: http://en.wikipedia.org/wiki/XML |
|accession number||A unique, relatively stable, identifier given to database record which allows you to track different versions of that record over time in a single data repository.
For example, in in the ArrayExpress Archive, experiments and array designs are given unique accession numbers in the format of E-XXXX-n for experiments and A-XXXX-n for array designs. XXXX is a four letter code indicating the course of submission and n is a number e.g. E-MEXP-568. Some experiments also have secondary accession numbers.
In the UniProt database, proteins have unique UniProt Accession Numbers (e.g. P04637) and UniProt Protein ID's (e.g. P53_HUMAN). Uniprot accessions are unique to specific protein isoforms in specific species, and are used as the standard method for uniquely referencing a protein in EBI resources. Uniprot accessions cross-link the entries in various UniProt databases. Most often, researchers will find it useful to follow the Uniprot accession back to an entry in UniProtKB/Swiss-Prot to view a curated summary of known information about that protein.
There is a 'ID Mapping' Tool on the UniProt homepage which can be useful for converting Accession Numbers to corresponding idenfiers in other databases.
|curator||A professional scientist who collects, annotates, and validates information that is disseminated by biological and model organism databases. The role of a biocurator encompasses quality control of primary biological research data intended for publication, extracting and organizing data from original scientific literature, and describing the data with standard annotation protocols and vocabularies that enable powerful queries and biological database inter-operability. Curators communicate with researchers to ensure the accuracy of curated information and to foster data exchanges with research laboratories. |
|data matrices||ArrayExpress definition:
In a gene expression data matrix, each row represents a gene and each column represents an experimental condition or array. An entry in the data matrix usually represents the expression level or expression ratio of a gene under a given experimental condition. In addition to numerical values, the matrix can also contain additional columns for gene annotation or additional rows for sample annotation. |
|gene expression||The process by which information from a gene is used in the synthesis of a functional product. Gene expression is the most fundamental level at which the genotype gives rise to the phenotype. The genetic code stored in DNA is "interpreted" by gene expression, and the properties of the expression give rise to the organism's phenotype.
|high throughput sequencing||Next generation sequencing or high-throughput sequencing technologies parallelise the sequencing process, producing thousands or millions of sequences at once.
You can find out more about NGS /HTS on the Wikipedia page: http://en.wikipedia.org/wiki/Next-generation_sequencing#High-throughput_sequencing |
|single nucleotide polymorphism||A single base pair of DNA that is polymorphic (has alternate alleles) with respect to a population. |