MIxS – Minimum information about any (x) sequence
The MIxS is a unified standard developed by the Genomic Standards Consortium (GSC) for reporting of minimum information about any (x) nucleotide sequence. It consists of MIGS, MIMS and MIMARKS* standards and describes fourteen environments.
MIGS, MIMS and MIMARKS share common mandatory core descriptors, differ in standard-specific elements and can be tailored to a particular environment by a subset of relevant environment-specific information components, as summarised in the figure below (Yilmaz et al. (2011)).
Fourteen environmental checklists, developed by environment-specific community experts, provide extensive range of environmental and epidemiological contextual data fields for accurate and consistent description of the sequenced sample, its environment and sequencing experiments performed with the sample.
* MIGS - Minimum Information about a Genome Sequence, MIMS – Minimum Information about a Metagenome Sequence, MIMARKS - Minimum Information about a MARKer gene Sequence. EU - eukarya; BA –bacteria or archaea; PL - plasmid; VI - virus; ORG - organelle
MIxS checklist - MANDATORY information
The MIGS checklist, the MIMS checklist and the MIMARKS checklist, collectively called MIxS, share a set of core descriptors, which are attributes mandatory for all checklists. The MIxS shared mandatory descriptors are in details described in the table here
MIMARKS checklist for marker gene data
GSC MIxS terms can be reported as a structured comment during submission of assembled sequences. Guidelines for an assembled and annotated sequence submission are available here. In order to reach MIMARKS compliance the structured comment of the assembled sequence must include (1) MIxS shared mandatory descriptors, (2) MIMARKS-specific descriptor, which is the target gene and (3) environment-specific mandatory descriptors. Please note that the MIxS environments have none, one or two mandatory descriptors.
The 16S rRNA marker gene can be submitted using the submission template MIMARKS-Survey 16S rRNA sequences. Guidelines for MIMARKS-compliant 16S rRNA marker gene data submission are included here.
MIMS checklist for metagenome and metatranscriptome data
GSC MIxS terms are included in the GSC MIxS sample checklists designed for reporting sample metadata of metagenomics and metatranscriptomics studies. Instructions for submissions of read data and sample metadata are here. In order to reach MIMS compliance the sample metadata must contain (1) MIxS shared mandatory descriptors (see here) and (2) environment-specific mandatory descriptors, listed for each environment here. Please note that the MIxS environments have none, one or two mandatory descriptors.
When providing sample metadata please choose the GSC MIxS environment appropriate to your study. All MIxS environmental packages are available. Note that the ENA Micro B3 checklist is also MIMS-compliant. A video tutorial outlines the interactive MIMS-compliant sample metadata submission using the ENA Micro B3 checklist as an example.
MIGS checklist for genome sequence data
GSC MIxS terms can be reported as a structured comment during submission of assembled sequences. Guidelines for an assembled and annotated sequence submission are available here. In order to reach MIGS compliance the structured comment of the assembled sequence must include (1) MIxS shared mandatory descriptors, (2) MIGS-specific descriptors and (3) environment-specific mandatory descriptors. Please note that the MIxS environments have none, one or two mandatory descriptors.
MIxS checklist - OPTIONAL information
Majority of GSC MIxS descriptors are optional. A complete list of of these attributes is beyond a scope of this page but is available from downloadable spreadsheets here.
|Structured comment name||Item name||Definition|
|investigation_type||investigation type||Nucleic Acid Sequence Report is the root element of all MIGS/MIMS compliant reports as standardized by Genomic Standards Consortium. This field is either eukaryote,bacteria,virus,plasmid,organelle, metagenome, miens-survey or miens-culture.|
|project_name||project name||Name of the project within which the sequencing was organized.|
|lat_lon||geographic location (latitude and longitude)||The geographical origin of the sample as defined by latitude and longitude. The values should be reported in decimal degrees and in WGS84 system|
|geo_loc_name||geographic location (country and/or sea,region)||The geographical origin of the sample as defined by the country or sea name followed by specific region name. Country or sea names should be chosen from the INSDC country list (http://insdc.org/country.html), or the GAZ ontology (v1.446) (http://purl.bioontology.org/ontology/GAZ)|
|collection_date||collection date||The time of sampling, either as an instance (single point in time) or interval. In case no exact time is available, the date/time can be right truncated i.e. all of these are valid times: 2008-01-23T19:23:10+00:00; 2008-01-23T19:23:10; 2008-01-23; 2008-01; 2008; Except: 2008-01; 2008 all are ISO8601 compliant.|
|biome||environment (biome)||In environmental biome level are the major classes of ecologically similar communities of plants, animals, and other organisms. Biomes are defined based on factors such as plant structures, leaf types, plant spacing, and other factors like climate. Examples include: desert, taiga, deciduous woodland, or coral reef. EnvO (v1.53) terms listed under environmental biome can be found from the link: http://www.environmentontology.org/Browse-EnvO|
|feature||environment (feature)||Environmental feature level includes geographic environmental features. Examples include: harbor, cliff, or lake. EnvO (v1.53) terms listed under environmental feature can be found from the link: http://www.environmentontology.org/Browse-EnvO|
|material||environment (material)||The environmental material level refers to the matter that was displaced by the sample, prior to the sampling event. Environmental matter terms are generally mass nouns. Examples include: air, soil, or water. EnvO (v1.53) terms listed under environmental matter can be found from the link: http://www.environmentontology.org/Browse-EnvO|
|env_package||environmental package||MIGS/MIMS/MIMARK extension for reporting of measurements and observations obtained from one or more of the environments where the sample was obtained. All environmental packages listed here are further defined in separate subtables. By giving the name of the environmental package, a selection of fields can be made from the subtables and can be reported.|
|seq_meth||sequencing method||Sequencing method used; e.g. Sanger, pyrosequencing, ABI-solid.|
|Structured comment name||Item name||Definition|
|ploidy||ploidy||The ploidy level of the genome (e.g. allopolyploid, haploid, diploid, triploid, tetraploid). It has implications for the downstream study of duplicated gene and regions of the genomes (and perhaps for difficulties in assembly). For terms, please select terms listed under class ploidy (PATO:001374) of Phenotypic Quality Ontology (PATO), and for a browser of PATO (v1.269) please refer to http://purl.bioontology.org/ontology/PATO|
|num_replicons||number of replicons||Reports the number of replicons in a nuclear genome of eukaryotes, in the genome of a bacterium or archaea or the number of segments in a segmented virus. Always applied to the haploid chromosome count of a eukaryote.|
|estimated_size||estimated size||The estimated size of the genome prior to sequencing. Of particular importance in the sequencing of (eukaryotic) genome which could remain in draft form for a long or unspecified period.|
|ref_biomaterial||reference for biomaterial||Primary publication if isolated before genome publication; otherwise, primary genome report.|
|propagation||propagation||This field is specific to different taxa. For phages: lytic/lysogenic, for plasmids: incompatibility group (Note: there is the strong opinion to name phage propagation obligately lytic or temperate, therefore we also give this choice.|
|assembly||assembly||How was the assembly done (e.g. with a text based assembler like phrap or a flowgram assembler); estimated error rate associated with the finished sequences (e.g. error rate of 1 in 1000 bp); and the method of calculation.|
|finishing_strategy||finishing strategy||Was the genome project intended to produce a complete or draft genome, Coverage, the fold coverage of the sequencing expressed as 2x, 3x, 18x etc, and how many contigs were produced for the genome.|
|isol_growth_condt||isolation and growth condition||Publication reference in the form of pubmed ID (pmid), digital object identifier (doi) or url for isolation and growth condition specifications of the organism/material.|
|Environmental package||Structured comment name||Item name||Definition|
|air||alt||altitude||The altitude of the sample is the vertical distance between Earth's surface above sea level and the sampled position in the air|
|microbial mat/biofilm||depth||depth||Depth is defined as the vertical distance below surface, e.g. for microbial mat samples depth is measured from mat surface. Depth can be reported as an interval for subsurface samples.|
|microbial mat/biofilm||elev||elevation||The elevation of the sampling site as measured by the vertical distance from mean sea level.|
|sediment||depth||depth||Depth is defined as the vertical distance below surface, e.g. for sediment samples depth is measured from sediment surface. Depth can be reported as an interval for subsurface samples.|
|sediment||elevation||elevation||The elevation of the sampling site as measured by the vertical distance from mean sea level.|
|soil||depth||depth||Depth is defined as the vertical distance below surface, e.g. for soil samples depth is measured from soil surface. Depth can be reported as an interval for subsurface samples.|
|soil||elevation||elevation||The elevation of the sampling site as measured by the vertical distance from mean sea level.|
|water||depth||depth||Depth is defined as the vertical distance below surface, e.g. for water samples depth is measured from water surface. Depth can be reported as an interval for subsurface samples.