Minimum information about species barcode nucleotide sequence
The Species BARCODE Data Standard is a biodiversity standard formulated by the Consortium for the Barcode of Life (CBOL) for reporting minimum information about species barcode nucleotide sequences. The CBOL specifies requirements on reporting sample provenance information and on sequence quality with the aim to create a reference library of barcode DNA sequences integrated with related biodiversity information, such as taxonomy, specimen vouchers or geo-reference. Ultimately, DNA barcoding shall serve as a global standard for species identification.
The International Barcode of Life project (iBOL) develops a DNA barcode reference library that will serve as DNA-based identification system for multi-cellular life.
The Barcode of Life Data Systems (BOLD) is the central informatics platform for DNA barcoding providing acquisition, storage, analysis and publication of DNA barcode records.
A suitable species barcode marker has to meet several criteria. Ideally, the barcode marker (1) can be easily amplified in one read following a standardised protocol, (2) is on both sides flanked by a highly conserved region for reliable primers annealing, (3) is capable of organism identification on a species level.
Currently, the CBOL approves as effective barcodes the following loci:
- for metazoa, the cytochrome c oxidase 1 (cox1) gene region
- for land plants, a two-locus barcode, the ribulose-bisphosphate carboxylase (rbcL) and maturaseK (matK) gene regions (with recommendation to collect also non-coding regions, such as the chloroplast trnH-psbA spacer region)
- for fungi, the ribosomal internal transcribed spacer (ITS) region
INSDC records that meet the criteria of Species BARCODE Data Standard have the keyword ‘BARCODE’.
The MIMARKS includes the Species BARCODE Data Standard, which means that a MIMARKS-compliant dataset is also Species BARCODE compliant.
Species BARCODE data submission
A checklist for a submission of Species BARCODE sequences of the cytochrome c oxidase 1 (cox1) gene region is available from the Webin submission tool.
|Organism name;||Formal taxonomic name of this metozoan organism or informal name if unpublished/unidentified.||Arabidopsis thaliana|
|Bio-repository data||Reference to physical specimen from which the sequence was obtained (e.g. curated museum collection, living specimen), can be structured or unstructured.||structured YMUK:12345
|Country||Political name of country or ocean in which a sequenced sample or isolate was collected.||France, Mediterranean Sea|
Mitochondrial translation table for this organism. Choose between vertebrate (table 2) and invertebrate (table 5) codes.
|Codon Start (required to determine reading frame)||The codon start for the reading frame which should be translated is the coordinate of the base for the fisrt complete codon.||3|
|Forward Primer Name||Name of the forward direction PCR primer.||ArthFW1|
|Forward Primer Sequence||Sequences should be given in the IUPAC degenerate-base alphabet, except for the modified bases; those must be included within angle brackets.||GACATTGKG<I>T|
|Reverse Primer Name||Name of the reverse direction PCR primer.||ArthRV1|
|Reverse Primer Sequence||Sequences should be given in the IUPAC degenerate-base alphabet, except for the modified bases; those must be included within angle brackets.||CATGRTTAGAC|
|Latitude/Longitude||Geographical coordinates of the location where the specimen was collected, in decimal degrees (to 2 places).||47.94, -12.45|
|Identified by||The person that identified the organism/sample.||John White|
|Collector||Name of the person that originally collected the sample/organism||John White|
|Collection Date||Date of collection of the original sample/organism||12-Apr-2013|
|Strain Name||Name of the indetifier for strain. Often used for mice and fly lines.||BALB/c|
|Breed||The recognised breed name of the organism.||Friesian Holstein|
|A name of the individual sample.||MP7|
|Clone Identifier||Identifier given to each clone in a sequenced library.||lib_1_9|
|Geographical Area||Political name of the area of country or ocean in which the sequenced sample or isolate was collected.||North Atlantic Ridge|
|Locality||More geographic-specific location where sequenced material was sourced. Must have 'Geographic Area' selected.||Loch Ness|
|Isolation Source||Physical geography of the sampling/isolation site.||rainforest conopy|
|Natural Host||The natural host (scientific name) of the organism from which the sequenced material was taken.||Canis lupus familiaris|
|Developmental Stage||Developmental stage of the organism, either a named stage, or a measurement of time.||blastula|
|Cell Type||Cell type from which the sequence was generated.||palisade cell|
|Tissue Type||Tissue type from which the sequence was obtained.||root|
|Sex||Sex of the organism from which the sequence was obtained.||male