0%

Reference databases for biodiversity studies

A reference database is a curated collection of genetic sequences from known organisms, often reviewed by taxonomic experts to ensure accuracy. Data from public repositories like NCBI or BOLD require curation due to frequent misidentifications. An ideal reference database offers broad taxonomic coverage, includes high-quality sequences (such as those obtained through Sanger sequencing), and provides accurate taxonomic annotations. Additionally, associated metadata enhances the utility of the database for biogeographic studies.

A few commonly used reference databases have been listed in the table below:

Database nameTarget gene Target organismsReference Database link
SILVA16S & 18S rRNABacteria, Archaea, Protists, Fungi, MetazoaQuast et al. 2013Silva (arb-silva.de)
PR218S rRNAProtists, All eukaryotesGuillou et al. 2013The PR2 databases (pr2-database.org)
UNITEITS Fungi, All eukaryotesKõljalg et al. 2019UNITE (ut.ee)
MaarjAM18S rRNA ITSArbuscular mycorrhizal fungiÖpik et. 2010https://maarjam.ut.ee/
Midori2COIAll eukaryotesLeray et al. 2012Home (reference-midori.info)
Mitofish12S FishZhu et al. 2023MitoFish: Mitochondrial Genome Database of Fish (u-tokyo.ac.jp)
12S vertebrateClassifier12SVertebratesPorter 2021https://github.com/terrimporter/12SvertebrateClassifier
Diat barcode18S rbcLDiatomsRimet el. 2019GitHub – fkeck/diatbarcode: Access the diat.barcode database with R

Now that you’ve explored the metabarcoding workflow, conducted bioinformatics analysis with DADA2, and learned about assigning taxonomy, take a moment to reflect on your own project. Consider how you can adapt these methods in a way that is suitable and effective for your research. What insights or techniques can you incorporate?

Next, as we shift our focus towards other important aspects of data-driven research, it’s essential to conduct a thorough statistical examination in big data analysis to ensure that your conclusions are valid. In the upcoming section, you will explore how to apply statistical methods specifically for biological interpretation.