MGnify Proteins BETA

Explore proteins predicted from metagenomic assemblies.


This service is currently in beta and under active development. We appreciate any feedback or issues you may have, which can be shared via our contact form.

MGnify Protein Database

The MGnify Protein Database is searchable by accession or sequence

Protein sequences are derived from the analysis of publicly available metagenomics assemblies within MGnify using our combined gene caller (which uses both Prodigal and FragGeneScan). Each sequence is assigned an MGYP accession. MGYPs are non-redundant, meaning that proteins with exactly the same sequence are assigned the same MGYP identifier.

Current release info

Name 2024_04
Date May 10, 2024
Count 717,738,164

Search our non-redundant protein database using HMMER

Paste your sequence or use this example


If you use the MGnify Protein Database, please cite:

MGnify: the microbiome sequence data analysis resource in 2023 Nucleic Acids Research (2023) doi:10.1093/nar/gkac1080
Richardson L, Allen B, Baldi G, Beracochea M, Bileschi ML, Burdett T, Burgin J, Caballero-Pérez J, Cochrane G, Colwell LJ, Curtis T, Escobar-Zepeda A, Gurbich TA, Kale V, Korobeynikov A, Raj S, Rogers AB, Sakharova E, Sanchez S, Wilkinson DJ, Finn RD