Sequence search
MGnify maintains a non-redundant database of predicted proteins obtained from the analysis of assemblies. The sequences can be searched on the MGnify website via the ‘Sequence search’ tab (Figure 12A).
Users can enter a protein query sequence in FASTA format sequence, and choose the database they wish to search against. It is possible to filter the sequence database to be searched against in terms of either sequence-type (full-length, partial, or all proteins), or the biome of the source study. Having specified the parameters and run the query, the user is presented with a table listing the matching proteins, with the corresponding E-value. Using the ‘Customise’ button, the results can be reformatted to include, for example, the bit-score for the match, a graphical representation of the match positions, a link to the corresponding protein in UniProt (if such a protein exists), and the samples and runs from which the protein is derived (Figure 12B). Full documentation on the sequence search facility can be found here.
