 |
2Can Support Portal - Protein Function
PPSearch - Introduction
PPSearch is a useful tool when we want to search for protein motifs in our query sequence. It rapidly compares our query sequence against all the patterns that are stored in the PROSITE pattern database.
PROSITE is a database of protein families and domains. Within PROSITE, motifs are encoded as regular expressions which are often called patterns. The process used to derive these patterns involves the construction of a multiple alignment of known homologues and then manual inspection to identify the conserved regions. These conserved regions are then reduced to single consensus expressions.
Examples of some of these PROSITE patterns are:
- Actin pattern: [FY]-[LIV]-G-[DE]-E-A-Q-x-[RKQ](2)-G
- Nuclear receptor: C-x(2)-C-x-[DE]-x(5)-[HN]-[FY]-x(4)-C-x(2)-C-x(2)-F-F-x-R
- CAMP phosporylation: [RK](2)-x-[ST].
When amino acids appear in square brackets, any of their contents can be matched. The 'x' stands for any of the 20 amino acids. A number in brackets symbolises how many of the preceeding amino acid occur in the pattern.
PPSearch searches the patterns in the PROSITE database looking for matches to the query sequence. Sometimes a query sequence may match more than one pattern.
Many protein families have more than one conserved region and are characterised by more than one motif. The matches that are found in our query sequence help us to determine to which family our protein sequence belongs and which domains are present in our protein.
We will next consider an example of using PPSearch >>>
|
|