Searching a protein sequence
An amino acid sequence in FASTA format can be submitted to the InterPro sequence search in order to determine its protein family, domain organisation or functional annotations, it uses the InterProScan software.
In this exercise, we will use a sequence from an unknown protein:
>UNKNOWN_SEQ
MGNRGMEELIPLVNKLQDAFSSIGQSCHLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLILQLIFSKTEH
AEFLHCKSKKFTDFDEVRQEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLIDLPGITKVPVGDQPPDIEYQIKDMI
LQFISRESSLILAVTPANMDLANSDALKLAKEVDPQGLRTIGVITKLDLMDEGTDARDVLENKLLPLRRGYIGVVNRSQK
DIEGKKDIRAALAAERKFFLSHPAYRHMADRMGTPHLQKTLNQQLTNHIRESLPALRSKLQSQLLSLEKEVEEYKNFRPD
DPTRKTKALLQMVQQFGVDFEKRIEGSGDQVDTLELSGGARINRIFHERFPFELVKMEFDEKDLRREISYAIKNIHGVRT
GLFTPDLAFEAIVKKQVVKLKEPCLKCVDLVIQELINTVRQCTSKLSSYPRLREETERIVTTYIREREGRTKDQILLLID
IEQSYINTNHEDFIGFANAQQRSTQLNKKRAIPNQGEILVIRRGWLTINNISLMKGGSKEYWFVLTAESLSWYKDEEEKE
KKYMLPLDNLKIRDVEKGFMSNKHVFAIFNTEQRNVYKDLRQIELACDSQEDVDSWKASFLRAGVYPEKDQAENEDGAQE
NTFSMDPQLERQVETIRNLVDSYVAIINKSIRDLMPKTIMHLMINNTKAFIHHELLAYLYSSADQSSLMEESADQAQRRD
DMLRMYHALKEALNIIGDISTSTVSTPVPPPVDDTWLQSASSHSPTPQRRPVSSIHPPGRPPAVRGPTPGPPLIPVPVGA
AASFSAPPIPSRPGPQSVFANSDLFPAPPQIPSRPVRIPPGIPPGVPSRRPPAAPSRPTIIRPAEPSLLD
|
While waiting for the search to complete carry on reading the section below. |
Interpreting the results
Once the search is finished, clicking on the job identifier in the Results column redirects to the Results page. General information about the query protein are shown at the top, including the protein family membership under the corresponding section, when available.
Matches to InterPro entries and member database signatures are displayed in the sequence viewer. The results are displayed categorised by InterPro entry types (e.g. Family, domain…). The match locations of the various member database signatures are shown below the overall InterPro match. For more details on how to interact with the viewer click on the (i) icons in the figure below.
Additionally, GO term annotations, giving an indication of the protein function and location, are shown at the bottom of the result page when available.
Figure 4. Example of a protein sequence search result page.
Looking at the results obtained from your sequence search, answer the following questions.
|
Click on the IPR027188 accession in the list of InterPro entries on the right hand-side of the sequence viewer to access the Dynamin-2 InterPro entry page. |