0%

Searching a protein sequence

An amino acid sequence in FASTA format can be submitted to the InterPro sequence search in order to determine its protein family, domain organisation or functional annotations, it uses the InterProScan software.

In this exercise, we will use a sequence from an unknown protein:

>UNKNOWN_SEQ
MGNRGMEELIPLVNKLQDAFSSIGQSCHLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLILQLIFSKTEH
AEFLHCKSKKFTDFDEVRQEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLIDLPGITKVPVGDQPPDIEYQIKDMI
LQFISRESSLILAVTPANMDLANSDALKLAKEVDPQGLRTIGVITKLDLMDEGTDARDVLENKLLPLRRGYIGVVNRSQK
DIEGKKDIRAALAAERKFFLSHPAYRHMADRMGTPHLQKTLNQQLTNHIRESLPALRSKLQSQLLSLEKEVEEYKNFRPD
DPTRKTKALLQMVQQFGVDFEKRIEGSGDQVDTLELSGGARINRIFHERFPFELVKMEFDEKDLRREISYAIKNIHGVRT
GLFTPDLAFEAIVKKQVVKLKEPCLKCVDLVIQELINTVRQCTSKLSSYPRLREETERIVTTYIREREGRTKDQILLLID
IEQSYINTNHEDFIGFANAQQRSTQLNKKRAIPNQGEILVIRRGWLTINNISLMKGGSKEYWFVLTAESLSWYKDEEEKE
KKYMLPLDNLKIRDVEKGFMSNKHVFAIFNTEQRNVYKDLRQIELACDSQEDVDSWKASFLRAGVYPEKDQAENEDGAQE
NTFSMDPQLERQVETIRNLVDSYVAIINKSIRDLMPKTIMHLMINNTKAFIHHELLAYLYSSADQSSLMEESADQAQRRD
DMLRMYHALKEALNIIGDISTSTVSTPVPPPVDDTWLQSASSHSPTPQRRPVSSIHPPGRPPAVRGPTPGPPLIPVPVGA
AASFSAPPIPSRPGPQSVFANSDLFPAPPQIPSRPVRIPPGIPPGVPSRRPPAAPSRPTIIRPAEPSLLD

Open the InterPro website (http://www.ebi.ac.uk/interpro/ ) in a new tab and navigate to the InterPro Search page. Select the by sequence tab, and paste the sequence into the text box labelled FASTA Sequence. Press Search. You will redirected to the Results/Your InterProScan searches page, where you can see the status of the search and the list of sequence searches performed in the last 7 days.

While waiting for the search to complete carry on reading the section below.

 

Interpreting the results

Once the search is finished, clicking on the job identifier in the Results column redirects to the Results page. General information about the query protein are shown at the top, including the protein family membership under the corresponding section, when available.

Matches to InterPro entries and member database signatures are displayed in the sequence viewer. The results are displayed categorised by InterPro entry types (e.g. Family, domain…). The match locations of the various member database signatures are shown below the overall InterPro match. For more details on how to interact with the viewer click on the (i) icons in the figure below.

Additionally, GO term annotations, giving an indication of the protein function and location, are shown at the bottom of the result page when available.

Figure 4. Example of a protein sequence search result page.

 

By now the search should have completed, click on the job identifier in the Results column in the InterPro website Results/Your InterProScan searches section to visualise the search output.

 

Looking at the results obtained from your sequence search, answer the following questions.

 

In the next section, we are going to continue the exploration of the Dynamin-2 protein family.

Click on the IPR027188 accession in the list of InterPro entries on the right hand-side of the sequence viewer to access the Dynamin-2 InterPro entry page.