This exercise will cover the basic types of protein structures as
represented in the Protein Data Bank. For a detailed explanation of
protein structure components, please see this excellent
introduction in Wikipedia.
Fold refers to a global type of arrangement, like helix-bundle or beta-barrel.
Despite the fact that there are now over 40,000 experimentally
determined structure in the PDB, the number of unique folds that these
protein adopt is limited, and all proteins can be classified into one
of more fold categories, which are annotated in databases like CATH and SCOP. More often than
not, similar functions may be associated with certain fold of proteins,
and the fold classification therefore, serves as an important tool in
understanding the possible function of a protein.
All alpha-helix proteins: There are many different
families of proteins which are composed of only alpha-helices. Please
see extra details here.
Some examples to explore are given below.
PDB Entry: 1IRD
Start with the MSD home page service at http://www.ebi.ac.uk/msd, and in the space provided for Get PDB by id, type in 1ird, and click on Start Search. The browser will take you to the 'Atlas' pages for this particular structure. The atlas pages provide information about various facets of the deposited structure, including links to external sites. Coming back to 1ird, this structure is of Human hemoglobin bound to carbon monoxide. Choose the visualization link on the page and then click on the small image of the protein. The protein is only composed on alpha-helices. You can also choose to view the structure interactively by choosing either the Astexviewer or Rasmol links on the visualization page. Go back and now look at the Sequence section on the atlas page for this entry. You can view the Pfam classification for this family of proteins by choosing the link on the page to PF0042 or by clicking here. Also look at the GO (Gene Ontology) entry for this protein on the Sequence page. The GO entry for this protein lists both the processes and function this protein is involved in. The structural classification of this protein can be viewed from the Similarity pages in the atlas pages. This protein belongs to the globin type of all alpha-helix proteins. Other entries in the PDB for the same protein are also given on the same page. Clicking on any of the related PDB entry links will take you to the summary pages for that PDB entry. In the ligands section of the atlas pages, you can view more details of the compounds that this protein is associated with in this entry. The ligands present in this structure are Carbon monoxide and heme. You can now view the interactions the heme group makes with the protein by choosing the "View the interactions of HEM with 1ird" link. This will take you to our MSDsite service and will show that HEM in this structure interacts with Histidine, CMO (Carbon monoxide), Tyrosine and others. Color codes provide information about the type of interaction.
All beta-sheet proteins: These proteins are
composed of only beta-sheets. This group is fairly large and comprises
proteins with widely varying functions, from sugar-binding to metabolic
transport to antibodies. Some examples are given below for you to
explore. In each example below, look at the structure as above, as well
as pay attention to the Pfam, CATH and SCOP classification for each
entry to get a feel for the structure.
PDB entry: 1A0S
This protein is a beta-barrel protein and is involved in the transport
of maltodextrin across the outer membrane of gram-negative bacteria.
Other proteins share similar topology to this protein. More information
about related proteins can be seen from the Pfam
entry. Essentially this family is comprised of proteins that are
collectively called porins.
Look at the GO and Pfam entries for this protein.
Question 1: What compound/s is
this protein associated with and what are the interactions between the
compound/s and the protein ? (HINT: Look at the ligands page!).
Question 2: Which other entries in
the PDB have the same protein ? (HINT: Look at the Similarity page).
PDB entry: 1BKZ
1BKZ is the structure of a protein called galectin-7. This protein
belongs a specific family of proteins called galectins (or s-lectins)
and almost all of them bind to the galactoside sugars. This protein
belongs to a different family of all beta-sheet proteins (Link
to SCOP entry). Look at the various links on the right side
of the atlas page for this entry for more information and try to answer
the questions below.
Question 1: What is general function of this protein as annotated in the GO heirarchy ?
Question 2: This structure is not bound
to any sugar/compounds, but are there any other PDB entries for the
same protein which have ligands bound ? Which sugars do the other
entries for the same protein associate with ? (Hint: Look at the
Related PDB entries and their summary pages!)
Question 3: Look at the summary
page for PDB entry 1w6o. Is there anything common between the
ligand interactions of GAL in 2gal with the interaction of LAT in 1w6o
? (Hint: Look at the interactions of GAL with PDB entry 2GAL and the
interactions of LAT with 1w6o).
Alpha-Beta proteins: This
is the most populous category in protein fold classification. (Link to
SCOP (a/b) and Link to
SCOP (a+b)). SCOP has a total of 415 classes of proteins which are
composed of alpha helices and beta-sheets in different topologies. Most
enzymes fall into one of these families. Let us look at a few examples.
PDB entry: 1AFL
1AFL is the structure of pancreatic ribonuclease from Bos taurus
(Bovine). Ribonucleases make up a large family of proteins with similar
enzymatic functions and structures and include members that are
implicated in angiogenesis
(blood vessel growth in cancers). Ribonuclease essentially cleave RNA.
Read more about the function of ribonucleases from the InterPro and
Pfam entries for this structure from the Sequence link on the atlas
page. As is probably obvious there are over 150 structures of
pancreatic ribonuclease determined in complex with various enzymatic
inhibitors. Look at the Ligands page for this entry. This protein is
bound to a compound called ATR, which is a modified ribonucleotide that
binds to the active site and inhibits the activity of the enzyme. View
the interactions of ATR with 1AFL from the ligands page. On the
Similarity page, click on "Compare PDB entry 1AFL to the entire PDB
using MSDfold". This will activate the MSDfold service which will
compare the structure of 1AFL with the whole PDB. The comparison
job will run on the whole PDB and will present a page of results. At
the bottom of the results page, choose "Sort by Seq%" and wait for this
page to be remade. From the right side bottom of the page, choose the
"Last Page" button to go to the last page. Click on result 308, which
has 29% sequence identity and 78% identity in structure and choose View
superpose. This will throw up a graphics window and show the structural
alignment of 1AFL with 2I5S (an onconase). You can rotate the aligned
structures. The two structures are very similar in fold but share very
low sequence identity. Now look at the atlas pages for 2I5S by clicking
here. Close the graphical window and the MSDfold windows.
Question 1: What are the similarities in function between 1AFL and 2I5S ? (Hint: Look at the GO entries for both structures!).
Question 2: What interactions does
ATR make with 1AFL ?
Question 3: What is the structural
characteristic of this class of proteins ? (Hint: Look at the SCOP
entry for this protein).
Question 4: What is the EC number for
this class of protein ? (Hint: Look at the Brenda classification for
this protein from the main summary pages). What can you tell about the
overall nature of proteins which have the first 2 EC numbers in common
with this entry ?
PDB entry: 1KWW
Look at 1KWW, the structure of a mannose-binding protein from Rat.
These sugar-binding proteins are characterized by binding sugars
(usually mannose) in the presence of metal ions such as Calcium. Look
over the structure carefully and try to answer the questions below.
Question 1: What best describes this
family of protein ? (Hint: Look at the Pfam and GO entries for this
protein).
Question 2: Which residues are generally
involved in the interaction between Calcium (CA) and the protein, and
what is the predominant nature of this interaction ?
Question 3: Compare the binding
site of MFU in 1KWW, with that of the binding site of FUC in PDB entry
3KMB. Can you identify the binding-site for sugars by looking at the
residues which interact with the protein in both cases ? (Hint: Look at
the ligand interactions for both MUC in 1KWW and FUC in 3KMB by looking
at atlas pages for both these entries).