LIGPLOT of Val-Trp dipeptide
DIMPLOT of interface between
chains A and B in 3sdp
|2.||Paths and directories|
|3.||Generating a LIGPLOT/DIMPLOT diagram|
|4.||Editing the plot|
|8.||Adding missing H-bonds|
|9.||Removing unwanted H-bonds|
|10.||Plots from structural alignments|
|11.||Structural alignment file formats|
Edit the text fields as follows:
i. PDB paths
PDB files are most conveniently identified simply by the 4-character PDB code or identifier. If your system contains one or more locations where PDB files are stored, you can define a "template" path which will allow LigPlot+ to find the relevant PDB file from its 4-character code alone. The files can be gzipped or not.
Alternatively, if you are connected to the internet when you run LigPlot+, the required file can be retrieved by ftp from a given ftp site.
The 4 characters of the PDB code in each template are represented as abcd. These appear in the template within square brackets.
For example, the template
C:/roman/pdbsum/pdb/pdb[abcd].entmeans that, for example, PDB code 1ral would be translated as:
C:/roman/pdbsum/pdb/pdb1ral.entSimilarly, the ftp template
ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/data/structures/divided/pdb/[bc]/pdb[abcd].ent.gzwould map to PDB code 1ral as:
ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/data/structures/divided/pdb/ra/pdb1ral.ent.gzNote that here the PDB files are stored in subdirectories whose names correspond to the middle part of the PDB code (here shown as "[bc]").
The above ftp address retrieves gzipped PDB files from the PDBe ftp site at the EBI. A full list of possible ftp addresses is given below:
You can add others.
PDBe ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/data/structures/divided/pdb/[bc]/pdb[abcd].ent.gz PDBj ftp://ftp.pdbj.org/pub/pdb/data/structures/divided/pdb/[bc]/pdb[abcd].ent.gz RCSB PDB ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb/[bc]/pdb[abcd].ent.gz
ii. The Het Group Dictionary
LigPlot+ gets its information about ligand molecules from the Het Group Dictionary. This defines the atom names, connectivities and bond orders of every Het Group in the PDB. You can download a copy of this dictionary from the wwPDB at:
It is a good idea to download it regularly to always have an up-to-date copy.
Record the full path and filename of the Het Group Dictionary here so that the programs know where to find it.
SDF Dictionary. Alternatively, if your ligand definitions are in a .sdf file, you can provide the name of your file here. There are two important considerations:
iii. Temporary directory
LigPlot+ needs a directory where it can puts its temporary working files (which it deletes once it has done with them). Use this parameter to define a directory to which you have write-access and which LigPlot+ can use for its working files.
iv. Name of RasMol executable
Enter the full path and name of the RasMol executable file. This will allow you to display the 3D coords of the given LIGPLOT/DIMPLOT diagrams. If you do not have RasMol, enter NONE here, although it is recommended that you download and install a copy of the program.
Linux users might find it more convenient to enter the RasMol path using the following format:
xterm -e rasmol -scriptThis will open up an X-term window as well as a RasMol window. The former can be used for entering RasMol commands.
v. Name of PyMOL executable
Similarly, if you have PyMOL installed, enter the full path and name of the executable here. As in the RasMol option, this will allow you to display the 3D coords of the given LIGPLOT/DIMPLOT diagrams. If you do not have PyMOL, enter NONE here.
Save. Once you have entered all the parameters, press the Save button to store your paths. Should you need to alter the paths at some later date, you can do so via the Program paths menu item in the Edit menu:
Alternatively, you can browse for a PDB file in your local directory system by clicking the Browse button.
When the file has been loaded, you will see a summary of the ligands and metals in the PDB file, if any, as shown below for PDB entry 1a95.
Clicking on the LIGPLOT or DIMPLOT tabs allows you to choose which of the two programs to run.
Alternatively, you can enter your own residue range in the box. The residue range can be one of the following:
|18 20 A||
Residue range: start-residue-no. end-residue-no. chain-id
If the chain-id is blank it should be omitted
|NAG 18 MAN 20 A||Residue names and numbers, plus chain id.|
|-n818 108 -n818 108 A||Where the residue name is numeric, prefix it with "-n" to indicate that it is the residue name rather than the residue number.|
|"MTE" 1 " G" 2 B||Where a residue name includes leading blanks (eg as in the old definition of the nucleic acid bases, A, C, G and T), enclose it in quotes and include the spaces.|
|"-nMTE" 1 "-n G" 2 B||An alternative for the case above.|
1-136 A & 338-371 AA domain can consist of residue ranges from more than one chain.
Then to run either program, click on the Run button.
When the plot has been generated it will appear on screen as shown below. This shows the PCP 301 ligand in PDB entry 1a95.
The meaning of the items on the plot is as follows:
You can change colours, sizes, etc as described later.
Sometimes a plot contains thin purple lines. These are either covalent bonds between protein and ligand, or "elastic" bonds within the ligand. The latter occur when the ligand is a cyclic peptide; the program has to make one of the bonds "elastic" in order to be able to flatten the ligand into 2D.
The plot can now be edited interactively, as described next.
Note, that to switch between moving whole residues and single atoms click the Move residue/move atom button at the bottom-right of the frame to toggle between the two modes.
Note also that the hydrophobic contact groups cannot be moved in the Move atom mode.
Use the right mouse button to click-and-drag on any blank area of the plot to zoom in and out of the plot.
Note for mac users. If your mouse only has a single button, use the SHIFT key with your button-click to get the same functionality as a right button click.
To rotate about this atom, click-and-drag any other atom in the residue. Release the button when you have reached the desired position.
To deselect the atom, right- or left-click on any blank part of the plot, or on any other residue.
Firstly, select the bond by clicking on it with the right mouse button. A marker will appear over the selected bond.
To flip about this bond, click with the left mouse-button on any atom on the side of the residue to be flipped. A single click will perform the flip. Clicking a second time on any of the flipped atoms will reverse the flip.
To "unselect" the flip-bond, click with the right or middle mouse button on any blank area of the plot. The marker will disappear.
Select the text item by right-clicking on it. A box containing the text string will appear. You can edit the text as required. Click the OK button to accept the new version, or the Cancel button to retain the previous text label.
After the plots have been generated and overlaid, you can edit them by moving their components independently and switching between them. The currently active plot is shown in full colour in the foreground, while all background plots are greyed out. Before printing you can "switch off" the display of the background plots such that only the foreground plot appears. This allows you to produce a set of clean plots with equivalent components in equivalent positions on them. Alternatively, if you use the "Write PostScript" option, you can have each plot printed on a separate page.
The first plot is generated in the usual way. Each subsequent plot is fitted to the first by a sequence-based comparison of the two binding sites. This identifies equivalent residues in the 3D structures and these equivalences drive the generation of the second LIGPLOT. If the program is unable to match the binding sites based on their sequences, it tries to fit the ligands using a graph matching procedure. For very distantly related proteins it is better to use a structural alignment. Section 7 describes how to import such an alignment.
The new plot, when generated, will become the foreground plot; any current plot(s) will go into the background.
The left-hand example above shows a plot of the interactions between guanine 600 in PDB entry 2pwu and the protein residues. The right-hand plot shows superposed on it a plot of a similar molecule (9DG) bound to PDB entry 1q2r. The red circles and ellipses identify the residue on the latter plot that are equivalent to the underlying residues from the first plot.
LigPlot+ attempts, as best it can, to place residues in the new plot on top of the equivalent residues in the old one. It does this by first performing a simple sequence alignment between the set of interacting residues from each structure. The equivalenced residues in this alignment are then used to superpose the 3D coordinates from both PDB files. This may throw up more pairs of equivalenced residues, being those that significantly overlap in the superposition. The residue equivalences are then used to "drive" the generation of the new LIGPLOT diagram; the 2D locations from the first plot restrain the positions of the corresponding residues in the second.
The superposition may not always be successful if either the two proteins are quite different, or the second ligand is very large relative to the first.
The "Split screen" button at the bottom of the frame separates the two (or more) plots, and arranges them in the window.
Two plots, as in the above example, are shown side by side:
The "Merge back" button will restore the plots to overlap mode.
You can change how the equivalent residues are shown on the plot by using the On/Off parameters to switch off the red circles and, instead, ticking the "Highlight equivalent side chains" option. The resultant plot will look like:
Here, the equivalent residues have a red underlay beneath their bonds and atoms. Equivalent residues engaged in hydrophobic interactions are shown in thicker lines.
Inverting the highlights
In some cases, it might be useful to highlight the residues in the proteins that are different, rather those that are equivalent. In this case, one of the On/Off parameters allows you to select the non-equivalenced residue for highlighting.
You can then move any of the plots on screen by clicking and dragging any of their components (ie atoms, bonds, text items, etc). Remember to revert to "Move residue" mode when you're done to return to normal editing. When you merge the plots back, the "Move plot" option is lost.
Clicking on one of these buttons will pop up a RasMol or PyMOL window containing the superposed ligands, in 3D, with H-bonds added in cyan and atoms involved in non-bonded contacts represented by dot surfaces (in RasMol) or transparent surfaces (in PyMOL).
Then, deselect the Show inactive plots option, circled in red below.
The colour options differ slightly depending on whether your plot is a LIGPLOT or a DIMPLOT. For the former, bonds and atoms are classified as ligand or non-ligand. For the latter, the two classes are: interface 1 and interface 2.
If you have two or more plots overlaid, there is an extra checkbox (see example on the left below) which allows you to apply the new parameters to the currently selected plot only, or to all plots. Thus you can, say, have the bonds or labels shown in a different colour in each plot. The background parameters always apply to all plots.
The items are grouped into: Labels, Atoms, Bonds and Miscellaneous.
The items are grouped into: Label sizes, Atom and residue sizes and Bond widths.
The left-hand panel above shows the LIGPLOT options while the right-hand panel contains the DIMPLOT options.
Background, Labels, Atoms and Bonds.
The left-hand panel above shows the LIGPLOT options while the right-hand panel contains the DIMPLOT options.
Note that, for ligand and non-ligand bonds there is a special extra colour labelled "ATOM". If this "colour" is selected, the bonds will be coloured such that each half is of the colour of the atom bonded at that end.
The four numeric parameters relate to the HBPLUS program used by LigPlot+ to compute all the potential hydrogen bonds and non-bonded contacts. The first two values correspond to the maximum hydrogen-acceptor and donor-acceptor distances for defining what is a hydrogen bond. By increasing these values you can "bring in" additional interactions which may be just outside the program's criteria for an H-bond.
The next two numeric parameters correspond to the range of distances defining non-bonded contacts (ie contacts between atoms that are neither covalently bonded, nor interacting via hydrogen bonds). The default range is 2.9-3.9 Å
Below the numeric parameters are two groups of radio buttons. The first set allows you to specify which atom-atom contacts are to be included as non-bonded contacts: a) between hydrophobic atoms only (ie C or S), b) between a hydrophobic atom and any other, or c) between any type of atom.
The second group of radio buttons relates to the treatment of any CONECT records that may be in your PDB file. The CONECT records define which atoms are covalently bonded to one another. They are generally used for defining the connectivity of the ligand. Sometimes, these records are incorrect and give unfeasibly long bonds.
You can choose how the program should treat the CONECT records. The default is to use them if they look sensible - that is, if they don't give ridiculously long bond lengths. The second option is to ignore these records altogether; the program will compute the ligand's covalent bonds itself, using set distance cut-offs. The third option is to accept all CONECT records, irrespective of the bond lengths they give.
If you need your plot to have an H-bond that HBPLUS has missed, you can add it as follows:
The format of each "HHB" or "NNB" record should be formatted exactly as shown below, where the dots represent blank spaces.
Key <----Atom 1 ---> <----Atom 2 ---> Dist HHB...RRR.C.NNNNI.AAAA.....RRR.C.NNNNI.AAAA....DDDD
|where||HHB||defines the record type: HHB for H-bonds, NNB for non-bonded contacts|
|RRR||is the 3-character residue name|
|C||is chain identifier (which may be blank)|
|NNNN||is the right-justified residue number (1-9999)|
|I||is the insertion code, or blank if none|
|AAAA||is the 4-character atom name in standard PDB format|
|DDDD||is the distance between the two atoms (to 2 places of decimals)|
Some example lines are given below:
HHB...RRR.C.NNNNI.AAAA.....RRR.C.NNNNI.AAAA....DDDD HHB HIS A 231 N ASP A 226 OD2 2.77 HHB HIS A 231 ND1 ASP A 226 OD1 2.76 HHB HIS A 231 NE2 THR A 1317 O 2.80 HHB THR A 1317 N ALA A 113 O 2.72 HHB ALA A 113 N PHQ A 317 O2 3.31
So, for example, to remove the H-bond between the NH2 of Arg286(A) and the O7 of your ligand, LIG1(B), you would add the following line to your PDB file:
-HB...RRR.C.NNNNI.AAAA.....RRR.C.NNNNI.AAAA....DDDD -HB ARG A 286 NH2 LIG B 1 O7 0.00
One way to help the program is to supply it with a structural alignment between the proteins, letting it know which residues are equivalent in 3D.
At present, LigPlot+ only accepts structural alignments in the following formats:
Import the alignment
Import using File-Import from the menu bar.
A pop-up window will show you the list of PDB codes in the file and allow you to select which one you want to plot first.
After the first plot has been generated, you can add others on top of it via either File-Add or File-Open-PDB file.
1. CORA format (.cora or .aln)
The format must be CORA v.1.1, which is as follows:
#FM CORA_FORMAT 1.1 3 6insE0 1igl00 1bqt00 73 1 0 1 0 0 0 1 A 0 0 0 0 0 0 0 0 2 0 1 0 0 0 2 Y 0 0 0 0 0 0 0 0 3 0 2 1B F 0 3 R 0 0 0 0 0 0 0 0 4 0 3 2B V H 4 P 0 1 G 0 0 1 0 2 5 1 3 3B N H 5 S 0 2 P 0 0 1 0 6 6 0 3 4B Q H 6 E 0 3 E 0 0 1 0 2 7 2 3 5B H H 7 T 0 4 T 0 0 1 0 5 ---------+---------+---------+---------+---------+---------+---------+ 1234567890123456789012345678901234567890123456789012345678901234567890 1 2 3 4 5 6 7
START PROT 1 PROT 2 PROT N END <------------><---------><---------><---------><----------------> ddddxddddxddddxddddcxcxxcxddddcxcxxcxddddcxcxxcxxxcxddddxddddxxdd 1 0 1 0 0 0 1 A 0 0 0 0 0 0 0 0 2 0 1 0 0 0 2 Y 0 0 0 0 0 0 0 0 3 0 2 1B F 0 3 R 0 0 0 0 0 0 0 0 4 0 3 2B V H 4 P 0 1 G 0 0 1 0 2 5 1 3 3B N H 5 S 0 2 P 0 0 1 0 6 6 0 3 4B Q H 6 E 0 3 E 0 0 1 0 2 7 2 3 5B H H 7 T 0 4 T 0 0 1 0 5 ---------+---------+---------+---------+---------+---------+---------+ 1234567890123456789012345678901234567890123456789012345678901234567890 1 2 3 4 5 6 7
2. CAF format (.caf)
This is one of the formats output by the CATHEDRAL structural alignment program.
HH CATHEDRAL Alignment 2.02 CC Date: Tue Feb 22 21:07:49 2011 CC Author: cathedral CC Protein1 Protein2 Len1 Len2 Score Align %Ov %Seq RMSD RR 1byq 2wi7 213 209 93.05 208 97 99 1.38 RR 1byq 3d36 213 121 72.64 115 53 17 9.95 RR 2wi7 3d36 209 121 72.58 114 54 16 9.68 1 11 21 31 41 51 pdb|3d36 --------------------VDIQATLAPFSVIGEREKFRQCLLNVMKNAIEAMPN---- pdb|3d36 SSSSS HHHHHHHHHHHHHHHHH pdb|1byq PMEEEEVETFAFQAEIAQLMSLIINTFYS-------NK-EIFLRELISNSSDALDKIRYE pdb|1byq SSSSS HHHHHHHHHHHH HHHHHHHHHHHHHHHHHHHH pdb|2wi7 -----EVETFAFQAEIAQLMSLIINTFYS-------NK-EIFLRELISNSSDALDKIRYE pdb|2wi7 SSSS HHHHHHHHHHHH HHHHHHHHHHHHHHHHHHHH ...............::::::::: :: :::::::::*:::*:::.... ---------+---------+---------+---------+---------+---------+---------+ 1234567890123456789012345678901234567890123456789012345678901234567890 1 2 3 4 5 6 7
3. Multiple FASTA format (.fasta)
The standard FASTA multiple alignment format:
>pdb|3d36 --------------------VDIQATLAPFSVIGEREKFRQCLLNVMKNAIEAMPN---- ------------GGTLQVYVSI---DNGRVLIRIADTGVGMTKEQLERLGEPYFTTKGVK G---------------TGLGMMVVYRIIES-MNGTIRIESEIH----------------- ----------------KGTTVSIYLPLAS------------------------------- ------------- >pdb|1z5a ----------KEKFTSLSPAEFFKRNPELAGFPNPARALYQTVRELIENSLDATDVHGI- ------------LPNIKITIDLIDDARQIYKVNVVDNGIGIPPQEVPNAFGRVLYSSKYV NRQTR---------GMYGLGVKAAVLYSQMHQDKPIEIETSPVNSKRIYTFKLKIDINKN EPIIVERGSVENTRGFHGTSVAISI--PGDWPKAKSRIYEYIKRTYIITPYAEFIFKDPE GNVTYYPRLTNKI >pdb|1byq PMEEEEVETFAFQAEIAQLMSLIINTFYS-------NK-EIFLRELISNSSDALDKIRYE TLTDPSKLDSGKELHINLIPNKQD-----RTLTIVDTGIGMTKADLINNLGTIAKSGTKA FMEALQAGADISMIGQFGVGFYSAYLVA-----EKVTVITKHNDD-EQYAWESSAG---- --GSFTVRTDTGEPMGRGTKVILHLKEDQTEYLEERRIKEIVKKHSQFI-GYPITLFVE- -------------The key thing here is the format of the protein name line which must take one of the following forms:
>pdb|1z5awhere "A" is the chain identifier and "01" is the domain number.