spacer

PDBsum1 Operating Manual

PDBsum1

Running PDBsum1

PDBsum1 is run from the command line.
 
Before you can run it, you need to create an environment variable called CATHPARAM that points to the file: pdbsum1/params/CATHPARAM. This is the file you edited during installation (see Installation Instructions).
 

Defining the CATHPARAM environment variable

a. Linux

In linux, CATHPARAM is defined via the command:
setenv CATHPARAM [fullpath]/pdbsum1/params/CATHPARAM
where [fullpath] points to the location of the PDBsum1 programs.
 
If you use the bash shell, you need to use the command:
export CATHPARAM=[fullpath]/pdbsum1/params/CATHPARAM

b. Mac

For macs, the command is the same as for the linux bash shell above:
export CATHPARAM=[fullpath]/pdbsum1/params/CATHPARAM

c. Windows

Under Windows, the CATHPARAM environment variable is defined from the Command Prompt by:
setx CATHPARAM "[fullpath]\pdbsum1\params\CATHPARAM"
This will set CATHPARAM to this value every time you open a new Command Prompt window, but will not set it in the current window. To set it for the current window, enter:
set CATHPARAM=[fullpath]\pdbsum1\params\CATHPARAM

 

Setting up an alias for pdbsum1

To make your life easier, you can set up an alias so that typing pdbsum1 will run the program without you having to enter the full path to the appropriate executable directory.

a. Linux

In linux, you can do this by:
alias pdbsum1 [fullpath]/pdbsum1/exe_linux/pdbsum1
where [fullpath] points to the location of the PDBsum1 programs.
 
If you use the bash shell, you need to use the command:
export pdbsum1='[fullpath]/pdbsum1/exe_linux/pdbsum1'

b. Mac

On macs, the command will depend on whether you have the bash or the Z shell. To tell which one you have, type:
echo $0
For bash, the command is as for the linux bash shell above:
export pdbsum1='[fullpath]/pdbsum1/exe_mac/pdbsum1'
For the Z shell, use:
alias pdbsum1='[fullpath]/pdbsum1/exe_mac/pdbsum1'

c. Windows

Under Windows, the easiest option is to add the exe_win directory to your %PATH% parameter as follows:
set PATH=%PATH%;"[fullpath]\pdbsum1\exe_win"

 

Saving the environment variables

As the CATHPARAM and pdbsum1 variables need to be defined as above in each window you open, you will find it convenient to have them automatically set up.
 
This is done by adding the relevant commands to your .cshrc, .zshrc, .bashrc, or autoexec.bat file - whichever is appropriate for your operating system (ie linux C shell, Z shell, bash shell, or Windows, respectively).
 
For example, for windows, you woule create a file called C:\autoexec.bat in the C:\ directory containing the following lines:
setx CATHPARAM "C:\pdbsum1\params\CATHPARAM"
set PATH=%PATH%;"C:\pdbsum1\exe_win"
After creating this file you may need to reboot your windows machine for the settings to take effect.
 
Check that all is OK by opening a new window and typing: pdbsum1. This should give the message:
Usage is:

pdbsum1 {file [pdb_code] | -l dataset.lst} [-nomaxchn]  [-relinks]  [-help]

where
     * file           = Name of single PDB file to be processed
     * pdb_code       = Optional 4-character identifier
     * -l dataset.lst = Name of file listing PDB files to be processed
     * -nomaxchn      = Clefts to be generated even if more than 4 chains in structure
     * -relinks       = Make all image links relative (if directories to be moved)

 

Running the program

There are two ways of running the program:
  1. Single PDB file - where the analyses are to be run on only one file. The command is:
    pdbsum1 filename
    where filename is the name of the PDB file. The file can be a standard text file, or a gzipped version.
     
  2. Multiple files - where you run the program on many files. The files should be listed in a simple text file, one filename per line, including the full path to the files if they are not in the current directory. The program is run by the command:
    pdbsum1 -l dataset.lst
    where dataset.lst is the file listing the PDB files to be processed, and may look like:
    
    /nfs/data/pdb/3p/pdb13pk.ent.gz
    /nfs/data/pdb/8g/pdb18gs.ent.gz
    /nfs/data/pdb/aa/pdb1aaw.ent.gz
    /nfs/data/pdb/co/pdb1coh.ent.gz
    /nfs/data/pdb/sd/pdb1sdx.ent.gz
    /nfs/data/pdb/or/pdb2or1.ent.gz
    
    The results will listed in a .html file called index.html in your PDBsum1 results directory (see Results below).
There is one extra, optional, parameter: -nomaxchn which relates to the Clefts analysis. As the calculation of the clefts can take a very long time for large structures, the calculations will not be performed if the structure contains more than 4 protein chains. If you add -nomaxchn to the command, the clefts will be calculated however many chains the structure contains:
pdbsum1 filename -nomaxchn
Note that, on Windows, you may see new windows popping open and then closing as the program runs. This is fine. If you accidentally click on one of these and the windows freezes, just press RETURN and the program should resume.
 

PDB codes

Each PDB file processed is identified by a 4-character 'PDB code'. If the name of your file is of the form:
pdbXXXX.ent or pdbXXXX.pdb
the program will use XXXX as the PDB code.
 
Alternatively, you can assign a PDB code when you run the program by:
pdbsum1 filename XXXX
where XXXX is the PDB code.
 
When using the dataset.lst file to run on multiple PDB files, you can assign individual PDB codes in the file as follows:

/nfs/data/pdb/3p/pdb13pk.ent.gz pdb1
/nfs/data/pdb/8g/pdb18gs.ent.gz pdb2
/nfs/data/pdb/aa/pdb1aaw.ent.gz pdb3
/nfs/data/pdb/co/pdb1coh.ent.gz pdb4
/nfs/data/pdb/sd/pdb1sdx.ent.gz pdb5
/nfs/data/pdb/or/pdb2or1.ent.gz pdb6
where pdb1, pdb2, etc, are the PDB codes.
 
If the program cannot identify a suitable code from your file name, and no code is supplied as described above, it will automatically generate a code starting from a001, a002, and so on.
 

PDB file format

The file(s) you run should be in correct PDB format. This is defined in: PDB file format
 
Some docking and molecular modelling programs fail to adhere to this format. PDBsum1 tries to correct certain types of errors (eg missing chain identifiers, duplicate atom names, incorrect base names), but may not catch them all. If the programs fail to run correctly, you may need to check the file format.
 
You will find it useful to make sure the file contains a TITLE and a DBREF record, as described below.

a. TITLE record

You can use this record for the name of the protein. This will appear on the PDBsum1 results page and will help identify different structures if you have processed many. The format of this record is given in: TITLE record

b. DBREF record

This record can tell PDBsum1 what the UniProt accession code of your protein is. The program will then obtain the Pfam domains and generate a domain diagram like this:

Figure 1. Pfam domains for the given UniProt sequence and PDB structure's coverage.

 
The format of the record is given in: DBREF record
 

Results

The results of the PDBsum1 run are listed in the index.html in your PDBsum1 results directory - as defined in the PDBSUM1_RESULTS_DIR parameter in your CATHPARAM file (see Installation instructions).
 
The results page will look like (a) or (b) below depending on whether you ran on a list of PDB files using the "-l dataset.lst" option, or only on a single file, respectively.
(a)   (b)
 
Figure 2. List of results from each run. (a) Run 1 was for 9 PDB structures, while (b) Run 2 was for a single file.

 
Clicking the links will take you to the results page for that entry. For example, for the first structure, 13pk, the results page looks like:
 

Figure 3. Results page for 13pk.

 
Clicking on the PDBsum1 logo in the top right-hand corner will return you to the list of results as in Fig.2 above, whereas clicking on the same logo in the lists illustrated in Fig.2 will list all the PDBsum1 runs to date:
 

Figure 4. List of PDBsum1 runs to date.

 

Viewing the structures

To view each structure in 3D, you need to have installed either the RasMol or PyMOL molecular viewer, or both, and your browser needs to be configured to load the appropriate one when you click on its icon:
 
  RasMol. Before you click on this icon for the very first time you will need to edit the appropriate rasscriptstart script file to indicate where the RasMol executable is (see Installation instructions).
 
Then, when you first click on this icon, your browser will ask what you want done with the file (ie whether to save it or to open it in an application). You need to say to Open it using the rasscriptstart script file. This will open the RasMol program which will then interpret the script that PDBsum1 has generated and render the structure correctly.
 
An alternative method is to: save the file, open RasMol, and then issue the command:
RasMol> script filename
where filename is the name (and full path) of the file you just saved.
 
  PyMOL. When you first click on this icon, your browser will ask what to do with the file (ie whether to save it or to open it with an application). You need to say to Open it using PyMOL, providing the full path to the PyMOL executable.

 
After configuring your browser as above, you should be able to check that your brower has the correct responses to both file types. For example, in the about:preferences page in Firefox, this will look like:
 

Figure 5. Application preferences in Firefox.

 
If you use a different browser, the following Google search will find information on how to view/edit the above application preferences:
browser-name MIME types
where browser-name is the name of your browser (eg Chrome).
 

Linking to LigPlot+

If you have LigPlot+ installed, you can import the LIGPLOT diagrams on each PDBsum1 Ligands pages into the program by configuring the link as follows.
 
  LigPlot+. When you first click on this icon, your browser will ask what to do with the file (ie whether to save it or to open it with an application). You need to say to Open it using LigPlot+, providing the full path to the LigPlus.jar executable (which is in the LigPlus directory).

 
The list of applications will now look like:
 

Figure 6. Application preferences in Firefox with DRW file added.

 
where DRW file represents the file extension of the file read in by LigPlot+.
 

Documentation and acknowledgements

Descriptions of the outputs can be found on the PDBsum website:
https://www.ebi.ac.uk/pdbsum
Acknowledgements:
PDBsum1 acknowledgements