CCP4 Coordinate Library Project

Object interface: Looking for contacts.

A contact is pair of atoms that are on a certain distance from each other. The Library offers a few functions for finding such pairs of contacting atoms.

Function Purpose
CMMDBManager::SeekContacts Finding contacts between vector of atoms and one atom from the same vector.
CMMDBManager::SeekContacts Finding contacts between atom and vector of atoms.
CMMDBManager::SeekContacts Finding contacts between two vectors of atoms.
SortContacts Sorting contacts.
CMMDBManager::MakeBricks Bricking the coordinate space and assigning atoms to the bricks
CMMDBManager::GetBrickDimension Retrieving the number of bricks in X, Y and Z - directions
CMMDBManager::GetBrick Retrieving a brick
CMMDBManager::GetBrickCoor Getting coordinates of a brick that contains given atom
CMMDBManager::GetBrickCoor Getting coordinates of a brick that contains given point of space
CMMDBManager::RemoveBricks Removing bricking


void CMMDBManager::SeekContacts ( 
PPCAtom  Atom, 
int alen,
int atomNum,
realtype dist1,
realtype dist2,
int seqDist,
RPSContact contact,
int & ncontacts,
int maxlen,
long group )
PURPOSE
Finding contacts between vector of atoms and one atom from the same vector.
ARGUMENTS
PPCAtom Atom
A vector of atoms; it may be obtained, for example, as a result of atom selection with CMMDBManager::GetSelIndex function. All atoms must be present in the coordinate hierarchy; no outside atoms are allowed. The vector may contain NULL pointers.

int alen
Number of atoms in vector Atom. The vector's index space is 0..alen-1.

int atomNum
Number of 1st contacting atom in vector Atom. All other atoms in the vector are checked for contact conditions with this one.

realtype dist1
Minimal contact distance, in angstroms.

realtype dist2
Maximal contact distance, in angstroms.

int seqDist
The minimal sequence distance of contacts. The contacting atoms are required to be in residues separated by not less than seqDist inter-residue spaces if they are in the same chain. Thus, seqDist=0 allows contacting atoms to be in the same residue, seqDist=1 requires them to be in different residues.

RPSContact contact
Array of found contacts. For ith contacting pair, contact[i].id1 is set to atomNum and contact[i].id2 is set to the index of 2nd contacting atom, which it has in the input vector Atom. The distance between contacting atoms is returned in contact[i].dist. A special field contact[i].group will contain the contact group ID (see below, parameter group).

int & ncontacts
Number of found contacts. The array contact is indexed as 1..ncontacts-1. If ncontacts>0 on input, it is assumed that ncontacts contacts were already found and newly found contacts are added to them.

int maxlen
Maximal number of contacts to find. If maxlen<=0 (default) then vector contact is allocated dynamically by SeekContacts for all found contacts.
If maxlen>0, then array contact is prohibited of dynamical allocation/deallocation. In this case, not more than maxlen contacts will be returned.
In either case, it is the application, rather than Manager, that is responsible for deallocation of array contact after use.

long group
A contact group ID. This ID is simply stored in contact[i].group fields and may be useful if contacts are calculated in multiple calls to the function (e.g. in the course of generation of symmetry mates). This parameter has default value of 0.

DESCRIPTION

The function attempts to find atoms in vector Atom, which are within the distance dist1<=r<=dist2 from atom Atom[atomNum], and which belong to residues separated by at least seqDist inter-residue spaces. Indices of contacting atoms (i.e. their positions in input array Atom) are returned in dynamically-allocated or static array contact.

NOTE 1: The number of inter-residue spaces between given residues is calculated from the actual number of residues between them, and not from the residues' sequence numbers.

NOTE 2: A coordinate file may be missing some residues or even whole parts of a chain. If this is the case and if the missing residue(s) fall within the sequence distance of seqDist from one of the contacting atoms, the algorithm assumes that not less than seqDist residues are missed. A gap of missing residues is assumed if C-alpha atoms of neighbouring residues are separated by more than 4 angstroms.


EXAMPLE

Looking for 5-angstrom contacts between C-alpha of 33rd residue of chain A (which is not a residue with sequence number 33, but merely a residue in 33rd position in the chain counted as 0..nResidues-1) and C-alpha atoms of all other residues of the same chain that are at more than one residue position apart from each other:

CMMDBManager MMDB;
int          RC,selHnd,alen,ncontacts,i;
PPCAtom      Atom;
PSContact    contact;
char         S1[100];
char         S2[100];

  // read coordinate file
  RC = MMDB.ReadCoorFile ( CoorFileName );
  if (RC) {
    .. checking for errors
    exit(1);
  }

  // select C-alpha atoms of chain A:
  SelHnd = MMDB.NewSelection();
  MMDB.Select       ( selHnd,STYPE_ATOM,1,"A",
                      ANY_RES,"*",ANY_RES,"*",
                      "*","CA","C",SKEY_NEW );
  MMDB.GetSelIndex  ( selHnd,Atom,alen );

  // get contacts:
  contact   = NULL;  // prepare for dynamical allocation
  ncontacts = 0;     // of vector contact
  MMDB.SeekContacts ( Atom,      // vector of selected atoms
                      alen,      // number of selected atoms
                      33,        // 1st contacting atom nr 33
                      0.0,       // minimal contact distance
                      5.0,       // maximal contact distance
                      2,         // sequence distance
                      contact,   // vector of contacts
                      ncontacts, // number of contacts
                      0,         // allocate contact dynamically
                      0          // zero group ID
                    );

  // print contacts
  if (ncontacts>0)  {
    printf ( " Found %i contacts:\n",ncontacts );
    for (i=0;i<ncontacts;i++)
      printf ( " %s <-> %s   %10.4f A\n",
               Atom[contact[i].id1]->GetAtomID(S1),
               Atom[contact[i].id2]->GetAtomID(S2),
               contact[i].dist );
  } else
    printf ( " No contact found.\n" );

  // dispose array contact:
  if (contact)  delete contact;


void CMMDBManager::SeekContacts ( 
PCAtom  A, 
PPCAtom Atom,
int alen,
realtype dist1,
realtype dist2,
int seqDist,
RPSContact contact,
int & ncontacts,
int maxlen,
long group )
PURPOSE
Finding contacts between atom and vector of atoms.
ARGUMENTS
PCAtom A
1st contacting atom. The atom must be present in the coordinate hierarchy; an application may not supply an "outside" atom here.

PPCAtom Atom
A vector of atoms; it may be obtained, for example, as a result of atom selection with CMMDBManager::GetSelIndex function. All atoms must be present in the coordinate hierarchy; no outside atoms are allowed. The vector may contain NULL pointers.

int alen
Number of atoms in vector Atom. The vector's index space is 0..alen-1.

realtype dist1
Minimal contact distance, in angstroms.

realtype dist2
Maximal contact distance, in angstroms.

int seqDist
The minimal sequence distance of contacts. The contacting atoms are required to be in residues separated by not less than seqDist inter-residue spaces if they are in the same chain. Thus, seqDist=0 allows contacting atoms to be in the same residue, seqDist=1 requires them to be in different residues.

RPSContact contact
Array of found contacts. For ith contacting pair, contact[i].id1 is set to -1, and contact[i].id2 is set to the index of 2nd contacting atom, which it has in the input vector Atom. The distance between contacting atoms is returned in contact[i].dist. A special field contact[i].group will contain the contact group ID (see below, parameter group).

int & ncontacts
Number of found contacts. The array contact is indexed as 1..ncontacts-1. If ncontacts>0 on input, it is assumed that ncontacts contacts were already found and newly found contacts are added to them.

int maxlen
Maximal number of contacts to find. If maxlen<=0 (default) then vector contact is allocated dynamically by SeekContacts for all found contacts.
If maxlen>0, then array contact is prohibited of dynamical allocation/deallocation. In this case, not more than maxlen contacts will be returned.
In either case, it is the application, rather than Manager, that is responsible for deallocation of array contact after use.

long group
A contact group ID. This ID is simply stored in contact[i].group fields and may be useful if contacts are calculated in multiple calls to the function (e.g. in the course of generation of symmetry mates). This parameter has default value of 0.

DESCRIPTION

The function attempts to find atoms in vector Atom, which are within the distance dist1<=r<=dist2 from atom A, and which belong to residues separated by at least seqDist inter-residue spaces. Indices of contacting atoms (i.e. their positions in input array Atom) are returned in dynamically-allocated or static array contact.

NOTE 1: The number of inter-residue spaces between given residues is calculated from the actual number of residues between them, and not from the residues' sequence numbers.

NOTE 2: A coordinate file may be missing some residues or even whole parts of a chain. If this is the case and if the missing residue(s) fall within the sequence distance of seqDist from one of the contacting atoms, the algorithm assumes that not less than seqDist residues are missed. A gap of missing residues is assumed if C-alpha atoms of neighbouring residues are separated by more than 4 angstroms.


EXAMPLE

Looking for 5-angstrom contacts between C-alpha atom of residue with sequence number 34 of chain A and C-alpha atoms of all other residues of the same chain that are at more than two residue position apart from each other:

CMMDBManager MMDB;
int          RC,selHnd,alen,ncontacts,i;
PCAtom       A;
PPCAtom      Atom;
PSContact    contact;
char         S1[100];
char         S2[100];

  // read coordinate file
  RC = MMDB.ReadCoorFile ( CoorFileName );
  if (RC) {
    .. checking for errors
    exit(1);
  }

  // select C-alpha atoms of chain A:
  SelHnd = MMDB.NewSelection();
  MMDB.Select      ( selHnd,STYPE_ATOM,1,"A",
                     ANY_RES,"*",ANY_RES,"*",
                     "*","CA","C",SKEY_NEW );
  MMDB.GetSelIndex ( selHnd,Atom,alen );

  // get 1st contacting atom:
  strcpy ( S1,"/1/A/34/CA[C]" );
  A = MMDB.GetAtom ( S1 );
  if (!A)  {
    // atom not found, identify the reason:
    printf ( " Atom '%s' not found: ",S1 );
    switch (MMDB.AtomExtrCode)  {
      case AEXTR_NoModel   : printf ( "no such model.\n" );           break;
      case AEXTR_NoChain   : printf ( "no such chain in model.\n" );  break;
      case AEXTR_NoResidue : printf ( "no such residue in chain\n" ); break;
      case AEXTR_NoAtom    : printf ( "no such atom in residue\n" );  break;
      case AEXTR_WrongPath : printf ( "wrong atom ID syntax\n" );     break;
      default              : printf ( "unknown error code\n" );
    }
    exit(2);
  }

  // get contacts:
  contact   = NULL;  // prepare for dynamical allocation
  ncontacts = 0;     // of vector contact
  MMDB.SeekContacts ( A,         // 1st contacting atom
                      Atom,      // vector of selected atoms
                      alen,      // number of selected atoms
                      0.0,       // minimal contact distance
                      5.0,       // maximal contact distance
                      2,         // sequence distance
                      contact,   // vector of contacts
                      ncontacts, // number of contacts
                      0,         // allocate contact dynamically
                      0          // zero group ID
                    );

  // print contacts
  if (ncontacts>0)  {
    printf ( " Found %i contacts:\n",ncontacts );
    for (i=0;i<ncontacts;i++)
      printf ( " %s <-> %s   %10.4f A\n",
               S1,Atom[contact[i].id2]->GetAtomID(S2),
               contact[i].dist );
  } else
    printf ( " No contact found.\n" );

  // dispose array contact:
  if (contact)  delete contact;


void CMMDBManager::SeekContacts ( 
PPCAtom  Atom1, 
int alen1,
PPCAtom Atom2,
int alen2,
realtype dist1,
realtype dist2,
int seqDist,
RPSContact contact,
int & ncontacts,
int maxlen,
mat44 * TMatrix,
long group )
PURPOSE
Finding contacts between two vectors of atoms.
ARGUMENTS
PPCAtom Atom1
1st vector of contacting atoms. All atoms must be present in the coordinate hierarchy; an application may not supply an "outside" atoms here. The vector may contain NULL pointers.

int alen1
Number of atoms in vector Atom1. The vector's index space is 0..alen1-1.

PPCAtom Atom2
2nd vector of contacting atoms. Both vectors may be obtained, for example, as a result of atom selection with CMMDBManager::GetSelIndex function. All atoms must be present in the coordinate hierarchy; no outside atoms are allowed. The vector may contain NULL pointers.

int alen2
Number of atoms in vector Atom2. The vector's index space is 0..alen2-1.

realtype dist1
Minimal contact distance, in angstroms.

realtype dist2
Maximal contact distance, in angstroms.

int seqDist
The minimal sequence distance of contacts. The contacting atoms are required to be in residues separated by not less than seqDist inter-residue spaces if they are in the same chain. Thus, seqDist=0 allows contacting atoms to be in the same residue, seqDist=1 requires them to be in different residues.

RPSContact contact
Array of found contacts. For ith contacting pair, contact[i].id1 is set to the index of contacting atom from vector Atom1, and contact[i].id2 - to the index of contacting atom from vector Atom2. The distance between contacting atoms is returned in contact[i].dist. A special field contact[i].group will contain the contact group ID (see below, parameter group).

int & ncontacts
Number of found contacts. The array contact is indexed as 1..ncontacts-1. If ncontacts>0 on input, it is assumed that ncontacts contacts were already found and newly found contacts are added to them.

int maxlen
Maximal number of contacts to find. If maxlen<=0 (default) then vector contact is allocated dynamically by SeekContacts for all found contacts.
If maxlen>0, then array contact is prohibited of dynamical allocation/deallocation. In this case, not more than maxlen contacts will be returned.
In either case, it is the application, rather than Manager, that is responsible for deallocation of array contact after use.

mat44 * TMatrix
A transformation matrix that should be applied to 2nd set of atoms Atom2. This may be used for finding contacts between symmetry mates.
If TMatrix is set to NULL (default), this parameter is ignored.
Even if the transformation is applied, the coordinates of atoms in Atom2 do not change upon return from the function.

long group
A contact group ID. This ID is simply stored in contact[i].group fields and may be useful if contacts are calculated in multiple calls to the function (e.g. in the course of generation of symmetry mates). This parameter has default value of 0.

DESCRIPTION

The function attempts to find all such pairs of atoms {Atom1[i],Atom2[j]} that are within the distance dist1<=r<=dist2 from each other and belong to residues separated by at least seqDist inter-residue spaces. Indices of contacting atoms (i.e. their positions in input arrays Atom1 and Atom2) are returned in dynamically-allocated or static array contact.

The function employs the bricking algorithm and therefore considerably outperforms a trivial scheme of applying the atom-to-vector version of CMMDBManager::SeekContacts to all atoms of vector Atom1, if its length is greater than alen2=2.

NOTE 1: The number of inter-residue spaces between given residues is calculated from the actual number of residues between them, and not from the residues' sequence numbers.

NOTE 2: A coordinate file may be missing some residues or even whole parts of a chain. If this is the case and if the missing residue(s) fall within the sequence distance of seqDist from one of the contacting atoms, the algorithm assumes that not less than seqDist residues are missed. A gap of missing residues is assumed if C-alpha atoms of neighbouring residues are separated by more than 4 angstroms.


EXAMPLE

Looking for 5-angstrom contacts between C-alpha atoms of chain A with all sulphur atoms. Only contacts between not-the-same residues should be considered.

CMMDBManager MMDB;
int          RC,selHnd1,selHnd2,alen1,alen2,ncontacts,i;
PPCAtom      Atom1,Atom2
PSContact    contact;
char         S1[100];
char         S2[100];

  // read coordinate file
  RC = MMDB.ReadCoorFile ( CoorFileName );
  if (RC) {
    .. checking for errors
    exit(1);
  }

  // select C-alpha atoms of chain A:
  SelHnd1 = MMDB.NewSelection();
  MMDB.Select      ( selHnd1,STYPE_ATOM,1,"A",
                     ANY_RES,"*",ANY_RES,"*",
                     "*","CA","C",SKEY_NEW );
  MMDB.GetSelIndex ( selHnd1,Atom1,alen1 );

  // select all sulphurs:
  SelHnd2 = MMDB.NewSelection();
  MMDB.Select      ( selHnd2,STYPE_ATOM,1,"*",
                     ANY_RES,"*",ANY_RES,"*",
                     "*","*","S",SKEY_NEW );
  MMDB.GetSelIndex ( selHnd2,Atom2,alen2 );

  // get contacts:
  contact   = NULL;  // prepare for dynamical allocation
  ncontacts = 0;     // of vector contact
  MMDB.SeekContacts ( Atom1,     // 1st vector of atoms
                      alen1,     // length of 1st vector
                      Atom2,     // 2nd vector of atoms
                      alen2,     // length of 2nd vector
                      0.0,       // minimal contact distance
                      5.0,       // maximal contact distance
                      2,         // sequence distance
                      contact,   // vector of contacts
                      ncontacts, // number of contacts
                      0,         // allocate contact dynamically
                      NULL,      // no transformation matrix
                      0          // zero group ID
                    );

  // print contacts
  if (ncontacts>0)  {
    printf ( " Found %i contacts:\n",ncontacts );
    for (i=0;i<ncontacts;i++)
      printf ( " %s <-> %s   %10.4f A\n",
               Atom1[contact[i].id1]->GetAtomID(S1),
               Atom2[contact[i].id2]->GetAtomID(S2),
               contact[i].dist );
  } else
    printf ( " No contact found.\n" );

  // dispose array contact:
  if (contact)  delete contact;


void SortContacts ( 
PSContact  contact, 
int ncontacts,
int sortmode )
PURPOSE
Sorting contacts.
ARGUMENTS
PSContact contact
Vector of contacts to sort.

int ncontacts
Length of vector contact.

int sortmode
The sort mode. This parameter may take the following values:

Value   Description
CNSORT_1INC   sorting by increasing index of 1st contacting atom contact[i].id1.
CNSORT_1DEC   sorting by decreasing index of 1st contacting atom contact[i].id1.
CNSORT_2INC   sorting by increasing index of 2nd contacting atom contact[i].id2.
CNSORT_2DEC   sorting by decreasing index of 2nd contacting atom contact[i].id2.
CNSORT_DINC   sorting by increasing contact distance contact[i].dist.
CNSORT_DDEC   sorting by decreasing contact distance contact[i].dist.


DESCRIPTION

The function sorts contacts found in the vector contact according to the sort mode specified by sortmode.

NOTE : This function is a standalone procedure rather than a member of CMMDBManager or any other class of the Library.


void CMMDBManager::MakeBricks ( 
PPCAtom  atmvec, 
int avlen,
realtype Margin,
realtype BrickSize )
PURPOSE
Bricking the coordinate space and assigning atoms to the bricks
ARGUMENTS
PPCAtom atmvec
A vector of atoms; it may be obtained, for example, as a result of atom selection with CMMDBManager::GetSelIndex function. The atoms do not have to be associated with a coordinate hierarchy. The vector may contain NULL pointers.

int avlen
Length of vector atmvec. The vector must be indexed as 0..avlen-1.

realtype Margin
The margin (in angstroms) which is added in all X, Y and Z directions to the minimal bar encapsulating all atoms before bricking (see below).

realtype BrickSize
Size of bricks in angstroms. All bricks are cubes. This parameter has a default value of 6.

DESCRIPTION

The function cuts the volume occupied by atoms given in vector atmvec into cubical bricks and for each brick it makes a list of atoms contained in the brick. The bricked volume is approximated as a minimal bar that contains all atoms covered by layer of width not less than Margin angstroms from all sides such that integer number of bricks of size BrickSize would fit on each face of it.

NOTE 1: The function removes previously existing bricking, if there was any.

NOTE 2: The contact-seeking function CMMDBManager::SeekContacts removes previously existing bricking.


EXAMPLE

Make bricking of all C-alphas and print atoms found in the brick of 33rd residue's C-alpha:

CMMDBManager MMDB;
int          RC,selHnd,alen, nx,ny,nz, i;
PPCAtom      Atom;
PCBrick      Brick;
char         S[100];

  // read coordinate file
  RC = MMDB.ReadCoorFile ( CoorFileName );
  if (RC) {
    .. checking for errors
    exit(1);
  }

  // select C-alpha atoms of chain A:
  SelHnd = MMDB.NewSelection();
  MMDB.Select       ( selHnd,STYPE_ATOM,1,"A",
                      ANY_RES,"*",ANY_RES,"*",
                      "*","CA","C",SKEY_NEW );
  MMDB.GetSelIndex  ( selHnd,Atom,alen );

  MMDB.MakeBricks   ( Atom,alen,6.0,6.0 );
  MMDB.GetBrickCoor ( Atom[32], nx,ny,nz );
  Brick = MMDB.GetBrick ( nx,ny,nz );

  MMDB.GetBrickDimension ( nx,ny,nz );
  printf ( " total bricks:\n"
           " %5i on X\n"
           " %5i on Y\n"
           " %5i on Z\n",nx,ny,nz );

  if (!Brick)  {
    printf ( " *****  no brick found: mmdb misfunction\n" );
  } else  {
    printf ( " atoms found in brick (%i,%i,%i):\n\n"
             " sel.No.     Coordinate ID\n",nx,ny,nz );
    for (i=0;i<Brick->nAtoms;i++)
      printf ( " %4i  %s\n",Brick->id[i],
               Brick->Atom[i]->GetAtomID(S1) );
  }

  MMDB.RemoveBricks    ();
  MMDB.DeleteSelection ( selHnd );
  


void CMMDBManager::GetBrickDimension ( 
int &  nxmax, 
int & nymax,
int & nzmax )
PURPOSE
Retrieving the number of bricks in X, Y and Z - directions
ARGUMENTS
int & nxmax
Number of bricks in X-direction

int & nymax
Number of bricks in Y-direction

int & nzmax
Number of bricks in Z-direction

DESCRIPTION

The function returns number of bricks in X, Y and Z - directions created by function CMMDBManager::MakeBricks. The bricks are indexed like [0..nxmax-1, 0..nymax-1,0..nzmax-1]. The function returns zeros if bricks were not created.


PCBrick CMMDBManager::GetBrick ( 
int  nx, 
int ny,
int nz )
PURPOSE
Retrieving a brick
ARGUMENTS
int nx
X-index of the brick to retrieve

int ny
Y-index of the brick to retrieve

int nz
Z-index of the brick to retrieve

DESCRIPTION

The function returns pointer to the brick having indices of nx, ny,nz in X, Y and Z - directions, respectively. If such a brick does not exists, if indices are wrong (beyond the valid range) or if bricking was not done, the function returns NULL.

NOTE : The application must not dispose or alterate the bricks.


void CMMDBManager::GetBrickCoor ( 
PCAtom  A, 
int & nx,
int & ny,
int & nz )
PURPOSE
Getting coordinates of a brick that contains given atom
ARGUMENTS
PCAtom A
The atom, a brick for which should be found.

int & nx
X-index of the brick containing atom A

int & ny
Y-index of the brick containing atom A

int & nz
Z-index of the brick containing atom A

DESCRIPTION

The function returns indices nx, ny and nz of the brick that contains atom A. If such a brick does not exist, or if bricking was not done, the function returns a negative value for nx.


void CMMDBManager::GetBrickCoor ( 
realtype  x, 
realtype y,
realtype z,
int & nx,
int & ny,
int & nz )
PURPOSE
Getting coordinates of a brick that contains given point of space
ARGUMENTS
realtype x
X-coordinate of the point a brick for which should be found (in angstroms)

realtype y
Y-coordinate of the point a brick for which should be found (in angstroms)

realtype z
Z-coordinate of the point a brick for which should be found (in angstroms)

int & nx
X-index of the brick containing atom A

int & ny
Y-index of the brick containing atom A

int & nz
Z-index of the brick containing atom A

DESCRIPTION

The function returns indices nx, ny and nz of the brick that contains point (x,y,z) (given in angstroms). If such a brick does not exist, or if bricking was not done, the function returns a negative value for nx.


void CMMDBManager::RemoveBricks ( 
 )
PURPOSE
Removing bricking
DESCRIPTION

The function removes bricks created by function CMMDBManager::MakeBricks. There is no particular need in calling this function other than freeing RAM: the bricks are disposed automatically when necessary.



Back to index