SSM vs. others: 1SAR:A

 
Materials from this page cannot be reproduced without permission from the authors.
Comparisons made on November 2002, using current versions of VAST, CE, DALI, DEJAVU and SSM v1.22 from 20/11/2002.
 

CONTENTS
  1. VAST
  2. CE (Combinatorial Extension)
  3. DALI
  4. DEJAVU
  5. Conclusion
 
1SAR:A (96 residues)
RIBONUCLEASE SA (E.C.3.1.4.8)
The longer helix and 6 strands were used for SSE matching.
 

1.  V A S T    (server)

Figure 1SAR:A-1 shows the Ca-alignment lengths obtained from SSM and VAST for different structural neighbours (chosen by VAST). As seen from the picture, VAST results allow for clear identification of highly similar (PDB entries 1-27), less similar (28-222) and remote structural neighbours. SSM fully agrees with VAST on highly similar structures but offers noticeably longer Ca-alignments for less similar and remote neighbours with a hardly seen borderline between them.


  Figure 1SAR:A-1.
Length of Ca-alignment as a function of PDB entry, obtained by SSM (black line) and VAST (red line). Details of the calculations are given here.
 

Analysis of Figure 1SAR:A-2 reveals that longer SSM alignments almost always come at the expense of larger RMSDs. It should be therefore concluded that SSM and VAST employ different criteria for solving the compromise between alignment length and RMSD.


  Figure 1SAR:A-2.
RMSD of Ca-alignment corresponding to data in Figure 1SAR:A-1. Details of the calculations are given here.
 

In spite of visible difference between SSM and VAST alignment lengths and RMSDs, seen in Figures 1SAR:A-1 and 1SAR:A-2, the match indexes of SSM and VAST alignments are very close to each other (cf. Figure 1SAR:A-3). In fact, they replicate each other very well, down to quite small details, with SSM's match index being sligtly higher than that of VAST for more remote structural neighbours. The borderlines between structures with different similarity levels are expressed equally well in match indexes from both SSM and VAST.


  Figure 1SAR:A-3.
Match Index corresponding to data shown in Figure 1SAR:A-1. Details of the calculations are given here.
 

P-values of SSM and VAST alignments, shown in Figure 1SAR:A-4, demonstrate similar trends, however SSM produces a considerably lower P-values for the closest structural neighbours. Interestingly enough, although the borderline between less similar and remote structural neighbours (at PDB entry 222 in the Figure), pronounced in VAST results, is hardly seen in SSM's alignment lengths (cf. Figure 1SAR:A-1), it is clearly present in SSM's P-values.


  Figure 1SAR:A-4.
P-values corresponding to matches shown in Figure 1SAR:A-1. Details of the calculations are given here.
 

In an expected correlation with P-values, Z-scores of SSM Ca-alignments (Figure 1SAR:A-5) are generally higher than those obtained from VAST. This is particularly true for closest structural neighbours. The borderline between less similar and remote structural neighbours (at PDB entry 222 in the Figure), is clearly present in SSM's Z-scores, perhaps even more pronounced than in VAST results.


  Figure 1SAR:A-5.
Z-scores corresponding to matches shown in Figure 1SAR:A-1. Details of the calculations are given here.
 

 

 

2.  C E (Combinatorial Extension)    (server)

Not quite typically for Combinatorial Extension, in this particular example of 1SAR:A it gives a lesser number of structural neighbours than VAST (cf. Figures 1SAR:A-6 and 1SAR:A-1). Unlike that in VAST results, Ca-alignments from CE do not allow to differentiate between less similar and remote structural neighbours, however 25 closest structures are clearly separated. SSM agrees with CE on all highly similar structures, but just as usual gives shorter alignments for less similar and remote neighbours.


  Figure 1SAR:A-6.
Length of Ca-alignment as a function of PDB entry, obtained by SSM (black line) and CE (red line). Details of the calculations are given here.
 

Analysis of the Ca-alignment RMSDs (Figure 1SAR:A-7) shows that longer CE alignments always result in larger RMSDs. Comparison of Figures 1SAR:A-6 and 1SAR:A-7 shows a remarkable correlation of almost all pronounced spikes in both pictures, reflecting the compromise between alignment length and RMSD.


  Figure 1SAR:A-7.
RMSD of Ca-alignment corresponding to data in Figure 1SAR:A-6. Details of the calculations are given here.
 

The match index of SSM and CE alignments shows a very good agreement up to tiny details (cf. Figure 1SAR:A-8). It may be seen from the Figure that SSM makes slightly better alignments for more remote structural neghbours, which is not obvious from visual inspection of Figures 1SAR:A-6 and 1SAR:A-7.


  Figure 1SAR:A-8.
Match Index corresponding to data shown in Figure 1SAR:A-6. Details of the calculations are given here.
 

Z-scores of SSM and CE alignments, shown in Figure 1SAR:A-9, are in fact very close to each other with the account to the factor of 2 which we always apply to Z-scores from CE. SSM curve look much more spiky than that of CE, the latter being bizzarly straight in most of the region.


  Figure 1SAR:A-9.
Z-scores corresponding to matches shown in Figure 1SAR:A-6. Details of the calculations are given here.
 

 

 

3.  D A L I    (server)

Comparing to VAST and CE, DALI produces less matches (cf. Figure 1SAR:A-10). It fails to align all residues for the closest structural neighbours 1SAR:A and 1SAR:B. As seen from Figure 1SAR:A-10, SSM and DALI generally agree in the length Ca-alignments, however DALI tends to longer alignments for remote structural neighbours.


  Figure 1SAR:A-10.
Length of Ca-alignment as a function of PDB entry, obtained by SSM (black line) and DALI (red line). Details of the calculations are given here.
 

As Figure 1SAR:A-11 shows, longer DALI's alignments come for the exchange of higher RMSDs, as compared to SSM. As seen from the Figure, RMSDs from DALI are always higher than those from SSM, with one of them being probably higher then reasonable (more than 5Å).


  Figure 1SAR:A-11.
RMSD of Ca-alignment corresponding to data in Figure 1SAR:A-10. Details of the calculations are given here.
 

The above conclusion is fully corroborated by data presented in Figure 1SAR:A-12. As seen from the Figure, the match index of SSM and DALI alignment is nearly coinciding. This indicates that the principal quality of alignments, given by these servers, is essentially the same.


  Figure 1SAR:A-12.
Match Index corresponding to data shown in Figure 1SAR:A-10. Details of the calculations are given here.
 

DALI's Z-scores (Figure 1SAR:A-13) are higher than those from SSM, for highly similar structures, and lower for the remote ones. Z-scores from both DALI and SSM show very similar trends, although differ in figures. As a typical finding here, Z-scores from DALI agree much better with minus logarithm of SSM's P-values (shown by black line in Figure 1SAR:A-13).


  Figure 1SAR:A-13.
Z-scores corresponding to matches shown in Figure 1SAR:A-10. Details of the calculations are given here.
 

 

 

4.  D E J A V U    (server)

DEJAVU failed to recognize most closest structural neighbours, including 1SAR:A itself and gave 1RGE as the closest prototype of the input. As seen from Figure 1SAR:A-14 most of DEJAVU's output represents very remote structures, which is in quite a difference from VAST, CE, DALI and SSM. Most of Ca-alignments from DEJAVU are considerably shorter than those computed by SSM.


  Figure 1SAR:A-14.
Length of Ca-alignment as a function of PDB entry, obtained by SSM (black line) and DEJAVU (red line). Details of the calculations are given here.
 

It is therefore expectable that DEJAVU produces shorter RMSDs, which is fully confirmed by data in Figure 1SAR:A-15. The Figure makes an impression that DEJAVU imposes high penalties for RMSDs higher than 2Å. RMSDs from SSM are considerably higher however stay in a reasonable range (less than 5Å).


  Figure 1SAR:A-15.
RMSD of Ca-alignment corresponding to data in Figure 1SAR:A-14. Details of the calculations are given here.
 

The match index, calculated from SSM results, appears virtually coinciding with that derived from DEJAVU's output (cf. Figure 1SAR:A-16). Given the difference in alignment lengths and RMSDs in Figures 1SAR:A-14 and 1SAR:A-15, one may therefore conclude SSM and DEJAVU differ substantially in balancing the compromise between alignment length and RMSD at similar principal quality of alignments.


  Figure 1SAR:A-16.
Match Index corresponding to data shown in Figure 1SAR:A-14. Details of the calculations are given here.
 

P-values of SSM and DEJAVU alignments are shown in Figure 1SAR:A-17. As seen from the Figure, there is a similarity in the SSM and DEJAVU assessment of statistical significance of the results. While DEJAVU gives lower P-values for remote structures, SSM assigns a higher statistical significance (lower P-values) to highly similar structures.


  Figure 1SAR:A-17.
P-values corresponding to matches shown in Figure 1SAR:A-14. Details of the calculations are given here.
 

Z-scores of SSM and DEJAVU alignments, presented in Figure 1SAR:A-18, show a slightly better agreement than P-values (cf. Figure 1SAR:A-17). Z-scores from DEJAVU are approximately twice higher than those from SSM.


  Figure 1SAR:A-18.
Z-scores corresponding to matches shown in Figure 1SAR:A-14. Details of the calculations are given here.
 

 

 

5.  Conclusion

The principal quality of 3D Ca-alignments, as measured by match index, is in a good agreement between results produced by SSM and those obtained from other servers. However, the servers differ in solving the compromise between alignment length in RMSD. In that respect, SSM agrees reasonably well with VAST, CE, DALI and DEJAVU, to the degree of their difference from each other.

Thus, SSM produces longer alignments than those from VAST, but shorter than those from CE, for remote structures. Correspondingly, VAST's RMSDs are lower, and the ones from CE - higher than those from SSM. DALI tends to longer alignments at higher RMSDs for remote structures, as compared with SSM. DEJAVU's output consists mostly of remote structural neighbours, at shorter, than SSM's, alighment lengths and lower RMSDs. P-values and Z-scores demonstrate similar trends however differ in numbers. This allows one to use them for rating the matches, expecting that the correspondence in the rating will be preserved, on average, between different servers.