Introduction

This tutorial is designed to explain the use of the advanced search system MSDPro. MSDPro is a graphical interface to our EMSD database that allows the construction of complex questions to be defined using hierarchy within logical statements. Multiple AND, OR and NOT relationships can be defined between a large number of search terms.

This tutorial walks through some very simple queries and starts with the basics of using the mouse controls to move and combine query statements.

  1. Getting Started
  2. Basic control to move and draw the query
  3. The Question or query boxes
  4. A database search
  5. A complex query

Getting started

Because the query builder is an applet that uses functionally not available within the earlier versions of Java it is necessary to have installed a version of Java 1.41 as a browser plugin. The query builder startup page will check the version of Java you have installed and if the version is not suitable will provide a link to allow you to update this. Note that it may be required for you have administrative permission in install a Java upgrade.

The MSDPro query interface can be started from this link. An applet will be downloaded after a short delay and you will be presented with a graphical interface containing five main areas as shown below.

Figure 1

Basic control to move and draw the query

There are two ways the search questions can be moved from the question list into the query builder :-

1)  By drag and drop

Drag and drop involves the following actions:-

  • Place the mouse pointer over the first query question (ID code)  (figure 1 – panel 1).
  • Press the left mouse button on the mouse, and keep it pressed
  • Move the mouse pointer, while still pressing the left mouse button, into the query builder pane (figure 1 – panel 4); you will see that the mouse pointer now has an additional symbol against the arrow showing a box and a "+" sign.
  • When the mouse pointer is at the centre of the query builder panel, let go of the mouse button.

This action is called "drag and drop". A new box will appear that contains "text" entry box, a selection drop down, help text and an OK/CANCEL/RESET button. We will explain this in a moment.

2)  By double click

The double click action involves the following:-

  • Place the mouse pointer over the first query question (ID code) in panel 1.
  • Press and let go of the mouse button twice in rapid succession. A press and release of the mouse button is known as a click, clicking twice is a "double click".

A new box will appear at the centre of the query builder panel, notice that all subsequent question boxes will appear in the same place, so the "drag and drop" is recommended as you can place the question box in the desired position.

The Question or query boxes

Both the click and drag action and double click action opens a question box. All search questions from the 1st panel require you to enter some information that will be used as a search target. The question box ID code opens a simple text dialog box that requests an ID code (figure 2). An ID code is the reference number/name that is unique to a piece of biological information - its bar code. There as 7 different ID codes than can be typed in this box, we are going to use the default (PDB) code with a name of 7dfr. Type (using the keyboard) the word 7dfr and this will appear in the white text field in this dialog. If no text appears in this box as you type then move the mouse cursor to the write text box and click the left mouse button once. Note that ID codes for different biological data can be selected by clicking with the left mouse button on the button labelled PDBon the right of the question panel. If you click on this button you will see a list of 7 different possible codes, please pick PDB to close this list again so that PDB remains as the button label (see figure 2).

Figure 2

(Janet : Mark the text “7dfr” that has been add within the white box “type text 7dfr”

Mark the text “PDB” which is purple (in option list) with “select ID code type = PDB”

Mark the OK button in the figure “press this to continue”)

Select the OK button, the question box will close leaving a small query box labelled ID code with a blue marker next to it. The marker is used to open up the question box again, and also show when the content of the question box is invalid; this is shown by a red marker.

  • Question boxes can be opened using the red/blue marker.
  • A red marker indicates that the contents of the question box is not valid, blue indicates the target is valid.

Try to invalidate this question by using an ID code with the wrong number of characters, what happens when you click OK?

How does the query builder know this information about ID codes?

A database search

You now have a single question box, make sure the query box marker is blue otherwise repeat the previous section. At the base of the MSDpro interface you will see an option list with 4 different key words (and, or, not, search – see figure 3), the default checked box is search. These 4 different options are "operators", they apply an operation to a target query.

Figure 3

We need to use the operator of search to search for the ID code question. Make sure that the search is selected (black dot in figure 3), then move the mouse pointer to beyond the top left of the ID code question box in the search area (figure 1 – panel 4). Now press the left mouse button and move the cursor (while pressing the left mouse button) to beyond the right bottom of the ID code question box. Ie, we use a click and drag action to lasso the question box. The result will be a blue box with a title that says search and has a marker to the right of the title bar.

  • As you lasso the question box you will see a lasso outline to aid in surrounding the question box.
  • If the question box is surrounded by a "valid" lasso then the operator (search) box will be dark blue
  • If the operator has not correctly surrounded the question box then the search box will be pale blue and the title bar will be dark blue.
  • A search lasso can surround any number of question boxes, and these combined by logical "AND".

To carry out the search of the ID code of 7dfr then click the gray marker box on the blue search-operator title bar. The pale gray marker box will turn dark gray, and stay dark gray until the results are returned within the results panel (figure 1 – panel 5). In this case there will be a single result as shown in figure 4.

Figure 4

To view the atlas page for this PDB ID code 7dfr then click the blue word 7dfr within the results panel. This will open a browser window containing summary information regarding this structure (figure 5). Documentation on using the atlas pages can be found elsewhere and is not covered here. To remove the browser window when you have finished viewing the atlas pages pick the [X] window button at the top right of this window as shown in figure 5.

Figure 5

How do you think you can view the molecule structure and sequence for this query result : Hint – see figure 4)

A complex query

This part of the tutorial will create a query that combines various questions using different logic.  First we have to tell MSDpro that we want to do a new query. At the bottom left of MSDpro you will see [New query] and [close query] (figure 1), click the former button, and a new empty query window will appear with a tag Query 2. We are now going to find all the high-resolution structures that have a similar fold to 7dfr or bind methltrexate which are not multimeric.

  1. To define "has similar fold to 7dfr" then we will use SSM (secondary structure matching) as a question box. Scroll down the list of search questions (figure 1 – panel 1) until SSM can be seen (this is near the bottom of the list). Click and drag this question into the search panel and the question panel will open. This has a number of options, we will use the defaults. The top left text field (white box) is currently empty, type in 7dfr into this PDB code field and click the [OK] button at the bottom of this question box. You should now have a small gray rectangle with a blue maker in the query panel and labeled SSM.

Figure 6

  1. To define "binds metheltrexate" we will use the Associated small molecule question that is approximately half way down the scrolling list of search questions. Click and drag this into the search panel, and it will open to reveal a text entry box and a two option selection list. Type in MTX into the text field AND change the option list (by clicking the text box [Molecule Name], to reveal the list, and click the [3 letter code] option). You should have a question box containing the text string MTX and the target type of 3 letter code. Click the [OK] button.

Figure 7

  1. To define "high resolution" we will use the resolution question box, click and drag this to the search panel. The box will open to reveal to input fields for floating point numbers (default 0.4 and 100). Change the value 100 to 2, so that the resolution range of 0.4Å to 2 Å is defined - this will select this resolution range within the query.

Figure 8

  1. To define NOT "multimeric" then select the Assembly type question box containing a single option list currently labelled Monomeric. Click this button to reveal a scrolling list of options, the second is multimeric, so select this one.

Figure 9

You should now have 4 question boxes scattered in the search panel. Notice that we can move these questions around using a click and drag action, you are likely to need to do this when combining the questions with operators. We need to combine these using logical operators. First we are going to "NOT" the Assembly type. From the bottom of the MSDpro window, check the "NOT" option from the list of (and, or, not, search – see figure 3), and click and drag an area that will cover the whole of Assembly type. A red box will appear with a not title bar. Note that not is a unary operator, it can only surround a single question box, or collection of question boxes combined with either and or or operators. Next we will "OR" the SSM looks like by structure question with the associated small molecule question. Select the or operator at the bottom of the MSDpro window, and click/drag an area that includes SSM and Associated small molecule. It is often easier to make a "OR" box big enough to place these elements, then click and drag the questions into the or box. Notice that the or question box will be pale green until you place 2 questions into it when it will become bright green. Finally we need to "AND" the 3 questions (where 1 question is an "AND" of two items and 1 question is a "NOT" of 1 item). We can take a short cut in this, the search operator is implicitly an and operator, so we can use this to surround all the questions, and this will combine these by "AND" and allow a search to take place. You should now have a query panel similar to that in figure 10.

Figure 10

Once the blue search operator is in place, click the gray button next on the search title bar. This will query the database. This time since you are using a structure alignment query (SSM) that is a relatively slow calculation, the result will take a number of seconds to return a hit list as this question is "difficult". But note that our database will cache (remember) any identical query that occurred in the last 24 hours so you may find that the query returns within 1 second because this was asked recently.

  • Picking the view blue word (figure 4) will open the AstexViewer@MSD-EBI viewer and allow you view a single structure and sequence. See the relevant tutorial on using this tool
  • Picking the [Browse results] button from the results panel will directly open the atlas page for these results. If there is more than one result then the hit list will be opened within a browser window from which it can be saved as a text file or XML file.
  • Picking [select] toggle box for each result record is used to define which structure is to be viewed when the [view] button is picked. [select all] will select all the results, and [select none] will de-select all the results. The select state of each hit changes the initial display state of the hit list within the AstexViewer@MSD-EBI viewer when displaying multiple hits.
  • Picking the [view] button will open the AstexViewer@MSD-EBI viewer with the currently selected molecules

You should have 14 hits to this query as of 20 August 2003 (see figure 10), though there may be more hits at a later date. If there are less or significantly more than this please return to the last section and repeat the query.  It is recommended that you try the Viewer tutorial so as to understand some of the basic features of this facility.  This completes the basic tutorial for using the MSDPro search system.  There are a number of scientific tutorials to work through that concentrate on generating real problem solving.  These are recommended to get the best out of our search systems.