IDA search tool documentation
This documentation relates to the search by domain organisation page.
You will find here a short description of the tool, some basic information on how to use it, and the most common questions that our users ask us.
The InterPro Domain Architecture (IDA) tool allows you to search the InterPro database with a particular set of domains, and returns all of the domain organisations and associated proteins that match the query. This makes it easy to rapidly identify all of the different domain combinations where one type of domain co-occurs with another, or a particular domain is followed by another (e.g., an SH3 domain is found C-terminal to a protein kinase domain, or vice versa), and to list the proteins that match each domain architecture.
The tool uses a specialised algorithm, developed in-house, that rapidly searches through all domain matches within InterPro and returns proteins that match the domains and ordering specified in the query.
To start a query, use the filter panel on the left hand side, and either click on the add/remove button or click in the domain viewer box, where it says ‘Click here to add domains’. A pop-up window (see Figure 1) will then list all of the domain entries in the InterPro database. This interactive list can be refined using the free text ‘Search’ box in the top right of the pop-up window. Alternatively, the list can be narrowed down alphabetically or numerically, using the ‘Select’ dropdown menu on the top left hand side. Once an appropriate domain has been identified from the list, it can be added to or removed from the query, using the plus or minus buttons next to each individual domain name. The same domain can be added multiple times in this way – this number is displayed in the box to the right of the plus and minus buttons.
Once the required number of different domains have been selected, pressing the ‘Apply’ button will perform the appropriate database query. The selected domains will be indicated graphically, as a square, in the domain viewer on the left hand side, and the domain organisation results, along with the number of proteins matched by each domain organisation, will be displayed in the result page (see Figure 2).
The results page shows the proteins in UniProt that match the query, grouped by IDA, with different IDAs displayed on separate rows. For each IDA there is a summary view, showing the positions where the domains match on a representative protein sequence, with exact amino acid coordinates available via mouse over. Each IDA also has the total number of matching UniProt proteins listed. Clicking on this link returns the full list of proteins, which will become available to download in FASTA format in future versions of the tool.
Activating ‘Order sensitivity’ via the checkbox in the domain organisation panel (see Figure 3) means that the order in which the domains are placed (from left to right on the panel, indicating N- to C-terminal) is reflected in the search results. The domains can be reordered by dragging and dropping their graphical representations, which will automatically update the search results. The domain selection can be reconfigured using the ‘Add/remove domains’ button. Individual domains can also be removed from the selection by dragging their graphical representation to the bin icon or by clicking on the [x] icon next to the domain’s name and InterPro accession number in the domain organisation panel.
What does the colour of domains mean?
Each InterPro domain has a specific colour. The colour of a domain in the the domain organisation viewer will match the colour of the domain in the resulting protein sequence summary views. If two domains have the same colour then it can mean that they are related in a same hierarchy (domain relationship).
Future implementation of the IDA search tool will include some of the following features:
- option to insert gaps and overlaps between domains, exclusion of domains.
- possibility to have multiple filters, to improve the search (like filter for species and sequence length)
- export full list of matching proteins
- possibility to sort columns and change number of records per page, for list of proteins table
- switch button to group/ungroup proteins by IDA
More questions? How to contribute?
Please don't hesitate to contact us if you have more questions, or if there is something that doesn't work properly when you use the domain organisation search.