On Monday 28th March 2011 the WSInterProScan service was decommissioned and replaced by the following services:

The service documentation and clients below are historical and provided solely for reference purposes.


InterProScan is a tool that combines different protein signature recognition methods native to the InterPro member databases into one resource with look up of corresponding InterPro and GO annotation. This service allows you to query your protein sequence against InterPro.

For more information about this tool see:

We kindly ask all users of SOAP Web Services to submit jobs in batches of up to 25 sequences at a time and to not submit more until the results and processing has completed for these. This should enable the users as well as the EBI maintainers to deal more easily with local and remote network outages as well as scheduled or unscheduled downtime.

Service provision happens on a fair-share basis. Overzealous usage of a particular resource will be dealt with in accordance to the EBI's Terms of Use.

Nucleotide Sequences

Due to resource limitations the InterProScan service no longer accepts nucleotide sequence submissions.

To process nucleotide sequences using InterProScan:

1. Translate your nucleotide sequence
The standalone version of InterProScan uses EMBOSS sixpack to perform the translation and filter the resulting open reading frame (ORF) sequences by length. Alternative tools such as EMBOSS transeq are also available, but may require an additional filtering process to limit the ORF sequences to those above a certain length. These tools are available as part of our EMBOSS webservices, Soaplab and as part of the EMBOSS package.

2. Filter ORFs by sequence length.
Short sequences (<80 aa) are unlikely to have any signature matches, so unless there is additional evidence that the sequence occurs, short sequences can be discarded. The EMBOSS sixpack tool provides an option to do length filtering when performing the translation.

3. Significant hits from sequence similarity searches.
The signatures used by InterProScan are based on known protein sequences so a filtering step by performing a BLAST or FASTA sequence similarity search with the ORF translations against the UniProtKB or UniParc protein sequence databases and only keeping sequences which have hits with e-values <0.001. In the case where an exact match is found to the sequence, you can go directly to the InterPro Matches databases to get the signature matches for the sequence.

Note: the standalone version of InterProScan can perform the translation and ORF length filtering as part of the submission and is recommended if you need to perform large numbers of analysis and have access to the required resources. See InterProScan Readme for details.


Sample clients are provided for a number of programming languages. For details of how to use these clients, download the client and run the program without any arguments.

Language Download Requirements
C# .NET Executable: wsinterproscan.exe; Source: wsinterproscan.cs A .NET runtime environment. If building from source development tools are also required. See the .NET tutorial for details.
Java Executable jar: WSInterProScan.jar; Source: Axis 1.4; All dependencies, including Axis 1.4 and Commons-CLI, are available in
Perl SOAP::Lite
PHP wsinterproscan_cli_nusoap.php nuSOAP/nusoap-for-php5
wsinterproscan_cli_php_soap.php PHP SOAP
Python SOAPpy ZSI 2.0
Ruby wsinterproscan.rb soap4r
Taverna 1.x EBI_InterProScan Taverna
VB.NET .NET Executable: wsinterproscan.exe; Source: wsinterproscan.vb A .NET runtime environment. If building from source development tools are also required. See the .NET tutorial for details.

For further details see WSInterProScan Clients.


In addition to these sample clients users have submitted workflows using these services to the myExperiment workflow repository. See workflows using the WSInterProScan Web Service for a list.


Service API


Get details of the available signature methods available in InterProScan.

Arguments: none

Returns: an array of outData objects describing the available options.

runInterProScan(params, content)

Submits a InterProScan job to the service.


  • params an instance of the inputParams data structure.
  • content a list of data data structures describing the query sequence data.

Returns: a string containing the job ID (jobid).


Get the status of a job.


  • jobid the job identifier of the job to check status of.

Returns: a string indicating the status of the job. Current values are:

  • DONE: job has finished, and the results can then be retrieved.
  • ERROR: the job failed or no results where found
  • NOT_FOUND: the job id is no longer available (job results are deleted after 24 h)
  • PENDING: the job is in a queue waiting processing
  • RUNNING: the job is currently being processed


Get details of the result types available.


  • jobid the job identifier of the job to get result types for.

Returns: an array of WSFile structures describing the available result types for the job.

poll(jobid, type)

Wait until the job has finished and get the specified type of result data.


  • jobid the job identifier of the job to get result from.
  • type a string specifying the type of result to retrieve. See getResults(jobid) and WSFile for details of how to obtain valid values.

Returns: a base64 encoded string containing the result data. Depending on the SOAP library and programming language used the result may be returned in decoded form.

doIprscan(params, content)

Deprecated. Use runInterProScan(params, content) instead.

polljob(jobid, outformat)

Deprecated. Use poll(jobid, type) instead.


Structure containing option descriptions for a parameter.

Attribute Type Description
name string Option value that should be passed as the parameter value.
print_name string Display name for use in user interfaces such as menus.
selected string Flag indicating if the option value is selected by default. Values: yes or no.
data_type string Not used
input_type string Not used
search_type string Not used


Structure containing the input data for the job

Attribute Type Description Default
type string Type of content being used. Valid values are: 'sequence' and 'dbfetch' required
content string A sequence entry identifier in db:id format or a formatted sequence (fasta recommended) required

For example to specify an input sequence in Java:

Data inSeq = new Data();
inSeq.setType = "sequence";
inSeq.setContent = ">TestSequence\nASAMPLESEQ\n";


A structure containing the parameters required to run the job.

Attribute Type Description Default
app string A space separated list of InterPro signature methods to run. Valid method names can be obtained using the getAppNames() method. all methods
crc boolean Use a CRC based look-up to get results from InterPro Matches if the sequence is known false
seqtype string The type of the input sequence. Only valid value is 'P' for protein input sequence. See Nucleotide Sequences for details or how to handle nucleotide sequences required
trlen int Deprecated
trtable int Deprecated
goterms boolean Fetch additional GO term annotations false
outformat string Not used, see poll(jobid, type) for details of how to retrieve specific result types
async boolean Asynchronous submission (recommended) false
email string Valid email address. See Why do you need my e-mail address? required


Structure describing a result type. Returned by the getResults(jobid) method.

Attribute Type Description
type string Symbolic name of the result type. Used with the poll(jobid, type) method.
ext string Recommended file extension for this result type.


If you have any questions or comments, or you plan to use this service as part of a course or for a high number of submissions, please contact us EMBL-EBI Support.

services/archive/pfa/wsinterproscan.txt · Last modified: 2013/04/23 16:30 by hpm