spacer
spacer

Taverna

About Taverna

The Taverna project aims to provide a language and software tools to facilitate easy use of workflow and distributed compute technology within the eScience community. Taverna allows a biologist or bioinformatician to construct highly complex analyses over public, private data and computational resources. The screenshot below shows the Taverna Workbench in action.

You can find more information and download the workbench on the project web site: http://taverna.sourceforge.net/

Taverna and EBI Web Services

Adding Services

To add an EBI web service to the list of available services, right-click the “Available processors” node in the services panel, and from the context menu select “Add new WSDL scavenger…”. Type the WSDL URL for the service you want from the WSDL list or the service page into the dialog, and click OK.

For detailed instructions on adding WSDL scavengers see @myGrid Taverna pages

Polling for Results

From the introduction you know that many of the EMBL-EBI services use a three step process to launch a job and retrieve the results, which in the case of asynchronous submissions involves using the following process:

  1. Submit the job (e.g. runInterproScan) with the async flag set
  2. Poll for the job status using checkStatus
  3. If the job has completed successfully:
    1. Get the list of available result types (getResults)
    2. Get the results of the required type (poll)

With Taverna this is a little bit tricky, as no explicit loop mechanism exists, so you must use the implicit retry-if-fail policy:

  1. Create a nested workflow and open it for editing
  2. In the nested workflow you need to create a workflow input for the job identifier (e.g. JobID).
  3. From the required service add the checkStatus processor and link this with the job identifier input.
  4. Optionally: add a workflow output for the job status and link the output of checkStatus to it. This can be useful for debugging.
  5. Add a beanshell processor to map the status code returned by checkStatus into a boolean value:
    1. Give it a name, for example Is_done
    2. Add an input (e.g. job_status)
    3. Add an output (e.g. is_done)
    4. Add a script. For example: if(job_status.equals(“DONE”)) {is_done = “true”;} else {is_done = “false”;}
  6. Add a Fail_if_False processor and link the output of your beanshell (e.g. is_done) to it.
  7. Check the 'critical' checkbox for the 'Fail if false' processor. This means that this processor failing will cause the nested workflow to fail.
  8. In the parent workflow configure the retry behavior for the nested workflow:
    1. Set the retry delay (e.g. 30000 to poll every 30 seconds)
    2. Set the maximum number of retries (e.g. 10)
  9. Connect the output jobid of the service's run processor (e.g. runInterProScan) to the input of the nested workflow
  10. Add a coordination link ('coordinate from') from the service's poll processor to the nested workflow, so the poll is only run when the nested workflow completes successfully.

Soaplab

For Soaplab services the approach depends on the type of processor used:

  • Soaplab processors, as provided by Taverna, can be configured to poll for results rather then waiting, see the Soaplab FAQ.
  • WSDL processors use a similar method to that described above. However the methods and status names are different, see the Soaplab documentation for details.

Example Workflows

A number of Taverna workflows demonstrating how to use the EBI web services are available from the myExperiment workflow repository. For example:

For most of the web services example workflows in myExperiment are linked from the corresponding clients page.

More complex examples combing multiple services are also available on myExperiment for example:

 
tutorials/07_workflows/taverna.txt · Last modified: 2010/02/09 21:41 by hpm
spacer
spacer