spacer
spacer

Taverna

About Taverna

The Taverna project aims to provide a language and software tools to facilitate easy use of workflow and distributed compute technology within the eScience community. Taverna allows a biologist or bioinformatician to construct highly complex analyses over public, private data and computational resources. The screenshot below shows the Taverna Workbench in action.

You can find more information and download the workbench on the project web site: http://www.taverna.org.uk/

Taverna and Asynchronous Web Services

As described in Introduction to Web Services, services where the processing can take a long time (e.g. a sequence search or a protein function prediction) use a submission pattern which involves obtaining an identifier for the submitted job and polling the job status before attempting to retrieve results. In order to use these services in Taverna the following method is used.

Adding Services

  • Taverna 1.x: add the web service to the list of available services by right-clicking the “Available processors” node in the services panel, and selecting “Add new WSDL scavenger…” from the menu. Type, or copy and paste, the WSDL URL for the service you want from the service page into the dialog, and click the “OK” button.
  • Taverna 2.x: add the web service to the list of available services by clicking the “Import new services” button in the “Service panel” and from the resulting menu selecting the “WSDL service…” option. Type, or copy and paste, the WSDL URL for the service you want to use from the corresponding service page into the dialog, and click the “Add” button.

The service should now appear in the list of available services, with a label like: “WSDL @ http://www.ebi.ac.uk/Tools/services/soap/ncbiblast?wsdl”.

Polling for Results

From the introduction you know that many of the EMBL-EBI services use a three step process to launch a job and retrieve the results, which in the case of asynchronous submissions involves using the following process:

  1. Submit the job (e.g. run) to get a job identifier
  2. Poll the job status using getStatus
  3. When/if the job has completed successfully:
    1. Get the list of available result types (getResultTypes)
    2. Get the results of the required type (getResult)

The mechanism to do this changes between Taverna 1.x and Taverna 2.x, so they are described separately.

Taverna 1.x

Since no explicit loop mechanism is available, the implicit retry-if-fail policy is used to perform polling of the job status

  1. Create a nested workflow and open it for editing
  2. In the nested workflow you need to create a workflow input for the job identifier (e.g. JobID).
  3. From the required service add the getStatus processor and link this with the job identifier input (JobID).
  4. Optional: add a workflow output for the job status and link the output of getStatus to it. This can be useful for debugging.
  5. Add a beanshell processor to map the status code returned by getStatus into a boolean value:
    1. Give it a name, for example: Is_done
    2. Add an input (e.g. job_status)
    3. Add an output (e.g. is_done)
    4. Add a script. For example: if(job_status.equals(“DONE”)) {is_done = “true”;} else {is_done = “false”;}
  6. Add a Fail_if_False processor and link the output of your beanshell (e.g. Is_done) to it.
  7. Check the 'critical' checkbox for the Fail_if_false processor. This means that this processor failing will cause the nested workflow to fail.
  8. In the parent workflow configure the retry behaviour for the nested workflow:
    1. Set the retry delay (e.g. 30000 to poll every 30 seconds)
    2. Set the maximum number of retries (e.g. 10)
  9. Connect the output 'jobid' of the service's run processor (e.g. run) to the JobID input of the nested workflow
  10. Add a coordination link ('coordinate from') from the service's poll processor to the nested workflow, so the poll is only run when the nested workflow completes successfully.

Taverna 2.x

An explicit loop mechanism is available in Taverna 2.x so polling uses this mechanism instead.

  1. Set-up the workflow inputs and outputs
  2. Add the run processor for the service
  3. Create the required “XML splitter” processors for the service input and output
  4. Connect the workflow inputs through the required XML splitters to the run processor
  5. Create a nested workflow:
    1. Add processor from: “Available Services”, “Service Templates”, “Nested workflow”
    2. This is a new embedded workflow, so just click the “Import workflow” button
    3. This opens the nested workflow for editing.
    4. Create a workflow input for the job identifier (e.g. JobID).
    5. Create a workflow output for the job status (e.g. JobStatus).
    6. Add the getStatus processor for the service.
    7. Create any required “XML splitter” processors for getStatus.
    8. Through the XML splitters connect the getStatus processor to the workflow input and output.
    9. Save the nested workflow.
  6. In the parent workflow connect the run processor output, through any required XML splitters to the input of the nested workflow
  7. Configure the looping behaviour for the Nested workflow:
    1. From the parent workflow select the Nested workflow a click the “Details” tab.
    2. Expand the “Advanced” section and click on the “Add looping” button
    3. Select: JobStatus “is not equal to” “RUNNING” as the loop condition
    4. Set “delay” to 3 seconds
    5. Click the “OK” button
  8. Add the desired getResult processors with XML splitters and connect them to the workflow outputs and to the output of the run processor
  9. Select the getResult processor(s) and right-click, select “Run after” and pick the Nested workflow use to get the status.

Then run the workflow and watch as the job is submitted, the status polled and, once the job is finished, the results retrieved.

Soaplab

For Soaplab services the approach depends on the type of processor used:

  • Soaplab processors, as provided by Taverna, can be configured to poll for results rather then waiting, see the Soaplab FAQ.
  • WSDL processors use a similar method to that described above. However the methods and status names are different, see the Soaplab documentation for details.

Example Workflows

A number of Taverna workflows demonstrating how to use the EMBL-EBI web services are available from the myExperiment workflow repository. For example:

For many of the web services example workflows in myExperiment are linked from the corresponding clients page.

More complex examples combing multiple services are also available on myExperiment for example:


Up Workflows Contents Contents
 
tutorials/07_workflows/taverna.txt · Last modified: 2013/03/25 16:52 by hpm
spacer
spacer