 |
The Taverna project aims to provide a language and software tools to facilitate easy use of workflow and distributed compute technology within the eScience community. Taverna allows a biologist or bioinformatician to construct highly complex analyses over public, private data and computational resources. The screenshot below shows the Taverna Workbench in action.
You can find more information and download the workbench on the project web site: http://www.taverna.org.uk/
As described in Introduction to Web Services, services where the processing can take a long time (e.g. a sequence search or a protein function prediction) use a submission pattern which involves obtaining an identifier for the submitted job and polling the job status before attempting to retrieve results. In order to use these services in Taverna the following method is used.
Taverna 1.x: add the web service to the list of available services by right-clicking the “ Available processors” node in the services panel, and selecting “ Add new WSDL scavenger…” from the menu. Type, or copy and paste, the WSDL URL for the service you want from the service page into the dialog, and click the “ OK” button.
Taverna 2.x: add the web service to the list of available services by clicking the “ Import new services” button in the “ Service panel” and from the resulting menu selecting the “ WSDL service…” option. Type, or copy and paste, the WSDL URL for the service you want to use from the corresponding service page into the dialog, and click the “ Add” button.
The service should now appear in the list of available services, with a label like: “WSDL @ http://www.ebi.ac.uk/Tools/services/soap/ncbiblast?wsdl”.
From the introduction you know that many of the EMBL-EBI services use a three step process to launch a job and retrieve the results, which in the case of asynchronous submissions involves using the following process:
Submit the job (e.g. run) to get a job identifier
Poll the job status using getStatus
When/if the job has completed successfully:
Get the list of available result types (getResultTypes)
Get the results of the required type (getResult)
The mechanism to do this changes between Taverna 1.x and Taverna 2.x, so they are described separately.
Since no explicit loop mechanism is available, the implicit retry-if-fail policy is used to perform polling of the job status
Create a nested workflow and open it for editing
In the nested workflow you need to create a workflow input for the job identifier (e.g. JobID).
From the required service add the getStatus processor and link this with the job identifier input (JobID).
Optional: add a workflow output for the job status and link the output of getStatus to it. This can be useful for debugging.
Add a beanshell processor to map the status code returned by getStatus into a boolean value:
Give it a name, for example: Is_done
Add an input (e.g. job_status)
Add an output (e.g. is_done)
Add a script. For example: if(job_status.equals(“DONE”)) {is_done = “true”;} else {is_done = “false”;}
Add a Fail_if_False processor and link the output of your beanshell (e.g. Is_done) to it.
Check the 'critical' checkbox for the Fail_if_false processor. This means that this processor failing will cause the nested workflow to fail.
In the parent workflow configure the retry behaviour for the nested workflow:
Set the retry delay (e.g. 30000 to poll every 30 seconds)
Set the maximum number of retries (e.g. 10)
Connect the output 'jobid' of the service's run processor (e.g. run) to the JobID input of the nested workflow
Add a coordination link ('coordinate from') from the service's poll processor to the nested workflow, so the poll is only run when the nested workflow completes successfully.
An explicit loop mechanism is available in Taverna 2.x so polling uses this mechanism instead.
Set-up the workflow inputs and outputs
Add the run processor for the service
Create the required “ XML splitter” processors for the service input and output
Connect the workflow inputs through the required XML splitters to the run processor
Create a nested workflow:
Add processor from: “Available Services”, “Service Templates”, “Nested workflow”
This is a new embedded workflow, so just click the “Import workflow” button
This opens the nested workflow for editing.
Create a workflow input for the job identifier (e.g. JobID).
Create a workflow output for the job status (e.g. JobStatus).
Add the getStatus processor for the service.
Create any required “ XML splitter” processors for getStatus.
Through the XML splitters connect the getStatus processor to the workflow input and output.
Save the nested workflow.
In the parent workflow connect the run processor output, through any required XML splitters to the input of the nested workflow
Configure the looping behaviour for the Nested workflow:
From the parent workflow select the Nested workflow a click the “Details” tab.
Expand the “Advanced” section and click on the “Add looping” button
Select: JobStatus “is not equal to” “RUNNING” as the loop condition
Set “delay” to 3 seconds
Click the “OK” button
Add the desired getResult processors with XML splitters and connect them to the workflow outputs and to the output of the run processor
Select the getResult processor(s) and right-click, select “Run after” and pick the Nested workflow use to get the status.
Then run the workflow and watch as the job is submitted, the status polled and, once the job is finished, the results retrieved.
For Soaplab services the approach depends on the type of processor used:
Soaplab processors, as provided by Taverna, can be configured to poll for results rather then waiting, see the Soaplab FAQ.
WSDL processors use a similar method to that described above. However the methods and status names are different, see the Soaplab documentation for details.
A number of Taverna workflows demonstrating how to use the EMBL-EBI web services are available from the myExperiment workflow repository. For example:
For many of the web services example workflows in myExperiment are linked from the corresponding clients page.
More complex examples combing multiple services are also available on myExperiment for example:
 |