spacer
spacer

Introduction to Web Services

What is a Web Service?

The term web service can mean a number of different things:

  1. In the broadest sense a web service is any service available on the World Wide Web.
  2. More commonly a web service is any service which is based on web technologies, which is intended for use by computer programs rather than people.
  3. In some cases web service is specifically used to refer to services which use specific web services technologies (e.g. SOAP or REST)

Throughout this guide the term “Web Service” will be used to refer to services intended for use by programs.

While many bioinformatics resources are available on the web, they are usually only available via web interfaces which are targeted at interactive use via a web browser. This limits their utility in applications which require systematic access to a resource, where local resources are not available to install and maintain the data and software required to support the resources functionality. Or in cases where a data resource or analytical tool needs to be integrated into a web portal or workbench application. In these cases Web Services offer a method for accessing the resource remotely. This has the advantage of delegating the maintenance costs for the resource to the service provider, rather then having to absorb these costs locally, and significantly reduces the development and deployment costs for applications which need to consume data or results from the resource.

Client/Server Model

Web Services use the traditional client server model: a, possibly remote, server provides resources which are requested and consumed by a client which the user interacts with. A simple example of this is browsing the web where the web browser is a client which displays pages and allows the user to intact with the pages, and the web server(s) provide the data for the pages (HTML, images, etc.) when it is requested by the client.

Synchronous and Asynchronous Access

While many resources will return a result almost immediately and thus are suitable for synchronous requests, where the client makes the request and waits for the server to send the response containing the result, some types of analysis take longer. In these cases a synchronous request will have issues with timeouts and may make the client unresponsive. To address these issues services which provide access to such resources provide a mechanism for making asynchronous requests. This usually takes the form of a collection of methods which allow the client to:

  1. Submit a job and get a job identifier
  2. Get the status of a job. Typically return a status code indicating if the job is pending, running, finished or gave an error.
  3. Get the results of a finished job

The nomenclature for these methods and the job status codes depends on the service and additional methods may also be provided giving more control over the life cycle of the job.

Dispatcher

The old Dispatcher based services at EMBL-EBI (e.g. WSFasta, WSInterProScan, WSNCBIBlast and WSWUBlast) provided four methods for interacting with a job:

  1. A “run” method to submit the job, this is named after the service (e.g. runFasta for WSFasta)
  2. Get the status of the job (checkStatus). Returns a status code indicating the status of the job, possible values are:
    • RUNNING: the job is currently being processed
    • PENDING: the job is in a queue waiting processing
    • NOT_FOUND: the job cannot be found, commonly seen when the job is no longer available
    • ERROR: the job failed
    • DONE: job has finished, and the results can then be retrieved.
  3. Discover the available result types for a finished job (getResults)
  4. Get the result of the specified type for the job (poll)

The general workflow is shown below.

JDispatcher

The JDispatcher 1) based services at EMBL-EBI (e.g. FASTA (REST), FASTA (SOAP), NCBI BLAST (REST) and NCBI BLAST (SOAP)) provide four methods for interacting with a job:

  1. Submit the job (run)
  2. Get the status of the job (getStatus). Returns a status code indicating the status of the job, possible values are:
    • RUNNING: the job is currently being processed.
    • FINISHED: job has finished, and the results can then be retrieved.
    • ERROR: an error occurred attempting to get the job status. In most cases the job will be okay, but something is preventing the status from being accessed, a common workaround is to wait for 5 seconds and try again.
    • FAILURE: the job failed. Additional error details maybe in the available output.
    • NOT_FOUND: the job cannot be found. Common causes of this are an incorrect job identifier or the job has expired and the results have been removed.
  3. Discover the available result types for a finished job (getResultTypes)
  4. Get the result of the specified type for the job (getResult)

Soaplab

Soaplab 2) based services (for example see http://wsembnet.vital-it.ch/) provide many methods for interacting with a job, to run a job asynchronously there are three relevant methods:

  1. To submit the job (createAndRun for generic service or run for typed service)
  2. Get the status of the job (getStatus). Returns a status code indicating the status of the job, possible values are:
    • CREATED: job has been created but not yet executed
    • RUNNING: job is currently being processed
    • COMPLETED: job has finished, its results can be retrieved
    • TERMINATED_BY_ERROR: job was terminated due to an error
    • TERMINATED_BY_REQUEST: job was terminated either by user request or by the Soaplab job manager
  3. Get the all the results of job (getResults)

Web Service Technologies

Many technologies exist for accessing remote services over a computer network, for example:

Some of these are programming language specific (e.g. DRb and Java RMI), have operating system dependencies (e.g. DCE/RPC and ONC RPC) or use special network protocols (e.g. CORBA and ILU), which have issues with network proxies and firewalls. Today the most commonly used technologies for Web Services are Representational State Transfer (REST) and the Simple Object Access Protocol (SOAP), these are both programming language and operating system neutral and use common network protocols which are well supported by proxies and firewalls.


Contents Contents Next REST
1) Goujon, M., McWilliam, H., Li, W., Valentin, F., Squizzato, S., Paern, J. and Lopez, R. (2010)
A new bioinformatics analysis tools framework at EMBL-EBI
Nucleic Acids Research; DOI: 10.1093/nar/gkq313; PubMed: 20439314
2) Senger M., Rice P., Bleasby A. and Uludag M. (2009)
Soaplab: Open Source Web Services Framework for Bioinformatics Programs
The 10th Annual Bioinformatics Open Source Conference; PDF
3) Common Object Request Broker Architecture (CORBA) - http://www.corba.org/
5) Extensible Messaging and Presence Protocol (XMPP) - http://xmpp.org/
6) Inter-Language Unification (ILU) - ftp://ftp.parc.xerox.com/pub/ilu/ilu.html
7) Internet Communications Engine (ICE) - http://zeroc.com/ice.html
 
tutorials/01_intro.txt · Last modified: 2014/03/05 11:23 by hpm
spacer
spacer