The standard method for accessing Hypertext Transfer Protocol (HTTP) 1) 2) based services using Python 3) is using urllib 4) or urllib2 5). Since REST Web Services are based on HTTP, urllib and urllib2 can be used to access any REST service.


The urllib and urllib2 modules are part of the standard Python distribution and should be installed as part of the Python installation.


HTTP GET is simplest of the HTTP requests and is used to get a document given a URL. So to use a GET the URL of the required Web Service resource is needed. Depending on the service this may be a static URL or more commonly the URL has to be constructed based on the parameters for the request. The following examples illustrate the process.

WSDbfetch (REST)

Using the WSDbfetch (REST) service (

# Defaults
dbName = 'uniprotkb'
entryId = 'wap_rat'
format = None
# Construct URL
baseUrl = ''
url = baseUrl + '/' + dbName + '/' + entryId
if format != None:
    url += '/' + format
# Get the entry
fh = urllib2.urlopen(url)
result =
# Print the entry
print result,


While HTTP GET is great for retrieving information there are restrictions on the amount of data that can be sent using GET. Thus for transferring large amounts of data or complex parameters an alternative method has to be used. Since HTTP POST sends the data independently of the URL, POST is used in circumstances where complex or large data needs to be transferred.


WU-BLAST (REST) requires a POST to be used to submit the parameters to be used for the search (

# Base URL for service
baseUrl = ''
# Query sequence
seq = """>Q8E5Q5_STRA3
# Structure containing parameters
params = {
# Submit job
submitUrl = baseUrl + '/run'
postData = urllib.urlencode(params)
# Errors are indicated by HTTP status codes.
    fh = urllib2.urlopen(submitUrl, postData)
    jobId =
except urllib2.HTTPError, ex:
    # Trap exception and output the document to get error message.
    print >>sys.stderr,
# Print job identifier
print jobId

HTTP Status Messages

Services may use many different methods for reporting errors to the client. On common method is to use HTTP status codes to indicate the error and a custom status message to describe the error. Unfortunately urllib2 overrides HTTP status messages and replaces them with standardised messages derived from the HTTP specification. One possible workaround is to use the error document received to report the message:

# Errors are indicated by HTTP status codes.
    # Make the request.
    fh = urllib2.urlopen(submitUrl, postData)
    jobId =
except urllib2.HTTPError, ex:
    # Trap exception and output the document to get error message.
    print >>sys.stderr,
    # Re-throw exception to get stack trace.


Generally urllib and urllib2 will automatically configure the required proxy settings from the system settings:

  • Environment variables on UNIX and UNIX-like systems: http_proxy, ftp_proxy and no_proxy
  • “Internet Settings” on MS Windows
  • “Network System Preferences” on MacOS X

If required these system settings can be overridden, see the Python documentation for details:


HTTP clients usually provide information about what they are, this allows services to handle specific clients differently if necessary, and gives service providers information about how their services are being used. By default the HTTP User-Agent header (see RFC2616 section 14.43) is set to something like:

  • urllib: Python-urllib/1.17
  • urllib2: Python-urllib/2.5

If additional identification of the client is required the a more specific product token (see RFC2616 section 3.8) should be added to the beginning of the User-Agent string.

Note: while the HTTP specification does not define a limit on the size of HTTP headers, web server implementations often do limit the maximum size of an HTTP header to 8KB or 16KB. If the server limit for an HTTP header is exceeded a “400 Bad Request” will be returned by the server.


For urllib the user agent can be set for all requests by modifying the URLopener used to create the connections. For example:

# Modify the user-agent to add a more specific prefix (see RFC2616 section 14.43)
import urllib
class AppURLopener(urllib.FancyURLopener):
    version = 'Example-Client/1.0 Python-urllib/%s' % urllib.__version__
urllib._urlopener = AppURLopener()


For urllib2 the user agent has to be specified in a User-Agent header for each request, for example:

# Modify the user-agent to add a more specific prefix (see RFC2616 section 14.43)
import urllib2, sys
user_agent = 'Example-Client/1.0 Python-urllib/%s' % sys.version[:3]
http_headers = { 'User-Agent' : user_agent }
req = urllib2.Request(url, None, http_headers)

Sample Clients

Most REST Web Services at EMBL-EBI have sample clients which provide command-line access to the service and example code. For Python some of the clients are based on urllib/urllib2.

Service Sample client
WSDbfetch (REST)

Further Reading

1) RFC1945 - Hypertext Transfer Protocol – HTTP/1.0 -
2) RFC2616 - Hypertext Transfer Protocol – HTTP/1.1 -
tutorials/06_programming/python/rest/urllib.txt · Last modified: 2013/06/21 09:35 by hpm