spacer
spacer

Jakarta Commons HttpClient 3.x

One method for accessing Hypertext Transfer Protocol (HTTP) 1) 2) based services using Java 3) is using the Jakarta Commons HttpClient 4). Since REST Web Services are based on HTTP, Jakarta Commons HttpClient can be used to access any REST service.

Installation

Jakarta Commons HttpClient 3.x is available for download from http://hc.apache.org/httpclient-3.x/.

Apache Ivy

If using Apache Ivy the following dependency declaration can be used:

<dependency org="commons-httpclient" name="commons-httpclient" rev="3.1" />

Apache Maven

If using Apache Maven the following dependency declaration can be used:

<dependency>
	<groupId>commons-httpclient</groupId>
	<artifactId>commons-httpclient</artifactId>
	<version>3.1</version>
	<scope>compile</scope>
</dependency>

HTTP GET

HTTP GET is simplest of the HTTP requests and is used to get a document given a URL. So to use a GET the URL of the required Web Service resource is needed. Depending on the service this may be a static URL or more commonly the URL has to be constructed based on the parameters for the request. The following examples illustrate the process using the dbfetch and WSDbfetch (REST) services.

dbfetch

The dbfetch service (http://www.ebi.ac.uk/Tools/dbfetch/dbfetch) provides a generic interface to retrieve data entries given an identifier (Id or accession) from a wide range of biological databases available at EMBL-EBI. Two styles of URL can be used to access dbfetch:

  1. Parametrised URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db={DB}&id={IDS}&format={FORMAT}&style={STYLE}
  2. Document style URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/{DB}/{IDS}/{FORMAT}

The dbfetch documentation (http://www.ebi.ac.uk/Tools/dbfetch/dbfetch) details the valid values for the database name ({DB}), data format ({FORMAT}) and data style ({STYLE}). The identifier list ({IDS}) is a comma separated list of entry identifiers. The identifiers can be either Ids, names or accessions. For example to retrieve the rat and mouse WAP proteins from UniProtKB:

  1. Parametrised URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db=uniprotkb&id=WAP_RAT,WAP_HUMAN&format=uniprot&style=raw
  2. Document style URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/uniprotkb/WAP_RAT,WAP_MOUSE/uniprot

Using the dbfetch document style URL to fetch the UniProtKB WAP_RAT in (examples/rest/httpclient/DbfetchHttpclientGet.java):

public class DbfetchHttpclientGet {	
	/** Get a web page using HTTP GET.
	 * 
	 * @param urlStr The URL of the page to be retrieved.
	 * @return A string containing the page.
	 */
	public static String getHttpUrl(String urlStr) {
		// Data obtained from service, to be returned
		String retVal = null;
		// Create a client
		HttpClient client = new HttpClient();
		// Create a HTTP GET request
		GetMethod method = new GetMethod(urlStr);
		// Allow redirects to be followed
		method.setFollowRedirects(true);
		try {
			// Execute the request using the client
			int statusCode = client.executeMethod(method);
			// Check the response status code
			if (statusCode != HttpStatus.SC_OK) {
				System.err.println("Method failed: " + method.getStatusLine());
		    }
			// Get the page data, allowing for character encoding
			BufferedReader bis = new BufferedReader(
				new InputStreamReader(
					method.getResponseBodyAsStream(),
					method.getResponseCharSet())
			);
			int bufLen = 8 * 1024;
			char[] charBuf = new char[ bufLen ];
			StringBuffer strBuf = new StringBuffer();
			int count;
			while( (count = bis.read(charBuf) ) != -1) {
				strBuf.append(charBuf, 0, count);
			}
			bis.close();
			retVal = strBuf.toString();
		}
		catch(HttpException ex) {
			System.out.println(ex.getMessage());			
		}
		catch(IOException ex) {
			System.out.println(ex.getMessage());
		}
		finally {
			// Clean-up the connection
			method.releaseConnection();
		}
		// Return the response data
		return retVal;
	}
 
	/** Execution entry point
	 * 
	 * @param args Command-line arguments
	 * @return Exit status
	 */
	public static void main(String[] args) {
		// Parameters for the dbfetch call
		String dbName = "uniprotkb"; // Database name (e.g. UniProtKB)
		String id = "WAP_RAT"; // Entry identifier, name or accession
		String format = "uniprot"; // Data format
 
		// Construct the dbfetch URL
		// Document style base URL 
		String dbfetchBaseUrl = "http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/";
		// Add the database name, identifiers and format to the URL
		String dbfetchUrl = dbfetchBaseUrl + dbName + "/" + id + "/" + format;
 
		// Get the page and print it.
		System.out.print(getHttpUrl(dbfetchUrl));
	}
}

Exercise 1: RESTful dbfetch

In the sample project a dbfetch client is provided (examples/rest/httpclient/DbfetchHttpclientGet.java). Starting from this client use dbfetch to get the EMBL-Bank entries with accessions: M28668, M60493 and M76128.

See the dbfetch and WSDbfetch REST documentation for details of the valid values for the parameters and the structure of the request URL.

Sample solution: solutions/rest/httpclient/Q1DbfetchHttpclientGet.java

HTTP POST

While HTTP GET is great for retrieving information there are restrictions on the amount of data that can be sent using GET. Thus for transferring large amounts of data or complex parameters an alternative method has to be used. Since HTTP POST sends the data independently of the URL, POST is used in circumstances where complex or large data needs to be transferred.

dbfetch

The dbfetch service accepts HTTP POST requests as well as HTTP GET requests, this is useful when using list of identifiers.

For example (examples/rest/httpclient/DbfetchHttpclientPost.java):

public class DbfetchHttpclientPost {
	/**
	 * Get a web page using HTTP POST.
	 * 
	 * @param urlStr
	 *            The URL of the page to be retrieved.
	 * @param postData
	 *            Array of name value pairs describing the data for the POST
	 * @return A string containing the page.
	 */
	public static String getHttpUrlPost(String urlStr, NameValuePair[] postData) {
		// Data obtained from service, to be returned
		String retVal = null;
		// Create a client
		HttpClient client = new HttpClient();
		// Create a HTTP POST request
		PostMethod method = new PostMethod(urlStr);
		// Add the POST data to the request
		method.setRequestBody(postData);
		try {
			// Execute the request using the client
			int statusCode = client.executeMethod(method);
			// Handle redirect response (cannot use setFollowRedirects(true)
			// with POST).
			// See http://hc.apache.org/httpclient-legacy/redirects.html
			if (statusCode == HttpStatus.SC_MOVED_PERMANENTLY
					|| statusCode == HttpStatus.SC_MOVED_TEMPORARILY
					|| statusCode == HttpStatus.SC_TEMPORARY_REDIRECT) {
				// Get the location to go to.
				Header locationHeader = method.getResponseHeader("location");
				if (locationHeader != null) {
					String redirectLocation = locationHeader.getValue();
					// Try request against new location
					method = new PostMethod(redirectLocation);
					method.setRequestBody(postData);
					statusCode = client.executeMethod(method);
				}
			}
			// Check the response status code
			if (statusCode != HttpStatus.SC_OK) {
				System.err.println("Method failed: " + method.getStatusLine());
			}
			// Get the page data, allowing for character encoding
			BufferedReader bis = new BufferedReader(new InputStreamReader(
					method.getResponseBodyAsStream(),
					method.getResponseCharSet()));
			int bufLen = 8 * 1024;
			char[] charBuf = new char[bufLen];
			StringBuffer strBuf = new StringBuffer();
			int count;
			while ((count = bis.read(charBuf)) != -1) {
				strBuf.append(charBuf, 0, count);
			}
			bis.close();
			retVal = strBuf.toString();
		} catch (HttpException ex) {
			System.out.println(ex.getMessage());
		} catch (IOException ex) {
			System.out.println(ex.getMessage());
		} finally {
			// Clean-up the connection
			method.releaseConnection();
		}
		// Return the response data
		return retVal;
	}
 
	/**
	 * Execution entry point
	 * 
	 * @param args
	 *            Command-line arguments
	 * @return Exit status
	 */
	public static void main(String[] args) {
		// Parameters for the dbfetch call
		String dbName = "uniprotkb"; // Database name (e.g. UniProtKB)
		String id = "WAP_RAT"; // Entry identifier, name or accession
		String format = "uniprot"; // Data format
 
		// Parameter style URL for dbfetch
		String dbfetchUrl = "http://www.ebi.ac.uk/Tools/dbfetch/dbfetch";
		// Construct the POST data for the parameters
		NameValuePair[] postData = {
				new NameValuePair("db", dbName), // Database
				new NameValuePair("id", id), // Entry identifier(s)
				new NameValuePair("format", format), // Data format
				new NameValuePair("style", "raw") // Result style
		};
 
		// Get the page and print it.
		System.out.print(getHttpUrlPost(dbfetchUrl, postData));
	}
}

Proxies

In some environments it is necessary to configure an HTTP proxy before a client can connect to external services. Jakarta commons HttpClient supports proxy configuration via the HostConfiguration:

// Create a client
HttpClient client = new HttpClient();
// Configure HTTP proxy.
HostConfiguration hostConf = client.getHostConfiguration();
hostConf.setProxy("proxy.example.org", 8080);

If support for the Java system properties (e.g. -Dhttp.proxyHost=proxy.example.org -Dhttp.proxyPort=8080) is required:

// Create a client
HttpClient client = new HttpClient();
// Configure HTTP proxy from system properties.
if(System.getProperty("http.proxyHost") != null) {
	String proxyHost = System.getProperty("http.proxyHost");
	int proxyPort = Integer.parseInt(System.getProperty("http.proxyPort"));
	HostConfiguration hostConf = client.getHostConfiguration();
	hostConf.setProxy(proxyHost, proxyPort);
}

User-Agent

HTTP clients usually provide information about what they are, allowing services to handle specific clients differently if necessary, and giving service providers some information about how their services are being used. By default HttpClient sets the HTTP User-Agent header (see RFC2616 section 14.43) to something like Jakarta Commons-HttpClient/3.1, where the version number (3.1) is the version of HttpClient. If additional identification of the client is required a more specific product token (see RFC2616 section 3.8) should be added to the beginning of the User-Agent string:

// Modify the user-agent to add a more specific prefix (see RFC2616 section 14.43)
HttpClientParams httpClientParams = new HttpClientParams();
httpClientParams.setParameter("http.useragent", 
	"Example-Client/1.0 (" + System.getProperty("os.name") + ") " 
	+ httpClientParams.getParameter("http.useragent"));
HttpClient httpClient = new HttpClient(httpClientParams);

Note: while the HTTP specification does not define a limit on the size of HTTP headers, web server implementations often do limit the maximum size of an HTTP header to 8KB or 16KB. If the server limit for an HTTP header is exceeded a “400 Bad Request” will be returned by the server.


Up Java Contents Contents
1) RFC1945 - Hypertext Transfer Protocol – HTTP/1.0 - http://www.faqs.org/rfcs/rfc1945.html
2) RFC2616 - Hypertext Transfer Protocol – HTTP/1.1 - http://www.faqs.org/rfcs/rfc2616.html
4) Jakarta Commons HttpClient 3.x - http://hc.apache.org/httpclient-3.x/
 
tutorials/06_programming/java/rest/jakarta_commons_httpclient_3.x.txt · Last modified: 2014/02/26 11:27 by hpm
spacer
spacer