spacer
spacer

java.net

One method for accessing Hypertext Transfer Protocol (HTTP) 1) 2) based services using Java 3) is using the java.net.URL and java.net.HttpURLConnection classes available in the java.net package. Since REST Web Services are based on HTTP, java.net.URL and java.net.HttpURLConnection can be used to access any REST service.

Installation

The java.net package is included as part of the Java distribution and does not require separate installation.

HTTP GET

HTTP GET is simplest of the HTTP requests and is used to get a document given a URL. So to use a GET the URL of the required Web Service resource is needed. Depending on the service this may be a static URL or more commonly the URL has to be constructed based on the parameters for the request. The following examples illustrate the process using the dbfetch, WSDbfetch (REST) and SRS services.

dbfetch

The dbfetch service (http://www.ebi.ac.uk/Tools/dbfetch/dbfetch) provides a generic interface to retrieve data entries given an identifier (Id or accession) from a wide range of biological databases available at EMBL-EBI. Two styles of URL can be used to access dbfetch:

  1. Parametrised URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db={DB}&id={IDS}&format={FORMAT}&style={STYLE}
  2. Document style URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/{DB}/{IDS}/{FORMAT}

The dbfetch documentation (http://www.ebi.ac.uk/Tools/dbfetch/dbfetch) details the valid values for the database name ({DB}), data format ({FORMAT}) and data style ({STYLE}). The identifier list ({IDS}) is a comma separated list of entry identifiers. The identifiers can be either Ids, names or accessions. For example to retrieve the rat and mouse WAP proteins from UniProtKB:

  1. Parametrised URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db=uniprotkb&id=WAP_RAT,WAP_HUMAN&format=uniprot&style=raw
  2. Document style URL:
    http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/uniprotkb/WAP_RAT,WAP_MOUSE/uniprot

Using the dbfetch document style URL to fetch the UniProtKB WAP_RAT in (examples/rest/javanet/RestDbfetchGet.java):

public class RestDbfetchGet {
	/** Get a web page using HTTP GET.
	 * 
	 * @param urlStr The URL of the page to be retrieved as a string.
	 * @return A string containing the page data.
	 */
	public static String getHttpUrl(String urlStr) {
		// Data obtained from service, to be returned
		String retVal = null;
		// Get data using HTTP GET
		try {
			URL url = new URL(urlStr);
			BufferedReader inBuf = new BufferedReader(new InputStreamReader(url.openStream()));
			StringBuffer strBuf = new StringBuffer();
			while(inBuf.ready()) {
				strBuf.append(inBuf.readLine() + System.getProperty("line.separator"));
			}
			retVal = strBuf.toString();
		}
		catch(IOException ex) {
			System.out.println(ex.getMessage());
		}
		// Return the response data
		return retVal;
	}
 
	/** Execution entry point
	 * 
	 * @param args Command-line arguments
	 * @return Exit status
	 */
	public static void main(String[] args) {
		// Parameters for dbfetch call
		String dbName = "uniprot"; // Database name (e.g. UniProtKB)
		String id = "WAP_RAT"; // Entry identifier, name or accession
		String format = "uniprot"; // Data format
 
		// Construct the dbfetch URL
		// dbfetch document style base URL 
		String dbfetchBaseUrl = "http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/";
		// Add the database name, identifiers and format to the URL
		String dbfetchUrl = dbfetchBaseUrl + dbName + "/" + id + "/" + format;
 
		// Get the page and print it.
		System.out.print(getHttpUrl(dbfetchUrl));
	}
}

Exercise 1: RESTful dbfetch

In the sample project a dbfetch client is provided (examples/rest/javanet/RestDbfetchGet.java). Starting from this client use dbfetch to get the EMBL-Bank entries with accessions: M28668, M60493 and M76128.

See the dbfetch and WSDbfetch REST documentation for details of the valid values for the parameters and the structure of the request URL.

Sample solution: solutions/rest/javanet/Q1RestDbfetchGet.java

SRS

While dbfetch provides a useful interface for entry retrieval, it is not a general query system. One option for performing queries is SRS (http://srs.ebi.ac.uk/). SRS offers a URL based interface which can be used to perform complex multi-database queries in a single request.

The simplest form of an SRS URL retrieves a list of entry identifiers in DB:ID format, for example to retrieve the entries in UniProtKB which contain the term “auxin”:

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?[uniprot-all:azurin*]

By default only the first 30 entries are returned. To get the number of entries matching the query the “cResult” page can be used:

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+cResult+[uniprot-all:azurin*]

This returns the number of entries matched by the query:

170 entries for [uniprot-all:azurin*]

Given the number of entries found the results can retrieved in chunks by using -bv to specify the number of the first entry in the chunk and -lv to specify the length of the chunk. For example to get the first two chunks of 30 entries for the query:

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-bv+1+-lv+30+[uniprot-all:azurin*]
http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-bv+31+-lv+30+[uniprot-all:azurin*]

As well as getting the identifiers of the entries matching the query the complete entry can be obtained using -e, or a specific view of the data using -view. For example to get a summary of the results of our query the “SeqSimpleView” could be used:

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-view+SeqSimpleView+[uniprot-all:azurin*]

or to get fasta formatted sequence the “FastaSeqs” view:

http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-view+FastaSeqs+[uniprot-all:azurin*]

For more information about SRS URLs see the Linking to SRS guide.

Exercise 2: REST and SRS

In the sample project a dbfetch client is provided (examples/rest/javanet/RestDbfetchGet.java). Starting from this client use SRS to find the number of entries in the EMBL Coding Sequences database (EMBLCDS) which contain the gene name CFTR.

See the Linking to SRS guide for details of how to construct a URL for SRS and the SRS Query Language Quick Guide for details of how to construct the query string.

Hint: use the SRS web interface (http://srs.ebi.ac.uk/) to perform the query and copy the query string created into the URL you construct.

Sample solution: solutions/rest/javanet/Q2RestSrsGet.java

HTTP POST

While HTTP GET is great for retrieving information there are restrictions on the amount of data that can be sent using GET. Thus for transferring large amounts of data or complex parameters an alternative method has to be used. Since HTTP POST sends the data independently of the URL, POST is used in circumstances where complex or large data needs to be transferred.

dbfetch

The dbfetch service accepts HTTP POST requests as well as HTTP GET requests, this is useful when using list of identifiers.

A POST request is a bit more complex since the request “method” has to be explicitly set and the POST data has to be provided (examples/rest/javanet/RestDbfetchPost.java):

public class RestDbfetchPost {
 /** Get a web page using HTTP POST.
  * 
  * @param urlStr The URL of the page to be retrieved.
  * @param postStr String containing POST encoded data
  * @return A string containing the entry
  */
  public static String getHttpUrl(String urlStr, String postStr) {
    // Data obtained from service, to be returned
    String retVal = null;
    // Get data using HTTP POST
    try {
      // Create connection to URL
      URL url = new URL(urlStr);
      HttpURLConnection conn = (HttpURLConnection)url.openConnection();
      conn.setDoOutput(true);
      conn.setRequestMethod("POST");
      // Send POST data
      OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
      wr.write(postStr);
      wr.flush();
      // Get the response
      BufferedReader inBuf = new BufferedReader(new InputStreamReader(conn.getInputStream()));
      StringBuffer strBuf = new StringBuffer();
      while(inBuf.ready()) {
        strBuf.append(inBuf.readLine() + System.getProperty("line.separator"));
      }
      retVal = strBuf.toString();
    }
    catch(IOException ex) {
      System.out.println(ex.getMessage());
    }
    // Return the response data
    return retVal;
  }
 
  /** Execution entry point
   * 
   * @param args Command-line arguments
   * @return Exit status
   */
  public static void main(String[] args) {
    String dbName = "uniprot"; // Database name (e.g. UniProtKB)
    String id = "wap_rat"; // Entry identifier, name or accession
    String format = "uniprot"; // Entry format.
 
    // Base URL for dbfetch
    String dbfetchUrlStr = "http://www.ebi.ac.uk/Tools/dbfetch/dbfetch";
    // Construct POST data
    String dbfetchPostStr = null;
    try {
      // Ensure appropriate encoding is used.
      dbfetchPostStr = URLEncoder.encode("db", "UTF-8") + "=" + URLEncoder.encode(dbName, "UTF-8")
        + "&" + URLEncoder.encode("id", "UTF-8") + "=" + URLEncoder.encode(id, "UTF-8")
        + "&" + URLEncoder.encode("format", "UTF-8") + "=" + URLEncoder.encode(format, "UTF-8")
        + "&" + URLEncoder.encode("style", "UTF-8") + "=" + URLEncoder.encode("raw", "UTF-8");
    }
    catch(IOException ex) {
      System.out.println(ex.getMessage());
    }
 
    // Get the page and print it.
    System.out.print(getHttpUrl(dbfetchUrlStr, dbfetchPostStr));
  }
}

Proxies

In some environments it is necessary to configure an HTTP proxy before a client can connect to external services. Java supports the configuration of proxies through:

  • Java system properties:
    • Provided to the JVM:
      java -Dhttp.proxyHost=proxy.example.org -Dhttp.proxyPort=8080 ExampleClientClass
    • Set in client code:
      System.setProperty("http.proxyHost", "proxy.example.org");
      System.setProperty("http.proxyPort", "8080");
  • Using the java.net.Proxy and java.net.ProxySelector classes.

For details and examples see:

User-Agent

HTTP clients usually provide information about what they are, allowing services to handle specific clients differently if necessary, and giving service providers some information about how their services are being used. By default Java sets the HTTP User-Agent header (see RFC2616 section 14.43) to something like Java/1.6.0_13, where the version number (1.6.0_13) is the version of Java. If additional identification of the client is required a more specific product token (see RFC2616 section 3.8) should be added to the beginning of the User-Agent string:

// Modify the user-agent to add a more specific prefix (see RFC2616 section 14.43)
String clientUserAgent = "Example-Client/1.0 (" + System.getProperty("os.name") + ")";
if (System.getProperty("http.agent") != null) {
	System.setProperty("http.agent", clientUserAgent + " " + System.getProperty("http.agent"));
}
else {
	System.setProperty("http.agent", clientUserAgent);
}

Note: this method of setting the user-agent is global and affects all HTTP requests made using the core Java packages, however third-party packages may use other mechanisms to set the user-agent and be unaffected by this change.


Up Java Contents Contents
1) RFC1945 - Hypertext Transfer Protocol – HTTP/1.0 - http://www.faqs.org/rfcs/rfc1945.html
2) RFC2616 - Hypertext Transfer Protocol – HTTP/1.1 - http://www.faqs.org/rfcs/rfc2616.html
 
tutorials/06_programming/java/rest/java.net.txt · Last modified: 2011/05/08 13:22 by hpm
spacer
spacer