spacer
spacer

WWW::OpenSearch

The OpenSearch specification (http://www.opensearch.org/) defines an API for interacting with search engines. This includes:

  • XML format for a document describing how to interact with the search engine.
  • Extensions to RSS 1) or Atom 2) web feeds to support search results

OpenSearch is designed to support aggregation of search results and was originally defined to support the A9 search engine's use of external search engines, without having to develop specific modules to interact with each search engine's API. The provision of interface description documents has lead to the adoption of the OpenSearch description format as the standard for defining search plug-ins in web browsers (e.g. Firefox). To aid in this, the specification defines a method for browsers to discover the search plug-in associated with a web page using a “search” link.

Since OpenSearch uses web feed to describe results, search results can also be used in feed readers, allowing users to get notification of changes in search results.

Installation

The WWW::OpenSearch modules can be installed by:

  1. Using the system package manager install/update the appropriate package, for example libwww-opensearch-perl for Debain based Linux systems (e.g. Ubuntu or Bio-Linux)
  2. Using CPAN to install/update WWW::OpenSearch (CPAN).
  3. Downloading the WWW::OpenSearch distribution and installing manually:

For an overview of the how to install a module see Installing Perl Modules.

Using an OpenSearch Description

Given an OpenSearch description a search can easily be performed and a summary of the results retrieved. For example using an OpenSearch interface to SRS to get the entries containing the terms “CFTR mouse” from UniProtKB (examples/REST/WWW-OpenSearch/srs-uniprot_opensearch.pl):

# Load modules
use WWW::OpenSearch;
use XML::Feed;
use XML::Feed::Enclosure;
 
# The feed returned by SRS uses multiple "enclosure" tags to provide 
# pointers to additional data formats available from SRS. Thus XML::Feed 
# has to be configured to allow multiple encousures per item in the feed
$XML::Feed::MULTIPLE_ENCLOSURES = 1;
 
# Pointer to OpenSearch XML description for UniProtKB in SRS
my $opensearch_url = 
    'http://srs.ebi.ac.uk/srsbin/cgi-bin/opensearch?db=uniprot';
 
# Get search interface from the description
my $engine = WWW::OpenSearch->new($opensearch_url);
 
# The service description contains some general information:
# A name for the search service.
print $engine->description->ShortName, "\n";
# A description.
print $engine->description->Description, "\n";
 
# Perform a search
my $response = $engine->search('CFTR mouse');
 
# The result is an RSS (or Atom) feed, with some additional tags. These 
# provide additional details, such as the number of hit found.
print $response->pager->total_entries, ' items found', "\n";
 
# The results are paged so to get all the results loop through the pages
while($response) {
    # The "item" elements in the feed contain summary data about the hit.
    foreach my $item ($response->feed->items) {
	# For this feed the content is the SRS identifier and the title
	# a description of the entry. 
	print $item->content->body, "\t", $item->title, "\n";
	# A link to the data for the hit. In this case the main entry view
	# in SRS.
	print "\t", $item->link, "\n";
	# Additional data can be associated with the item using an 
	# "enclosure". In this case multiple enclosures are used to 
	# point to other entry formats, e.g. plain text flatfile or 
	# flatfile with hyperlinks. The type provides the MIME type of 
	# the data.
	foreach my $enclosure ($item->enclosure) {
	    print "\t", $enclosure->type, "\t", $enclosure->url, "\n";
	}
    }
    $response = $response->next_page;
}


Up Perl Contents Contents
 
tutorials/06_programming/perl/rest/www-opensearch.txt · Last modified: 2010/02/20 15:00 by hpm
spacer
spacer