 |
What is it?
Whatizit is a text processing system that allows you to do textmining tasks on text. The
tasks come defined by the pipelines in the drop down list of the above window and the text
can be pasted in the text area. The description of each individual task/pipeline can be found
following the link next to the submit button. Whatizit is also a Medline abstracts retrieval/search engine.
Instead of providing the text by Copy&Paste, you can launch a Medline search. The abstracts
that match your search critetia are retrieved and processed by a pipeline of your choice. Take
a look at
query syntax
to check the query syntax.
Whatizit is great at identifying molecular biology terms and linking them to
publicly available databases. Identified terms are wrapped with XML tags
that carry additional information, such as the primary keys to the databases where
all the relevant information is kept. The wrapping XML is translated into HTML
hypertext links. This service is highly appreciated by people who are reading
literature and need to quickly find more information about a particular term,
e.g. its Uniprot id.
Whatizit is also available as 1) a
webservice
and as 2) a
streamed servlet
. The webservice
allows you to enrich content within your website in a similar way as in the wikipedia.
The streamed servlet allows you to process large amounts of text.
In general, any vocabulary in the range of up to 500k terms can be easily integrated
into a whatizit pipeline. Whatizit is fantastic at identifying formalized language patterns,
specialized, syntactically formalized, technical notation. You can talk to
us
about particular
needs. The annotation speed of a given pipeline is almost independent of the size of the
vocabulary behind it and is currently based on pattern matching (as a result, quite a few
spurious matches are highlighted because many terms, e.g. protein names, resemble normal
English words or acronyms which have also other meanings. We are actively working on the
disambiguation of these terms). In addition, several vocabularies can be integrated in a
single pipeline as is the case of Swissprot and GO terms in the
whatizitSwissprotGo pipeline.
Examples of already integrated vocabularies are Swissprot, the
Gene Ontology (GO), the NCBI's taxonomy, Medline Plus. If you have
any question, suggestion or request do not hesitate to contact
us.
|
 |