spacer
spacer
Pipelines Info/Update

What is it?

Whatizit is a text processing system that allows you to do textmining tasks on text. The tasks come defined by the pipelines in the drop down list of the above window and the text can be pasted in the text area. The description of each individual task/pipeline can be found following the link next to the submit button. Whatizit is also a Medline abstracts retrieval/search engine. Instead of providing the text by Copy&Paste, you can launch a Medline search. The abstracts that match your search critetia are retrieved and processed by a pipeline of your choice. Take a look at query syntax to check the query syntax.

Whatizit is great at identifying molecular biology terms and linking them to publicly available databases. Identified terms are wrapped with XML tags that carry additional information, such as the primary keys to the databases where all the relevant information is kept. The wrapping XML is translated into HTML hypertext links. This service is highly appreciated by people who are reading literature and need to quickly find more information about a particular term, e.g. its Uniprot id.

Whatizit is also available as 1) a webservice and as 2) a streamed servlet . The webservice allows you to enrich content within your website in a similar way as in the wikipedia. The streamed servlet allows you to process large amounts of text.

In general, any vocabulary in the range of up to 500k terms can be easily integrated into a whatizit pipeline. Whatizit is fantastic at identifying formalized language patterns, specialized, syntactically formalized, technical notation. You can talk to us about particular needs. The annotation speed of a given pipeline is almost independent of the size of the vocabulary behind it and is currently based on pattern matching (as a result, quite a few spurious matches are highlighted because many terms, e.g. protein names, resemble normal English words or acronyms which have also other meanings. We are actively working on the disambiguation of these terms). In addition, several vocabularies can be integrated in a single pipeline as is the case of Swissprot and GO terms in the whatizitSwissprotGo pipeline.

Examples of already integrated vocabularies are Swissprot, the Gene Ontology (GO), the NCBI's taxonomy, Medline Plus. If you have any question, suggestion or request do not hesitate to contact us.

spacer
spacer