5,146 research outputs found
Compressed Text Indexes:From Theory to Practice!
A compressed full-text self-index represents a text in a compressed form and
still answers queries efficiently. This technology represents a breakthrough
over the text indexing techniques of the previous decade, whose indexes
required several times the size of the text. Although it is relatively new,
this technology has matured up to a point where theoretical research is giving
way to practical developments. Nonetheless this requires significant
programming skills, a deep engineering effort, and a strong algorithmic
background to dig into the research results. To date only isolated
implementations and focused comparisons of compressed indexes have been
reported, and they missed a common API, which prevented their re-use or
deployment within other applications.
The goal of this paper is to fill this gap. First, we present the existing
implementations of compressed indexes from a practitioner's point of view.
Second, we introduce the Pizza&Chili site, which offers tuned implementations
and a standardized API for the most successful compressed full-text
self-indexes, together with effective testbeds and scripts for their automatic
validation and test. Third, we show the results of our extensive experiments on
these codes with the aim of demonstrating the practical relevance of this novel
and exciting technology
Matching Queries to Frequently Asked Questions: Search Functionality for the MRSA Web-Portal
As part of the long-term EUREGIO MRSA-net project a system was developed which enables health care workers and the general public to quickly find answers to their questions regarding the MRSA pathogen. This paper focuses on how these questions can be answered using Information Retrieval (IR) and Natural Language Processing (NLP) techniques on a Frequently-Asked-Questions-style (FAQ) database
- …