3,363 research outputs found
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Truly Subquadratic-Time Extension Queries and Periodicity Detection in Strings with Uncertainties
Strings with don\u27t care symbols, also called partial words, and more general indeterminate strings are a natural representation of strings containing uncertain symbols. A considerable effort has been made to obtain efficient algorithms for pattern matching and periodicity detection in such strings. Among those, a number of algorithms have been proposed that behave well on random data, but still their worst-case running time is Theta(n^2). We present the first truly subquadratic-time solutions for a number of such problems on partial words that can also be adapted to indeterminate strings over a constant-sized alphabet. We show that longest common compatible prefix queries (which correspond to longest common extension queries in regular strings) can be answered on-line in O(n * sqrt(n * log(n)) time after O(n * sqrt(n * log(n))-time preprocessing. We also present O(n * sqrt(n * log(n))-time algorithms for computing the prefix array and two types of border array of a partial word
Generic Subsequence Matching Framework: Modularity, Flexibility, Efficiency
Subsequence matching has appeared to be an ideal approach for solving many
problems related to the fields of data mining and similarity retrieval. It has
been shown that almost any data class (audio, image, biometrics, signals) is or
can be represented by some kind of time series or string of symbols, which can
be seen as an input for various subsequence matching approaches. The variety of
data types, specific tasks and their partial or full solutions is so wide that
the choice, implementation and parametrization of a suitable solution for a
given task might be complicated and time-consuming; a possibly fruitful
combination of fragments from different research areas may not be obvious nor
easy to realize. The leading authors of this field also mention the
implementation bias that makes difficult a proper comparison of competing
approaches. Therefore we present a new generic Subsequence Matching Framework
(SMF) that tries to overcome the aforementioned problems by a uniform frame
that simplifies and speeds up the design, development and evaluation of
subsequence matching related systems. We identify several relatively separate
subtasks solved differently over the literature and SMF enables to combine them
in straightforward manner achieving new quality and efficiency. This framework
can be used in many application domains and its components can be reused
effectively. Its strictly modular architecture and openness enables also
involvement of efficient solutions from different fields, for instance
efficient metric-based indexes. This is an extended version of a paper
published on DEXA 2012.Comment: This is an extended version of a paper published on DEXA 201
- …