83 research outputs found
Quasi-SLCA based Keyword Query Processing over Probabilistic XML Data
The probabilistic threshold query is one of the most common queries in
uncertain databases, where a result satisfying the query must be also with
probability meeting the threshold requirement. In this paper, we investigate
probabilistic threshold keyword queries (PrTKQ) over XML data, which is not
studied before. We first introduce the notion of quasi-SLCA and use it to
represent results for a PrTKQ with the consideration of possible world
semantics. Then we design a probabilistic inverted (PI) index that can be used
to quickly return the qualified answers and filter out the unqualified ones
based on our proposed lower/upper bounds. After that, we propose two efficient
and comparable algorithms: Baseline Algorithm and PI index-based Algorithm. To
accelerate the performance of algorithms, we also utilize probability density
function. An empirical study using real and synthetic data sets has verified
the effectiveness and the efficiency of our approaches
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
TopX : efficient and versatile top-k query processing for text, structured, and semistructured data
TopX is a top-k retrieval engine for text and XML data. Unlike Boolean engines, it stops query processing as soon as it can safely determine the k top-ranked result objects according to a monotonous score aggregation function with respect to a multidimensional query. The main contributions of the thesis unfold into four main points, confirmed by previous publications at international conferences or workshops:
• Top-k query processing with probabilistic guarantees.
• Index-access optimized top-k query processing.
• Dynamic and self-tuning, incremental query expansion for top-k query
processing.
• Efficient support for ranked XML retrieval and full-text search.
Our experiments demonstrate the viability and improved efficiency of our approach compared to existing related work for a broad variety of retrieval scenarios.TopX ist eine Top-k Suchmaschine für Text und XML Daten. Im Gegensatz
zu Boole\u27; schen Suchmaschinen terminiert TopX die Anfragebearbeitung,
sobald die k besten Ergebnisobjekte im Hinblick auf eine mehrdimensionale
Anfrage gefunden wurden. Die Hauptbeiträge dieser Arbeit teilen sich in
vier Schwerpunkte basierend auf vorherigen Veröffentlichungen bei internationalen
Konferenzen oder Workshops:
• Top-k Anfragebearbeitung mit probabilistischen Garantien.
• Zugriffsoptimierte Top-k Anfragebearbeitung.
• Dynamische und selbstoptimierende, inkrementelle Anfrageexpansion für Top-k Anfragebearbeitung.
• Effiziente Unterstützung für XML-Anfragen und Volltextsuche.
Unsere Experimente bestätigen die Vielseitigkeit und gesteigerte Effizienz unserer Verfahren gegenüber existierenden, führenden Ansätzen für eine weite
Bandbreite von Anwendungen in der Informationssuche
31. međunarodna konferencija Very Large Data Bases
Dana je vijest o održanoj 31. međunarodnoj konferenciji Very Large Data Bases
- …