677 research outputs found
06472 Abstracts Collection - XQuery Implementation Paradigms
From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available
Reasoning & Querying – State of the Art
Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF
Content-Aware DataGuides for Indexing Large Collections of XML Documents
XML is well-suited for modelling structured data with
textual content. However, most indexing approaches perform
structure and content matching independently, combining
the retrieved path and keyword occurrences in a third
step. This paper shows that retrieval in XML documents can
be accelerated significantly by processing text and structure
simultaneously during all retrieval phases. To this end,
the Content-Aware DataGuide (CADG) enhances the wellknown
DataGuide with (1) simultaneous keyword and path
matching and (2) a precomputed content/structure join. Extensive
experiments prove the CADG to be 50-90% faster
than the DataGuide for various sorts of query and document,
including difficult cases such as poorly structured
queries and recursive document paths. A new query classification
scheme identifies precise query characteristics with
a predominant influence on the performance of the individual
indices. The experiments show that the CADG is applicable
to many real-world applications, in particular large
collections of heterogeneously structured XML documents
Type-Based Detection of XML Query-Update Independence
This paper presents a novel static analysis technique to detect XML
query-update independence, in the presence of a schema. Rather than types, our
system infers chains of types. Each chain represents a path that can be
traversed on a valid document during query/update evaluation. The resulting
independence analysis is precise, although it raises a challenging issue:
recursive schemas may lead to infer infinitely many chains. A sound and
complete approximation technique ensuring a finite analysis in any case is
presented, together with an efficient implementation performing the chain-based
analysis in polynomial space and time.Comment: VLDB201
Four Lessons in Versatility or How Query Languages Adapt to the Web
Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
The ViP2P Platform: XML Views in P2P
The growing volumes of XML data sources on the Web or produced by
enterprises, organizations etc. raise many performance challenges for data
management applications. In this work, we are concerned with the distributed,
peer-to-peer management of large corpora of XML documents, based on distributed
hash table (or DHT, in short) overlay networks. We present ViP2P (standing for
Views in Peer-to-Peer), a distributed platform for sharing XML documents based
on a structured P2P network infrastructure (DHT). At the core of ViP2P stand
distributed materialized XML views, defined by arbitrary XML queries, filled in
with data published anywhere in the network, and exploited to efficiently
answer queries issued by any network peer. ViP2P allows user queries to be
evaluated over XML documents published by peers in two modes. First, a
long-running subscription mode, when a query can be registered in the system
and receive answers incrementally when and if published data matches the query.
Second, queries can also be asked in an ad-hoc, snapshot mode, where results
are required immediately and must be computed based on the results of other
long-running, subscription queries. ViP2P innovates over other similar
DHT-based XML sharing platforms by using a very expressive structured XML query
language. This expressivity leads to a very flexible distribution of XML
content in the ViP2P network, and to efficient snapshot query execution. ViP2P
has been tested in real deployments of hundreds of computers. We present the
platform architecture, its internal algorithms, and demonstrate its efficiency
and scalability through a set of experiments. Our experimental results outgrow
by orders of magnitude similar competitor systems in terms of data volumes,
network size and data dissemination throughput.Comment: RR-7812 (2011
RDF Querying
Reactive Web systems, Web services, and Web-based publish/
subscribe systems communicate events as XML messages, and in
many cases require composite event detection: it is not sufficient to react
to single event messages, but events have to be considered in relation to
other events that are received over time.
Emphasizing language design and formal semantics, we describe the
rule-based query language XChangeEQ for detecting composite events.
XChangeEQ is designed to completely cover and integrate the four complementary
querying dimensions: event data, event composition, temporal
relationships, and event accumulation. Semantics are provided as
model and fixpoint theories; while this is an established approach for rule
languages, it has not been applied for event queries before
- …