Search CORE

89 research outputs found

An XML Query Engine for Network-Bound Data

Author: Halevy Alon Y
Ives Zachary G
Weld Daniel S
Publication venue: ScholarlyCommons
Publication date: 01/01/2001
Field of study

XML has become the lingua franca for data exchange and integration across administrative and enterprise boundaries. Nearly all data providers are adding XML import or export capabilities, and standard XML Schemas and DTDs are being promoted for all types of data sharing. The ubiquity of XML has removed one of the major obstacles to integrating data from widely disparate sources –- namely, the heterogeneity of data formats. However, general-purpose integration of data across the wide area also requires a query processor that can query data sources on demand, receive streamed XML data from them, and combine and restructure the data into new XML output -- while providing good performance for both batch-oriented and ad-hoc, interactive queries. This is the goal of the Tukwila data integration system, the first system that focuses on network-bound, dynamic XML data sources. In contrast to previous approaches, which must read, parse, and often store entire XML objects before querying them, Tukwila can return query results even as the data is streaming into the system. Tukwila is built with a new system architecture that extends adaptive query processing and relational-engine techniques into the XML realm, as facilitated by a pair of operators that incrementally evaluate a query’s input path expressions as data is read. In this paper, we describe the Tukwila architecture and its novel aspects, and we experimentally demonstrate that Tukwila provides better overall query performance and faster initial answers than existing systems, and has excellent scalability

CiteSeerX

ScholarlyCommons@Penn

Overview of query optimization in XML database systems

Author: Abdel Kader R.
van Keulen Maurice
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 12/11/2007
Field of study

University of Twente Research Information

XPath: Looking Forward

Author: Bry François
Furche Tim
Meuss Holger
Schaffert Dan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

The location path language XPath is of particular importance for XML applications since it is a core component of many XML processing standards such as XSLT or XQuery. In this paper, based on axis symmetry of XPath, equivalences of XPath 1.0 location paths involving reverse axes, such as anc and prec, are established. These equivalences are used as rewriting rules in an algorithm for transforming location paths with reverse axes into equivalent reverse-axis-free ones. Location paths without reverse axes, as generated by the presented rewriting algorithm, enable efficient SAX-like streamed data processing of XPath

CiteSeerX

Open Access LMU

Pathfinder: relational XQuery over multi-gigabyte XML inputs in interactive time

Author: Boncz P.A. (Peter)
Grust T.
Manegold S. (Stefan)
Rittinger J.
Teubner J. (Jens)
Publication venue: CWI
Publication date: 01/01/2005
Field of study

Using a relational DBMS as back-end engine for an XQuery processing system leverages relational query optimization and scalable query processing strategies provided by mature DBMS engines in the XML domain. Though a lot of theoretical work has been done in this area and various solutions have been proposed, no complete systems have been made available so far to give the practical evidence that this is a viable approach. In this paper, we describe the ourely relational XQuery processor Pathfinder that has been built on top of the extensible RDBMS MonetDB. Performance results indicate that the system is capable of evaluating XQuery queries efficiently, even if the input XML documents become huge. We additionally present further contributions such as loop-lifted staircase join, techniques to derive order properties and to reduce sorting effort in the generated relational algebra plans, as well as methods for optimizing XQuery joins, which, taken together, enabled us to reach our performance and scalability goal

CWI's Institutional Repository

XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme

Author: A. Balmin
A. Chebotko
D. Florescu
D. Kossmann
H. Gupta
H. Gupta
H.V. Jagadish
M. Atay
M.R. Garey
R. Chirkova
S. Abiteboul
S. Chaudhuri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Query evaluation in an XML database requires reconstructing XML subtrees rooted at nodes found by an XML query. Since XML subtree reconstruction can be expensive, one approach to improve query response time is to use reconstruction views - materialized XML subtrees of an XML document, whose nodes are frequently accessed by XML queries. For this approach to be efficient, the principal requirement is a framework for view selection. In this work, we are the first to formalize and study the problem of XML reconstruction view selection. The input is a tree

T

, in which every node

i

has a size

c_i

and profit

p_i

, and the size limitation

C

. The target is to find a subset of subtrees rooted at nodes

i_1,\cdots, i_k

respectively such that

c_{i_1}+\cdots +c_{i_k}\le C

, and

p_{i_1}+\cdots +p_{i_k}

is maximal. Furthermore, there is no overlap between any two subtrees selected in the solution. We prove that this problem is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) as a solution

arXiv.org e-Print Archive

Crossref

The XQueC Project: Compressing and Querying XML

Author: Arion Andrei
Bonifati Angela
Manolescu Ioana
Pugliese Andrea
Publication venue: Dagstuhl Seminar Proceedings. 08261 - Structure-Based Compression of Complex Massive Data
Publication date: 01/01/2008
Field of study

Dagstuhl Research Online Publication Server

The relational XQuery puzzle: a look-back on the pieces found so far

Author: Teubner Jens
Publication venue
Publication date: 18/06/2018
Field of study

Given the tremendous versatility of relational database implementations toward awide range of database problems, it seems only natural to consider them as back-ends for XML data processing. Yet, the assumptions behind the language XQuery are considerably different to those in traditional RDBMSs. The underlying data model is a tree, data and results carry an intrinsic order, queries are described using explicit iteration and, after all, problems are everything else but regular. Solving the relational XQuery puzzle, therefore, has challenged anumber of research groups over the past years. The purpose of this article is to summarize and assess some of the results that have been obtained during this period to solve the puzzle. Our main focus is on the Pathfinder XQuery compiler, afull reference implementation of apurely relational XQuery processor. As we dissect its components, we relate them to other work in the field and also point to open problems and limitations in the context of relational XQuery processin

RERO DOC Digital Library

Scalable XQuery type matching

Author: Jens Teubner
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

XML Schema awareness has been an integral part of the XQuery language since its early design stages. Matching XML data against XML types is the main operation that backs up XQuery type expressions, such as typeswitch, instance of, or certain XPath operators. This interaction is particularly vital in data-centric XQuery applications, where data come with detailed type information from an XML Schema document. So far there has been little work on the optimization of those operations. This work presents an efficient implementation of the runtime aspects of XML Schema support. We propose type ranks as a novel and uniform way to implement all facets of type matching in the W3C XQuery Recommendation. As a concise encoding of the type hierarchy defined by an XML Schema document, type ranks minimize the cost of checking the runtime type of XQuery singleton items. By aggregating type ranks, we leverage the grouping capabilities of modern DBMS implementations to efficiently execute type matching on XQuery sequences. In addition, we improve the complexity bounds incurring with typeswitch expressions over existing approaches. Experiments on an off-the-shelf database system demonstrate the potential of our approach

CiteSeerX

Crossref