19 research outputs found
An extended preorder index for optimising XPath expressions
Many of the problems with native XML databases relate to
query performance and subsequently, it can be difficult to convince traditional database users of the benefits of using semi- or unstructured databases. Presently, there still lacks an index structure providing efficient support for structural queries and the traditional data-centric and
content queries. This paper presents an extended index structure based on the preorder traversal rank and the level (or depth) rank of each node in a document tree. The extended index fully supports the navigation of all XPath axes while efficiently supporting data-centric queries. The ability to start path traversals from arbitrary nodes in a document tree also enables the extended index to support the evaluation of path traversals embedded in XQuery expressions. Furthermore, an encoding technique is presented where properties of the level ranking may be exploited to provide efficient and optimised level-based XPath evaluations
Desirable properties for XML update mechanisms
The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality
Pattern based processing of XPath queries
As the popularity of areas including document storage and
distributed systems continues to grow, the demand for high
performance XML databases is increasingly evident. This
has led to a number of research eorts aimed at exploiting
the maturity of relational database systems in order to in-
crease XML query performance. In our approach, we use an
index structure based on a metamodel for XML databases
combined with relational database technology to facilitate
fast access to XML document elements. The query process
involves transforming XPath expressions to SQL which can
be executed over our optimised query engine. As there are
many dierent types of XPath queries, varying processing
logic may be applied to boost performance not only to indi-
vidual XPath axes, but across multiple axes simultaneously.
This paper describes a pattern based approach to XPath
query processing, which permits the execution of a group of
XPath location steps in parallel
Level-based indexing for optimising XML queries
Many of the problems with native XML databases relate to query performance and subsequently, it can be difficult to convince traditional database users of the benefits of using semi- or unstructured databases. In particular, the ongoing development of the XQuery language requires that performance related issues are resolved. Presently, there still lacks an index structure providing efficient support for both navigational and structural queries and the traditional data-centric and content queries. This thesis presents a new extended index structure based on the preorder traversal rank and the level (or depth) rank of each node in a document tree. The extended index fully supports the navigation of all XPath axes while efficiently supporting data-centric queries. The ability to start path traversals from arbitrary nodes in a document tree also enables the extended index to support the evaluation of path traversals embedded in XQuery expressions. Furthermore, an encoding technique for this extended index structure is presented, whereby properties of a level ranking may be exploited to provide efficient and optimised path traversals and in certain cases, optimal solutions to path traversals
A node partitioning strategy for optimising the performance of XML queries
For ease of communication between heterogeneous systems, the eXtensible Markup Language (XML) has been widely adopted as a data storage format.
However, XML query processing presents issues both in terms of query performance and updatability. Thus, many are choosing to shred XML data into relational databases in order to benet from its mature technology.
The problem with this approach is that (often complex and time consuming) data transformation processes are required to transform XML data to relational tables and vice versa. Additionally, many of the benets of XML data can be lost during these processes. In this dissertation, we present a
process that partitions nodes within an XML document into disjoint subsets.
Briefly, as there are fewer partitions than there are nodes, a more efficient join operation can be performed between partitions, thus reducing the number of inefficient node comparisons. The number and size of partitions varies
depending on the structure and layout in the XML document, and the number of partitions impacts query performance. Therefore, we also provide a partition classication process, which signicantly reduces the number of
partitions because each partition class represents many equivalent partitions within the XML document. In this dissertation, we will demonstrate that our approach outperforms similar approaches for a large subset of XML
queries by eliminating complex join operations (where possible) during the query process
Parameterized XPath Views
We present a new approach for accelerating the execution of XPath expressions using parameterized materialized XPath views (PXV). While the approach is generic we show how it can be utilized in an XML extension for relational database systems. Furthermore we discuss an algorithm for automatically determining the best PXV candidates to materialize based on a given workload. We evaluate our approach and show the superiority of our cost based algorithm for determining PXV candidates over frequent pattern based algorithms