Search CORE

14 research outputs found

Worst Case Optimal Joins on Relational and XML data

Author: Al-Khalifa Shurug
Lu Jiaheng
Lu Jiaheng
Publication venue
Publication date: 10/06/2018
Field of study

In recent data management ecosystem, one of the greatest challenges is the data variety. Data varies in multiple formats such as relational and (semi-)structured data. Traditional database handles a single type of data format and thus its ability to deal with different types of data formats is limited. To overcome such limitation, we propose a multi-model processing framework for relational and semi-structured data (i.e. XML), and design a worst-case optimal join algorithm. The salient feature of our algorithm is that it can guarantee that the intermediate results are no larger than the worst-case join results. Preliminary results show that our multi-model algorithm significantly outperforms the baseline join methods in terms of running time and intermediate result size.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

TIMBER: A native XML database

Author: Adriane Chapman
Andrew Nierman
Cong Yu
Divesh Srivastava
H. V. Jagadish
Jignesh M. Patel
Laks V. S. Lakshmanan
Nuwee
Nuwee Wiwatwattana
Shurug Al-Khalifa
Stelios Paparizos
Yuqing Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

This paper describes the overall design and architecture of the Timber XML database system currently being implemented at the University of Michigan. The system is based upon a bulk algebra for manipulating trees, and natively stores XML. New access methods have been developed to evaluate queries in the XML context, and new cost estimation and query optimization techniques have also been developed. We present performance numbers to support some of our design decisions. We believe that the key intellectual contribution of this system is a comprehensive set-at-a-time query processing ability in a native XML store, with all the standard components of relational query processing, including algebraic rewriting and a cost-based optimizer.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/42328/1/20110274.pd

CiteSeerX

Southampton (e-Prints Soton)

Crossref

Deep Blue Documents at the University of Michigan

XML QUERY EVALUATION

Author: Shurug A. Al-khalifa
Publication venue
Publication date: 01/01/2005
Field of study

XML is now widely used and management of XML data has become important. To this end, there has been work on the native management of XML data in a database to utilize the different capabilities of such a system like transaction management and indexing structures. At the heart of such a native XML database is the query evaluator, which provides access methods specifically tailored for XML data manipulation. The design of efficient access methods is the topic of this thesis. The most frequently used operation in an XML database is called structural join. Almost all XML queries contain at least one structural join. The structural join returns matches to a pattern from an XML document. We introduce a new efficient family of algorithms to address this task. These algorithms use a stack data structure that exploits the hierarchy of XML in favor of performance. We then develop variants that permit the combination of other operators, including projection, set difference, and universal quantification, with the structural join operation for greater efficiency. An important value provided by XML is the seamless representation of text and structured data. Querying the text with regard to the structure yields fast and accurate results. However, standard database query paradigms are not suitable for querying text. We introduce the TIX algebra for this purpose, and develop new access methods capable of efficiently computing and combining scores associated with intermediate results. In such applications, one is typically interested in only a few results with the highest scores. We develop new access methods to find results that score within a margin of error from the actual top results. These new access methods out-perform getting actual top results by at least an order of magnitude.Ph.D.Applied SciencesComputer scienceUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/124730/2/3163746.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

Multi-level operator combination in XML query processing

Author: H. V. Jagadish
Shurug Al-Khalifa
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Querying Structured Text in an XML Database

Author: Cong Yu
H. V. Jagaidsh
Shurug Al-Khalifa
Publication venue
Publication date: 01/01/2003
Field of study

XML databases often contain documents comprising structured text. Therefore, it is important to integrate "information retrieval style" query evaluation, which is well-suited for natural language text, with standard "database style" query evaluation, which handles structured queries efficiently. Relevance scoring is central to information retrieval. In the case of XML, this operation becomes more complex because the data required for scoring could reside not directly in an element itself but also in its descendant elements

CiteSeerX

Crossref

Structural Joins: A Primitive for Efficient XML Query Pattern Matching

Author: Divesh Srivastava
H. V. Jagadish
Jignesh M. Patel
Nick Koudas
Shurug Al-Khalifa Shurug
Yuqing Wu
Publication venue
Publication date
Field of study

XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are parent-child and ancestor-descendant, and finding all occurrences of these relationships in an XML database is a core operation for XML query processing

CiteSeerX

The michigan benchmark: Towards XML query performance diagnostics

Author: H. V. Jagadish
Jignesh M. Patel
Kanda Runapongsa
Shurug Al-khalifa
Yun Chen
Publication venue: Morgan Kaufmann
Publication date: 01/01/2002
Field of study

We propose a micro-benchmark for XML data management to aid engineers in designing improved XML processing engines. This benchmark is inherently different from application-level benchmarks, which are designed to help users choose between alternative products. We primarily attempt to capture the rich variety of data structures and distributions possible in XML, and to isolate their effects, without imitating any particular application. The benchmark specifies a single data set against which carefully specified queries can be used to evaluate system performance for XML data with various characteristics. We have used the benchmark to analyze the performance of three database systems: two native XML DBMSs, and a commercial ORDBMS. The benchmark reveals key strengths and weaknesses of these systems. We find that commercial relational techniques are effective for XML query processing in many cases, but are sensitive to query rewriting, and require better support for efficiently determining indirect structural containment.

CiteSeerX