6 research outputs found
Dynamic recomposition of documents from distributed data sources
Dynamic recomposition of documents refers to the process of on-the-fly creation of documents. A document can be generated from several documents that are stored at distributed data sites. The source can be queried and results obtained in the form of XML. These XML documents can be combined after a series of transformation operations to obtain the target document. The resultant document can be stored statically or in the form of a command, which can be invoked later to recompose this document dynamically. Also, in case a change is made to a document, then only the change can be stored, instead of storing the modified document in its entirety.
The purpose of this research was to provide a way to recompose dynamic documents. A solution is proposed at the level of algebra for update and recomposition of documents stored at distributed data sources. The issue of representation of a document by a command, i.e., a composition operator and/or an editing command along with one or more path expressions has also been researched. The construction of a dynamic document has three phases to it. The first one is the information retrieval. Phase two deals with building of real document: this includes the filtering of retrieved data by selecting relevant subset of a document and then applying update operations, and finally the ordering and assembling of the document. The final phase consists of displaying or storing or exchanging it over the web through a convenient means
Recommended from our members
Incremental Maintenance Of Materialized XQuery Views
Keeping views fresh by maintaining the consistency between materialized views and their base data in the presence of base updates is a critical problem for many applications, including data warehousing and data integration. While heavily studied for traditional databases, the maintenance of XML views remains largely unexplored. Maintaining XML views is complex due to the richness of the XML data model and the powerful capabilities of XML query languages, such as XQuery.
This dissertation proposes a comprehensive solution for the general problem of maintaining materialized XQuery views. Our solution is the first to enable the maintenance of a large class of XQuery views including XPath expressions, FLWOR expressions, and Element Constructors. These views may contain arbitrary result construction and arbitrary grouping and join operations. Our solution also supports the unique order requirements of XQuery including source document order and query order.
The contributions of this dissertation include: (i) an efficient solution for supporting order in XML query processing and view maintenance, (ii) an identifier-based technique for enabling incremental construction of XML views, (iii) a mechanism for modeling and validating source XML updates, (iv) a counting algorithm for supporting view maintenance on delete and modify updates, (v) an algebraic solution for propagating bulk XML updates, and (vi) an efficient mechanism for refreshing materialized XML views on propagated updates. We provide proofs of correctness of our proposed techniques for materialized XQuery maintenance.
We have implemented a prototype of our view maintenance solution on top of the Rainbow XML query engine, developed at WPI. Our experiments confirm that our solution provides a practical and efficient solution for maintaining materialized XQuery views even when handling heterogeneous batches of possibly large source updates.
Our solution follows the widely adopted propagate-apply framework for view maintenance common to all mainstream query engines. That is, our solution produces incremental maintenance plans in the same algebraic language used to define the views. These plans can thus be optimized and executed by standard query processing techniques. Being compatible with standard frameworks paves the way for our XML view maintenance solution to be easily adopted by existing database engines
Managing Uncertainty and Ontologies in Databases
Nowadays a vast amount of data is generated in Extensible Markup Language (XML). However, it is necessary for applications in some domains to store and manipulate uncertain information, e.g. when the sensor inputs are noisy, or we want to store data that is uncertain. Another big change we can see in applications and web data is the increasing use of ontologies to describe the semantics of data, i.e., the semantic relationships between the terms in the databases.
As such information is usually absent from traditional databases, there is tremendous opportunity to ask new kinds of queries that could not be handled in the past. This provides new challenges on how to manipulate and maintain such new kinds of database systems.
In this dissertation, we will see how we can (i) incorporate and manipulate uncertainty in databases, and (ii) efficiently compute aggregates and maintain views on ontology databases.
First, I explain applications that require manipulating uncertain information in XML databases and maintaining web ontology databases written in Resource Description Framework (RDF). I then introduce the probabilistic semistructured PXML data model with two formal semantics. I describe a set of algebraic operations and its efficient implementation. Aggregations of PXML instances are studied with two semantics proposed: possible-worlds semantics and expectation semantics. Efficient algorithms with pruning are given and evaluated to show their feasibility. I introduce PIXML, an interval probability version of PXML, and develop a formal semantics for it. A query language and its operational semantics are given and proved to be sound and complete. Based on XML, RDF is a language used to describe web ontologies. RDQL, an RDF query language, is extended to support view definition and aggregations. Two sets of algorithms are given to maintain non-aggregate and aggregate views. Experimental results show that they are efficient compared with standard relational view maintenance algorithms
An algebraic approach for incremental maintenance of materialized XQuery views
Modern data sources, including structural and semi-structural sources, often export XML views over base data, and at times may materialize their views by storing the XML query result to provide faster data access. It is typically more efficient to maintain a view by incrementally propagating the base changes to the view than by re-computing it from scratch. Techniques for the incremental maintenance of relational views have been extensively studied in the literature. However, the maintenance of views created using XQuery is as of now unexplored. In this paper we propose an algebraic approach for incremental XQuery view maintenance. In our approach, an update to the XML source is transformed into a set of well defined update primitives which are propagated through the XML algebra tree. This algebraic update propagation process generates incremental update primitives to be applied to the result view. We briefly discuss our XQuery view maintenance system implementation. Our experiments confirm that incremental view maintenance is indeed faster than re-computation. Categories and Subject Descriptors H.2.3 [Database Management]: Languages—Query languages