18,258 research outputs found

    Materialized view maintenance for XML documents

    Get PDF
    Master'sMASTER OF SCIENC

    Self Maintenance of Materialized XQuery Views via Query Containment and Re-Writing

    Get PDF
    In recent years XML, the eXtensible Markup Language has become the de-facto standard for publishing and exchanging information on the web and in enterprise data integration systems. Materialized views are often used in information integration systems to present a unified schema for efficient querying of distributed and possibly heterogenous data sources. On similar lines, ACE-XQ, an XQuery based semantic caching system shows the significant performance gains achieved by caching query results (as materialized views) and using these materialized views along with query containment techniques for answering future queries over distributed XML data sources. To keep data in these materialized views of ACE-XQ up-to-date, the view must be maintained i.e. whenever the base data changes, the corresponding cached data in the materialized view must also be updated. This thesis builds on the query containment ideas of ACE-XQ and proposes an efficient approach for self-maintenance of materialized views. Our experimental results illustrate the significant performance improvement achieved by this strategy over view re-computation for a variety of situations

    On the Abductive or Deductive Nature of Database Schema Validation and Update Processing Problems

    Full text link
    We show that database schema validation and update processing problems such as view updating, materialized view maintenance, integrity constraint checking, integrity constraint maintenance or condition monitoring can be classified as problems of either abductive or deductive nature, according to the reasoning paradigm that inherently suites them. This is done by performing abductive and deductive reasoning on the event rules [Oli91], a set of rules that define the difference between consecutive database states In this way, we show that it is possible to provide methods able to deal with all these problems as a whole. We also show how some existing general deductive and abductive procedures may be used to reason on the event rules. In this way, we show that these procedures can deal with all database schema validation and update processing problems considered in this paper

    Incremental Maintenance of a Materialized view in Data Warehousing : An Effective Approach

    Get PDF
    A view is a derived relation defined in terms of base relations. A view can be materialized by storing its extent in the database. An index can be made of these views and access to materialized view is much faster that recomputing the view from scratch. A Data Warehouse stores large amount of information collected from a different data sources. In order to speed up query processing, warehouse usually contains a large number of materialized views. When the data sources are updated, the views need to be updated. The process of keeping view up to date called as materialize view maintenance. Accessing base relations for view maintenance can be difficult, because the relations may be being used by users. Therefore materialize view maintenance in data warehousing is an important issue. For these reasons, the issue of self-maintainability of the view is an important issue in data warehousing. In this paper we have shown that a materialized view can be maintained without accessing the view itself by materializing additional relations at the data warehouse site. We have developed a cost effective approach to reduce the burden of view maintenance and also proved that proposed approach is optimum as compared to other approaches. Here incremental evaluation algorithm to compute changes to materialized views in relational is presented

    Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views

    Full text link
    Materialized views (MVs), stored pre-computed results, are widely used to facilitate fast queries on large datasets. When new records arrive at a high rate, it is infeasible to continuously update (maintain) MVs and a common solution is to defer maintenance by batching updates together. Between batches the MVs become increasingly stale with incorrect, missing, and superfluous rows leading to increasingly inaccurate query results. We propose Stale View Cleaning (SVC) which addresses this problem from a data cleaning perspective. In SVC, we efficiently clean a sample of rows from a stale MV, and use the clean sample to estimate aggregate query results. While approximate, the estimated query results reflect the most recent data. As sampling can be sensitive to long-tailed distributions, we further explore an outlier indexing technique to give increased accuracy when the data distributions are skewed. SVC complements existing deferred maintenance approaches by giving accurate and bounded query answers between maintenance. We evaluate our method on a generated dataset from the TPC-D benchmark and a real video distribution application. Experiments confirm our theoretical results: (1) cleaning an MV sample is more efficient than full view maintenance, (2) the estimated results are more accurate than using the stale MV, and (3) SVC is applicable for a wide variety of MVs

    Incremental View Maintenance For Collection Programming

    Get PDF
    In the context of incremental view maintenance (IVM), delta query derivation is an essential technique for speeding up the processing of large, dynamic datasets. The goal is to generate delta queries that, given a small change in the input, can update the materialized view more efficiently than via recomputation. In this work we propose the first solution for the efficient incrementalization of positive nested relational calculus (NRC+) on bags (with integer multiplicities). More precisely, we model the cost of NRC+ operators and classify queries as efficiently incrementalizable if their delta has a strictly lower cost than full re-evaluation. Then, we identify IncNRC+; a large fragment of NRC+ that is efficiently incrementalizable and we provide a semantics-preserving translation that takes any NRC+ query to a collection of IncNRC+ queries. Furthermore, we prove that incremental maintenance for NRC+ is within the complexity class NC0 and we showcase how recursive IVM, a technique that has provided significant speedups over traditional IVM in the case of flat queries [25], can also be applied to IncNRC+.Comment: 24 pages (12 pages plus appendix

    Adapting Materialized Views after Redefinitions: Techniques and a Performance Study

    Get PDF
    We consider a variant of the view maintenance problem: How does one keep a materialized view up-to-date when the view definition itself changes? Can one do better than recomputing the view from the base relations? Traditional view maintenance tries to maintain the materialized view in response to modifications to the base relations; we try to ``adapt'' the view in response to changes in the view definition. Such techniques are needed for applications where the user can change queries dynamically and see the changes in the results fast. Data archaeology, data visualization, and dynamic queries are examples of such applications. We consider all possible redefinitions of SQL\Select-\From-\Where-\Groupby-\Having, \Union, and \Minus\ views, and show how these views can be adapted using the old materialization for the cases where it is possible to do so. We identify extra information that can be kept with a materialization to facilitate redefinition.Multiple simultaneous changes to a view can be handled without necessarily materializing intermediate results. We identify guidelines for users and database administrators that can be used to facilitate efficient view adaptation. We perform a systematic experimental evaluation of our proposed techniques. Our evaluation indicates that adaptation is more efficient than rematerialization in most cases. Certain adaptation techniques can be up to1,000 times better. We also point out the physical layouts that can benefit adaptation

    ViewDF: a Flexible Framework for Incremental View Maintenance in Stream Data Warehouses

    Get PDF
    Because of the increasing data sizes and demands for low latency in modern data analysis, the traditional data warehousing technologies are greatly pushed beyond their limits. Several stream data warehouse (SDW) systems, which are warehouses that ingest append-only data feeds and support frequent refresh cycles, have been proposed including different methods to improve the responsiveness of the systems. Materialized views are critical in large-scale data warehouses due to their ability to speed up queries. Thus an SDW maintains layers of materialized views. Materialized view maintenance in SDW systems introduces new challenges. However, some of the existing SDW systems do not address the maintenance of views while others employ view maintenance techniques that are not efficient. This thesis presents ViewDF, a flexible framework for incremental maintenance of materialized views in SDW systems that generalizes existing techniques and enables new optimizations for views defined with operators that are common in stream analytics. We give a special view definition (ViewDF) to enhance the traditional way of creating views in SQL by being able to reference any partition of any table. We describe a prototype system based on this idea, which allows users to write ViewDFs directly and can automatically translate a broad class of queries into ViewDFs. Several optimizations are proposed and experiments show that our proposed system can improve view maintenance time by a factor of two or more in practical settings.1 yea

    Derivation of incremental equations for PNF nested relations

    Get PDF
    Incremental view maintenance techniques are required for many new types of data models that are being increasingly used in industry. One of these models is the nested relational model that is used in the modelling complex objects in databases. In this paper we derive a group of expressions for incrementally evaluating query expressions in the nested relational model. We also present an algorithm to propagate base relation updates to a materialized view when the view is defined as a complex query

    A Data warehouse within a Federated database architecture

    Get PDF
    Research in heterogeneous databases [Sheth & Larson 90] have provided methods to integrate disparate databases into a single unifying architecture -the federated database model. But they are limited in as much as: 1) The federated schema is non-materialized, which means that queries will have to be evaluated in the individual databases, resulting in slower response time, and 2) Data from external sources are not integrated within the federated schema. We propose to extend the federated architecture to include a data warehouse [Inmon 94, Kimball 96] modeled as a materialized view [Hanson 87] of the underlying federated schema. In addition, we employ view maintenance techniques to maintain the data warehouse against changes in the underlying operational sources. We adopt a deferredview maintenance [Colby et al 96] approach, rather than immediateapproach adopted by Stanford WHIPS project [Hammer et al 95]. This approach is preferred, because a great deal of decision-making may not require current data, but for those that require them, the model provides a mechanism to obtain them without adding too much overhead. For example, a data warehouse at a central office of a large chain of stores would like to have access to current inventory levels at individual stores, before deciding on a promotion. Similarly, access to both historical and current data of stock prices help an investment company to re-create point-in-time snapshots to help predict movements in stock prices. This approach provides the following advantages: •A unified architecture that ties the data warehouse to multiple heterogeneous databases. •Provides a method of maintaining the data warehouse as an integrated materialized view of the underlying data sources. •Provides flexible access to current data residing in the data sources. •Ease of maintenance against any change to the schema in the data source or warehouse
    • …
    corecore