18,258 research outputs found
Materialized view maintenance for XML documents
Master'sMASTER OF SCIENC
Self Maintenance of Materialized XQuery Views via Query Containment and Re-Writing
In recent years XML, the eXtensible Markup Language has become the de-facto standard for publishing and exchanging information on the web and in enterprise data integration systems. Materialized views are often used in information integration systems to present a unified schema for efficient querying of distributed and possibly heterogenous data sources. On similar lines, ACE-XQ, an XQuery based semantic caching system shows the significant performance gains achieved by caching query results (as materialized views) and using these materialized views along with query containment techniques for answering future queries over distributed XML data sources. To keep data in these materialized views of ACE-XQ up-to-date, the view must be maintained i.e. whenever the base data changes, the corresponding cached data in the materialized view must also be updated. This thesis builds on the query containment ideas of ACE-XQ and proposes an efficient approach for self-maintenance of materialized views. Our experimental results illustrate the significant performance improvement achieved by this strategy over view re-computation for a variety of situations
On the Abductive or Deductive Nature of Database Schema Validation and Update Processing Problems
We show that database schema validation and update processing problems such
as view updating, materialized view maintenance, integrity constraint checking,
integrity constraint maintenance or condition monitoring can be classified as
problems of either abductive or deductive nature, according to the reasoning
paradigm that inherently suites them. This is done by performing abductive and
deductive reasoning on the event rules [Oli91], a set of rules that define the
difference between consecutive database states In this way, we show that it is
possible to provide methods able to deal with all these problems as a whole. We
also show how some existing general deductive and abductive procedures may be
used to reason on the event rules. In this way, we show that these procedures
can deal with all database schema validation and update processing problems
considered in this paper
Incremental Maintenance of a Materialized view in Data Warehousing : An Effective Approach
A view is a derived relation defined in terms of base relations. A view can be materialized by storing its extent in the database. An index can be made of these views and access to materialized view is much faster that recomputing the view from scratch. A Data Warehouse stores large amount of information collected from a different data sources. In order to speed up query processing, warehouse usually contains a large number of materialized views. When the data sources are updated, the views need to be updated. The process of keeping view up to date called as materialize view maintenance. Accessing base relations for view maintenance can be difficult, because the relations may be being used by users. Therefore materialize view maintenance in data warehousing is an important issue. For these reasons, the issue of self-maintainability of the view is an important issue in data warehousing. In this paper we have shown that a materialized view can be maintained without accessing the view itself by materializing additional relations at the data warehouse site. We have developed a cost effective approach to reduce the burden of view maintenance and also proved that proposed approach is optimum as compared to other approaches. Here incremental evaluation algorithm to compute changes to materialized views in relational is presented
Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views
Materialized views (MVs), stored pre-computed results, are widely used to
facilitate fast queries on large datasets. When new records arrive at a high
rate, it is infeasible to continuously update (maintain) MVs and a common
solution is to defer maintenance by batching updates together. Between batches
the MVs become increasingly stale with incorrect, missing, and superfluous rows
leading to increasingly inaccurate query results. We propose Stale View
Cleaning (SVC) which addresses this problem from a data cleaning perspective.
In SVC, we efficiently clean a sample of rows from a stale MV, and use the
clean sample to estimate aggregate query results. While approximate, the
estimated query results reflect the most recent data. As sampling can be
sensitive to long-tailed distributions, we further explore an outlier indexing
technique to give increased accuracy when the data distributions are skewed.
SVC complements existing deferred maintenance approaches by giving accurate and
bounded query answers between maintenance. We evaluate our method on a
generated dataset from the TPC-D benchmark and a real video distribution
application. Experiments confirm our theoretical results: (1) cleaning an MV
sample is more efficient than full view maintenance, (2) the estimated results
are more accurate than using the stale MV, and (3) SVC is applicable for a wide
variety of MVs
Incremental View Maintenance For Collection Programming
In the context of incremental view maintenance (IVM), delta query derivation
is an essential technique for speeding up the processing of large, dynamic
datasets. The goal is to generate delta queries that, given a small change in
the input, can update the materialized view more efficiently than via
recomputation. In this work we propose the first solution for the efficient
incrementalization of positive nested relational calculus (NRC+) on bags (with
integer multiplicities). More precisely, we model the cost of NRC+ operators
and classify queries as efficiently incrementalizable if their delta has a
strictly lower cost than full re-evaluation. Then, we identify IncNRC+; a large
fragment of NRC+ that is efficiently incrementalizable and we provide a
semantics-preserving translation that takes any NRC+ query to a collection of
IncNRC+ queries. Furthermore, we prove that incremental maintenance for NRC+ is
within the complexity class NC0 and we showcase how recursive IVM, a technique
that has provided significant speedups over traditional IVM in the case of flat
queries [25], can also be applied to IncNRC+.Comment: 24 pages (12 pages plus appendix
Adapting Materialized Views after Redefinitions: Techniques and a Performance Study
We consider a variant of the view maintenance problem: How does one keep a materialized view up-to-date when the view definition itself changes? Can one do better than recomputing the view from the base relations? Traditional view maintenance tries to maintain the materialized view in response to modifications to the base relations; we try to ``adapt'' the view in response to changes in the view definition. Such techniques are needed for applications where the user can change queries dynamically and see the changes in the results fast. Data archaeology, data visualization, and dynamic queries are examples of such applications. We consider all possible redefinitions of SQL\Select-\From-\Where-\Groupby-\Having, \Union, and \Minus\ views, and show how these views can be adapted using the old materialization for the cases where it is possible to do so. We identify extra information that can be kept with a materialization to facilitate redefinition.Multiple simultaneous changes to a view can be handled without necessarily materializing intermediate results. We identify guidelines for users and database administrators that can be used to facilitate efficient view adaptation. We perform a systematic experimental evaluation of our proposed techniques. Our evaluation indicates that adaptation is more efficient than rematerialization in most cases. Certain adaptation techniques can be up to1,000 times better. We also point out the physical layouts that can benefit adaptation
ViewDF: a Flexible Framework for Incremental View Maintenance in Stream Data Warehouses
Because of the increasing data sizes and demands for low latency in modern data analysis, the traditional data warehousing technologies are greatly pushed beyond their limits. Several stream data warehouse (SDW) systems, which are warehouses that ingest append-only data feeds and support frequent refresh cycles, have been proposed including different methods to improve the responsiveness of the systems. Materialized views are critical in large-scale data warehouses due to their ability to speed up queries. Thus an SDW maintains layers of materialized views. Materialized view maintenance in SDW systems introduces new challenges. However, some of the existing SDW systems do not address the maintenance of views while others employ view maintenance techniques that are not efficient. This thesis presents ViewDF, a flexible framework for incremental maintenance of materialized views in SDW systems that generalizes existing techniques and enables new
optimizations for views defined with operators that are common in stream analytics. We give a special view definition (ViewDF) to enhance the traditional way of creating views in SQL by being able to reference any partition of any table. We describe a prototype system based on this idea, which allows users to write ViewDFs directly and can automatically translate a broad class of queries into ViewDFs. Several optimizations are proposed and experiments show that our proposed system can improve view maintenance time by a factor of two or more in practical settings.1 yea
Derivation of incremental equations for PNF nested relations
Incremental view maintenance techniques are required for many new types of data models that are being increasingly used in industry. One of these models is the nested relational model that is used in the modelling complex objects in databases. In this paper we derive a group of expressions for incrementally evaluating query expressions in the nested relational model. We also present an algorithm to propagate base relation updates to a materialized view when the view is defined as a complex query
A Data warehouse within a Federated database architecture
Research in heterogeneous databases [Sheth & Larson 90] have provided methods to integrate disparate databases into a single unifying architecture -the federated database model. But they are limited in as much as: 1) The federated schema is non-materialized, which means that queries will have to be evaluated in the individual databases, resulting in slower response time, and 2) Data from external sources are not integrated within the federated schema. We propose to extend the federated architecture to include a data warehouse [Inmon 94, Kimball 96] modeled as a materialized view [Hanson 87] of the underlying federated schema. In addition, we employ view maintenance techniques to maintain the data warehouse against changes in the underlying operational sources. We adopt a deferredview maintenance [Colby et al 96] approach, rather than immediateapproach adopted by Stanford WHIPS project [Hammer et al 95]. This approach is preferred, because a great deal of decision-making may not require current data, but for those that require them, the model provides a mechanism to obtain them without adding too much overhead. For example, a data warehouse at a central office of a large chain of stores would like to have access to current inventory levels at individual stores, before deciding on a promotion. Similarly, access to both historical and current data of stock prices help an investment company to re-create point-in-time snapshots to help predict movements in stock prices. This approach provides the following advantages: •A unified architecture that ties the data warehouse to multiple heterogeneous databases. •Provides a method of maintaining the data warehouse as an integrated materialized view of the underlying data sources. •Provides flexible access to current data residing in the data sources. •Ease of maintenance against any change to the schema in the data source or warehouse
- …