82 research outputs found
Online View Selection for the Web
View materialization has been shown to ameliorate the scalability
problem of data-intensive web servers. However, unlike data warehouses
which are off-line during updates, most web servers maintain their
back-end databases online and perform updates concurrently with user
accesses. In such environments, the selection of views to materialize
must be performed online; both performance and data freshness should
be considered. In this paper, we discuss the Online View Selection
problem: select which views to materialize in order to maximize
performance while maintaining freshness at acceptable levels. We
define Quality of Service and Quality of Data metrics and present
OVIS(theta), an adaptive algorithm for the Online View Selection
problem. OVIS(theta) evolves the materialization decisions to match
the constantly changing access/update patterns on the Web. The
algorithm is also able to identify infeasible freshness levels,
effectively avoiding saturation at the server. We performed extensive
experiments under various workloads, which showed that our online
algorithm comes close to the optimal off-line selection algorithm.
Also UMIACS-TR-2002-2
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22-26, 201
Update propagation strategies for improving the quality of data on the Web
Dynamically generated web pages are ubiquitous today but their high demand for resources creates a huge scalability problem at the servers. Traditional web caching is not able to solve this problem since it cannot provide any guarantees as to the freshness of the cached data. A robust solution to the problem is web materialization, where pages are cached at the web server and constantly updated in the background, resulting in fresh data accesses on cache hits. In this work, we define Quality of Data metrics to evaluate how fresh the data served to the users is. We then focus on the update scheduling problem: given a set of views that are materialized, find the best order to refresh them, in the presence of continuous updates, so that the overall Quality of Data (QoD) is maximized. We present a QoD-aware Update Scheduling algorithmthat is adaptive and tolerantto surges in the incoming update stream. We performed extensive experiments using real traces and synthetic ones, which show that our algorithm consistently outperforms FIFO scheduling by up to two orders of magnitude.
Reduction of Materialized View Staleness Using Online Updates
Updating the materialized views stored in data warehouses usually implies making the warehouse unavailable to users. We propose MAUVE , a new algorithm for online incremental view updates that uses timestamps and allows consistent read-only access to the warehouse while it being updated. The algorithm propagates the updates to the views more often than the typical once a day in order to reduce view staleness. We have implemented MAUVE on top of the Informix Universal Server and used a synthetic workload generator to experiment with various update workloads and different view update frequencies. Our results show that, all kinds of update streams benefit from more frequent view updates, instead of just once a day. However, there is a clear maximum for the view update frequency, for which view staleness is minimal. 1 Introduction Data warehouses contain data replicated from several external sources, collected to answer decision support queries. The replicated data is often copied in re..
- …