82 research outputs found

    Online View Selection for the Web

    Get PDF
    View materialization has been shown to ameliorate the scalability problem of data-intensive web servers. However, unlike data warehouses which are off-line during updates, most web servers maintain their back-end databases online and perform updates concurrently with user accesses. In such environments, the selection of views to materialize must be performed online; both performance and data freshness should be considered. In this paper, we discuss the Online View Selection problem: select which views to materialize in order to maximize performance while maintaining freshness at acceptable levels. We define Quality of Service and Quality of Data metrics and present OVIS(theta), an adaptive algorithm for the Online View Selection problem. OVIS(theta) evolves the materialization decisions to match the constantly changing access/update patterns on the Web. The algorithm is also able to identify infeasible freshness levels, effectively avoiding saturation at the server. We performed extensive experiments under various workloads, which showed that our online algorithm comes close to the optimal off-line selection algorithm. Also UMIACS-TR-2002-2

    Update propagation strategies for improving the quality of data on the Web

    No full text
    Dynamically generated web pages are ubiquitous today but their high demand for resources creates a huge scalability problem at the servers. Traditional web caching is not able to solve this problem since it cannot provide any guarantees as to the freshness of the cached data. A robust solution to the problem is web materialization, where pages are cached at the web server and constantly updated in the background, resulting in fresh data accesses on cache hits. In this work, we define Quality of Data metrics to evaluate how fresh the data served to the users is. We then focus on the update scheduling problem: given a set of views that are materialized, find the best order to refresh them, in the presence of continuous updates, so that the overall Quality of Data (QoD) is maximized. We present a QoD-aware Update Scheduling algorithmthat is adaptive and tolerantto surges in the incoming update stream. We performed extensive experiments using real traces and synthetic ones, which show that our algorithm consistently outperforms FIFO scheduling by up to two orders of magnitude.

    Reduction of Materialized View Staleness Using Online Updates

    Get PDF
    Updating the materialized views stored in data warehouses usually implies making the warehouse unavailable to users. We propose MAUVE , a new algorithm for online incremental view updates that uses timestamps and allows consistent read-only access to the warehouse while it being updated. The algorithm propagates the updates to the views more often than the typical once a day in order to reduce view staleness. We have implemented MAUVE on top of the Informix Universal Server and used a synthetic workload generator to experiment with various update workloads and different view update frequencies. Our results show that, all kinds of update streams benefit from more frequent view updates, instead of just once a day. However, there is a clear maximum for the view update frequency, for which view staleness is minimal. 1 Introduction Data warehouses contain data replicated from several external sources, collected to answer decision support queries. The replicated data is often copied in re..
    • …