    Optimized Generation and Maintenance of Materialized View using Adaptive Mechanism

    Data Warehouse is storage of enormous amount of data gathered from multiple data sources, which is mainly used by managers for analysis purpose. Hence to make this data available in less amount of time is essential. Using Materialize view we can have result of query in less amount of time compared to access the same from base tables. To materialize all of the views is not possible since it requires storage space and maintenance cost. So it is required to select materialized view which minimizes response time of query and cost of maintenance. In this paper, effective approach is suggested for selection and maintenance of materialize view. DOI: 10.17762/ijritcc2321-8169.15050

    Optimized cost effective approach for selection of materialized views in data warehousing

    A data warehouse efficiently processes a given set of queries by utilizing the multiple materialized views. Owing to the constraint on space and maintenance cost, the materialization of all views is unfeasible. One of the critical decisions involved in the process of designing a data warehouse for optimal efficiency, is the materialized views selection. The primary goal of data warehousing is to select a suitable set of views that minimizes the total cost associated with the materialized views. In this paper, we have presented a framework, an optimized version of our previous work, for the selection of views to materialize, for a given storage space constraints, which intends to achieve the best combination of good query response, low query processing cost and low view maintenance cost. All the cost metrics associated with the materialized views selection that comprise the query execution frequencies, base-relation update frequencies, query access costs, view maintenance costs and the system's storage space constraints are considered by this framework. This framework optimizes the maintenance, storage and query processing cost as it selects the most cost effective views to materialize. Thus, an efficient data warehousing system is the outcome.Facultad de Informátic

    In-memory caching for multi-query optimization of data-intensive scalable computing workloads

    In modern large-scale distributed systems, analytics jobs submitted by various users often share similar work. Instead of optimizing jobs independently, multi-query optimization techniques can be employed to save a considerable amount of cluster resources. In this work, we introduce a novel method combining in-memory cache primitives and multi-query optimization, to improve the efficiency of data-intensive, scalable computing frameworks. By careful selection and exploitation of common (sub) expressions, while satisfying memory constraints, our method transforms a batch of queries into a new, more efficient one which avoids unnecessary recomputations. To find feasible and efficient execution plans, our method uses a cost-based optimization formulation akin to the multiple-choice knapsack problem. Experiments on a prototype implementation of our system show significant benefits of worksharing for TPC-DS workloads

    Research on Materialized View Selection

    定义了数据仓库领域的视图选择问题,并讨论了与该问题相关的代价模型、收益函数、代价计算、约束条件和视图索引等内容;介绍了3大类视图选择方法,即静态方法、动态方法和混合方法,以及各类方法的代表性研究成果;最后展望未来的研究方向.Definition of view selection issue in the field of data warehouses is presented, followed by the discussion of related problems, such as cost model, benefit function, cost computation, restriction condition, view index, etc. Then three categories of view selection methods, namely, static, dynamic and hybrid methods are discussed. For each method, some representative work is introduced. Finally some future trends in this area are discussed.Supported by the National Natural Science Foundation of China under Grant No.60473051 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant Nos.2007AA01Z191, 2006AA01Z230 (国家高技术研究发展计划(863)

    Automatic physical database design : recommending materialized views

    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    Інформаційна технологія побудови розподілених сховищ даних гібридного типу

    У дисертаційній роботі вирішено актуальне науково-практичне завдання створення інформаційної технології побудови розподілених сховищ даних гібридного типу з врахуванням властивостей даних і статистики виконання запитів до сховища. Здійснено аналіз проблеми побудови сховищ даних з врахуванням властивостей даних і виконуваних запитів, обґрунтовано актуальність вирішення цієї проблеми. Визначено вимоги до інформаційної технології побудови розподілених сховищ гібридного типу. Введено поняття мультибазових сховищ даних, розроблено концептуальну, логічну та фізичну моделі таких сховищ і процедури міжрівневих переходів. Описано інтеграцію даних у сховище за допомогою процедур перетворення елементів даних і операцій, вибору моделей представлення даних. Розташування даних по вузлах, маршрути реплікації даних визначаються за критерієм мінімальної сукупної вартості збереження та обробки даних з використанням модифікованого генетичного алгоритму. На основі запропонованих моделей і методів створено інформаційну технологію побудови розподілених сховищ гібридного типу, яка вирішує поставлене наукове завдання. Зазначена технологія застосована при розробленні інформаційних та інформаційно-аналітичних систем Міністерства фінансів України. Результати впровадження підтвердили її відповідність поставленим вимогам