2,575 research outputs found

    Clustering-Based Materialized View Selection in Data Warehouses

    Full text link
    Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

    A workload‑driven approach for view selection in large dimensional datasets

    Get PDF
    The information explosion the world has witnessed in the last two decades has forced businesses to adopt a data-driven culture for them to be competitive. These data-driven businesses have access to countless sources of information, and face the challenge of making sense of overwhelming amounts of data in a efficient and reliable manner, which implies the execution of read-intensive operations. In the context of this challenge, a framework for the dynamic read-optimization of large dimensional datasets has been designed, and on top of it a workload-driven mechanism for automatic materialized view selection and creation has been developed. This paper presents an extensive description of this mechanism, along with a proof-of-concept implementation of it and its corresponding performance evaluation. Results show that the proposed mechanism is able to derive a limited but comprehensive set of views leading to a drop in query latency ranging from 80% to 99.99% at the expense of 13% of the disk space used by the base dataset. This way, the devised mechanism enables speeding up query execution by building materialized views that match the actual demand of query workloads

    Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation

    Get PDF
    Materialized views are important for optimizing Business Intelligence (BI) systems when they are designed without data cubes. Selecting candidate queries from large number of queries for materialized views is a challenging task. Most of the work done in the past involves finding out frequent queries from the past workload and creating materialized views from such queries by either manually analyzing workload or using approximate string matching algorithms using query text. Most of the existing methods suggest complete queries but ignore query components such as sub queries for creation of materialized views. This paper presents a novel method to determine on which queries and query components materialized views can be created to optimize aggregate and join queries by mining database of query execution plans which are in the form of binary trees. The proposed algorithm showed significant improvement in terms of more number of optimized queries because it is using the execution plan tree of the query as a basis of selection of query to be optimized using materialized views rather than choosing query text which is used by traditional methods. For selecting a correct set of queries to be optimized using materialized views, the paper proposes efficient specialized frequent tree component mining algorithm with novel heuristics to prune search space. These frequent components are used to determine the possible set of candidate queries for creation of materialized views. Experimentation on standard, real and synthetic data sets, and also the theoretical basis, proved that the proposed method is able to optimize a large number of queries with less number of materialized views and showed a significant improvement in performance compared to traditional methods

    EFFICIENT APPROACH FOR VIEW SELECTION FOR DATA WAREHOUSE USING TREE MINING AND EVOLUTIONARY COMPUTATION

    Get PDF
    Selection of a proper set of views to materialize plays an important role indatabase performance. There are many methods of view selection which uses different techniques and frameworks to select an efficient set of views for materialization. In this paper, we present a new efficient, scalable method for view selection under the given storage constraints using a tree mining approach and evolutionary optimization. Tree mining algorithm is designed to determine the exact frequency of (sub)queries in the historical SQL dataset. Query Cost model achieves the objective of maximizing the performance benefits from the final view set which is derived from the frequent view set given by tree mining algorithm. Performance benefit of a query is defined as a function of queryfrequency, query creation cost, and query maintenance cost. The experimental results shows that the proposed method is successful in recommending a solution which is fairly close to optimal solution

    Optimized Generation and Maintenance of Materialized View using Adaptive Mechanism

    Get PDF
    Data Warehouse is storage of enormous amount of data gathered from multiple data sources, which is mainly used by managers for analysis purpose. Hence to make this data available in less amount of time is essential. Using Materialize view we can have result of query in less amount of time compared to access the same from base tables. To materialize all of the views is not possible since it requires storage space and maintenance cost. So it is required to select materialized view which minimizes response time of query and cost of maintenance. In this paper, effective approach is suggested for selection and maintenance of materialize view. DOI: 10.17762/ijritcc2321-8169.15050
    corecore