9 research outputs found

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    A workload‑driven approach for view selection in large dimensional datasets

    Get PDF
    The information explosion the world has witnessed in the last two decades has forced businesses to adopt a data-driven culture for them to be competitive. These data-driven businesses have access to countless sources of information, and face the challenge of making sense of overwhelming amounts of data in a efficient and reliable manner, which implies the execution of read-intensive operations. In the context of this challenge, a framework for the dynamic read-optimization of large dimensional datasets has been designed, and on top of it a workload-driven mechanism for automatic materialized view selection and creation has been developed. This paper presents an extensive description of this mechanism, along with a proof-of-concept implementation of it and its corresponding performance evaluation. Results show that the proposed mechanism is able to derive a limited but comprehensive set of views leading to a drop in query latency ranging from 80% to 99.99% at the expense of 13% of the disk space used by the base dataset. This way, the devised mechanism enables speeding up query execution by building materialized views that match the actual demand of query workloads

    Data mining-based materialized view and index selection in data warehouses

    No full text
    International audienceMaterialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view-index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes

    Data mining-based materialized view and index selection in data warehouses

    No full text
    International audienceMaterialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view-index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes

    Optimization of Progressive Queries via Materialized Views for Large Databases

    Full text link
    There is an increasing demand to efficiently process emerging types of queries, such as progressive queries (PQ), on large scale databases from numerous contemporary applications including telematics, e-commerce, and social media. Unlike a conventional query, a PQ consists of a set of interrelated step-queries (SQ). A user formulates a new SQ on the fly based on the result(s) from the previously executed SQ(s). Processing PQs raises a number of new challenges. Existing database management systems were not designed to efficiently process such queries. In this dissertation, we propose a suite of novel materialized-view based techniques to efficiently process PQs. First, we propose a dynamic materialized-view based approach to efficiently processing a special type of PQs, called monotonic linear PQs. We introduce a so-called superior relationship graph to capture superior relationships among SQs of such a PQ and suggest a method to estimate the benefit of keeping the result of an SQ as a materialized view using the graph. To efficiently construct the superior relationship graph, we propose two algorithms: generating-based and pruning-based. To improve the view searching efficiency and quality, we design an algorithm with a special storage structure to store and manage the materialized views. Second, to handle generic PQs, we define a so-called multiple query dependency graph to capture the data source dependency relationships that exist among SQs and external tables of a generic PQ. Using the graph, a mathematical benefit estimation model, which takes both the impact and the effectiveness of materialization into consideration, is derived. A greedy method and a dynamic programming method to solve the view maintenance problem are proposed. Third, to efficiently find usable materialized views from the view space/set for answering a given SQ, we suggest a dynamic materialized view index method. A special index tree structure with nodes ordered by a two-level priority rule that facilitates efficient locating of different types of nodes is designed. Bitmaps encoded with special methods are also used to refine the pruning of unusable views during a search. Fourth, to support PQs in a big data environment like Hadoop, we propose an index based technique for performing a new column family join operation on Hbase tables. To efficiently process such a join operation, we suggest a multiple freedom family index. A parallel MapReduce algorithm to construct the index is developed. To perform a column family join on two Hbase tables using the indexes, we present two partitioning methods to balance the workload among map nodes in a MapReduce algorithm. The introduced column family join operation and its relevant processing technique can ensure the closure property that is essential to the processing of PQs. To examine the performance of the proposed techniques, we performed extensive empirical and theoretical analyses. Our studies show that the proposed techniques are quite promising in efficiently processing PQs. To our knowledge, our work is the first to apply the materialized-view based approach to efficiently processing progressive queries on large databases.Ph.D.College of Engineering and Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/110311/1/ChaoZhu_Thesis_final.pdfDescription of ChaoZhu_Thesis_final.pdf : Dissertatio
    corecore