27 research outputs found

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    Avaliação de algoritmos para a selecção de vistas materializadas em ambientes de data warehousing

    Get PDF
    A competição no mundo empresarial obriga a uma monitorização mais apertada de todas as variáveis envolvidas nas actividades de negócio. Com o objectivo de suportar o processo de tomada de decisão em factos, e não apenas na intui-ção dos agentes de decisão, surgiram os sistemas de suporte à decisão. Estes sistemas são hoje uma ferramenta chave no processo de tomada de decisão, pois conciliam e integram toda a informação disponível numa única plataforma tec-nológica. Assim, todas as técnicas de optimização do desempenho desses siste-mas são bem-vindas. De entre as diversas técnicas disponíveis, este trabalho concentra-se na materialização de vistas como método de optimização do pro-cessamento de interrogações. A materialização de vistas consiste na antecipação do processamento e armazenamento dos tuplos resultantes do processamento da sua definição numa tabela. De facto, o tempo de reposta a uma interrogação é menor, se as operações intermédias como selecções, projecções, junções e a-gregações se encontrarem já armazenadas numa tabela. Desta forma, o tempo de resposta limita-se ao varrimento da vista materializada. Este artigo apresenta um estudo preliminar para o desenvolvimento de um sistema de gestão de vistas materializadas em ambientes de data warehousing. Neste trabalho comparam-se, basicamente, os comportamentos de dois algoritmos de selecção de vistas materializadas: o BPUS e o A*, ambos algoritmos de procura exaustiva (deter-minísticos)

    Heuristic Algorithms for Designing a Data Warehouse with SPJ Views

    Full text link

    A solution to the materialized view selection problem in data warehousing

    Get PDF
    One of the most important decisions in the physical designing of a data warehouse is the selection of materialized views and indexes to be created. The problem is to select an appropriate set of views and indexes to storage that minimizes the total query response time, as long as the cost of maintaining them, given a constraint of some resource like storage space, is kept as low as possible.In this work, we have developed a new algorithm for the general problem of se-lection of views considering indexes, as an extension to a well-known algorithm. We present a heuristic for selection of views and indexes to optimize total que-ry response under a materialization time constraint. Finally, we present an ex-perimental comparison of our proposal with the considered state-of-art ap-proach.XI Workshop Bases de Datos y Minería de DatosRed de Universidades con Carreras de Informática (RedUNCI

    Automating and Optimizing Data-Centric What-If Analyses on Native Machine Learning Pipelines

    Get PDF
    Software systems that learn from data with machine learning (ML) are used in critical decision-making processes. Unfortunately, real-world experience shows that the pipelines for data preparation, feature encoding and model training in ML systems are often brittle with respect to their input data. As a consequence, data scientists have to run different kinds of data centric what-if analyses to evaluate the robustness and reliability of such pipelines, e.g., with respect to data errors or preprocessing techniques. These what-if analyses follow a common pattern: they take an existing ML pipeline, create a pipeline variant by introducing a small change, and execute this pipeline variant to see how the change impacts the pipeline's output score. The application of existing analysis techniques to ML pipelines is technically challenging as they are hard to integrate into existing pipeline code and their execution introduces large overheads due to repeated work.We propose mlwhatif to address these integration and efficiency challenges for data-centric what-if analyses on ML pipelines. mlwhatif enables data scientists to declaratively specify what-if analyses for an ML pipeline, and to automatically generate, optimize and execute the required pipeline variants. Our approach employs pipeline patches to specify changes to the data, operators and models of a pipeline. Based on these patches, we define a multi-query optimizer for efficiently executing the resulting pipeline variants jointly, with four subsumption-based optimization rules. Subsequently, we detail how to implement the pipeline variant generation and optimizer of mlwhatif. For that, we instrument native ML pipelines written in Python to extract dataflow plans with re-executable operators.We experimentally evaluate mlwhatif, and find that its speedup scales linearly with the number of pipeline variants in applicable cases, and is invariant to the input data size. In end-to-end experiments with four analyses on more than 60 pipelines, we show speedups of up to 13x compared to sequential execution, and find that the speedup is invariant to the model and featurization in the pipeline. Furthermore, we confirm the low instrumentation overhead of mlwhatif

    Online View Selection for the Web

    Get PDF
    View materialization has been shown to ameliorate the scalability problem of data-intensive web servers. However, unlike data warehouses which are off-line during updates, most web servers maintain their back-end databases online and perform updates concurrently with user accesses. In such environments, the selection of views to materialize must be performed online; both performance and data freshness should be considered. In this paper, we discuss the Online View Selection problem: select which views to materialize in order to maximize performance while maintaining freshness at acceptable levels. We define Quality of Service and Quality of Data metrics and present OVIS(theta), an adaptive algorithm for the Online View Selection problem. OVIS(theta) evolves the materialization decisions to match the constantly changing access/update patterns on the Web. The algorithm is also able to identify infeasible freshness levels, effectively avoiding saturation at the server. We performed extensive experiments under various workloads, which showed that our online algorithm comes close to the optimal off-line selection algorithm. Also UMIACS-TR-2002-2

    “AccessBIM” - A Model of Environmental Characteristics for Vision Impaired Indoor Navigation and Way Finding

    Get PDF
    The complexity of modern indoor environments has made navigation difficult for individuals with vision impairment. Hence, this thesis presents the AccessBIM framework, which is an optimized database that’s facilitates generation of a real-time floor plan with path determination. The AccessBIM framework has the potential to play an integral role in improving the independence and quality of life for people with vision impairment whilst also decreasing the cost to the community related to caretakers