6 research outputs found

    Una plataforma basada en metadata para cálculo de vistas en sistemas de información multi-fuentes

    Get PDF
    Un Sistema de Información Multi- Fuente (MSIS) se compone de un conjunto de fuentes de datos independientes y un conjunto de vistas o consultas que definen los requerimientos de los usuarios. Sus diferencias con los sistemas de información clásicos introducen nuevas actividades de diseño y motiva el desarrollo de nuevas técnicas. En este artículo estudiamos un caso particular de un MSIS: un Data Warehouse (DW) y proponemos un meta-modelo para representar su metadata desde dos puntos de vistas: la representación de los esquemas y las relaciones inter-esquema que permiten calcular una vista a partir de los datos fuentes. El meta-modelo es el centro de una plataforma general para desarrollo de MSIS. La plataforma permite la fácil integración de herramientas de diseño y mantenimiento a través de un modelo de datos común que centraliza el flujo de datos y las rutinas de control de integridad entre las herramientas.Eje: Bases de DatosRed de Universidades con Carreras en Informática (RedUNCI

    View Materialization for Nested GPSJ Queries

    No full text
    View materialization is a central issue in logical design of data warehouses since it is one of the most powerful techniques to improve the response to the workload. Most approaches in the literature only focus on the aggregation patterns required by the queries in the workload; in this paper we propose an original approach to materialization in which the workload is characterized by the presence of complex queries which cannot be effectively described only by their aggregation pattern. In particular, we consider queries represented by nested GPSJ (Generalized Projection / Selection / Join) expressions, in which sequences of aggregate operators may be applied to measures and selection predicates may be formulated, at different granularities, on both dimensions and measures. Other specific issues taken into account are related to the need for materializing derived measures as well as support measures to make algebraic operators distributive. Based on this query model, an efficient algorithm to determine a restricted set of candidate views for materialization, to be fed into an optimization algorithm, is proposed. Finally, the effectiveness of our approach is discussed with reference to a sample workload.

    Automatic physical database design : recommending materialized views

    Get PDF
    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    A comparison of statistical machine learning methods in heartbeat detection and classification

    Get PDF
    In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms

    Carga de un Data Warehouse a partir de la traza de diseño

    Get PDF
    Data Warehousing es el término generalmente usado para definir la tecnología de los sistemas de soporte a la toma de decisiones y aplicaciones OLAP. En particular se denomina Data Warehouse (DW) al repositorio de datos integrados, orientados a un dominio específico, no volátiles y variables en el tiempo, que ayudan a la toma de decisiones de una empresa u organización. La estructura de dicho DW se obtiene como resultado de un proceso de diseño generalmente guiado por alguna metodología. El proceso de extraer los datos desde donde residen y transformarlos para almacenarlos en el DW se denomina proceso de carga. El proceso de mantener estos datos actualizados se denomina actualización. La carga y actualización de un DW que fue diseñado utilizando alguna metodología es el foco de esta tesis. Este trabajo aborda el problema de la carga y actualización del DW reutilizando el conocimiento generado durante el diseño conceptual y lógico de éste. En particular, se toma como base un algoritmo existente que genera el esquema de la base de datos relacional de un DW, partiendo de un diseño conceptual del mismo, de una base de datos fuente integrada y lineamientos de diseño. Utilizando la información disponible del algoritmo este trabajo analiza los resultados que se obtendrían con un enfoque naive (basándose exclusivamente/directamente en dicho algoritmo), identifica los errores que podrían producirse con este enfoque y propone una solución que presenta un mejor desempeño y resuelve los errores encontrados.La propuesta continúa la línea de trabajo del grupo CSI en el área de diseño lógico y conceptual de Data Warehouses, complementando las técnicas y algoritmos existentes con soluciones específicas a los problemas de carga y actualización hasta ahora no abordados en dichos trabajos
    corecore