9 research outputs found

    Una plataforma basada en metadata para cálculo de vistas en sistemas de información multi-fuentes

    Get PDF
    Un Sistema de Información Multi- Fuente (MSIS) se compone de un conjunto de fuentes de datos independientes y un conjunto de vistas o consultas que definen los requerimientos de los usuarios. Sus diferencias con los sistemas de información clásicos introducen nuevas actividades de diseño y motiva el desarrollo de nuevas técnicas. En este artículo estudiamos un caso particular de un MSIS: un Data Warehouse (DW) y proponemos un meta-modelo para representar su metadata desde dos puntos de vistas: la representación de los esquemas y las relaciones inter-esquema que permiten calcular una vista a partir de los datos fuentes. El meta-modelo es el centro de una plataforma general para desarrollo de MSIS. La plataforma permite la fácil integración de herramientas de diseño y mantenimiento a través de un modelo de datos común que centraliza el flujo de datos y las rutinas de control de integridad entre las herramientas.Eje: Bases de DatosRed de Universidades con Carreras en Informática (RedUNCI

    Automatic physical database design : recommending materialized views

    Get PDF
    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    Polyflow: a Polystore-compliant mechanism to provide interoperability to heterogeneous provenance graphs

    Get PDF
    Many scientific experiments are modeled as workflows. Workflows usually output massive amounts of data. To guarantee the reproducibility of workflows, they are usually orchestrated by Workflow Management Systems (WfMS), that capture provenance data. Provenance represents the lineage of a data fragment throughout its transformations by activities in a workflow. Provenance traces are usually represented as graphs. These graphs allows scientists to analyze and evaluate results produced by a workflow. However, each WfMS has a proprietary format for provenance and do it in different granularity levels. Therefore, in more complex scenarios in which the scientist needs to interpret provenance graphs generated by multiple WfMSs and workflows, a challenge arises. To first understand the research landscape, we conduct a Systematic Literature Mapping, assessing existing solutions under several different lenses. With a clearer understanding of the state of the art, we propose a tool called Polyflow, which is based on the concept of Polystore systems, integrating several databases of heterogeneous origin by adopting a global ProvONE schema. Polyflow allows scientists to query multiple provenance graphs in an integrated way. Polyflow was evaluated by experts using provenance data collected from real experiments that generate phylogenetic trees through workflows. The experiment results suggest that Polyflow is a viable solution for interoperating heterogeneous provenance data generated by different WfMSs, from both a usability and performance standpoint.Muitos experimentos científicos são modelados como workflows (fluxos de trabalho). Workflows produzem comumente um grande volume de dados. De forma a garantir a reprodutibilidade desses workflows, estes geralmente são orquestrados por Sistemas de Gerência de Workflows (SGWfs), garantindo que dados de proveniência sejam capturados. Dados de proveniência representam o histórico de derivação de um dado ao longo da execução do workflow. Assim, o histórico de derivação dos dados pode ser representado por meio de um grafo de proveniência. Este grafo possibilita aos cientistas analisarem e avaliarem resultados produzidos por um workflow. Todavia, cada SGWf tem seu formato proprietário de representação para dados de proveniência, e os armazenam em diferentes granularidades. Consequentemente, em cenários mais complexos em que um cientista precisa analisar de forma integrada grafos de proveniência gerados por múltiplos workflows, isso se torna desafiador. Primeiramente, para entender o campo de pesquisa, realizamos um Mapeamento Sistemático da Literatura, avaliando soluções existentes sob diferentes lentes. Com uma compreensão mais clara do atual estado da arte, propomos uma ferramenta chamada Polyflow, inspirada em conceitos de sistemas Polystore, possibilitando a integração de várias bases de dados heterogêneas por meio de uma interface de consulta única que utiliza o ProvONE como schema global. Polyflow permite que cientistas submetam consultas em múltiplos grafos de proveniência de maneira integrada. Polyflow foi avaliado em conjunto com especialistas usando dados de proveniência coletados de workflows reais que apoiam o estudo de geração de árvores filogenéticas. O resultado da avaliação mostrou a viabilidade do Polyflow para interoperar semanticamente dados de proveniência gerado por distintos SGWfs, tanto do ponto de vista de desempenho quanto de usabilidade

    A comparison of statistical machine learning methods in heartbeat detection and classification

    Get PDF
    In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms

    Lower bounds in distributed computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 167-170).Distributed computing is the study of achieving cooperative behavior between independent computing processes with possibly conflicting goals. Distributed computing is ubiquitous in the Internet, wireless networks, multi-core and multi-processor computers, teams of mobile robots, etc. In this thesis, we study two fundamental distributed computing problems, clock synchronization and mutual exclusion. Our contributions are as follows. 1. We introduce the gradient clock synchronization (GCS) problem. As in traditional clock synchronization, a group of nodes in a bounded delay communication network try to synchronize their logical clocks, by reading their hardware clocks and exchanging messages. We say the distance between two nodes is the uncertainty in message delay between the nodes, and we say the clock skew between the nodes is their difference in logical clock values. GCS studies clock skew as a function of distance. We show that surprisingly, every clock synchronization algorithm exhibits some execution in which two nodes at distance one apart have Q( lo~gD clock skew, where D is the maximum distance between any pair of nodes. 2. We present an energy efficient and fault tolerant clock synchronization algorithm suitable for wireless networks. The algorithm synchronizes nodes to each other, as well as to real time. It satisfies a relaxed gradient property. That is, it guarantees that, using certain reasonable operating parameters, nearby nodes are well synchronized most of the time. 3. We study the mutual exclusion (mutex) problem, in which a set of processes in a shared memory system compete for exclusive access to a shared resource. We prove a tight Q(n log n) lower bound on the time for n processes to each access the resource once. .(cont.) Our novel proof technique is based on separately lower bounding the amount of information needed for solving mutex, and upper bounding the amount of information any mutex algorithm can acquire in each step. We hope that our results offer fresh ways of looking at classical problems, and point to interesting new open problemsby Rui Fan.Ph.D

    Doctors and computers

    Get PDF
    The twin concerns of the thesis are (a) to develop a labour process analysis that is able to account for professional work and (b) in so doing to explain the reasons for hospital doctors various responses to the introduction of computer systems into medical work. This thesis constitutes a study of hospital doctors (clinicians) use of information technology in their clinic work. The first part reviews the literature and general developments in medical computing in relation to a theoretical analysis of the organisation and control of the clinic/medical labour process. The second part consists of an ethnographic study of the introduction of computer-based medical information systems into three hospitals; two being case studies of renal units and associated clinics and the third a study of an outpatients' department at a small acute hospital. The computer systems involved either replaced or supplemented the traditional form of the medical records and for this reason it was possible to focus on the role of these organisational records in the maintenance and reproduction of dominance and subordination within the labour process of clinic/medical work

    Studies in theory and method in sociolinguistics

    Get PDF
    PhD ThesisProblems raised in a pilot linguistic survey of a street in Newcastle upon Tyne (Pellowe 1967) are here treated positively. An informal normative model of the hearer's treatment of the speaker's output is developed in terms both of psychological processing and of social interpretation. This model is then interpreted methodologically and used to generate an analytical framework and a set of mete-interpretive procedures. These are tested in various ways on samples of speech from members of the Tyneside speech community, on experimental groups of hearers and speakers, and on various miscellaneous data. The generality, replicability and accountability of the methods are examined, and the consequences of the model and its techniques are contrasted with those of other studies

    Constructing GPSJ view graphs

    No full text
    A data warehouse collects and maintains integrated information from heterogeneous data sources for OLAP and decision support. An important task in data warehouse design is the selection of views to materialize, in order to minimize the response time and maintenance cost of generalized project-select-join (GPSJ) queries. We discuss how to construct GPSJ view graphs. GPSJ view graphs are directed acyclic graphs, used to compactly encode and represent different possible ways of evaluating a set of GPSJ queries. Our view graph construction algorithm, GPSJVIEWGRAPHBUILDER, incrementally constructs GPSJ view graphs based on a set of merge rules. We provide a set of merging rules to construct GPSJ view graphs in the presence of duplicate sensitive and insensitive aggregates. The merging algorithm used in GPSJVIEWGRAPH-BUILDER ensures that each node is correctly added to the view graph, and employs the merge rules to ensure that relationships between nodes from different queries are incorporated into the view graph
    corecore