976 research outputs found

    Modeling and querying spatio-temporal clinical databases with multiple granularities

    Get PDF
    In molti campi di ricerca, i ricercatori hanno la necessit\ue0 di memorizzare, gestire e interrogare dati spazio-temporali. Tali dati sono classici dati alfanumerici arricchiti per\uf2 con una o pi\uf9 componenti temporali, spaziali e spazio-temporali che, con diversi possibili significati, li localizzano nel tempo e/o nello spazio. Ambiti in cui tali dati spazio-temporali devono essere raccolti e gestiti sono, per esempio, la gestione del territorio o delle risorse naturali, l'epidemiologia, l'archeologia e la geografia. Pi\uf9 in dettaglio, per esempio nelle ricerche epidemiologiche, i dati spazio-temporali possono servire a rappresentare diversi aspetti delle malattie e delle loro caratteristiche, quali per esempio la loro origine, espansione ed evoluzione e i fattori di rischio potenzialmente connessi alle malattie e al loro sviluppo. Le componenti spazio-temporali dei dati possono essere considerate come dei "meta-dati" che possono essere sfruttati per introdurre nuovi tipi di analisi sui dati stessi. La gestione di questi "meta-dati" pu\uf2 avvenire all'interno di diversi framework proposti in letteratura. Uno dei concetti proposti a tal fine \ue8 quello delle granularit\ue0. In letteratura c'\ue8 ampio consenso sul concetto di granularit\ue0 temporale, di cui esistono framework basati su diversi approcci. D'altro canto, non esiste invece un consenso generale sulla definizione di un framework completo, come quello delle granularit\ue0 temporali, per le granularit\ue0 spaziali e spazio-temporali. Questa tesi ha lo scopo di riempire questo vuoto proponendo un framework per le granularit\ue0 spaziali e, basandosi su questo e su quello gi\ue0 presente in letteratura per le granularit\ue0 temporali, un framework per le granularit\ue0 spazio-temporali. I framework proposti vogliono essere completi, per questo, oltre alle definizioni dei concetti di granularit\ue0 spaziale e spazio-temporale, includono anche la definizione di diversi concetti legati alle granularit\ue0, quali per esempio le relazioni e le operazioni tra granularit\ue0. Le relazioni permettono di conoscere come granularit\ue0 diverse sono legate tra loro, costruendone anche una gerarchia. Tali informazioni sono poi utili al fine di conoscere se e come \ue8 possibile confrontare dati associati e rappresentati con granularit\ue0 diverse. Le operazioni permettono invece di creare nuove granularit\ue0 a partire da altre granularit\ue0 gi\ue0 definite nel sistema, manipolando o selezionando alcune loro componenti. Basandosi su questi framework, l'obiettivo della tesi si sposta poi sul mostrare come le granularit\ue0 possano essere utilizzate per arricchire basi di dati spazio-temporali gi\ue0 esistenti al fine di una loro migliore e pi\uf9 ricca gestione e interrogazione. A tal fine, proponiamo qui una base di dati per la gestione dei dati riguardanti le granularit\ue0 temporali, spaziali e spazio-temporali. Nella base di dati proposta possono essere rappresentate tutte le componenti di una granularit\ue0 come definito nei framework proposti. La base di dati pu\uf2 poi essere utilizzata per estendere una base di dati spazio-temporale esistente aggiungendo alle tuple di quest'ultima delle referenze alle granularit\ue0 dove quei dati possono essere localizzati nel tempo e/o nel spazio. Per dimostrare come ci\uf2 possa essere fatto, nella tesi introduciamo la base di dati sviluppata ed utilizzata dal Servizio Psichiatrico Territoriale (SPT) di Verona. Tale base di dati memorizza le informazioni su tutti i pazienti venuti in contatto con l'SPT negli ultimi 30 anni e tutte le informazioni sui loro contatti con il servizio stesso (per esempio: chiamate telefoniche, visite a domicilio, ricoveri). Parte di tali informazioni hanno una componente spazio-temporale e possono essere quindi analizzate studiandone trend e pattern nel tempo e nello spazio. Nella tesi quindi estendiamo questa base di dati psichiatrica collegandola a quella proposta per la gestione delle granularit\ue0. A questo punto i dati psichiatrici possono essere interrogati anche sulla base di vincoli spazio-temporali basati su granularit\ue0. L'interrogazione di dati spazio-temporali associati a granularit\ue0 richiede l'utilizzo di un linguaggio d'interrogazione che includa, oltre a strutture, operatori e funzioni spazio-temporali per la gestione delle componenti spazio-temporali dei dati, anche costrutti per l'utilizzo delle granularit\ue0 nelle interrogazioni. Quindi, partendo da un linguaggio d'interrogazione spazio-temporale gi\ue0 presente in letteratura, in questa tesi proponiamo anche un linguaggio d'interrogazione che permetta ad un utente di recuperare dati da una base di dati spazio-temporale anche sulla base di vincoli basati su granularit\ue0. Il linguaggio viene introdotto fornendone la sintassi e la semantica. Inoltre per mostrare l'effettivo ruolo delle granularit\ue0 nell'interrogazione di una base di dati clinica, mostreremo diversi esempi di interrogazioni, scritte con il linguaggio d'interrogazione proposto, sulla base di dati psichiatrica dell'SPT di Verona. Tali interrogazioni spazio-temporali basate su granularit\ue0 possono essere utili ai ricercatori ai fini di analisi epidemiologiche dei dati psichiatrici.In several research fields, temporal, spatial, and spatio-temporal data have to be managed and queried with several purposes. These data are usually composed by classical data enriched with a temporal and/or a spatial qualification. For instance, in epidemiology spatio-temporal data may represent surveillance data, origins of disease and outbreaks, and risk factors. In order to better exploit the time and spatial dimensions, spatio-temporal data could be managed considering their spatio-temporal dimensions as meta-data useful to retrieve information. One way to manage spatio-temporal dimensions is by using spatio-temporal granularities. This dissertation aims to show how this is possible, in particular for epidemiological spatio-temporal data. For this purpose, in this thesis we propose a framework for the definition of spatio-temporal granularities (i.e., partitions of a spatio-temporal dimension) with the aim to improve the management and querying of spatio-temporal data. The framework includes the theoretical definitions of spatial and spatio-temporal granularities (while for temporal granularities we refer to the framework proposed by Bettini et al.) and all related notions useful for their management, e.g., relationships and operations over granularities. Relationships are useful for relating granularities and then knowing how data associated with different granularities can be compared. Operations allow one to create new granularities from already defined ones, manipulating or selecting their components. We show how granularities can be represented in a database and can be used to enrich an existing spatio-temporal database. For this purpose, we conceptually and logically design a relational database for temporal, spatial, and spatio-temporal granularities. The database stores all data about granularities and their related information we defined in the theoretical framework. This database can be used for enriching other spatio-temporal databases with spatio-temporal granularities. We introduce the spatio-temporal psychiatric case register, developed by the Verona Community-based Psychiatric Service (CPS), for storing and managing information about psychiatric patient, their personal information, and their contacts with the CPS occurred in last 30 years. The case register includes both clinical and statistical information about contacts, that are also temporally and spatially qualified. We show how the case register database can be enriched with spatio-temporal granularities both extending its structure and introducing a spatio-temporal query language dealing with spatio-temporal data and spatio-temporal granularities. Thus, we propose a new spatio-temporal query language, by defining its syntax and semantics, that includes ad-hoc features and constructs for dealing with spatio-temporal granularities. Finally, using the proposed query language, we report several examples of spatio-temporal queries on the psychiatric case register showing the ``usage'' of granularities and their role in spatio-temporal queries useful for epidemiological studies

    Análise colaborativa de grandes conjuntos de séries temporais

    Get PDF
    The recent expansion of metrification on a daily basis has led to the production of massive quantities of data, and in many cases, these collected metrics are only useful for knowledge building when seen as a full sequence of data ordered by time, which constitutes a time series. To find and interpret meaningful behavioral patterns in time series, a multitude of analysis software tools have been developed. Many of the existing solutions use annotations to enable the curation of a knowledge base that is shared between a group of researchers over a network. However, these tools also lack appropriate mechanisms to handle a high number of concurrent requests and to properly store massive data sets and ontologies, as well as suitable representations for annotated data that are visually interpretable by humans and explorable by automated systems. The goal of the work presented in this dissertation is to iterate on existing time series analysis software and build a platform for the collaborative analysis of massive time series data sets, leveraging state-of-the-art technologies for querying, storing and displaying time series and annotations. A theoretical and domain-agnostic model was proposed to enable the implementation of a distributed, extensible, secure and high-performant architecture that handles various annotation proposals in simultaneous and avoids any data loss from overlapping contributions or unsanctioned changes. Analysts can share annotation projects with peers, restricting a set of collaborators to a smaller scope of analysis and to a limited catalog of annotation semantics. Annotations can express meaning not only over a segment of time, but also over a subset of the series that coexist in the same segment. A novel visual encoding for annotations is proposed, where annotations are rendered as arcs traced only over the affected series’ curves in order to reduce visual clutter. Moreover, the implementation of a full-stack prototype with a reactive web interface was described, directly following the proposed architectural and visualization model while applied to the HVAC domain. The performance of the prototype under different architectural approaches was benchmarked, and the interface was tested in its usability. Overall, the work described in this dissertation contributes with a more versatile, intuitive and scalable time series annotation platform that streamlines the knowledge-discovery workflow.A recente expansão de metrificação diária levou à produção de quantidades massivas de dados, e em muitos casos, estas métricas são úteis para a construção de conhecimento apenas quando vistas como uma sequência de dados ordenada por tempo, o que constitui uma série temporal. Para se encontrar padrões comportamentais significativos em séries temporais, uma grande variedade de software de análise foi desenvolvida. Muitas das soluções existentes utilizam anotações para permitir a curadoria de uma base de conhecimento que é compartilhada entre investigadores em rede. No entanto, estas ferramentas carecem de mecanismos apropriados para lidar com um elevado número de pedidos concorrentes e para armazenar conjuntos massivos de dados e ontologias, assim como também representações apropriadas para dados anotados que são visualmente interpretáveis por seres humanos e exploráveis por sistemas automatizados. O objetivo do trabalho apresentado nesta dissertação é iterar sobre o software de análise de séries temporais existente e construir uma plataforma para a análise colaborativa de grandes conjuntos de séries temporais, utilizando tecnologias estado-de-arte para pesquisar, armazenar e exibir séries temporais e anotações. Um modelo teórico e agnóstico quanto ao domínio foi proposto para permitir a implementação de uma arquitetura distribuída, extensível, segura e de alto desempenho que lida com várias propostas de anotação em simultâneo e evita quaisquer perdas de dados provenientes de contribuições sobrepostas ou alterações não-sancionadas. Os analistas podem compartilhar projetos de anotação com colegas, restringindo um conjunto de colaboradores a uma janela de análise mais pequena e a um catálogo limitado de semântica de anotação. As anotações podem exprimir significado não apenas sobre um intervalo de tempo, mas também sobre um subconjunto das séries que coexistem no mesmo intervalo. Uma nova codificação visual para anotações é proposta, onde as anotações são desenhadas como arcos traçados apenas sobre as curvas de séries afetadas de modo a reduzir o ruído visual. Para além disso, a implementação de um protótipo full-stack com uma interface reativa web foi descrita, seguindo diretamente o modelo de arquitetura e visualização proposto enquanto aplicado ao domínio AVAC. O desempenho do protótipo com diferentes decisões arquiteturais foi avaliado, e a interface foi testada quanto à sua usabilidade. Em geral, o trabalho descrito nesta dissertação contribui com uma abordagem mais versátil, intuitiva e escalável para uma plataforma de anotação sobre séries temporais que simplifica o fluxo de trabalho para a descoberta de conhecimento.Mestrado em Engenharia Informátic

    Temporalized logics and automata for time granularity

    Full text link
    Suitable extensions of the monadic second-order theory of k successors have been proposed in the literature to capture the notion of time granularity. In this paper, we provide the monadic second-order theories of downward unbounded layered structures, which are infinitely refinable structures consisting of a coarsest domain and an infinite number of finer and finer domains, and of upward unbounded layered structures, which consist of a finest domain and an infinite number of coarser and coarser domains, with expressively complete and elementarily decidable temporal logic counterparts. We obtain such a result in two steps. First, we define a new class of combined automata, called temporalized automata, which can be proved to be the automata-theoretic counterpart of temporalized logics, and show that relevant properties, such as closure under Boolean operations, decidability, and expressive equivalence with respect to temporal logics, transfer from component automata to temporalized ones. Then, we exploit the correspondence between temporalized logics and automata to reduce the task of finding the temporal logic counterparts of the given theories of time granularity to the easier one of finding temporalized automata counterparts of them.Comment: Journal: Theory and Practice of Logic Programming Journal Acronym: TPLP Category: Paper for Special Issue (Verification and Computational Logic) Submitted: 18 March 2002, revised: 14 Januari 2003, accepted: 5 September 200

    Design and implementation of serverless architecture for i2b2 on AWS cloud and Snowflake data warehouse

    Get PDF
    Informatics for Integrating Biology and the Beside (i2b2) is an open-source medical tool for cohort discovery that allows researchers to explore and query clinical data. The i2b2 platform is designed to adopt any patient-centric data models and used at over 400 healthcare institutions worldwide for querying patient data. The platform consists of a webclient, core servers and database. Despite having installation guidelines, the complex architecture of the system with numerous dependencies and configuration parameters makes it difficult to install a functional i2b2 platform. On the other hand, maintaining the scalability, security, availability of the application is also challenging and requires lot of resources. Our aim was to deploy the i2b2 for University of Missouri (UM) System in the cloud as well as reduce the complexity and effort of the installation and maintenance process. Our solution encapsulated the complete installation process of each component using docker and deployed the container in the AWS Virtual Private Cloud (VPC) using several AWS PaaS (Platform as a Service), IaaS (Infrastructure as a Service) services. We deployed the application as a service in the AWS FARGATE, an on-demand, serverless, auto scalable compute engine. We also enhanced the functionality of i2b2 services and developed Snowflake JDBC driver support for i2b2 backend services. It enabled i2b2 services to query directly from Snowflake analytical database. In addition, we also created i2b2-data-installer package to load PCORnet CDM and ACT ontology data into i2b2 database. The i2b2 platform in University of Missouri holds 1.26B facts of 2.2M patients of UM Cerner Millennium data.Includes bibliographical references

    Two Essays on Analytical Capabilities: Antecedents and Consequences

    Get PDF
    Although organizations are rapidly embracing business analytics (BA) to enhance organizational performance, only a small proportion have managed to build analytical capabilities. While BA continues to draw attention from academics and practitioners, theoretical understanding of antecedents and consequences of analytical capabilities remain limited and lack a systematic view. In order to address the research gap, the two essays investigate: (a) the impact of organization’s core information processing mechanisms and its impact on analytical capabilities, (b) the sequential approach to integration of IT-enabled business processes and its impact on analytical capabilities, and (c) network position and its impact on analytical capabilities. Drawing upon the Information Processing Theory (IPT), the first essay investigates the relationship between organization’s core information processing mechanisms–i.e., electronic health record (EHRs), clinical information standards (CIS), and collaborative information exchange (CIE)–and its impact on analytical capabilities. We use data from two sources (HIMSS Analytics 2013 and AHA IT Survey 2013) to test the theorized relationships in the healthcare context empirically. Using the competitive progression theory, the second essay investigates whether organizations sequential approach to the integration of IT-enabled business processes is associated with increased analytical capabilities. We use data from three sources (HIMSS Analytics 2013, AHA IT Survey 2013, and CMS 2014) to test if sequential integration of EHRs –i.e., reflecting the unique organizational path of integration–has a significant impact on hospital’s analytical capability. Together the two essays advance our understanding of the factors that underlie enabling of firm’s analytical capabilities. We discuss in detail the theoretical and practical implications of the findings and the opportunities for future research

    Spatio-structural granularity of biological material entities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the continuously increasing demands on knowledge- and data-management that databases have to meet, ontologies and the theories of granularity they use become more and more important. Unfortunately, currently used theories and schemes of granularity unnecessarily limit the performance of ontologies due to two shortcomings: (i) they do not allow the integration of multiple granularity perspectives into one granularity framework; (ii) they are not applicable to cumulative-constitutively organized material entities, which cover most of the biomedical material entities.</p> <p>Results</p> <p>The above mentioned shortcomings are responsible for the major inconsistencies in currently used spatio-structural granularity schemes. By using the Basic Formal Ontology (BFO) as a top-level ontology and Keet's general theory of granularity, a granularity framework is presented that is applicable to cumulative-constitutively organized material entities. It provides a scheme for granulating complex material entities into their constitutive and regional parts by integrating various compositional and spatial granularity perspectives. Within a scale dependent resolution perspective, it even allows distinguishing different types of representations of the same material entity. Within other scale dependent perspectives, which are based on specific types of measurements (e.g. weight, volume, etc.), the possibility of organizing instances of material entities independent of their parthood relations and only according to increasing measures is provided as well. All granularity perspectives are connected to one another through overcrossing granularity levels, together forming an integrated whole that uses the <it>compositional object perspective </it>as an integrating backbone. This granularity framework allows to consistently assign structural granularity values to all different types of material entities.</p> <p>Conclusions</p> <p>The here presented framework provides a spatio-structural granularity framework for all domain reference ontologies that model cumulative-constitutively organized material entities. With its multi-perspectives approach it allows querying an ontology stored in a database at one's own desired different levels of detail: The contents of a database can be organized according to diverse granularity perspectives, which in their turn provide different <it>views </it>on its content (i.e. data, knowledge), each organized into different levels of detail.</p

    Representing and Reasoning about Temporal Granularities

    Full text link

    Interactive Data Exploration of Distributed Raw Files: A Systematic Mapping Study

    Get PDF
    When exploring big amounts of data without a clear target, providing an interactive experience becomes really dif cult, since this tentative inspection usually defeats any early decision on data structures or indexing strategies. This is also true in the physics domain, speci cally in high-energy physics, where the huge volume of data generated by the detectors are normally explored via C++ code using batch processing, which introduces a considerable latency. An interactive tool, when integrated into the existing data management systems, can add a great value to the usability of these platforms. Here, we intend to review the current state-of-the-art of interactive data exploration, aiming at satisfying three requirements: access to raw data les, stored in a distributed environment, and with a reasonably low latency. This paper follows the guidelines for systematic mapping studies, which is well suited for gathering and classifying available studies.We summarize the results after classifying the 242 papers that passed our inclusion criteria. While there are many proposed solutions that tackle the problem in different manners, there is little evidence available about their implementation in practice. Almost all of the solutions found by this paper cover a subset of our requirements, with only one partially satisfying the three. The solutions for data exploration abound. It is an active research area and, considering the continuous growth of data volume and variety, is only to become harder. There is a niche for research on a solution that covers our requirements, and the required building blocks are there
    • …
    corecore