976 research outputs found
Modeling and querying spatio-temporal clinical databases with multiple granularities
In molti campi di ricerca, i ricercatori hanno la necessit\ue0 di memorizzare, gestire e interrogare dati spazio-temporali. Tali dati sono classici dati alfanumerici arricchiti per\uf2 con una o pi\uf9 componenti temporali, spaziali e spazio-temporali che, con diversi possibili significati, li localizzano nel tempo e/o nello spazio. Ambiti in cui tali dati spazio-temporali devono essere raccolti e gestiti sono, per esempio, la gestione del territorio o delle risorse naturali, l'epidemiologia, l'archeologia e la geografia. Pi\uf9 in dettaglio, per esempio nelle ricerche epidemiologiche, i dati spazio-temporali possono servire a rappresentare diversi aspetti delle malattie e delle loro caratteristiche, quali per esempio la loro origine, espansione ed evoluzione e i fattori di rischio potenzialmente connessi alle malattie e al loro sviluppo. Le componenti spazio-temporali dei dati possono essere considerate come dei "meta-dati" che possono essere sfruttati per introdurre nuovi tipi di analisi sui dati stessi. La gestione di questi "meta-dati" pu\uf2 avvenire all'interno di diversi framework proposti in letteratura. Uno dei concetti proposti a tal fine \ue8 quello delle granularit\ue0. In letteratura c'\ue8 ampio consenso sul concetto di granularit\ue0 temporale, di cui esistono framework basati su diversi approcci. D'altro canto, non esiste invece un consenso generale sulla definizione di un framework completo, come quello delle granularit\ue0 temporali, per le granularit\ue0 spaziali e spazio-temporali. Questa tesi ha lo scopo di riempire questo vuoto proponendo un framework per le granularit\ue0 spaziali e, basandosi su questo e su quello gi\ue0 presente in letteratura per le granularit\ue0 temporali, un framework per le granularit\ue0 spazio-temporali. I framework proposti vogliono essere completi, per questo, oltre alle definizioni dei concetti di granularit\ue0 spaziale e spazio-temporale, includono anche la definizione di diversi concetti legati alle granularit\ue0, quali per esempio le relazioni e le operazioni tra granularit\ue0. Le relazioni permettono di conoscere come granularit\ue0 diverse sono legate tra loro, costruendone anche una gerarchia. Tali informazioni sono poi utili al fine di conoscere se e come \ue8 possibile confrontare dati associati e rappresentati con granularit\ue0 diverse. Le operazioni permettono invece di creare nuove granularit\ue0 a partire da altre granularit\ue0 gi\ue0 definite nel sistema, manipolando o selezionando alcune loro componenti.
Basandosi su questi framework, l'obiettivo della tesi si sposta poi sul mostrare come le granularit\ue0 possano essere utilizzate per arricchire basi di dati spazio-temporali gi\ue0 esistenti al fine di una loro migliore e pi\uf9 ricca gestione e interrogazione. A tal fine, proponiamo qui una base di dati per la gestione dei dati riguardanti le granularit\ue0 temporali, spaziali e spazio-temporali. Nella base di dati proposta possono essere rappresentate tutte le componenti di una granularit\ue0 come definito nei framework proposti. La base di dati pu\uf2 poi essere utilizzata per estendere una base di dati spazio-temporale esistente aggiungendo alle tuple di quest'ultima delle referenze alle granularit\ue0 dove quei dati possono essere localizzati nel tempo e/o nel spazio.
Per dimostrare come ci\uf2 possa essere fatto, nella tesi introduciamo la base di dati sviluppata ed utilizzata dal Servizio Psichiatrico Territoriale (SPT) di Verona. Tale base di dati memorizza le informazioni su tutti i pazienti venuti in contatto con l'SPT negli ultimi 30 anni e tutte le informazioni sui loro contatti con il servizio stesso (per esempio: chiamate telefoniche, visite a domicilio, ricoveri). Parte di tali informazioni hanno una componente spazio-temporale e possono essere quindi analizzate studiandone trend e pattern nel tempo e nello spazio. Nella tesi quindi estendiamo questa base di dati psichiatrica collegandola a quella proposta per la gestione delle granularit\ue0. A questo punto i dati psichiatrici possono essere interrogati anche sulla base di vincoli spazio-temporali basati su granularit\ue0.
L'interrogazione di dati spazio-temporali associati a granularit\ue0 richiede l'utilizzo di un linguaggio d'interrogazione che includa, oltre a strutture, operatori e funzioni spazio-temporali per la gestione delle componenti spazio-temporali dei dati, anche costrutti per l'utilizzo delle granularit\ue0 nelle interrogazioni. Quindi, partendo da un linguaggio d'interrogazione spazio-temporale gi\ue0 presente in letteratura, in questa tesi proponiamo anche un linguaggio d'interrogazione che permetta ad un utente di recuperare dati da una base di dati spazio-temporale anche sulla base di vincoli basati su granularit\ue0. Il linguaggio viene introdotto fornendone la sintassi e la semantica. Inoltre per mostrare l'effettivo ruolo delle granularit\ue0 nell'interrogazione di una base di dati clinica, mostreremo diversi esempi di interrogazioni, scritte con il linguaggio d'interrogazione proposto, sulla base di dati psichiatrica dell'SPT di Verona. Tali interrogazioni spazio-temporali basate su granularit\ue0 possono essere utili ai ricercatori ai fini di analisi epidemiologiche dei dati psichiatrici.In several research fields, temporal, spatial, and spatio-temporal data have to be managed and queried with several purposes. These data are usually composed by classical data enriched with a temporal and/or a spatial qualification. For instance, in epidemiology spatio-temporal data may represent surveillance data, origins of disease and outbreaks, and risk factors. In order to better exploit the time and spatial dimensions, spatio-temporal data could be managed considering their spatio-temporal dimensions as meta-data useful to retrieve information. One way to manage spatio-temporal dimensions is by using spatio-temporal granularities. This dissertation aims to show how this is possible, in particular for epidemiological spatio-temporal data. For this purpose, in this thesis we propose a framework for the definition of spatio-temporal granularities (i.e., partitions of a spatio-temporal dimension) with the aim to improve the management and querying of spatio-temporal data. The framework includes the theoretical definitions of spatial and spatio-temporal granularities (while for temporal granularities we refer to the framework proposed by Bettini et al.) and all related notions useful for their management, e.g., relationships and operations over granularities. Relationships are useful for relating granularities and then knowing how data associated with different granularities can be compared. Operations allow one to create new granularities from already defined ones, manipulating or selecting their components.
We show how granularities can be represented in a database and can be used to enrich an existing spatio-temporal database. For this purpose, we conceptually and logically design a relational database for temporal, spatial, and spatio-temporal granularities. The database stores all data about granularities and their related information we defined in the theoretical framework. This database can be used for enriching other spatio-temporal databases with spatio-temporal granularities. We introduce the spatio-temporal psychiatric case register, developed by the Verona Community-based Psychiatric Service (CPS), for storing and managing information about psychiatric patient, their personal information, and their contacts with the CPS occurred in last 30 years. The case register includes both clinical and statistical information about contacts, that are also temporally and spatially qualified. We show how the case register database can be enriched with spatio-temporal granularities both extending its structure and introducing a spatio-temporal query language dealing with spatio-temporal data and spatio-temporal granularities. Thus, we propose a new spatio-temporal query language, by defining its syntax and semantics, that includes ad-hoc features and constructs for dealing with spatio-temporal granularities. Finally, using the proposed query language, we report several examples of spatio-temporal queries on the psychiatric case register showing the ``usage'' of granularities and their role in spatio-temporal queries useful for epidemiological studies
Análise colaborativa de grandes conjuntos de séries temporais
The recent expansion of metrification on a daily basis has led to the production
of massive quantities of data, and in many cases, these collected metrics
are only useful for knowledge building when seen as a full sequence of
data ordered by time, which constitutes a time series. To find and interpret
meaningful behavioral patterns in time series, a multitude of analysis software
tools have been developed. Many of the existing solutions use annotations
to enable the curation of a knowledge base that is shared between a group
of researchers over a network. However, these tools also lack appropriate
mechanisms to handle a high number of concurrent requests and to properly
store massive data sets and ontologies, as well as suitable representations
for annotated data that are visually interpretable by humans and explorable by
automated systems. The goal of the work presented in this dissertation is to
iterate on existing time series analysis software and build a platform for the
collaborative analysis of massive time series data sets, leveraging state-of-the-art technologies for querying, storing and displaying time series and annotations.
A theoretical and domain-agnostic model was proposed to enable
the implementation of a distributed, extensible, secure and high-performant
architecture that handles various annotation proposals in simultaneous and
avoids any data loss from overlapping contributions or unsanctioned changes.
Analysts can share annotation projects with peers, restricting a set of collaborators
to a smaller scope of analysis and to a limited catalog of annotation
semantics. Annotations can express meaning not only over a segment of time,
but also over a subset of the series that coexist in the same segment. A novel
visual encoding for annotations is proposed, where annotations are rendered
as arcs traced only over the affected series’ curves in order to reduce visual
clutter. Moreover, the implementation of a full-stack prototype with a reactive
web interface was described, directly following the proposed architectural and
visualization model while applied to the HVAC domain. The performance of
the prototype under different architectural approaches was benchmarked, and
the interface was tested in its usability. Overall, the work described in this dissertation
contributes with a more versatile, intuitive and scalable time series
annotation platform that streamlines the knowledge-discovery workflow.A recente expansão de metrificação diária levou à produção de quantidades
massivas de dados, e em muitos casos, estas métricas são úteis para
a construção de conhecimento apenas quando vistas como uma sequência
de dados ordenada por tempo, o que constitui uma série temporal. Para se
encontrar padrões comportamentais significativos em séries temporais, uma
grande variedade de software de análise foi desenvolvida. Muitas das soluções
existentes utilizam anotações para permitir a curadoria de uma base
de conhecimento que Ă© compartilhada entre investigadores em rede. No entanto,
estas ferramentas carecem de mecanismos apropriados para lidar com
um elevado nĂşmero de pedidos concorrentes e para armazenar conjuntos
massivos de dados e ontologias, assim como também representações apropriadas
para dados anotados que são visualmente interpretáveis por seres
humanos e exploráveis por sistemas automatizados. O objetivo do trabalho
apresentado nesta dissertação é iterar sobre o software de análise de séries
temporais existente e construir uma plataforma para a análise colaborativa
de grandes conjuntos de séries temporais, utilizando tecnologias estado-de-arte
para pesquisar, armazenar e exibir séries temporais e anotações. Um
modelo teĂłrico e agnĂłstico quanto ao domĂnio foi proposto para permitir a
implementação de uma arquitetura distribuĂda, extensĂvel, segura e de alto
desempenho que lida com várias propostas de anotação em simultâneo e
evita quaisquer perdas de dados provenientes de contribuições sobrepostas
ou alterações não-sancionadas. Os analistas podem compartilhar projetos
de anotação com colegas, restringindo um conjunto de colaboradores a uma
janela de análise mais pequena e a um catálogo limitado de semântica de
anotação. As anotações podem exprimir significado não apenas sobre um
intervalo de tempo, mas também sobre um subconjunto das séries que coexistem
no mesmo intervalo. Uma nova codificação visual para anotações é
proposta, onde as anotações são desenhadas como arcos traçados apenas
sobre as curvas de sĂ©ries afetadas de modo a reduzir o ruĂdo visual. Para
além disso, a implementação de um protótipo full-stack com uma interface
reativa web foi descrita, seguindo diretamente o modelo de arquitetura e visualização
proposto enquanto aplicado ao domĂnio AVAC. O desempenho do
protótipo com diferentes decisões arquiteturais foi avaliado, e a interface foi
testada quanto à sua usabilidade. Em geral, o trabalho descrito nesta dissertação
contribui com uma abordagem mais versátil, intuitiva e escalável para
uma plataforma de anotação sobre séries temporais que simplifica o fluxo de
trabalho para a descoberta de conhecimento.Mestrado em Engenharia Informátic
Temporalized logics and automata for time granularity
Suitable extensions of the monadic second-order theory of k successors have
been proposed in the literature to capture the notion of time granularity. In
this paper, we provide the monadic second-order theories of downward unbounded
layered structures, which are infinitely refinable structures consisting of a
coarsest domain and an infinite number of finer and finer domains, and of
upward unbounded layered structures, which consist of a finest domain and an
infinite number of coarser and coarser domains, with expressively complete and
elementarily decidable temporal logic counterparts.
We obtain such a result in two steps. First, we define a new class of
combined automata, called temporalized automata, which can be proved to be the
automata-theoretic counterpart of temporalized logics, and show that relevant
properties, such as closure under Boolean operations, decidability, and
expressive equivalence with respect to temporal logics, transfer from component
automata to temporalized ones. Then, we exploit the correspondence between
temporalized logics and automata to reduce the task of finding the temporal
logic counterparts of the given theories of time granularity to the easier one
of finding temporalized automata counterparts of them.Comment: Journal: Theory and Practice of Logic Programming Journal Acronym:
TPLP Category: Paper for Special Issue (Verification and Computational Logic)
Submitted: 18 March 2002, revised: 14 Januari 2003, accepted: 5 September
200
Design and implementation of serverless architecture for i2b2 on AWS cloud and Snowflake data warehouse
Informatics for Integrating Biology and the Beside (i2b2) is an open-source medical tool for cohort discovery that allows researchers to explore and query clinical data. The i2b2 platform is designed to adopt any patient-centric data models and used at over 400 healthcare institutions worldwide for querying patient data. The platform consists of a webclient, core servers and database. Despite having installation guidelines, the complex architecture of the system with numerous dependencies and configuration parameters makes it difficult to install a functional i2b2 platform. On the other hand, maintaining the scalability, security, availability of the application is also challenging and requires lot of resources. Our aim was to deploy the i2b2 for University of Missouri (UM) System in the cloud as well as reduce the complexity and effort of the installation and maintenance process. Our solution encapsulated the complete installation process of each component using docker and deployed the container in the AWS Virtual Private Cloud (VPC) using several AWS PaaS (Platform as a Service), IaaS (Infrastructure as a Service) services. We deployed the application as a service in the AWS FARGATE, an on-demand, serverless, auto scalable compute engine. We also enhanced the functionality of i2b2 services and developed Snowflake JDBC driver support for i2b2 backend services. It enabled i2b2 services to query directly from Snowflake analytical database. In addition, we also created i2b2-data-installer package to load PCORnet CDM and ACT ontology data into i2b2 database. The i2b2 platform in University of Missouri holds 1.26B facts of 2.2M patients of UM Cerner Millennium data.Includes bibliographical references
Two Essays on Analytical Capabilities: Antecedents and Consequences
Although organizations are rapidly embracing business analytics (BA) to enhance organizational performance, only a small proportion have managed to build analytical capabilities. While BA continues to draw attention from academics and practitioners, theoretical understanding of antecedents and consequences of analytical capabilities remain limited and lack a systematic view. In order to address the research gap, the two essays investigate: (a) the impact of organization’s core information processing mechanisms and its impact on analytical capabilities, (b) the sequential approach to integration of IT-enabled business processes and its impact on analytical capabilities, and (c) network position and its impact on analytical capabilities.
Drawing upon the Information Processing Theory (IPT), the first essay investigates the relationship between organization’s core information processing mechanisms–i.e., electronic health record (EHRs), clinical information standards (CIS), and collaborative information exchange (CIE)–and its impact on analytical capabilities. We use data from two sources (HIMSS Analytics 2013 and AHA IT Survey 2013) to test the theorized relationships in the healthcare context empirically. Using the competitive progression theory, the second essay investigates whether organizations sequential approach to the integration of IT-enabled business processes is associated with increased analytical capabilities. We use data from three sources (HIMSS Analytics 2013, AHA IT Survey 2013, and CMS 2014) to test if sequential integration of EHRs –i.e., reflecting the unique organizational path of integration–has a significant impact on hospital’s analytical capability. Together the two essays advance our understanding of the factors that underlie enabling of firm’s analytical capabilities. We discuss in detail the theoretical and practical implications of the findings and the opportunities for future research
Spatio-structural granularity of biological material entities
<p>Abstract</p> <p>Background</p> <p>With the continuously increasing demands on knowledge- and data-management that databases have to meet, ontologies and the theories of granularity they use become more and more important. Unfortunately, currently used theories and schemes of granularity unnecessarily limit the performance of ontologies due to two shortcomings: (i) they do not allow the integration of multiple granularity perspectives into one granularity framework; (ii) they are not applicable to cumulative-constitutively organized material entities, which cover most of the biomedical material entities.</p> <p>Results</p> <p>The above mentioned shortcomings are responsible for the major inconsistencies in currently used spatio-structural granularity schemes. By using the Basic Formal Ontology (BFO) as a top-level ontology and Keet's general theory of granularity, a granularity framework is presented that is applicable to cumulative-constitutively organized material entities. It provides a scheme for granulating complex material entities into their constitutive and regional parts by integrating various compositional and spatial granularity perspectives. Within a scale dependent resolution perspective, it even allows distinguishing different types of representations of the same material entity. Within other scale dependent perspectives, which are based on specific types of measurements (e.g. weight, volume, etc.), the possibility of organizing instances of material entities independent of their parthood relations and only according to increasing measures is provided as well. All granularity perspectives are connected to one another through overcrossing granularity levels, together forming an integrated whole that uses the <it>compositional object perspective </it>as an integrating backbone. This granularity framework allows to consistently assign structural granularity values to all different types of material entities.</p> <p>Conclusions</p> <p>The here presented framework provides a spatio-structural granularity framework for all domain reference ontologies that model cumulative-constitutively organized material entities. With its multi-perspectives approach it allows querying an ontology stored in a database at one's own desired different levels of detail: The contents of a database can be organized according to diverse granularity perspectives, which in their turn provide different <it>views </it>on its content (i.e. data, knowledge), each organized into different levels of detail.</p
Interactive Data Exploration of Distributed Raw Files: A Systematic Mapping Study
When exploring big amounts of data without a clear target, providing an interactive experience
becomes really dif cult, since this tentative inspection usually defeats any early decision on data structures
or indexing strategies. This is also true in the physics domain, speci cally in high-energy physics, where
the huge volume of data generated by the detectors are normally explored via C++ code using batch
processing, which introduces a considerable latency. An interactive tool, when integrated into the existing
data management systems, can add a great value to the usability of these platforms. Here, we intend to
review the current state-of-the-art of interactive data exploration, aiming at satisfying three requirements:
access to raw data les, stored in a distributed environment, and with a reasonably low latency. This paper
follows the guidelines for systematic mapping studies, which is well suited for gathering and classifying
available studies.We summarize the results after classifying the 242 papers that passed our inclusion criteria.
While there are many proposed solutions that tackle the problem in different manners, there is little evidence
available about their implementation in practice. Almost all of the solutions found by this paper cover a
subset of our requirements, with only one partially satisfying the three. The solutions for data exploration
abound. It is an active research area and, considering the continuous growth of data volume and variety,
is only to become harder. There is a niche for research on a solution that covers our requirements, and the
required building blocks are there
- …