718 research outputs found
A BIM - GIS Integrated Information Model Using Semantic Web and RDF Graph Databases
In recent years, 3D virtual indoor and outdoor urban modelling has become an essential geospatial information framework for civil and engineering applications such as emergency response, evacuation planning, and facility management. Building multi-sourced and multi-scale 3D urban models are in high demand among architects, engineers, and construction professionals to achieve these tasks and provide relevant information to decision support systems. Spatial modelling technologies such as Building Information Modelling (BIM) and Geographical Information Systems (GIS) are frequently used to meet such high demands. However, sharing data and information between these two domains is still challenging. At the same time, the semantic or syntactic strategies for inter-communication between BIM and GIS do not fully provide rich semantic and geometric information exchange of BIM into GIS or vice-versa. This research study proposes a novel approach for integrating BIM and GIS using semantic web technologies and Resources Description Framework (RDF) graph databases. The suggested solution's originality and novelty come from combining the advantages of integrating BIM and GIS models into a semantically unified data model using a semantic framework and ontology engineering approaches. The new model will be named Integrated Geospatial Information Model (IGIM). It is constructed through three stages. The first stage requires BIMRDF and GISRDF graphs generation from BIM and GIS datasets. Then graph integration from BIM and GIS semantic models creates IGIMRDF. Lastly, the information from IGIMRDF unified graph is filtered using a graph query language and graph data analytics tools. The linkage between BIMRDF and GISRDF is completed through SPARQL endpoints defined by queries using elements and entity classes with similar or complementary information from properties, relationships, and geometries from an ontology-matching process during model construction. The resulting model (or sub-model) can be managed in a graph database system and used in the backend as a data-tier serving web services feeding a front-tier domain-oriented application. A case study was designed, developed, and tested using the semantic integrated information model for validating the newly proposed solution, architecture, and performance
Digital Traces of the Mind::Using Smartphones to Capture Signals of Well-Being in Individuals
General context and questions Adolescents and young adults typically use their smartphone several hours a day. Although there are concerns about how such behaviour might affect their well-being, the popularity of these powerful devices also opens novel opportunities for monitoring well-being in daily life. If successful, monitoring well-being in daily life provides novel opportunities to develop future interventions that provide personalized support to individuals at the moment they require it (just-in-time adaptive interventions). Taking an interdisciplinary approach with insights from communication, computational, and psychological science, this dissertation investigated the relation between smartphone app use and well-being and developed machine learning models to estimate an individual’s well-being based on how they interact with their smartphone. To elucidate the relation between smartphone trace data and well-being and to contribute to the development of technologies for monitoring well-being in future clinical practice, this dissertation addressed two overarching questions:RQ1: Can we find empirical support for theoretically motivated relations between smartphone trace data and well-being in individuals? RQ2: Can we use smartphone trace data to monitor well-being in individuals?Aims The first aim of this dissertation was to quantify the relation between the collected smartphone trace data and momentary well-being at the sample level, but also for each individual, following recent conceptual insights and empirical findings in psychological, communication, and computational science. A strength of this personalized (or idiographic) approach is that it allows us to capture how individuals might differ in how smartphone app use is related to their well-being. Considering such interindividual differences is important to determine if some individuals might potentially benefit from spending more time on their smartphone apps whereas others do not or even experience adverse effects. The second aim of this dissertation was to develop models for monitoring well-being in daily life. The present work pursued this transdisciplinary aim by taking a machine learning approach and evaluating to what extent we might estimate an individual’s well-being based on their smartphone trace data. If such traces can be used for this purpose by helping to pinpoint when individuals are unwell, they might be a useful data source for developing future interventions that provide personalized support to individuals at the moment they require it (just-in-time adaptive interventions). With this aim, the dissertation follows current developments in psychoinformatics and psychiatry, where much research resources are invested in using smartphone traces and similar data (obtained with smartphone sensors and wearables) to develop technologies for detecting whether an individual is currently unwell or will be in the future. Data collection and analysis This work combined novel data collection techniques (digital phenotyping and experience sampling methodology) for measuring smartphone use and well-being in the daily lives of 247 student participants. For a period up to four months, a dedicated application installed on participants’ smartphones collected smartphone trace data. In the same time period, participants completed a brief smartphone-based well-being survey five times a day (for 30 days in the first month and 30 days in the fourth month; up to 300 assessments in total). At each measurement, this survey comprised questions about the participants’ momentary level of procrastination, stress, and fatigue, while sleep duration was measured in the morning. Taking a time-series and machine learning approach to analysing these data, I provide the following contributions: Chapter 2 investigates the person-specific relation between passively logged usage of different application types and momentary subjective procrastination, Chapter 3 develops machine learning methodology to estimate sleep duration using smartphone trace data, Chapter 4 combines machine learning and explainable artificial intelligence to discover smartphone-tracked digital markers of momentary subjective stress, Chapter 5 uses a personalized machine learning approach to evaluate if smartphone trace data contains behavioral signs of fatigue. Collectively, these empirical studies provide preliminary answers to the overarching questions of this dissertation.Summary of results With respect to the theoretically motivated relations between smartphone trace data and wellbeing (RQ1), we found that different patterns in smartphone trace data, from time spent on social network, messenger, video, and game applications to smartphone-tracked sleep proxies, are related to well-being in individuals. The strength and nature of this relation depends on the individual and app usage pattern under consideration. The relation between smartphone app use patterns and well-being is limited in most individuals, but relatively strong in a minority. Whereas some individuals might benefit from using specific app types, others might experience decreases in well-being when spending more time on these apps. With respect to the question whether we might use smartphone trace data to monitor well-being in individuals (RQ2), we found that smartphone trace data might be useful for this purpose in some individuals and to some extent. They appear most relevant in the context of sleep monitoring (Chapter 3) and have the potential to be included as one of several data sources for monitoring momentary procrastination (Chapter 2), stress (Chapter 4), and fatigue (Chapter 5) in daily life. Outlook Future interdisciplinary research is needed to investigate whether the relationship between smartphone use and well-being depends on the nature of the activities performed on these devices, the content they present, and the context in which they are used. Answering these questions is essential to unravel the complex puzzle of developing technologies for monitoring well-being in daily life.<br/
Exemplars as a least-committed alternative to dual-representations in learning and memory
Despite some notable counterexamples, the theoretical and empirical exchange between the fields of learning and memory is limited. In an attempt to promote further theoretical exchange, I explored how learning and memory may be conceptualized as distinct algorithms that operate on a the same representations of past experiences. I review representational and process assumptions in learning and memory, by the example of evaluative conditioning and false recognition, and identified important similarities in the theoretical debates. Based on my review, I identify global matching memory models and their exemplar representation as a promising candidate for a common representational substrate that satisfies the principle of least commitment. I then present two cases in which exemplar-based global matching models, which take characteristics of the stimulus material and context into account, suggest parsimonious explanations for empirical dissociations in evaluative conditioning and false recognition in long-term memory. These explanations suggest reinterpretations of findings that are commonly taken as evidence for dual-representation models. Finally, I report the same approach provides also provides a natural unitary account of false recognition in short-term memory, a finding which challenges the assumption that short-term memory is insulated from long-term memory. Taken together, this work illustrates the broad explanatory scope and the integrative and yet parsimonious potential of exemplar-based global matching models
Semantic-aware Retrieval Standards based on Dirichlet Compound Model to Rank Notifications by Level of Urgency
There is a growing number of notifications generated from a wide range of sources. However, to our knowledge, there is no well-known generalizable standard for detecting the most urgent notifications. Establishing reusable standards is crucial for applications in which the recommendation (notification) is critical due to the level of urgency and sensitivity (e.g. medical domain). To tackle this problem, this thesis aims to establish Information Retrieval (IR) standards for notification (recommendation) task by taking semantic dimensions (terms, opinions, concepts and user interaction) into consideration. The technical research contributions of this thesis include but not limited to the development of a semantic IR framework based on Dirichlet Compound Model (DCM); namely FDCM, extending FDCM to the recommendation scenario (RFDCM) and proposing novel opinion-aware ranking models. Transparency, explainability and generalizability are some benefits that the use of a mathematically well-defined solution such as DCM offers. The FDCM framework is based on a robust aggregation parameter which effectively combines the semantic retrieval scores using Query Performance Predictors (QPPs). Our experimental results confirm the effectiveness of such approach in recommendation systems and semantic retrieval. One of the main findings of this thesis is that the concept-based extension (term-only + concept-only) of FDCM consistently outperformed both terms-only and concept-only baselines concerning biomedical data. Moreover, we show that semantic IR is beneficial for collaborative filtering and therefore it could help data scientists to develop hybrid and consolidated IR systems comprising content-based and collaborative filtering aspects of recommendation
Making Presentation Math Computable
This Open-Access-book addresses the issue of translating mathematical expressions from LaTeX to the syntax of Computer Algebra Systems (CAS). Over the past decades, especially in the domain of Sciences, Technology, Engineering, and Mathematics (STEM), LaTeX has become the de-facto standard to typeset mathematical formulae in publications. Since scientists are generally required to publish their work, LaTeX has become an integral part of today's publishing workflow. On the other hand, modern research increasingly relies on CAS to simplify, manipulate, compute, and visualize mathematics. However, existing LaTeX import functions in CAS are limited to simple arithmetic expressions and are, therefore, insufficient for most use cases. Consequently, the workflow of experimenting and publishing in the Sciences often includes time-consuming and error-prone manual conversions between presentational LaTeX and computational CAS formats. To address the lack of a reliable and comprehensive translation tool between LaTeX and CAS, this thesis makes the following three contributions. First, it provides an approach to semantically enhance LaTeX expressions with sufficient semantic information for translations into CAS syntaxes. Second, it demonstrates the first context-aware LaTeX to CAS translation framework LaCASt. Third, the thesis provides a novel approach to evaluate the performance for LaTeX to CAS translations on large-scaled datasets with an automatic verification of equations in digital mathematical libraries. This is an open access book
Enriching information extraction pipelines in clinical decision support systems
Programa Oficial de Doutoramento en Tecnoloxías da Información e as Comunicacións. 5032V01[Resumo] Os estudos sanitarios de múltiples centros son importantes para aumentar a repercusión dos resultados da investigación médica debido ao número de suxeitos que poden participar neles. Para simplificar a execución destes estudos, o proceso de intercambio de datos debería ser sinxelo, por exemplo, mediante o uso de bases de datos interoperables. Con todo, a consecución desta interoperabilidade segue sendo
un tema de investigación en curso, sobre todo debido aos problemas de gobernanza e privacidade dos datos. Na primeira fase deste traballo, propoñemos varias metodoloxías para optimizar os procesos de estandarización das bases de datos sanitarias. Este
traballo centrouse na estandarización de fontes de datos heteroxéneas nun esquema de datos estándar, concretamente o OMOP CDM, que foi desenvolvido e promovido
pola comunidade OHDSI. Validamos a nosa proposta utilizando conxuntos de datos de pacientes con enfermidade de Alzheimer procedentes de distintas institucións.
Na seguinte etapa, co obxectivo de enriquecer a información almacenada nas bases de datos de OMOP CDM, investigamos solucións para extraer conceptos clínicos de narrativas non estruturadas, utilizando técnicas de recuperación de información e
de procesamento da linguaxe natural. A validación realizouse a través de conxuntos de datos proporcionados en desafíos científicos, concretamente no National NLP Clinical Challenges(n2c2). Na etapa final, propuxémonos simplificar a execución de
protocolos de estudos provenientes de múltiples centros, propoñendo solucións novas para perfilar, publicar e facilitar o descubrimento de bases de datos. Algunhas das solucións desenvolvidas están a utilizarse actualmente en tres proxectos europeos
destinados a crear redes federadas de bases de datos de saúde en toda Europa.[Resumen] Los estudios sanitarios de múltiples centros son importantes para aumentar la repercusión de los resultados de la investigación médica debido al número de sujetos que pueden participar en ellos. Para simplificar la ejecución de estos estudios, el proceso de intercambio de datos debería ser sencillo, por ejemplo, mediante el uso de bases de datos interoperables. Sin embargo, la consecución de esta interoperabilidad
sigue siendo un tema de investigación en curso, sobre todo debido a los problemas de gobernanza y privacidad de los datos. En la primera fase de este trabajo, proponemos varias metodologías para optimizar los procesos de estandarización de las
bases de datos sanitarias. Este trabajo se centró en la estandarización de fuentes de datos heterogéneas en un esquema de datos estándar, concretamente el OMOP CDM, que ha sido desarrollado y promovido por la comunidad OHDSI. Validamos nuestra propuesta utilizando conjuntos de datos de pacientes con enfermedad de Alzheimer procedentes de distintas instituciones. En la siguiente etapa, con el objetivo de enriquecer la información almacenada en las bases de datos de OMOP CDM, hemos investigado soluciones para extraer conceptos clínicos de narrativas no estructuradas, utilizando técnicas de recuperación de información y de procesamiento del lenguaje natural. La validación se realizó a través de conjuntos de datos proporcionados en desafíos científicos, concretamente en el National NLP Clinical Challenges (n2c2). En la etapa final, nos propusimos simplificar la ejecución de protocolos de estudios provenientes de múltiples centros, proponiendo soluciones novedosas para perfilar, publicar y facilitar el descubrimiento de bases de datos. Algunas de las soluciones desarrolladas se están utilizando actualmente en tres proyectos europeos destinados a crear redes federadas de bases de datos de salud en toda Europa.[Abstract] Multicentre health studies are important to increase the impact of medical research
findings due to the number of subjects that they are able to engage. To simplify the execution of these studies, the data-sharing process should be effortless, for instance, through the use of interoperable databases. However, achieving this interoperability is still an ongoing research topic, namely due to data governance and privacy issues. In the first stage of this work, we propose several methodologies to optimise the harmonisation pipelines of health databases. This work was focused on harmonising heterogeneous data sources into a standard data schema, namely the OMOP CDM which has been developed and promoted by the OHDSI community. We validated our proposal using data sets of Alzheimer’s disease patients from distinct institutions. In the following stage, aiming to enrich the information stored in OMOP CDM databases, we have investigated solutions to extract clinical concepts from unstructured narratives, using information retrieval and natural language processing
techniques. The validation was performed through datasets provided in scientific challenges, namely in the National NLP Clinical Challenges (n2c2). In the final stage, we aimed to simplify the protocol execution of multicentre studies, by proposing novel solutions for profiling, publishing and facilitating the discovery of databases. Some of the developed solutions are currently being used in three European projects
aiming to create federated networks of health databases across Europe
Automated retrieval and analysis of published biomedical literature through natural language processing for clinical applications
The size of the existing academic literature corpus and the incredible rate of new publications
offers a great need and opportunity to harness computational approaches to data and
knowledge extraction across all research fields. Elements of this challenge can be met by
developments in automation for retrieval of electronic documents, document classification
and knowledge extraction. In this thesis, I detail studies of these processes in three related
chapters. Although the focus of each chapter is distinct, they contribute to my aim of
developing a generalisable pipeline for clinical applications in Natural Language Processing
in the academic literature. In chapter one, I describe the development of “Cadmus”, An open-source system developed in Python to generate corpora of biomedical text from the published
literature. Cadmus comprises three main steps: Search query & meta-data collection,
document retrieval, and parsing of the retrieved text. I present an example of full-text
retrieval for a corpus of over two hundred thousand articles using a gene-based search query
with quality control metrics for this retrieval process and a high-level illustration of the utility
of full text over metadata for each article. For a corpus of 204,043 articles, the retrieval rate
was 85.2% with institutional subscription access and 54.4% without. Chapter Two details
developing a custom-built Naïve Bayes supervised machine learning document classifier.
This binary classifier is based on calculating the relative enrichment of biomedical terms
between two classes of documents in a training set.
The classifier is trained and tested upon a manually classified set of over 8000 abstract and
full-text articles to identify articles containing human phenotype descriptions. 10-fold cross-validation of the model showed a performance of recall of 85%, specificity of 99%, Precision
of 0.76%, f1 score of 0.82 and accuracy of 90%. Chapter three illustrates the clinical
applications of automated retrieval, processing, and classification by considering the
published literature on Paediatric COVID-19. Case reports and similar articles were classified
into “severe” and “non-severe” classes, and term enrichment was evaluated to find
biomarkers associated with, or predictive of, severe paediatric COVID-19. Time series
analysis was employed to illustrate emerging disease entities like the Multisystem
Inflammatory Syndrome in Children (MIS-C) and consider unrecognised trends through
literature-based discovery
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
Технология комплексной поддержки жизненного цикла семантически совместимых интеллектуальных компьютерных систем нового поколения
В издании представлено описание текущей версии открытой технологии онтологического проектирования, производства и эксплуатации семантически совместимых гибридных интеллектуальных компьютерных систем (Технологии OSTIS). Предложена стандартизация интеллектуальных компьютерных систем, а также стандартизация методов и
средств их проектирования, что является важнейшим фактором, обеспечивающим семантическую совместимость интеллектуальных компьютерных систем и их компонентов, что
существенное снижение трудоемкости разработки таких систем.
Книга предназначена всем, кто интересуется проблемами искусственного интеллекта, а также специалистам в области интеллектуальных компьютерных систем и инженерии знаний. Может быть использована студентами, магистрантами и аспирантами специальности «Искусственный интеллект».
Табл. 8. Ил. 223. Библиогр.: 665 назв
A critical analysis of the Working Memory model of Eye Movement Desensitization and Reprocessing
Eye Movement Desensitization and Reprocessing (EMDR) is one of the foremost interventions for posttraumatic stress disorder (PTSD). Treatment aims to desensitise and reprocess trauma memories by combining imaginal exposure to the trauma with concurrent bilateral stimulation, usually in the form of eye movements (EMs). Multiple explanations have been proposed to account for therapeutic effect of EMs in EMDR. This thesis examined a leading theoretical account: the working memory (WM) hypothesis.
To investigate the theory that EMs desensitise negative imagery in EMDR by taxing visuospatial WM, a series of experiments were conducted in which healthy subjects formed a visual image depicting a negative autobiographical memory while performing an EM task, an auditory task - designed to place similar demands on the central executive – and/or keeping both eyes stationary. We reliably found that EMs did not reduce image vividness and emotionality more than auditory interference. Evidence was mixed regarding the effect of EMs compared to fixation, although null-results may be explained by the use of a less powerful between-subjects design. These findings challenge the view that EMs interfere with distressing imagery in EMDR by taxing visuospatial WM, and are more consistent with the view that the general cognitive load of EMs can fully explain their desensitising effect on imagery in EMDR.
An important gap in current understanding of EMDR is how the WM interference created by EMs contributes to the reprocessing of trauma memories. A novel procedure was developed for use in laboratory settings to test the prediction that EMs facilitate memory reprocessing. In an initial study, healthy participants allowed their mind to wander between sets of negative recall with concurrent EMs, or fixation. Preliminary results showed that EMs did not facilitate mind wandering, although this may have reflected limitations in the study design. This novel procedure provides an avenue for future research on a revised model of how WM interference contribute to important processes in EMDR, beyond the immediate desensitisation of imagery
- …