3,161 research outputs found

    The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.

    Get PDF
    Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch\u27s APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch\u27s data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch\u27s analytic tools by developing a customized plugin for OpenAI\u27s ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app

    A clinical decision support system for detecting and mitigating potentially inappropriate medications

    Get PDF
    Background: Medication errors are a leading cause of preventable harm to patients. In older adults, the impact of ageing on the therapeutic effectiveness and safety of drugs is a significant concern, especially for those over 65. Consequently, certain medications called Potentially Inappropriate Medications (PIMs) can be dangerous in the elderly and should be avoided. Tackling PIMs by health professionals and patients can be time-consuming and error-prone, as the criteria underlying the definition of PIMs are complex and subject to frequent updates. Moreover, the criteria are not available in a representation that health systems can interpret and reason with directly. Objectives: This thesis aims to demonstrate the feasibility of using an ontology/rule-based approach in a clinical knowledge base to identify potentially inappropriate medication(PIM). In addition, how constraint solvers can be used effectively to suggest alternative medications and administration schedules to solve or minimise PIM undesirable side effects. Methodology: To address these objectives, we propose a novel integrated approach using formal rules to represent the PIMs criteria and inference engines to perform the reasoning presented in the context of a Clinical Decision Support System (CDSS). The approach aims to detect, solve, or minimise undesirable side-effects of PIMs through an ontology (knowledge base) and inference engines incorporating multiple reasoning approaches. Contributions: The main contribution lies in the framework to formalise PIMs, including the steps required to define guideline requisites to create inference rules to detect and propose alternative drugs to inappropriate medications. No formalisation of the selected guideline (Beers Criteria) can be found in the literature, and hence, this thesis provides a novel ontology for it. Moreover, our process of minimising undesirable side effects offers a novel approach that enhances and optimises the drug rescheduling process, providing a more accurate way to minimise the effect of drug interactions in clinical practice

    Trusted Provenance with Blockchain - A Blockchain-based Provenance Tracking System for Virtual Aircraft Component Manufacturing

    Get PDF
    The importance of provenance in the digital age has led to significant interest in utilizing blockchain technology for tamper-proof storage of provenance data. This thesis proposes a blockchain-based provenance tracking system for the certification of aircraft components. The aim is to design and implement a system that can ensure the trustworthy, tamper-resistant storage of provenance documents originating from an aircraft manufacturing process. To achieve this, the thesis presents a systematic literature review, which provides a comprehensive overview of existing works in the field of provenance and blockchain technology. After obtaining strategies to utilize blockchain for the storage of provenance data on the blockchain, a system was designed to meet the requirements of stakeholders in the aviation industry. The thesis utilized a systematic approach to gather requirements by conducting interviews with stakeholders. The system was implemented using a combination of smart contracts and a graphical user interface to provide tamper-resistant, traceable storage of relevant data on a transparent blockchain. An evaluation based on the requirements identified during the requirement engineering process found that the proposed system meets all identified requirements. Overall, this thesis offers insight into a potential application of blockchain technology in the aviation industry and provides a valuable resource for researchers and industry professionals seeking to leverage blockchain technology for provenance tracking and certification purpose

    Mining Butterflies in Streaming Graphs

    Get PDF
    This thesis introduces two main-memory systems sGrapp and sGradd for performing the fundamental analytic tasks of biclique counting and concept drift detection over a streaming graph. A data-driven heuristic is used to architect the systems. To this end, initially, the growth patterns of bipartite streaming graphs are mined and the emergence principles of streaming motifs are discovered. Next, the discovered principles are (a) explained by a graph generator called sGrow; and (b) utilized to establish the requirements for efficient, effective, explainable, and interpretable management and processing of streams. sGrow is used to benchmark stream analytics, particularly in the case of concept drift detection. sGrow displays robust realization of streaming growth patterns independent of initial conditions, scale and temporal characteristics, and model configurations. Extensive evaluations confirm the simultaneous effectiveness and efficiency of sGrapp and sGradd. sGrapp achieves mean absolute percentage error up to 0.05/0.14 for the cumulative butterfly count in streaming graphs with uniform/non-uniform temporal distribution and a processing throughput of 1.5 million data records per second. The throughput and estimation error of sGrapp are 160x higher and 0.02x lower than baselines. sGradd demonstrates an improving performance over time, achieves zero false detection rates when there is not any drift and when drift is already detected, and detects sequential drifts in zero to a few seconds after their occurrence regardless of drift intervals

    Method versatility in analysing human attitudes towards technology

    Get PDF
    Various research domains are facing new challenges brought about by growing volumes of data. To make optimal use of them, and to increase the reproducibility of research findings, method versatility is required. Method versatility is the ability to flexibly apply widely varying data analytic methods depending on the study goal and the dataset characteristics. Method versatility is an essential characteristic of data science, but in other areas of research, such as educational science or psychology, its importance is yet to be fully accepted. Versatile methods can enrich the repertoire of specialists who validate psychometric instruments, conduct data analysis of large-scale educational surveys, and communicate their findings to the academic community, which corresponds to three stages of the research cycle: measurement, research per se, and communication. In this thesis, studies related to these stages have a common theme of human attitudes towards technology, as this topic becomes vitally important in our age of ever-increasing digitization. The thesis is based on four studies, in which method versatility is introduced in four different ways: the consecutive use of methods, the toolbox choice, the simultaneous use, and the range extension. In the first study, different methods of psychometric analysis are used consecutively to reassess psychometric properties of a recently developed scale measuring affinity for technology interaction. In the second, the random forest algorithm and hierarchical linear modeling, as tools from machine learning and statistical toolboxes, are applied to data analysis of a large-scale educational survey related to students’ attitudes to information and communication technology. In the third, the challenge of selecting the number of clusters in model-based clustering is addressed by the simultaneous use of model fit, cluster separation, and the stability of partition criteria, so that generalizable separable clusters can be selected in the data related to teachers’ attitudes towards technology. The fourth reports the development and evaluation of a scholarly knowledge graph-powered dashboard aimed at extending the range of scholarly communication means. The findings of the thesis can be helpful for increasing method versatility in various research areas. They can also facilitate methodological advancement of academic training in data analysis and aid further development of scholarly communication in accordance with open science principles.Verschiedene Forschungsbereiche müssen sich durch steigende Datenmengen neuen Herausforderungen stellen. Der Umgang damit erfordert – auch in Hinblick auf die Reproduzierbarkeit von Forschungsergebnissen – Methodenvielfalt. Methodenvielfalt ist die Fähigkeit umfangreiche Analysemethoden unter Berücksichtigung von angestrebten Studienzielen und gegebenen Eigenschaften der Datensätze flexible anzuwenden. Methodenvielfalt ist ein essentieller Bestandteil der Datenwissenschaft, der aber in seinem Umfang in verschiedenen Forschungsbereichen wie z. B. den Bildungswissenschaften oder der Psychologie noch nicht erfasst wird. Methodenvielfalt erweitert die Fachkenntnisse von Wissenschaftlern, die psychometrische Instrumente validieren, Datenanalysen von groß angelegten Umfragen im Bildungsbereich durchführen und ihre Ergebnisse im akademischen Kontext präsentieren. Das entspricht den drei Phasen eines Forschungszyklus: Messung, Forschung per se und Kommunikation. In dieser Doktorarbeit werden Studien, die sich auf diese Phasen konzentrieren, durch das gemeinsame Thema der Einstellung zu Technologien verbunden. Dieses Thema ist im Zeitalter zunehmender Digitalisierung von entscheidender Bedeutung. Die Doktorarbeit basiert auf vier Studien, die Methodenvielfalt auf vier verschiedenen Arten vorstellt: die konsekutive Anwendung von Methoden, die Toolbox-Auswahl, die simultane Anwendung von Methoden sowie die Erweiterung der Bandbreite. In der ersten Studie werden verschiedene psychometrische Analysemethoden konsekutiv angewandt, um die psychometrischen Eigenschaften einer entwickelten Skala zur Messung der Affinität von Interaktion mit Technologien zu überprüfen. In der zweiten Studie werden der Random-Forest-Algorithmus und die hierarchische lineare Modellierung als Methoden des Machine Learnings und der Statistik zur Datenanalyse einer groß angelegten Umfrage über die Einstellung von Schülern zur Informations- und Kommunikationstechnologie herangezogen. In der dritten Studie wird die Auswahl der Anzahl von Clustern im modellbasierten Clustering bei gleichzeitiger Verwendung von Kriterien für die Modellanpassung, der Clustertrennung und der Stabilität beleuchtet, so dass generalisierbare trennbare Cluster in den Daten zu den Einstellungen von Lehrern zu Technologien ausgewählt werden können. Die vierte Studie berichtet über die Entwicklung und Evaluierung eines wissenschaftlichen wissensgraphbasierten Dashboards, das die Bandbreite wissenschaftlicher Kommunikationsmittel erweitert. Die Ergebnisse der Doktorarbeit tragen dazu bei, die Anwendung von vielfältigen Methoden in verschiedenen Forschungsbereichen zu erhöhen. Außerdem fördern sie die methodische Ausbildung in der Datenanalyse und unterstützen die Weiterentwicklung der wissenschaftlichen Kommunikation im Rahmen von Open Science

    Predicate Matrix: an interoperable lexical knowledge base for predicates

    Get PDF
    183 p.La Matriz de Predicados (Predicate Matrix en inglés) es un nuevo recurso léxico-semántico resultado de la integración de múltiples fuentes de conocimiento, entre las cuales se encuentran FrameNet, VerbNet, PropBank y WordNet. La Matriz de Predicados proporciona un léxico extenso y robusto que permite mejorar la interoperabilidad entre los recursos semánticos mencionados anteriormente. La creación de la Matriz de Predicados se basa en la integración de Semlink y nuevos mappings obtenidos utilizando métodos automáticos que enlazan el conocimiento semántico a nivel léxico y de roles. Asimismo, hemos ampliado la Predicate Matrix para cubrir los predicados nominales (inglés, español) y predicados en otros idiomas (castellano, catalán y vasco). Como resultado, la Matriz de predicados proporciona un léxico multilingüe que permite el análisis semántico interoperable en múltiples idiomas

    Computational design of dynamic receptor-peptide signaling complexes applied to chemotaxis.

    Get PDF
    Engineering protein biosensors that sensitively respond to specific biomolecules by triggering precise cellular responses is a major goal of diagnostics and synthetic cell biology. Previous biosensor designs have largely relied on binding structurally well-defined molecules. In contrast, approaches that couple the sensing of flexible compounds to intended cellular responses would greatly expand potential biosensor applications. Here, to address these challenges, we develop a computational strategy for designing signaling complexes between conformationally dynamic proteins and peptides. To demonstrate the power of the approach, we create ultrasensitive chemotactic receptor-peptide pairs capable of eliciting potent signaling responses and strong chemotaxis in primary human T cells. Unlike traditional approaches that engineer static binding complexes, our dynamic structure design strategy optimizes contacts with multiple binding and allosteric sites accessible through dynamic conformational ensembles to achieve strongly enhanced signaling efficacy and potency. Our study suggests that a conformationally adaptable binding interface coupled to a robust allosteric transmission region is a key evolutionary determinant of peptidergic GPCR signaling systems. The approach lays a foundation for designing peptide-sensing receptors and signaling peptide ligands for basic and therapeutic applications

    Linking provenance and its metadata in multi-organizational environments

    Get PDF
    Reproducibility issues are widely reported in life sciences. As a response, scientific communities have called for enhanced provenance information documenting the complete research life cycle, starting from biological or environmental material acquisition and ending with translating research results into practice. The integrity and trustworthiness of such provenance can be achieved by applying versioning mechanisms and cryptographic techniques, such as hashes or digital signatures, which are provenance metadata. However, the available provenance literature lacks an analysis of mechanisms for the exchange of provenance and its metadata between organizations as well as a grounded proposal of linking provenance and its metadata. In this work, we provide an in-depth analysis of the approaches for coupling provenance information and its metadata with documented research objects in the context of multi-organizational processes, leading to the categorization of possible approaches, description of their key properties, and derivation of requirements for underlying provenance models. We address the requirements by proposing a mechanism for linking provenance and its metadata by extending the Common Provenance Model, the open conceptual foundation for the ISO 23494 provenance standard series, currently under development. The concepts are demonstrated and validated on two complex use cases. This work is intended as a harmonized source of information on provenance coupling in the context of exchange of provenance between organizations, which can be used when designing or choosing a provenance solution. This type of usage is exemplified in the extension of the Common Provenance Model as another step toward a provenance standard for life sciences
    corecore