266 research outputs found

    Agent-based modeling: a systematic assessment of use cases and requirements for enhancing pharmaceutical research and development productivity.

    Get PDF
    A crisis continues to brew within the pharmaceutical research and development (R&D) enterprise: productivity continues declining as costs rise, despite ongoing, often dramatic scientific and technical advances. To reverse this trend, we offer various suggestions for both the expansion and broader adoption of modeling and simulation (M&S) methods. We suggest strategies and scenarios intended to enable new M&S use cases that directly engage R&D knowledge generation and build actionable mechanistic insight, thereby opening the door to enhanced productivity. What M&S requirements must be satisfied to access and open the door, and begin reversing the productivity decline? Can current methods and tools fulfill the requirements, or are new methods necessary? We draw on the relevant, recent literature to provide and explore answers. In so doing, we identify essential, key roles for agent-based and other methods. We assemble a list of requirements necessary for M&S to meet the diverse needs distilled from a collection of research, review, and opinion articles. We argue that to realize its full potential, M&S should be actualized within a larger information technology framework--a dynamic knowledge repository--wherein models of various types execute, evolve, and increase in accuracy over time. We offer some details of the issues that must be addressed for such a repository to accrue the capabilities needed to reverse the productivity decline

    Generation and Applications of Knowledge Graphs in Systems and Networks Biology

    Get PDF
    The acceleration in the generation of data in the biomedical domain has necessitated the use of computational approaches to assist in its interpretation. However, these approaches rely on the availability of high quality, structured, formalized biomedical knowledge. This thesis has the two goals to improve methods for curation and semantic data integration to generate high granularity biological knowledge graphs and to develop novel methods for using prior biological knowledge to propose new biological hypotheses. The first two publications describe an ecosystem for handling biological knowledge graphs encoded in the Biological Expression Language throughout the stages of curation, visualization, and analysis. Further, the second two publications describe the reproducible acquisition and integration of high-granularity knowledge with low contextual specificity from structured biological data sources on a massive scale and support the semi-automated curation of new content at high speed and precision. After building the ecosystem and acquiring content, the last three publications in this thesis demonstrate three different applications of biological knowledge graphs in modeling and simulation. The first demonstrates the use of agent-based modeling for simulation of neurodegenerative disease biomarker trajectories using biological knowledge graphs as priors. The second applies network representation learning to prioritize nodes in biological knowledge graphs based on corresponding experimental measurements to identify novel targets. Finally, the third uses biological knowledge graphs and develops algorithmics to deconvolute the mechanism of action of drugs, that could also serve to identify drug repositioning candidates. Ultimately, the this thesis lays the groundwork for production-level applications of drug repositioning algorithms and other knowledge-driven approaches to analyzing biomedical experiments

    Semantic data integration and knowledge graph creation at scale

    Get PDF
    Contrary to data, knowledge is often abstract. Concrete knowledge can be achieved through the inclusion of semantics in the data models, highlighting the role of data integration. The massive growing number of data, in recent years, has promoted the demand for scaling up data management techniques; materializing data integration, a.k.a., knowledge graph creation falls in that category. In this thesis, we investigate efficient methods and techniques for materializing data integration. We formalize the process of materializing data integration. We formally define the characteristics of a materialized data integration system that merge the data operators and sources. Owing to this formalism, both layers of data integration, including data and schema-level integration, are formalized in the context of mapping assertions. We explore optimization opportunities for improving the materialization of data integration systems. We recognize three angles including intra/inter-mapping assertions from which the materialization can be improved. Accordingly, we propose source-based, mapping-based, and inter-mapping assertion groups of optimization techniques. We utilize our proposed techniques in three real-world projects. We illustrate how applying these optimization techniques contribute to meeting the objectives of the mentioned projects. Furthermore, we study the parameters impacting the performance of materialization of data integration. Relying on reported parameters and the presumably impacting parameters, we build four groups of testbeds. We empirically study the performances of these different testbeds in the presence and absence of our proposed techniques, in terms of execution time. We observe that the savings can be up to 75%. Lastly, we contribute to facilitating the process of declarative data integration system definition. We propose two data operation function signatures in Function Ontology (FnO). The first set of functions is designed to perform the task of entity alignment by resorting to an entity and relation linking tool. The second library consists of domain-specific functions to align genomic entities by harmonizing their representations. Finally, we introduce a tool equipped with a user interface to facilitate the process of defining declarative mapping rules by allowing users to explore the data sources and unified schema while defining their correspondences.Im Gegensatz zu den Daten ist das Wissen oft abstrakt. Konkretes Wissen kann durch die Einbeziehung von Semantik in die Datenmodelle erreicht werden, was die Rolle der Datenintegration unterstreicht. Die massiv wachsende Zahl von Daten hat in den letzten Jahren die Nachfrage nach einer Ausweitung der Datenverwaltungstechnikengef¨ordert; die materialisierende Datenintegration, auch bekannt als die Erstellung von Wissensgraphen, f¨allt in diese Kategorie. In dieser Arbeit untersuchen wir effiziente Methoden und Techniken zur Materialisierung der Datenintegration. Wir formalisieren den Prozess der Materialisierung der Datenintegration. Wir definieren formal die Eigenschaften eines materialisierten Datenintegrationssystems, so dass die Datenoperatoren und -quellen zusammengef¨uhrt werden. Dank dieses Formalismus werden beide Ebenen der Datenintegration, einschließlich der Integration auf Daten- und Schemaebene, im Kontext von Mapping-Assertions formalisiert. Wir untersuchen die Optimierungsm¨oglichkeiten zur Verbesserung der Materialisierung von Datenintegrationssystemen. Wir erkennen drei Gesichtspunkte, einschließlich Intra-/Inter-Mapping-Assertions, unter denen die Materialisierung verbessert werden kann. Dementsprechend schlagen wir quellenbasierte, mappingbasierte und inter-mapping Assertionsgruppen von Optimierungstechniken vor. Wir setzen die von uns vorgeschlagenen Techniken in drei Forschungsprojekte ein. Wir veranschaulichen, wie die Anwendung dieser Optimierungstechniken dazu beitr¨agt, die Ziele der genannten Projekte zu erreichen. Wir untersuchen die Parameter, die sich auf die Leistung der Materialisierung der Datenintegration auswirken. Auf der Grundlage der gemeldeten Parameter und der vermutlich ausschlaggebenden Parameter erstellen wir vier Gruppen von Testumgebungen. Wir untersuchen empirisch die Leistung dieser verschiedenen Testbeds mit und ohne die von uns vorgeschlagenen Techniken in Bezug auf die Ausf¨uhrungszeit. Wir stellen fest, dass die Einsparungen bis zu 75% betragen k¨onnen. Schließlich tragen wir zur Erleichterung des Prozesses der deklarativen Definition von Datenintegrationssystemen bei, indem wir zwei Funktionssignaturen f¨ur Datenoperationen in der Function Ontology (FnO) vorschlagen. Die erste Gruppe von Funktionen ist f¨ur die Aufgabe des Entit¨atsabgleichs konzipiert, w¨ahrend die zweite Bibliothek aus dom¨anenspezifischen Funktionen zum Abgleich genomischer Entit¨aten durch Harmonisierung ihrer Darstellungen besteht. Schließlich stellen wir ein Tool vor, das mit einer Benutzeroberfl¨ache ausgestattet ist, um den Prozess der Definition deklarativer Mapping-Regeln zu erleichtern, indem es den Benutzern erm¨oglicht, die Datenquellen und das einheitliche Schema zu erkunden

    Advancing translational research with the Semantic Web

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A fundamental goal of the U.S. National Institute of Health (NIH) "Roadmap" is to strengthen <it>Translational Research</it>, defined as the movement of discoveries in basic research to application at the clinical level. A significant barrier to translational research is the lack of uniformly structured data across related biomedical domains. The Semantic Web is an extension of the current Web that enables navigation and meaningful use of digital resources by automatic processes. It is based on common formats that support aggregation and integration of data drawn from diverse sources. A variety of technologies have been built on this foundation that, together, support identifying, representing, and reasoning across a wide range of biomedical data. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG), set up within the framework of the World Wide Web Consortium, was launched to explore the application of these technologies in a variety of areas. Subgroups focus on making biomedical data available in RDF, working with biomedical ontologies, prototyping clinical decision support systems, working on drug safety and efficacy communication, and supporting disease researchers navigating and annotating the large amount of potentially relevant literature.</p> <p>Results</p> <p>We present a scenario that shows the value of the information environment the Semantic Web can support for aiding neuroscience researchers. We then report on several projects by members of the HCLSIG, in the process illustrating the range of Semantic Web technologies that have applications in areas of biomedicine.</p> <p>Conclusion</p> <p>Semantic Web technologies present both promise and challenges. Current tools and standards are already adequate to implement components of the bench-to-bedside vision. On the other hand, these technologies are young. Gaps in standards and implementations still exist and adoption is limited by typical problems with early technology, such as the need for a critical mass of practitioners and installed base, and growing pains as the technology is scaled up. Still, the potential of interoperable knowledge sources for biomedicine, at the scale of the World Wide Web, merits continued work.</p

    Computational Advances in Drug Safety: Systematic and Mapping Review of Knowledge Engineering Based Approaches

    Get PDF
    Drug Safety (DS) is a domain with significant public health and social impact. Knowledge Engineering (KE) is the Computer Science discipline elaborating on methods and tools for developing “knowledge-intensive” systems, depending on a conceptual “knowledge” schema and some kind of “reasoning” process. The present systematic and mapping review aims to investigate KE-based approaches employed for DS and highlight the introduced added value as well as trends and possible gaps in the domain. Journal articles published between 2006 and 2017 were retrieved from PubMed/MEDLINE and Web of Science® (873 in total) and filtered based on a comprehensive set of inclusion/exclusion criteria. The 80 finally selected articles were reviewed on full-text, while the mapping process relied on a set of concrete criteria (concerning specific KE and DS core activities, special DS topics, employed data sources, reference ontologies/terminologies, and computational methods, etc.). The analysis results are publicly available as online interactive analytics graphs. The review clearly depicted increased use of KE approaches for DS. The collected data illustrate the use of KE for various DS aspects, such as Adverse Drug Event (ADE) information collection, detection, and assessment. Moreover, the quantified analysis of using KE for the respective DS core activities highlighted room for intensifying research on KE for ADE monitoring, prevention and reporting. Finally, the assessed use of the various data sources for DS special topics demonstrated extensive use of dominant data sources for DS surveillance, i.e., Spontaneous Reporting Systems, but also increasing interest in the use of emerging data sources, e.g., observational healthcare databases, biochemical/genetic databases, and social media. Various exemplar applications were identified with promising results, e.g., improvement in Adverse Drug Reaction (ADR) prediction, detection of drug interactions, and novel ADE profiles related with specific mechanisms of action, etc. Nevertheless, since the reviewed studies mostly concerned proof-of-concept implementations, more intense research is required to increase the maturity level that is necessary for KE approaches to reach routine DS practice. In conclusion, we argue that efficiently addressing DS data analytics and management challenges requires the introduction of high-throughput KE-based methods for effective knowledge discovery and management, resulting ultimately, in the establishment of a continuous learning DS system

    Discovering lesser known molecular players and mechanistic patterns in Alzheimer's disease using an integrative disease modelling approach

    Get PDF
    Convergence of exponentially advancing technologies is driving medical research with life changing discoveries. On the contrary, repeated failures of high-profile drugs to battle Alzheimer's disease (AD) has made it one of the least successful therapeutic area. This failure pattern has provoked researchers to grapple with their beliefs about Alzheimer's aetiology. Thus, growing realisation that Amyloid-β and tau are not 'the' but rather 'one of the' factors necessitates the reassessment of pre-existing data to add new perspectives. To enable a holistic view of the disease, integrative modelling approaches are emerging as a powerful technique. Combining data at different scales and modes could considerably increase the predictive power of the integrative model by filling biological knowledge gaps. However, the reliability of the derived hypotheses largely depends on the completeness, quality, consistency, and context-specificity of the data. Thus, there is a need for agile methods and approaches that efficiently interrogate and utilise existing public data. This thesis presents the development of novel approaches and methods that address intrinsic issues of data integration and analysis in AD research. It aims to prioritise lesser-known AD candidates using highly curated and precise knowledge derived from integrated data. Here much of the emphasis is put on quality, reliability, and context-specificity. This thesis work showcases the benefit of integrating well-curated and disease-specific heterogeneous data in a semantic web-based framework for mining actionable knowledge. Furthermore, it introduces to the challenges encountered while harvesting information from literature and transcriptomic resources. State-of-the-art text-mining methodology is developed to extract miRNAs and its regulatory role in diseases and genes from the biomedical literature. To enable meta-analysis of biologically related transcriptomic data, a highly-curated metadata database has been developed, which explicates annotations specific to human and animal models. Finally, to corroborate common mechanistic patterns — embedded with novel candidates — across large-scale AD transcriptomic data, a new approach to generate gene regulatory networks has been developed. The work presented here has demonstrated its capability in identifying testable mechanistic hypotheses containing previously unknown or emerging knowledge from public data in two major publicly funded projects for Alzheimer's, Parkinson's and Epilepsy diseases

    Conceptualization of Computational Modeling Approaches and Interpretation of the Role of Neuroimaging Indices in Pathomechanisms for Pre-Clinical Detection of Alzheimer Disease

    Get PDF
    With swift advancements in next-generation sequencing technologies alongside the voluminous growth of biological data, a diversity of various data resources such as databases and web services have been created to facilitate data management, accessibility, and analysis. However, the burden of interoperability between dynamically growing data resources is an increasingly rate-limiting step in biomedicine, specifically concerning neurodegeneration. Over the years, massive investments and technological advancements for dementia research have resulted in large proportions of unmined data. Accordingly, there is an essential need for intelligent as well as integrative approaches to mine available data and substantiate novel research outcomes. Semantic frameworks provide a unique possibility to integrate multiple heterogeneous, high-resolution data resources with semantic integrity using standardized ontologies and vocabularies for context- specific domains. In this current work, (i) the functionality of a semantically structured terminology for mining pathway relevant knowledge from the literature, called Pathway Terminology System, is demonstrated and (ii) a context-specific high granularity semantic framework for neurodegenerative diseases, known as NeuroRDF, is presented. Neurodegenerative disorders are especially complex as they are characterized by widespread manifestations and the potential for dramatic alterations in disease progression over time. Early detection and prediction strategies through clinical pointers can provide promising solutions for effective treatment of AD. In the current work, we have presented the importance of bridging the gap between clinical and molecular biomarkers to effectively contribute to dementia research. Moreover, we address the need for a formalized framework called NIFT to automatically mine relevant clinical knowledge from the literature for substantiating high-resolution cause-and-effect models

    Causal Inference in Healthcare: Approaches to Causal Modeling and Reasoning through Graphical Causal Models

    Get PDF
    In the era of big data, researchers have access to large healthcare datasets collected over a long period. These datasets hold valuable information, frequently investigated using traditional Machine Learning algorithms or Neural Networks. These algorithms perform great in finding patterns out of datasets (as a predictive machine); however, the models lack extensive interpretability to be used in the healthcare sector (as an explainable machine). Without exploring underlying causal relationships, the algorithms fail to explain their reasoning. Causal Inference, a relatively newer branch of Artificial Intelligence, deals with interpretability and portrays causal relationships in data through graphical models. It explores the issue of causality and works towards an explainability of underlying causal models deeply buried in data. For this dissertation work, the research goal is to use Causal Inference to build an applied framework that lets researchers leverage observational datasets in understanding causal relationships between features. To achieve that, we focus on specific objectives such as (a) the addition of background knowledge to causal structure learning algorithms, (b) the proposal of new causal inference methodologies, (c) generation of theories connecting causality to standard statistical analyses (e.g., Odds Ratio, Survival Analysis), and (d) application of proposed approaches in real-world healthcare problems. This dissertation encapsulates the tasks mentioned above, through various new methodologies and experiments under the rubric of Structural Theory of Causation. We discuss the common research theme in causal inference, historical development, the structural theory of causation, and underlying assumptions. Finally, we explore the impact of these proposed methodologies in real-world treatment controversy of Delirium patients, by examining the efficacy of antipsychotic drugs prescribed in treating Delirium in the ICU, from a curated observational healthcare dataset

    An ontology for formal representation of medication adherence-related knowledge : case study in breast cancer

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Medication non-adherence is a major healthcare problem that negatively impacts the health and productivity of individuals and society as a whole. Reasons for medication non-adherence are multi-faced, with no clear-cut solution. Adherence to medication remains a difficult area to study, due to inconsistencies in representing medicationadherence behavior data that poses a challenge to humans and today’s computer technology related to interpreting and synthesizing such complex information. Developing a consistent conceptual framework to medication adherence is needed to facilitate domain understanding, sharing, and communicating, as well as enabling researchers to formally compare the findings of studies in systematic reviews. The goal of this research is to create a common language that bridges human and computer technology by developing a controlled structured vocabulary of medication adherence behavior—“Medication Adherence Behavior Ontology” (MAB-Ontology) using breast cancer as a case study to inform and evaluate the proposed ontology and demonstrating its application to real-world situation. The intention is for MAB-Ontology to be developed against the background of a philosophical analysis of terms, such as belief, and desire to be human, computer-understandable, and interoperable with other systems that support scientific research. The design process for MAB-Ontology carried out using the METHONTOLOGY method incorporated with the Basic Formal Ontology (BFO) principles of best practice. This approach introduces a novel knowledge acquisition step that guides capturing medication-adherence-related data from different knowledge sources, including adherence assessment, adherence determinants, adherence theories, adherence taxonomies, and tacit knowledge source types. These sources were analyzed using a systematic approach that involved some questions applied to all source types to guide data extraction and inform domain conceptualization. A set of intermediate representations involving tables and graphs was used to allow for domain evaluation before implementation. The resulting ontology included 629 classes, 529 individuals, 51 object property, and 2 data property. The intermediate representation was formalized into OWL using Protégé. The MAB-Ontology was evaluated through competency questions, use-case scenario, face validity and was found to satisfy the requirement specification. This study provides a unified method for developing a computerized-based adherence model that can be applied among various disease groups and different drug categories
    • …