737 research outputs found

    Towards an In Silico Approach to Personalized Pharmacokinetics

    Get PDF

    Computational and human-based methods for knowledge discovery over knowledge graphs

    Get PDF
    The modern world has evolved, accompanied by the huge exploitation of data and information. Daily, increasing volumes of data from various sources and formats are stored, resulting in a challenging strategy to manage and integrate them to discover new knowledge. The appropriate use of data in various sectors of society, such as education, healthcare, e-commerce, and industry, provides advantages for decision support in these areas. However, knowledge discovery becomes challenging since data may come from heterogeneous sources with important information hidden. Thus, new approaches that adapt to the new challenges of knowledge discovery in such heterogeneous data environments are required. The semantic web and knowledge graphs (KGs) are becoming increasingly relevant on the road to knowledge discovery. This thesis tackles the problem of knowledge discovery over KGs built from heterogeneous data sources. We provide a neuro-symbolic artificial intelligence system that integrates symbolic and sub-symbolic frameworks to exploit the semantics encoded in a KG and its structure. The symbolic system relies on existing approaches of deductive databases to make explicit, implicit knowledge encoded in a KG. The proposed deductive database DSDS can derive new statements to ego networks given an abstract target prediction. Thus, DSDS minimizes data sparsity in KGs. In addition, a sub-symbolic system relies on knowledge graph embedding (KGE) models. KGE models are commonly applied in the KG completion task to represent entities in a KG in a low-dimensional vector space. However, KGE models are known to suffer from data sparsity, and a symbolic system assists in overcoming this fact. The proposed approach discovers knowledge given a target prediction in a KG and extracts unknown implicit information related to the target prediction. As a proof of concept, we have implemented the neuro-symbolic system on top of a KG for lung cancer to predict polypharmacy treatment effectiveness. The symbolic system implements a deductive system to deduce pharmacokinetic drug-drug interactions encoded in a set of rules through the Datalog program. Additionally, the sub-symbolic system predicts treatment effectiveness using a KGE model, which preserves the KG structure. An ablation study on the components of our approach is conducted, considering state-of-the-art KGE methods. The observed results provide evidence for the benefits of the neuro-symbolic integration of our approach, where the neuro-symbolic system for an abstract target prediction exhibits improved results. The enhancement of the results occurs because the symbolic system increases the prediction capacity of the sub-symbolic system. Moreover, the proposed neuro-symbolic artificial intelligence system in Industry 4.0 (I4.0) is evaluated, demonstrating its effectiveness in determining relatedness among standards and analyzing their properties to detect unknown relations in the I4.0KG. The results achieved allow us to conclude that the proposed neuro-symbolic approach for an abstract target prediction improves the prediction capability of KGE models by minimizing data sparsity in KGs

    Drug interaction prediction using ontology-driven hypothetical assertion framework for pathway generation followed by numerical simulation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In accordance with the increasing amount of information concerning individual differences in drug response and molecular interaction, the role of <it>in silico </it>prediction of drug interaction on the pathway level is becoming more and more important. However, in view of the interferences for the identification of new drug interactions, most conventional information models of a biological pathway would have limitations. As a reflection of real world biological events triggered by a stimulus, it is important to facilitate the incorporation of known molecular events for inferring (unknown) possible pathways and hypothetic drug interactions. Here, we propose a new Ontology-Driven Hypothetic Assertion (OHA) framework including pathway generation, drug interaction detection, simulation model generation, numerical simulation, and hypothetic assertion. Potential drug interactions are detected from drug metabolic pathways dynamically generated by molecular events triggered after the administration of certain drugs. Numerical simulation enables to estimate the degree of side effects caused by the predicted drug interactions. New hypothetic assertions of the potential drug interactions and simulation are deduced from the Drug Interaction Ontology (DIO) written in Web Ontology Language (OWL).</p> <p>Results</p> <p>The concept of the Ontology-Driven Hypothetic Assertion (OHA) framework was demonstrated with known interactions between irinotecan (CPT-11) and ketoconazole. Four drug interactions that involved cytochrome p450 (CYP3A4) and albumin as potential drug interaction proteins were automatically detected from Drug Interaction Ontology (DIO). The effect of the two interactions involving CYP3A4 were quantitatively evaluated with numerical simulation. The co-administration of ketoconazole may increase AUC and Cmax of SN-38(active metabolite of irinotecan) to 108% and 105%, respectively. We also estimates the potential effects of genetic variations: the AUC and Cmax of SN-38 may increase to 208% and 165% respectively with the genetic variation UGT1A1*28/*28 which reduces the expression of UGT1A1 down to 30%.</p> <p>Conclusion</p> <p>These results demonstrate that the Ontology-Driven Hypothetic Assertion framework is a promising approach for <it>in silico </it>prediction of drug interactions. The following future researches for the <it>in silico </it>prediction of individual differences in the response to the drug and drug interactions after the administration of multiple drugs: expansion of the Drug Interaction Ontology for other drugs, and incorporation of virtual population model for genetic variation analysis, as well as refinement of the pathway generation rules, the drug interaction detection rules, and the numerical simulation models.</p

    Leveraging syntactic and semantic graph kernels to extract pharmacokinetic drug drug interactions from biomedical literature

    Get PDF
    BACKGROUND: Information about drug-drug interactions (DDIs) supported by scientific evidence is crucial for establishing computational knowledge bases for applications like pharmacovigilance. Since new reports of DDIs are rapidly accumulating in the scientific literature, text-mining techniques for automatic DDI extraction are critical. We propose a novel approach for automated pharmacokinetic (PK) DDI detection that incorporates syntactic and semantic information into graph kernels, to address the problem of sparseness associated with syntactic-structural approaches. First, we used a novel all-path graph kernel using shallow semantic representation of sentences. Next, we statistically integrated fine-granular semantic classes into the dependency and shallow semantic graphs. RESULTS: When evaluated on the PK DDI corpus, our approach significantly outperformed the original all-path graph kernel that is based on dependency structure. Our system that combined dependency graph kernel with semantic classes achieved the best F-scores of 81.94 % for in vivo PK DDIs and 69.34 % for in vitro PK DDIs, respectively. Further, combining shallow semantic graph kernel with semantic classes achieved the highest precisions of 84.88 % for in vivo PK DDIs and 74.83 % for in vitro PK DDIs, respectively. CONCLUSIONS: We presented a graph kernel based approach to combine syntactic and semantic information for extracting pharmacokinetic DDIs from Biomedical Literature. Experimental results showed that our proposed approach could extract PK DDIs from literature effectively, which significantly enhanced the performance of the original all-path graph kernel based on dependency structure

    Identifying Novel Drug Indications through Automated Reasoning

    Get PDF
    abstract: Background With the large amount of pharmacological and biological knowledge available in literature, finding novel drug indications for existing drugs using in silico approaches has become increasingly feasible. Typical literature-based approaches generate new hypotheses in the form of protein-protein interactions networks by means of linking concepts based on their cooccurrences within abstracts. However, this kind of approaches tends to generate too many hypotheses, and identifying new drug indications from large networks can be a time-consuming process. Methodology In this work, we developed a method that acquires the necessary facts from literature and knowledge bases, and identifies new drug indications through automated reasoning. This is achieved by encoding the molecular effects caused by drug-target interactions and links to various diseases and drug mechanism as domain knowledge in AnsProlog, a declarative language that is useful for automated reasoning, including reasoning with incomplete information. Unlike other literature-based approaches, our approach is more fine-grained, especially in identifying indirect relationships for drug indications. Conclusion/Significance To evaluate the capability of our approach in inferring novel drug indications, we applied our method to 943 drugs from DrugBank and asked if any of these drugs have potential anti-cancer activities based on information on their targets and molecular interaction types alone. A total of 507 drugs were found to have the potential to be used for cancer treatments. Among the potential anti-cancer drugs, 67 out of 81 drugs (a recall of 82.7%) are indeed known cancer drugs. In addition, 144 out of 289 drugs (a recall of 49.8%) are non-cancer drugs that are currently tested in clinical trials for cancer treatments. These results suggest that our method is able to infer drug indications (original or alternative) based on their molecular targets and interactions alone and has the potential to discover novel drug indications for existing drugs.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.004094

    Extracting Drug-Drug Interactions with Character-Level and Dependency-Based Embeddings

    Get PDF
    The DDI track of TAC-2018 challenge addresses the problem of an information retrieval of drug-drug interactions on structured product labeling documents with discontinuous and overlapping entities. In this paper, we present our participation for event extraction subtask (Task 1). We used a supervised long-short-term memory (LSTM) network with conditional random fields decoding (LSTM-CRF) approach with an automatic exploring of words and characters features. Additional dependency-based information was integrated into word embeddings to allow better word representation. Our system performed with above median score

    Structuring the Unstructured: Unlocking pharmacokinetic data from journals with Natural Language Processing

    Get PDF
    The development of a new drug is an increasingly expensive and inefficient process. Many drug candidates are discarded due to pharmacokinetic (PK) complications detected at clinical phases. It is critical to accurately estimate the PK parameters of new drugs before being tested in humans since they will determine their efficacy and safety outcomes. Preclinical predictions of PK parameters are largely based on prior knowledge from other compounds, but much of this potentially valuable data is currently locked in the format of scientific papers. With an ever-increasing amount of scientific literature, automated systems are essential to exploit this resource efficiently. Developing text mining systems that can structure PK literature is critical to improving the drug development pipeline. This thesis studied the development and application of text mining resources to accelerate the curation of PK databases. Specifically, the development of novel corpora and suitable natural language processing architectures in the PK domain were addressed. The work presented focused on machine learning approaches that can model the high diversity of PK studies, parameter mentions, numerical measurements, units, and contextual information reported across the literature. Additionally, architectures and training approaches that could efficiently deal with the scarcity of annotated examples were explored. The chapters of this thesis tackle the development of suitable models and corpora to (1) retrieve PK documents, (2) recognise PK parameter mentions, (3) link PK entities to a knowledge base and (4) extract relations between parameter mentions, estimated measurements, units and other contextual information. Finally, the last chapter of this thesis studied the feasibility of the whole extraction pipeline to accelerate tasks in drug development research. The results from this thesis exhibited the potential of text mining approaches to automatically generate PK databases that can aid researchers in the field and ultimately accelerate the drug development pipeline. Additionally, the thesis presented contributions to biomedical natural language processing by developing suitable architectures and corpora for multiple tasks, tackling novel entities and relations within the PK domain

    Semantic resources in pharmacovigilance: a corpus and an ontology for drug-drug interactions

    Get PDF
    Mención Internacional en el título de doctorNowadays, with the increasing use of several drugs for the treatment of one or more different diseases (polytherapy) in large populations, the risk for drugs combinations that have not been studied in pre-authorization clinical trials has increased. This provides a favourable setting for the occurrence of drug-drug interactions (DDIs), a common adverse drug reaction (ADR) representing an important risk to patients safety, and an increase in healthcare costs. Their early detection is, therefore, a main concern in the clinical setting. Although there are different databases supporting healthcare professionals in the detection of DDIs, the quality of these databases is very uneven, and the consistency of their content is limited. Furthermore, these databases do not scale well to the large and growing number of pharmacovigilance literature in recent years. In addition, large amounts of current and valuable information are hidden in published articles, scientific journals, books, and technical reports. Thus, the large number of DDI information sources has overwhelmed most healthcare professionals because it is not possible to remain up to date on everything published about DDIs. Computational methods can play a key role in the identification, explanation, and prediction of DDIs on a large scale, since they can be used to collect, analyze and manipulate large amounts of biological and pharmacological data. Natural language processing (NLP) techniques can be used to retrieve and extract DDI information from pharmacological texts, supporting researchers and healthcare professionals on the challenging task of searching DDI information among different and heterogeneous sources. However, these methods rely on the availability of specific resources providing the domain knowledge, such as databases, terminological vocabularies, corpora, ontologies, and so forth, which are necessary to address the Information Extraction (IE) tasks. In this thesis, we have developed two semantic resources for the DDI domain that make an important contribution to the research and development of IE systems for DDIs. We have reviewed and analyzed the existing corpora and ontologies relevant to this domain, based on their strengths and weaknesses, we have developed the DDI corpus and the ontology for drug-drug interactions (named DINTO). The DDI corpus has proven to fulfil the characteristics of a high-quality gold-standard, and has demonstrated its usefulness as a benchmark for the training and testing of different IE systems in the SemEval-2013 DDIExtraction shared task. Meanwhile, DINTO has been used and evaluated in two different applications. Firstly, it has been proven that the knowledge represented in the ontology can be used to infer DDIs and their different mechanisms. Secondly, we have provided a proof-of-concept of the contribution of DINTO to NLP, by providing the domain knowledge to be exploited by an IE pilot prototype. From these results, we believe that these two semantic resources will encourage further research into the application of computational methods to the early detection of DDIs. This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR [S2009/TIC-1542], by the Spanish Ministry of Education under the project MULTIMEDICA [TIN2010-20644-C03-01] and by the European Commission Seventh Framework Programme under TrendMiner project [FP7-ICT287863].Hoy en día ha habido un notable aumento del número de pacientes polimedicados que reciben simultáneamente varios fármacos para el tratamiento de una o varias enfermedades. Esta situación proporciona el escenario ideal para la prescripción de combinaciones de fármacos que no han sido estudiadas previamente en ensayos clínicos, y puede dar lugar a un aumento de interacciones farmacológicas (DDIs por sus siglas en inglés). Las interacciones entre fármacos son un tipo de reacción adversa que supone no sólo un riesgo para los pacientes, sino también una importante causa de aumento del gasto sanitario. Por lo tanto, su detección temprana es crucial en la práctica clínica. En la actualidad existen diversos recursos y bases de datos que pueden ayudar a los profesionales sanitarios en la detección de posibles interacciones farmacológicas. Sin embargo, la calidad de su información varía considerablemente de unos a otros, y la consistencia de sus contenidos es limitada. Además, la actualización de estos recursos es difícil debido al aumento que ha experimentado la literatura farmacológica en los últimos años. De hecho, mucha información sobre DDIs se encuentra dispersa en artículos, revistas científicas, libros o informes técnicos, lo que ha hecho que la mayoría de los profesionales sanitarios se hayan visto abrumados al intentar mantenerse actualizados en el dominio de las interacciones farmacológicas. La ingeniería informática puede representar un papel fundamental en este campo permitiendo la identificación, explicación y predicción de DDIs, ya que puede ayudar a recopilar, analizar y manipular grandes cantidades de datos biológicos y farmacológicos. En concreto, las técnicas del procesamiento del lenguaje natural (PLN) pueden ayudar a recuperar y extraer información sobre DDIs de textos farmacológicos, ayudando a los investigadores y profesionales sanitarios en la complicada tarea de buscar esta información en diversas fuentes. Sin embargo, el desarrollo de estos métodos depende de la disponibilidad de recursos específicos que proporcionen el conocimiento del dominio, como bases de datos, vocabularios terminológicos, corpora u ontologías, entre otros, que son necesarios para desarrollar las tareas de extracción de información (EI). En el marco de esta tesis hemos desarrollado dos recursos semánticos en el dominio de las interacciones farmacológicas que suponen una importante contribución a la investigación y al desarrollo de sistemas de EI sobre DDIs. En primer lugar hemos revisado y analizado los corpora y ontologías existentes relevantes para el dominio y, en base a sus potenciales y limitaciones, hemos desarrollado el corpus DDI y la ontología para interacciones farmacológicas DINTO. El corpus DDI ha demostrado cumplir con las características de un estándar de oro de gran calidad, así como su utilidad para el entrenamiento y evaluación de distintos sistemas en la tarea de extracción de información SemEval-2013 DDIExtraction Task. Por su parte, DINTO ha sido utilizada y evaluada en dos aplicaciones diferentes. En primer lugar, hemos demostrado que esta ontología puede ser utilizada para inferir interacciones entre fármacos y los mecanismos por los que ocurren. En segundo lugar, hemos obtenido una primera prueba de concepto de la contribución de DINTO al área del PLN al proporcionar el conocimiento del dominio necesario para ser explotado por un prototipo de un sistema de EI. En vista de estos resultados, creemos que estos dos recursos semánticos pueden estimular la investigación en el desarrollo de métodos computaciones para la detección temprana de DDIs. Este trabajo ha sido financiado parcialmente por el Gobierno Regional de Madrid a través de la red de investigación MA2VICMR [S2009/TIC-1542], por el Ministerio de Educación Español, a través del proyecto MULTIMEDICA [TIN2010-20644-C03-01], y por el Séptimo Programa Macro de la Comisión Europea a través del proyecto TrendMiner [FP7-ICT287863].This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR [S2009/TIC-1542], by the Spanish Ministry of Education under the project MULTIMEDICA [TIN2010-20644-C03-01] and by the European Commission Seventh Framework Programme under TrendMiner project [FP7-ICT287863].Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Asunción Gómez Pérez.- Secretario: María Belén Ruiz Mezcua.- Vocal: Mariana Neve

    Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

    Full text link
    Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unknown cases but give a guess from seen relations, hindering the applicability of those approaches. We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. By converting relations to natural language hypotheses, NBR is capable of exploiting semantic cues to alleviate annotation scarcity. By incorporating a ranking-based loss that implicitly calibrates abstinent instances, NBR learns a clearer decision boundary and is instructed to abstain on uncertain instances. Extensive experiments on three widely-used biomedical RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in both full-set and low-resource regimes. Our analysis demonstrates that indirect supervision benefits biomedical RE even when a domain gap exists, and combining NLI knowledge with biomedical knowledge leads to the best performance gains.Comment: 16 pages; ACL 2023; code in https://github.com/luka-group/NLI_as_Indirect_Supervisio
    corecore