3,537 research outputs found

    Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction

    Full text link
    Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, NLP based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely-used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.Comment: Accepted by Briefings in Bioinformatic

    Biological Applications of Knowledge Graph Embedding Models

    Get PDF
    Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph s inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.The work presented in this paper was supported by the CLARIFY project that has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 875160, and by Insight research centre supported by the Science Foundation Ireland (SFI) grant (12/RC/2289_2)peer-reviewed2021-02-1

    Integrative bioinformatics and graph-based methods for predicting adverse effects of developmental drugs

    Get PDF
    Adverse drug effects are complex phenomena that involve the interplay between drug molecules and their protein targets at various levels of biological organisation, from molecular to organismal. Many factors are known to contribute toward the safety profile of a drug, including the chemical properties of the drug molecule itself, the biological properties of drug targets and other proteins that are involved in pharmacodynamics and pharmacokinetics aspects of drug action, and the characteristics of the intended patient population. A multitude of scattered publicly available resources exist that cover these important aspects of drug activity. These include manually curated biological databases, high-throughput experimental results from gene expression and human genetics resources as well as drug labels and registered clinical trial records. This thesis proposes an integrated analysis of these disparate sources of information to help bridge the gap between the molecular and the clinical aspects of drug action. For example, to address the commonly held assumption that narrowly expressed proteins make safer drug targets, an integrative data-driven analysis was conducted to systematically investigate the relationship between the tissue expression profile of drug targets and the organs affected by clinically observed adverse drug reactions. Similarly, human genetics data were used extensively throughout the thesis to compare adverse symptoms induced by drug molecules with the phenotypes associated with the genes encoding their target proteins. One of the main outcomes of this thesis was the generation of a large knowledge graph, which incorporates diverse molecular and phenotypic data in a structured network format. To leverage the integrated information, two graph-based machine learning methods were developed to predict a wide range of adverse drug effects caused by approved and developmental therapies

    Analyzing adverse drug reaction using statistical and machine learning methods: A systematic review

    Get PDF
    Background: Adverse drug reactions (ADRs) are unintended negative drug-induced responses. Determining the association between drugs and ADRs is crucial, and several methods have been proposed to demonstrate this association. This systematic review aimed to examine the analytical tools by considering original articles that utilized statistical and machine learning methods for detecting ADRs. Methods: A systematic literature review was conducted based on articles published between 2015 and 2020. The keywords used were statistical, machine learning, and deep learning methods for detecting ADR signals. The study was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (PRISMA) guidelines. Results: We reviewed 72 articles, of which 51 and 21 addressed statistical and machine learning methods, respectively. Electronic medical record (EMR) data were exclusively analyzed using the regression method. For FDA Adverse Event Reporting System (FAERS) data, components of the disproportionality method were preferable. DrugBank was the most used database for machine learning. Other methods accounted for the highest and supervised methods accounted for the second highest. Conclusions: Using the 72 main articles, this review provides guidelines on which databases are frequently utilized and which analysis methods can be connected. For statistical analysis, >90% of the cases were analyzed by disproportionate or regression analysis with each spontaneous reporting system (SRS) data or electronic medical record (EMR) data; for machine learning research, however, there was a strong tendency to analyze various data combinations. Only half of the DrugBank database was occupied, and the k-nearest neighbor method accounted for the greatest proportion.ope

    Computational and human-based methods for knowledge discovery over knowledge graphs

    Get PDF
    The modern world has evolved, accompanied by the huge exploitation of data and information. Daily, increasing volumes of data from various sources and formats are stored, resulting in a challenging strategy to manage and integrate them to discover new knowledge. The appropriate use of data in various sectors of society, such as education, healthcare, e-commerce, and industry, provides advantages for decision support in these areas. However, knowledge discovery becomes challenging since data may come from heterogeneous sources with important information hidden. Thus, new approaches that adapt to the new challenges of knowledge discovery in such heterogeneous data environments are required. The semantic web and knowledge graphs (KGs) are becoming increasingly relevant on the road to knowledge discovery. This thesis tackles the problem of knowledge discovery over KGs built from heterogeneous data sources. We provide a neuro-symbolic artificial intelligence system that integrates symbolic and sub-symbolic frameworks to exploit the semantics encoded in a KG and its structure. The symbolic system relies on existing approaches of deductive databases to make explicit, implicit knowledge encoded in a KG. The proposed deductive database DSDS can derive new statements to ego networks given an abstract target prediction. Thus, DSDS minimizes data sparsity in KGs. In addition, a sub-symbolic system relies on knowledge graph embedding (KGE) models. KGE models are commonly applied in the KG completion task to represent entities in a KG in a low-dimensional vector space. However, KGE models are known to suffer from data sparsity, and a symbolic system assists in overcoming this fact. The proposed approach discovers knowledge given a target prediction in a KG and extracts unknown implicit information related to the target prediction. As a proof of concept, we have implemented the neuro-symbolic system on top of a KG for lung cancer to predict polypharmacy treatment effectiveness. The symbolic system implements a deductive system to deduce pharmacokinetic drug-drug interactions encoded in a set of rules through the Datalog program. Additionally, the sub-symbolic system predicts treatment effectiveness using a KGE model, which preserves the KG structure. An ablation study on the components of our approach is conducted, considering state-of-the-art KGE methods. The observed results provide evidence for the benefits of the neuro-symbolic integration of our approach, where the neuro-symbolic system for an abstract target prediction exhibits improved results. The enhancement of the results occurs because the symbolic system increases the prediction capacity of the sub-symbolic system. Moreover, the proposed neuro-symbolic artificial intelligence system in Industry 4.0 (I4.0) is evaluated, demonstrating its effectiveness in determining relatedness among standards and analyzing their properties to detect unknown relations in the I4.0KG. The results achieved allow us to conclude that the proposed neuro-symbolic approach for an abstract target prediction improves the prediction capability of KGE models by minimizing data sparsity in KGs

    Knowledge-augmented Graph Machine Learning for Drug Discovery: A Survey from Precision to Interpretability

    Full text link
    The integration of Artificial Intelligence (AI) into the field of drug discovery has been a growing area of interdisciplinary scientific research. However, conventional AI models are heavily limited in handling complex biomedical structures (such as 2D or 3D protein and molecule structures) and providing interpretations for outputs, which hinders their practical application. As of late, Graph Machine Learning (GML) has gained considerable attention for its exceptional ability to model graph-structured biomedical data and investigate their properties and functional relationships. Despite extensive efforts, GML methods still suffer from several deficiencies, such as the limited ability to handle supervision sparsity and provide interpretability in learning and inference processes, and their ineffectiveness in utilising relevant domain knowledge. In response, recent studies have proposed integrating external biomedical knowledge into the GML pipeline to realise more precise and interpretable drug discovery with limited training instances. However, a systematic definition for this burgeoning research direction is yet to be established. This survey presents a comprehensive overview of long-standing drug discovery principles, provides the foundational concepts and cutting-edge techniques for graph-structured data and knowledge databases, and formally summarises Knowledge-augmented Graph Machine Learning (KaGML) for drug discovery. A thorough review of related KaGML works, collected following a carefully designed search methodology, are organised into four categories following a novel-defined taxonomy. To facilitate research in this promptly emerging field, we also share collected practical resources that are valuable for intelligent drug discovery and provide an in-depth discussion of the potential avenues for future advancements
    • 

    corecore