121 research outputs found
Neurosymbolic AI for Reasoning on Graph Structures: A Survey
Neurosymbolic AI is an increasingly active area of research which aims to
combine symbolic reasoning methods with deep learning to generate models with
both high predictive performance and some degree of human-level
comprehensibility. As knowledge graphs are becoming a popular way to represent
heterogeneous and multi-relational data, methods for reasoning on graph
structures have attempted to follow this neurosymbolic paradigm. Traditionally,
such approaches have utilized either rule-based inference or generated
representative numerical embeddings from which patterns could be extracted.
However, several recent studies have attempted to bridge this dichotomy in ways
that facilitate interpretability, maintain performance, and integrate expert
knowledge. Within this article, we survey a breadth of methods that perform
neurosymbolic reasoning tasks on graph structures. To better compare the
various methods, we propose a novel taxonomy by which we can classify them.
Specifically, we propose three major categories: (1) logically-informed
embedding approaches, (2) embedding approaches with logical constraints, and
(3) rule-learning approaches. Alongside the taxonomy, we provide a tabular
overview of the approaches and links to their source code, if available, for
more direct comparison. Finally, we discuss the applications on which these
methods were primarily used and propose several prospective directions toward
which this new field of research could evolve.Comment: 21 pages, 8 figures, 1 table, currently under review. Corresponding
GitHub page here: https://github.com/NeSymGraph
Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction
Recent advances and achievements of artificial intelligence (AI) as well as
deep and graph learning models have established their usefulness in biomedical
applications, especially in drug-drug interactions (DDIs). DDIs refer to a
change in the effect of one drug to the presence of another drug in the human
body, which plays an essential role in drug discovery and clinical research.
DDIs prediction through traditional clinical trials and experiments is an
expensive and time-consuming process. To correctly apply the advanced AI and
deep learning, the developer and user meet various challenges such as the
availability and encoding of data resources, and the design of computational
methods. This review summarizes chemical structure based, network based, NLP
based and hybrid methods, providing an updated and accessible guide to the
broad researchers and development community with different domain knowledge. We
introduce widely-used molecular representation and describe the theoretical
frameworks of graph neural network models for representing molecular
structures. We present the advantages and disadvantages of deep and graph
learning methods by performing comparative experiments. We discuss the
potential technical challenges and highlight future directions of deep and
graph learning models for accelerating DDIs prediction.Comment: Accepted by Briefings in Bioinformatic
Network-driven strategies to integrate and exploit biomedical data
[eng] In the quest for understanding complex biological systems, the scientific community has been delving into protein, chemical and disease biology, populating biomedical databases with a wealth of data and knowledge. Currently, the field of biomedicine has entered a Big Data era, in which computational-driven research can largely benefit from existing knowledge to better understand and characterize biological and chemical entities. And yet, the heterogeneity and complexity of biomedical data trigger the need for a proper integration and representation of this knowledge, so that it can be effectively and efficiently exploited.
In this thesis, we aim at developing new strategies to leverage the current biomedical knowledge, so that meaningful information can be extracted and fused into downstream applications. To this goal, we have capitalized on network analysis algorithms to integrate and exploit biomedical data in a wide variety of scenarios, providing a better understanding of pharmacoomics experiments while helping accelerate the drug discovery process. More specifically, we have (i) devised an approach to identify functional gene sets associated with drug response mechanisms of action, (ii) created a resource of biomedical descriptors able to anticipate cellular drug response and identify new drug repurposing opportunities, (iii) designed a tool to annotate biomedical support for a given set of experimental observations, and (iv) reviewed different chemical and biological descriptors relevant for drug discovery, illustrating how they can be used to provide solutions to current challenges in biomedicine.[cat] En la cerca d’una millor comprensiĂł dels sistemes biològics complexos, la comunitat cientĂfica ha estat aprofundint en la biologia de les proteĂŻnes, fĂ rmacs i malalties, poblant les bases de dades biomèdiques amb un gran volum de dades i coneixement. En l’actualitat, el camp de la biomedicina es troba en una era de “dades massives” (Big Data), on la investigaciĂł duta a terme per ordinadors se’n pot beneficiar per entendre i caracteritzar millor les entitats quĂmiques i biològiques. No obstant, la heterogeneĂŻtat i complexitat de les dades biomèdiques requereix que aquestes s’integrin i es representin d’una manera idònia, permetent aixĂ explotar aquesta informaciĂł d’una manera efectiva i eficient.
L’objectiu d’aquesta tesis doctoral Ă©s desenvolupar noves estratègies que permetin explotar el coneixement biomèdic actual i aixĂ extreure informaciĂł rellevant per aplicacions biomèdiques futures. Per aquesta finalitat, em fet servir algoritmes de xarxes per tal d’integrar i explotar el coneixement biomèdic en diferents tasques, proporcionant un millor enteniment dels experiments farmacoòmics per tal d’ajudar accelerar el procĂ©s de descobriment de nous fĂ rmacs. Com a resultat, en aquesta tesi hem (i) dissenyat una estratègia per identificar grups funcionals de gens associats a la resposta de lĂnies cel·lulars als fĂ rmacs, (ii) creat una col·lecciĂł de descriptors biomèdics capaços, entre altres coses, d’anticipar com les cèl·lules responen als fĂ rmacs o trobar nous usos per fĂ rmacs existents, (iii) desenvolupat una eina per descobrir quins contextos biològics corresponen a una associaciĂł biològica observada experimentalment i, finalment, (iv) hem explorat diferents descriptors quĂmics i biològics rellevants pel procĂ©s de descobriment de nous fĂ rmacs, mostrant com aquests poden ser utilitzats per trobar solucions a reptes actuals dins el camp de la biomedicina
Applications of Artificial Intelligence in Battling Against Covid-19: A Literature Review
© 2020 Elsevier Ltd. All rights reserved.Colloquially known as coronavirus, the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), that causes CoronaVirus Disease 2019 (COVID-19), has become a matter of grave concern for every country around the world. The rapid growth of the pandemic has wreaked havoc and prompted the need for immediate reactions to curb the effects. To manage the problems, many research in a variety of area of science have started studying the issue. Artificial Intelligence is among the area of science that has found great applications in tackling the problem in many aspects. Here, we perform an overview on the applications of AI in a variety of fields including diagnosis of the disease via different types of tests and symptoms, monitoring patients, identifying severity of a patient, processing covid-19 related imaging tests, epidemiology, pharmaceutical studies, etc. The aim of this paper is to perform a comprehensive survey on the applications of AI in battling against the difficulties the outbreak has caused. Thus we cover every way that AI approaches have been employed and to cover all the research until the writing of this paper. We try organize the works in a way that overall picture is comprehensible. Such a picture, although full of details, is very helpful in understand where AI sits in current pandemonium. We also tried to conclude the paper with ideas on how the problems can be tackled in a better way and provide some suggestions for future works.Peer reviewe
BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs
Foundation models (FMs) are able to leverage large volumes of unlabeled data
to demonstrate superior performance across a wide range of tasks. However, FMs
developed for biomedical domains have largely remained unimodal, i.e.,
independently trained and used for tasks on protein sequences alone, small
molecule structures alone, or clinical data alone. To overcome this limitation
of biomedical FMs, we present BioBridge, a novel parameter-efficient learning
framework, to bridge independently trained unimodal FMs to establish multimodal
behavior. BioBridge achieves it by utilizing Knowledge Graphs (KG) to learn
transformations between one unimodal FM and another without fine-tuning any
underlying unimodal FMs. Our empirical results demonstrate that BioBridge can
beat the best baseline KG embedding methods (on average by around 76.3%) in
cross-modal retrieval tasks. We also identify BioBridge demonstrates
out-of-domain generalization ability by extrapolating to unseen modalities or
relations. Additionally, we also show that BioBridge presents itself as a
general purpose retriever that can aid biomedical multimodal question answering
as well as enhance the guided generation of novel drugs
Mining Patents with Large Language Models Demonstrates Congruence of Functional Labels and Chemical Structures
Predicting chemical function from structure is a major goal of the chemical
sciences, from the discovery and repurposing of novel drugs to the creation of
new materials. Recently, new machine learning algorithms are opening up the
possibility of general predictive models spanning many different chemical
functions. Here, we consider the challenge of applying large language models to
chemical patents in order to consolidate and leverage the information about
chemical functionality captured by these resources. Chemical patents contain
vast knowledge on chemical function, but their usefulness as a dataset has
historically been neglected due to the impracticality of extracting
high-quality functional labels. Using a scalable ChatGPT-assisted patent
summarization and word-embedding label cleaning pipeline, we derive a Chemical
Function (CheF) dataset, containing 100K molecules and their patent-derived
functional labels. The functional labels were validated to be of high quality,
allowing us to detect a strong relationship between functional label and
chemical structural spaces. Further, we find that the co-occurrence graph of
the functional labels contains a robust semantic structure, which allowed us in
turn to examine functional relatedness among the compounds. We then trained a
model on the CheF dataset, allowing us to assign new functional labels to
compounds. Using this model, we were able to retrodict approved Hepatitis C
antivirals, uncover an antiviral mechanism undisclosed in the patent, and
identify plausible serotonin-related drugs. The CheF dataset and associated
model offers a promising new approach to predict chemical functionality.Comment: Under revie
- …