11 research outputs found
Large-scale computational drug repositioning to find treatments for rare diseases
© 2018, The Author(s). Rare, or orphan, diseases are conditions afflicting a small subset of people in a population. Although these disorders collectively pose significant health care problems, drug companies require government incentives to develop drugs for rare diseases due to extremely limited individual markets. Computer-aided drug repositioning, i.e., finding new indications for existing drugs, is a cheaper and faster alternative to traditional drug discovery offering a promising venue for orphan drug research. Structure-based matching of drug-binding pockets is among the most promising computational techniques to inform drug repositioning. In order to find new targets for known drugs ultimately leading to drug repositioning, we recently developed eMatchSite, a new computer program to compare drug-binding sites. In this study, eMatchSite is combined with virtual screening to systematically explore opportunities to reposition known drugs to proteins associated with rare diseases. The effectiveness of this integrated approach is demonstrated for a kinase inhibitor, which is a confirmed candidate for repositioning to synapsin Ia. The resulting dataset comprises 31,142 putative drug-target complexes linked to 980 orphan diseases. The modeling accuracy is evaluated against the structural data recently released for tyrosine-protein kinase HCK. To illustrate how potential therapeutics for rare diseases can be identified, we discuss a possibility to repurpose a steroidal aromatase inhibitor to treat Niemann-Pick disease type C. Overall, the exhaustive exploration of the drug repositioning space exposes new opportunities to combat orphan diseases with existing drugs. DrugBank/Orphanet repositioning data are freely available to research community at https://osf.io/qdjup/
BionoiNet: Ligand-binding site classification with off-the-shelf deep neural network
© The 2020 Author(s). Published by Oxford University Press. All rights reserved. Motivation: Fast and accurate classification of ligand-binding sites in proteins with respect to the class of binding molecules is invaluable not only to the automatic functional annotation of large datasets of protein structures but also to projects in protein evolution, protein engineering and drug development. Deep learning techniques, which have already been successfully applied to address challenging problems across various fields, are inherently suitable to classify ligand-binding pockets. Our goal is to demonstrate that off-the-shelf deep learning models can be employed with minimum development effort to recognize nucleotide-and heme-binding sites with a comparable accuracy to highly specialized, voxel-based methods. Results: We developed BionoiNet, a new deep learning-based framework implementing a popular ResNet model for image classification. BionoiNet first transforms the molecular structures of ligand-binding sites to 2D Voronoi diagrams, which are then used as the input to a pretrained convolutional neural network classifier. The ResNet model generalizes well to unseen data achieving the accuracy of 85.6% for nucleotide-and 91.3% for heme-binding pockets. BionoiNet also computes significance scores of pocket atoms, called BionoiScores, to provide meaningful insights into their interactions with ligand molecules. BionoiNet is a lightweight alternative to computationally expensive 3D architectures
Unlocking the Potential of Kinase Targets in Cancer: Insights from CancerOmicsNet, an AI-Driven Approach to Drug Response Prediction in Cancer
Deregulated protein kinases are crucial in promoting cancer cell proliferation and driving malignant cell signaling. Although these kinases are essential targets for cancer therapy due to their involvement in cell development and proliferation, only a small part of the human kinome has been targeted by drugs. A comprehensive scoring system is needed to evaluate and prioritize clinically relevant kinases. We recently developed CancerOmicsNet, an artificial intelligence model employing graph-based algorithms to predict the cancer cell response to treatment with kinase inhibitors. The performance of this approach has been evaluated in large-scale benchmarking calculations, followed by the experimental validation of selected predictions against several cancer types. To shed light on the decision-making process of CancerOmicsNet and to better understand the role of each kinase in the model, we employed a customized saliency map with adjustable channel weights. The saliency map, functioning as an explainable AI tool, allows for the analysis of input contributions to the output of a trained deep-learning model and facilitates the identification of essential kinases involved in tumor progression. The comprehensive survey of biomedical literature for essential kinases selected by CancerOmicsNet demonstrated that it could help pinpoint potential druggable targets for further investigation in diverse cancer types
Graph-Based Computational Approach to The Study Of Cancer Pharmacotherapy
Recent advances in the analysis of omics data from various cancer cells pave the way for cancer therapy to predict effective drug responses and drug-target interactions based on cell- specific genetic features such as gene expression profiles, mutations, and copy number variations. Because the phenotype of a complex disease, such as cancer, is determined by several factors, it is difficult to predict the best effective treatment for a certain cell line based just on one sort of data, such as genetic traits and drug molecular features. The overall studies in this thesis proposed a graph-based representation of heterogeneous biological data and applications of this data representation in graph-based deep learning approaches that can use network data alongside other types of data to predict effective drug response to account for the complexities of cancer progression.
The overall thesis has been divided into four sections. In the first section, we addressed the question that how to integrate multiple heterogeneous data to represent the system-level complexity as a graph in the study of anti-cancer pharmacotherapy. In the second section, we suggested the implementation of a graph-based artificial intelligence system to predict drug reactions, mostly kinase inhibitors, on diverse cancer cell lines. In the third section, we compared our proposed method with gene signature analysis and validated our results against biomedical literature and live-cell time course inhibition assays. In the concluding part, we suggested an artificial intelligence technique for predicting drug-target interactions based on graph-based data
CancerOmicsNet: a multi-omics network-based approach to anti-cancer drug profiling
Development of novel anti-cancer treatments requires not only a comprehensive knowledge of cancer processes and drug mechanisms of action, but also the ability to accurately predict the response of various cancer cell lines to therapeutics. Numerous computational methods have been developed to address this issue, including algorithms employing supervised machine learning. Nonetheless, high prediction accuracies reported for many of these techniques may result from a significant overlap among training, validation, and testing sets, making existing predictors inapplicable to new data. To address these issues, we developed CancerOmicsNet, a graph neural network with sophisticated attention propagation mechanisms to predict the therapeutic effects of kinase inhibitors across various tumors. Emphasizing on the system-level complexity of cancer, CancerOmicsNet integrates multiple heterogeneous data, such as biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. The performance of CancerOmicsNet, properly cross-validated at the tissue level, is 0.83 in terms of the area under the receiver operating characteristics, which is notably higher than those measured for other approaches. CancerOmicsNet generalizes well to unseen data, i.e., it can predict therapeutic effects across a variety of cancer cell lines and inhibitors. CancerOmicsNet is freely available to the academic community at https://github.com/pulimeng/CancerOmicsNet
Pocket2Drug: An Encoder-Decoder Deep Neural Network for the Target-Based Drug Design
Computational modeling is an essential component of modern drug discovery. One of its most important applications is to select promising drug candidates for pharmacologically relevant target proteins. Because of continuing advances in structural biology, putative binding sites for small organic molecules are being discovered in numerous proteins linked to various diseases. These valuable data offer new opportunities to build efficient computational models predicting binding molecules for target sites through the application of data mining and machine learning. In particular, deep neural networks are powerful techniques capable of learning from complex data in order to make informed drug binding predictions. In this communication, we describe Pocket2Drug, a deep graph neural network model to predict binding molecules for a given a ligand binding site. This approach first learns the conditional probability distribution of small molecules from a large dataset of pocket structures with supervised training, followed by the sampling of drug candidates from the trained model. Comprehensive benchmarking simulations show that using Pocket2Drug significantly improves the chances of finding molecules binding to target pockets compared to traditional drug selection procedures. Specifically, known binders are generated for as many as 80.5% of targets present in the testing set consisting of dissimilar data from that used to train the deep graph neural network model. Overall, Pocket2Drug is a promising computational approach to inform the discovery of novel biopharmaceuticals
GraphSite: Ligand Binding Site Classification with Deep Graph Learning
The binding of small organic molecules to protein targets is fundamental to a wide array of cellular functions. It is also routinely exploited to develop new therapeutic strategies against a variety of diseases. On that account, the ability to effectively detect and classify ligand binding sites in proteins is of paramount importance to modern structure-based drug discovery. These complex and non-trivial tasks require sophisticated algorithms from the field of artificial intelligence to achieve a high prediction accuracy. In this communication, we describe GraphSite, a deep learning-based method utilizing a graph representation of local protein structures and a state-of-the-art graph neural network to classify ligand binding sites. Using neural weighted message passing layers to effectively capture the structural, physicochemical, and evolutionary characteristics of binding pockets mitigates model overfitting and improves the classification accuracy. Indeed, comprehensive cross-validation benchmarks against a large dataset of binding pockets belonging to 14 diverse functional classes demonstrate that GraphSite yields the class-weighted F1-score of 81.7%, outperforming other approaches such as molecular docking and binding site matching. Further, it also generalizes well to unseen data with the F1-score of 70.7%, which is the expected performance in real-world applications. We also discuss new directions to improve and extend GraphSite in the future
An integrated network representation of multiple cancer-specific data for graph-based machine learning
Genomic profiles of cancer cells provide valuable information on genetic alterations in cancer. Several recent studies employed these data to predict the response of cancer cell lines to drug treatment. Nonetheless, due to the multifactorial phenotypes and intricate mechanisms of cancer, the accurate prediction of the effect of pharmacotherapy on a specific cell line based on the genetic information alone is problematic. Emphasizing on the system-level complexity of cancer, we devised a procedure to integrate multiple heterogeneous data, including biological networks, genomics, inhibitor profiling, and gene-disease associations, into a unified graph structure. In order to construct compact, yet information-rich cancer-specific networks, we developed a novel graph reduction algorithm. Driven by not only the topological information, but also the biological knowledge, the graph reduction increases the feature-only entropy while preserving the valuable graph-feature information. Subsequent comparative benchmarking simulations employing a tissue level cross-validation protocol demonstrate that the accuracy of a graph-based predictor of the drug efficacy is 0.68, which is notably higher than those measured for more traditional, matrix-based techniques on the same data. Overall, the non-Euclidean representation of the cancer-specific data improves the performance of machine learning to predict the response of cancer to pharmacotherapy. The generated data are freely available to the academic community at https://osf.io/dzx7b/
GraphDTI: A robust deep learning predictor of drug-target interactions from multiple heterogeneous data
Traditional techniqueset identification, we developed GraphDTI, a robust machine learning framework integrating the molecular-level information on drugs, proteins, and binding sites with the system-level information on gene expression and protein-protein interactions. In order to properly evaluate the performance of GraphDTI, we compiled a high-quality benchmarking dataset and devised a new cluster-based cross-validation p to identify macromolecular targets for drugs utilize solely the information on a query drug and a putative target. Nonetheless, the mechanisms of action of many drugs depend not only on their binding affinity toward a single protein, but also on the signal transduction through cascades of molecular interactions leading to certain phenotypes. Although using protein-protein interaction networks and drug-perturbed gene expression profiles can facilitate system-level investigations of drug-target interactions, utilizing such large and heterogeneous data poses notable challenges. To improve the state-of-the-art in drug targrotocol. Encouragingly, GraphDTI not only yields an AUC of 0.996 against the validation dataset, but it also generalizes well to unseen data with an AUC of 0.939, significantly outperforming other predictors. Finally, selected examples of identified drug-target interactions are validated against the biomedical literature. Numerous applications of GraphDTI include the investigation of drug polypharmacological effects, side effects through off-target binding, and repositioning opportunities