940 research outputs found

    Application of machine learning, molecular modelling and structural data mining against antiretroviral drug resistance in HIV-1

    Get PDF
    Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis

    iAVPs-ResBi: Identifying antiviral peptides by using deep residual network and bidirectional gated recurrent unit

    Get PDF
    Human history is also the history of the fight against viral diseases. From the eradication of viruses to coexistence, advances in biomedicine have led to a more objective understanding of viruses and a corresponding increase in the tools and methods to combat them. More recently, antiviral peptides (AVPs) have been discovered, which due to their superior advantages, have achieved great impact as antiviral drugs. Therefore, it is very necessary to develop a prediction model to accurately identify AVPs. In this paper, we develop the iAVPs-ResBi model using k-spaced amino acid pairs (KSAAP), encoding based on grouped weight (EBGW), enhanced grouped amino acid composition (EGAAC) based on the N5C5 sequence, composition, transition and distribution (CTD) based on physicochemical properties for multi-feature extraction. Then we adopt bidirectional long short-term memory (BiLSTM) to fuse features for obtaining the most differentiated information from multiple original feature sets. Finally, the deep model is built by combining improved residual network and bidirectional gated recurrent unit (BiGRU) to perform classification. The results obtained are better than those of the existing methods, and the accuracies are 95.07, 98.07, 94.29 and 97.50% on the four datasets, which show that iAVPs-ResBi can be used as an effective tool for the identification of antiviral peptides. The datasets and codes are freely available at https://github.com/yunyunliang88/iAVPs-ResBi

    Accelerating drug target inhibitor discovery with a deep generative foundation model

    Get PDF
    Inhibitor discovery for emerging drug-target proteins is challenging, especially when target structure or active molecules are unknown. Here, we experimentally validate the broad utility of a deep generative framework trained at-scale on protein sequences, small molecules, and their mutual interactions-unbiased toward any specific target. We performed a protein sequence-conditioned sampling on the generative foundation model to design small-molecule inhibitors for two dissimilar targets: the spike protein receptor-binding domain (RBD) and the main protease from SARS-CoV-2. Despite using only the target sequence information during the model inference, micromolar-level inhibition was observed in vitro for two candidates out of four synthesized for each target. The most potent spike RBD inhibitor exhibited activity against several variants in live virus neutralization assays. These results establish that a single, broadly deployable generative foundation model for accelerated inhibitor discovery is effective and efficient, even in the absence of target structure or binder information

    A Deep Learning Approaches for Modeling and Predicting of HIV Test Results Using EDHS Dataset

    Get PDF
    At present, HIV/AIDS has steadily been listed in the top position as a major cause of death. However, HIV is largely preventable and can be avoided by making strategies to increase HIV early prediction. So, there is a need for a predictive tool that can help the domain experts with early prediction of the disease and hence can recommend strategies to stop the prognosis of the diseases. Using deep learning models, we investigated whether demographic and health survey dataset might be utilized to predict HIV test status. The contribution of this work is to improve the accuracy of a model for predicting an individual’s HIV test status. We employed deep learning models to predict HIV status using Ethiopian demography and health survey (EDHS) datasets. Furthermore, we discovered that predictive models based on these dataset may be used to forecast individuals’ HIV test status, which might assist domain experts prioritize strategies and policies to safeguard the pandemic. The outcome of the study confirms that a DL model provides the best results with the most promising extracted features. The accuracy of the all DL models can further be enhanced by including the big dataset for predicting the prognosis of the disease

    Calibrated Simplex Mapping Classification

    Full text link
    We propose a novel supervised multi-class/single-label classifier that maps training data onto a linearly separable latent space with a simplex-like geometry. This approach allows us to transform the classification problem into a well-defined regression problem. For its solution we can choose suitable distance metrics in feature space and regression models predicting latent space coordinates. A benchmark on various artificial and real-world data sets is used to demonstrate the calibration qualities and prediction performance of our classifier.Comment: 24 pages, 8 figures, 7 table

    Prediction of pharmacological activities from chemical structures with graph convolutional neural networks

    Get PDF
    化合物の薬理作用を予測する技術を開発 --薬理作用ビッグデータを用いて--. 京都大学プレスリリース. 2021-01-13.Many therapeutic drugs are compounds that can be represented by simple chemical structures, which contain important determinants of affinity at the site of action. Recently, graph convolutional neural network (GCN) models have exhibited excellent results in classifying the activity of such compounds. For models that make quantitative predictions of activity, more complex information has been utilized, such as the three-dimensional structures of compounds and the amino acid sequences of their respective target proteins. As another approach, we hypothesized that if sufficient experimental data were available and there were enough nodes in hidden layers, a simple compound representation would quantitatively predict activity with satisfactory accuracy. In this study, we report that GCN models constructed solely from the two-dimensional structural information of compounds demonstrated a high degree of activity predictability against 127 diverse targets from the ChEMBL database. Using the information entropy as a metric, we also show that the structural diversity had less effect on the prediction performance. Finally, we report that virtual screening using the constructed model identified a new serotonin transporter inhibitor with activity comparable to that of a marketed drug in vitro and exhibited antidepressant effects in behavioural studies

    Advances in Antimicrobial Peptide Discovery via Machine Learning and Delivery via Nanotechnology

    Get PDF
    Antimicrobial peptides (AMPs) have been investigated for their potential use as an alternative to antibiotics due to the increased demand for new antimicrobial agents. AMPs, widely found in nature and obtained from microorganisms, have a broad range of antimicrobial protection, allowing them to be applied in the treatment of infections caused by various pathogenic microorganisms. Since these peptides are primarily cationic, they prefer anionic bacterial membranes due to electrostatic interactions. However, the applications of AMPs are currently limited owing to their hemolytic activity, poor bioavailability, degradation from proteolytic enzymes, and high-cost production. To overcome these limitations, nanotechnology has been used to improve AMP bioavailability, permeation across barriers, and/or protection against degradation. In addition, machine learning has been investigated due to its time-saving and cost-effective algorithms to predict AMPs. There are numerous databases available to train machine learning models. In this review, we focus on nanotechnology approaches for AMP delivery and advances in AMP design via machine learning. The AMP sources, classification, structures, antimicrobial mechanisms, their role in diseases, peptide engineering technologies, currently available databases, and machine learning techniques used to predict AMPs with minimal toxicity are discussed in detail

    Molecular Targets of CNS Tumors

    Get PDF
    Molecular Targets of CNS Tumors is a selected review of Central Nervous System (CNS) tumors with particular emphasis on signaling pathway of the most common CNS tumor types. To develop drugs which specifically attack the cancer cells requires an understanding of the distinct characteristics of those cells. Additional detailed information is provided on selected signal pathways in CNS tumors

    Latent Representation and Sampling in Network: Application in Text Mining and Biology.

    Get PDF
    In classical machine learning, hand-designed features are used for learning a mapping from raw data. However, human involvement in feature design makes the process expensive. Representation learning aims to learn abstract features directly from data without direct human involvement. Raw data can be of various forms. Network is one form of data that encodes relational structure in many real-world domains. Therefore, learning abstract features for network units is an important task. In this dissertation, we propose models for incorporating temporal information given as a collection of networks from subsequent time-stamps. The primary objective of our models is to learn a better abstract feature representation of nodes and edges in an evolving network. We show that the temporal information in the abstract feature improves the performance of link prediction task substantially. Besides applying to the network data, we also employ our models to incorporate extra-sentential information in the text domain for learning better representation of sentences. We build a context network of sentences to capture extra-sentential information. This information in abstract feature representation of sentences improves various text-mining tasks substantially over a set of baseline methods. A problem with the abstract features that we learn is that they lack interpretability. In real-life applications on network data, for some tasks, it is crucial to learn interpretable features in the form of graphical structures. For this we need to mine important graphical structures along with their frequency statistics from the input dataset. However, exact algorithms for these tasks are computationally expensive, so scalable algorithms are of urgent need. To overcome this challenge, we provide efficient sampling algorithms for mining higher-order structures from network(s). We show that our sampling-based algorithms are scalable. They are also superior to a set of baseline algorithms in terms of retrieving important graphical sub-structures, and collecting their frequency statistics. Finally, we show that we can use these frequent subgraph statistics and structures as features in various real-life applications. We show one application in biology and another in security. In both cases, we show that the structures and their statistics significantly improve the performance of knowledge discovery tasks in these domains

    In Silico Design and Selection of CD44 Antagonists:implementation of computational methodologies in drug discovery and design

    Get PDF
    Drug discovery (DD) is a process that aims to identify drug candidates through a thorough evaluation of the biological activity of small molecules or biomolecules. Computational strategies (CS) are now necessary tools for speeding up DD. Chapter 1 describes the use of CS throughout the DD process, from the early stages of drug design to the use of artificial intelligence for the de novo design of therapeutic molecules. Chapter 2 describes an in-silico workflow for identifying potential high-affinity CD44 antagonists, ranging from structural analysis of the target to the analysis of ligand-protein interactions and molecular dynamics (MD). In Chapter 3, we tested the shape-guided algorithm on a dataset of macrocycles, identifying the characteristics that need to be improved for the development of new tools for macrocycle sampling and design. In Chapter 4, we describe a detailed reverse docking protocol for identifying potential 4-hydroxycoumarin (4-HC) targets. The strategy described in this chapter is easily transferable to other compounds and protein datasets for overcoming bottlenecks in molecular docking protocols, particularly reverse docking approaches. Finally, Chapter 5 shows how computational methods and experimental results can be used to repurpose compounds as potential COVID-19 treatments. According to our findings, the HCV drug boceprevir could be clinically tested or used as a lead molecule to develop compounds that target COVID-19 or other coronaviral infections. These chapters, in summary, demonstrate the importance, application, limitations, and future of computational methods in the state-of-the-art drug design process
    corecore