Search CORE

8,071 research outputs found

TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

Author: Cang Zixuan
Wei Guo-Wei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 31/03/2017
Field of study

Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

arXiv.org e-Print Archive

Directory of Open Access Journals

Recommended from our members

A combined computational-experimental approach to define the structural origin of antibody recognition of sialyl-Tn, a tumor-associated carbohydrate antigen.

Author: Amon Ron
Chen Xi
Fleishman Sarel J
Glushka John N
Grant Oliver C
Leviatan Ben-Arye Shani
Makeneni Spandana
Marshanski Tal
Nivedha Anita K
Norn Christoffer
Padler-Karavani Vered
Woods Robert J
Yu Hai
Publication venue: eScholarship, University of California
Publication date: 01/07/2018
Field of study

Anti-carbohydrate monoclonal antibodies (mAbs) hold great promise as cancer therapeutics and diagnostics. However, their specificity can be mixed, and detailed characterization is problematic, because antibody-glycan complexes are challenging to crystallize. Here, we developed a generalizable approach employing high-throughput techniques for characterizing the structure and specificity of such mAbs, and applied it to the mAb TKH2 developed against the tumor-associated carbohydrate antigen sialyl-Tn (STn). The mAb specificity was defined by apparent KD values determined by quantitative glycan microarray screening. Key residues in the antibody combining site were identified by site-directed mutagenesis, and the glycan-antigen contact surface was defined using saturation transfer difference NMR (STD-NMR). These features were then employed as metrics for selecting the optimal 3D-model of the antibody-glycan complex, out of thousands plausible options generated by automated docking and molecular dynamics simulation. STn-specificity was further validated by computationally screening of the selected antibody 3D-model against the human sialyl-Tn-glycome. This computational-experimental approach would allow rational design of potent antibodies targeting carbohydrates

eScholarship - University of California

Detection of the TCDD binding-fingerprint within the Ah receptor ligand binding domain by structurally driven mutagenesis and functional analysis

Author: Bonati L
Denison MS
Pandini A
Song Y
Soshilov AA
Zhao J
Publication venue: 'American Chemical Society (ACS)'
Publication date: 30/06/2009
Field of study

The aryl hydrocarbon receptor (AhR) is a ligand-dependent, basic helix-loop-helix Per-Arnt-Sim (PAS)-containing transcription factor that can bind and be activated by structurally diverse chemicals, including the toxic environmental contaminant 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). Our previous three-dimensional homology model of the mouse AhR (mAhR) PAS B ligand binding domain allowed identification of the binding site and its experimental validation. We have extended this analysis by conducting comparative structural modeling studies of the ligand binding domains of six additional highaffinity mammalian AhRs. These results, coupled with site-directed mutagenesis and AhR functional analysis, have allowed detection of the "TCDD binding-fingerprint" of conserved residues within the ligand binding cavity necessary for high-affinity TCDD binding and TCDD-dependent AhR transformation DNA binding. The essential role of selected residues was further evaluated using molecular docking simulations of TCDD with both wild-type and mutant mAhRs. Taken together, our results dramatically improve our understanding of the molecular determinants of TCDD binding and provide a basis for future studies directed toward rationalizing the observed species differences in AhR sensitivity to TCDD and understanding the mechanistic basis for the dramatic diversity in AhR ligand structure. © 2009 American Chemical Society

PubMed Central

King's Research Portal

Brunel University Research Archive

Serverification of Molecular Modeling Applications: the Rosetta Online Server that Includes Everyone (ROSIE)

Author: Bonneau Richard
Borgo Benjamin
Chou Fang-Chieh
Conchúir Shane Ó
Das Rhiju
Der Bryan S.
Drew Kevin
Gray Jeffrey J.
Havranek James J.
Kortemme Tanja
Kuhlman Brian
Kuroda Daisuke
Lyskov Sergey
Renfrew P. Douglas
Sripakdeevong Parin
Weitzner Brian D.
Xu Jianqing
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org

arXiv.org e-Print Archive

Directory of Open Access Journals

Digital Commons@Becker

PubMed Central

Carolina Digital Repository

Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

Author: Cang Zixuan
Mu Lin
Wei Guowei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 27/08/2017
Field of study

This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

An Olfactory Receptor Pseudogene whose Function emerged in Humans

Author: Catherine Ronin
Chiquito J. Crasto
Gautam Bahl
Maryse Gremigni
Olivier Clot-Faybesse
Peter Lai
Valery Matarazzo
Publication venue
Publication date: 02/11/2007
Field of study

Human olfactory receptor, hOR17-210, is identified as a pseudogene in the human genome. Experimental data has shown however, that the gene product of cloned hOR17-210 cDNA was able to bind an odorant-binding protein and is narrowly tuned for excitation by cyclic ketones. Supported by experimental results, we used the bioinformatics methods of sequence analysis, computational protein modeling and docking, to show that functionality in this receptor is retained due to sequence-structure features not previously observed in mammalian ORs. This receptor does not possess the first two transmembrane helical domains (of seven typically seen in GPCRs). It however, possesses an additional TM that has not been observed in other human olfactory receptors. By incorporating these novel structural features, we created two putative models for this receptor. We also docked odor ligands that were experimentally shown to bind hOR17-210 model. We show how and why structural modifications of OR17-210 do not hinder this receptor's functionality. Our studies reveal that novel gene rearrangement that result in sequence and structural diversity in has a bearing on OR and GPCR function and evolution

Crossref

Nature Precedings

DEEP LEARNING METHODS FOR PREDICTION OF AND ESCAPE FROM PROTEIN RECOGNITION

Author: Dai Bowen
Publication venue: Dartmouth Digital Commons
Publication date: 02/03/2023
Field of study

Protein interactions drive diverse processes essential to living organisms, and thus numerous biomedical applications center on understanding, predicting, and designing how proteins recognize their partners. While unfortunately the number of interactions of interest still vastly exceeds the capabilities of experimental determination methods, computational methods promise to fill the gap. My thesis pursues the development and application of computational methods for several protein interaction prediction and design tasks. First, to improve protein-glycan interaction specificity prediction, I developed GlyBERT, which learns biologically relevant glycan representations encapsulating the components most important for glycan recognition within their structures. GlyBERT encodes glycans with a branched biochemical language and employs an attention-based deep language model to embed the correlation between local and global structural contexts. This approach enables the development of predictive models from limited data, supporting applications such as lectin binding prediction. Second, to improve protein-protein interaction prediction, I developed a unified geometric deep neural network, ‘PInet’ (Protein Interface Network), which leverages the best properties of both data- and physics-driven methods, learning and utilizing models capturing both geometrical and physicochemical molecular surface complementarity. In addition to obtaining state-of-the-art performance in predicting protein-protein interactions, PInet can serve as the backbone for other protein-protein interaction modeling tasks such as binding affinity prediction. Finally, I turned from ii prediction to design, addressing two important tasks in the context of antibodyantigen recognition. The first problem is to redesign a given antigen to evade antibody recognition, e.g., to help biotherapeutics avoid pre-existing immunity or to focus vaccine responses on key portions of an antigen. The second problem is to design a panel of variants of a given antigen to use as “bait” in experimental identification of antibodies that recognize different parts of the antigen, e.g., to support classification of immune responses or to help select among different antibody candidates. I developed a geometry-based algorithm to generate variants to address these design problems, seeking to maximize utility subject to experimental constraints. During the design process, the algorithm accounts for and balances the effects of candidate mutations on antibody recognition and on antigen stability. In retrospective case studies, the algorithm demonstrated promising precision, recall, and robustness of finding good designs. This work represents the first algorithm to systematically design antigen variants for characterization and evasion of polyclonal antibody responses

Dartmouth Digital Commons (Dartmouth College)

Protein-Ligand Scoring with Convolutional Neural Networks

Author: Hochuli Joshua
Idrobo Elisa
Koes David Ryan
Ragoza Matthew
Sunseri Jocelyn
Publication venue
Publication date: 08/12/2016
Field of study

Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protein-ligand scoring. We describe convolutional neural network (CNN) scoring functions that take as input a comprehensive 3D representation of a protein-ligand interaction. A CNN scoring function automatically learns the key features of protein-ligand interactions that correlate with binding. We train and optimize our CNN scoring functions to discriminate between correct and incorrect binding poses and known binders and non-binders. We find that our CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening

arXiv.org e-Print Archive

FigShare

Structure-guided machine learning prediction of drug resistance mutations in Abelson 1 kinase.

Author: Ascher David B
Nguyen Thanh-Binh
Pat Megan
Pires Douglas E V
Portelli Stephanie
Rodrigues Carlos H M
Zhou Yunzhuo
Publication venue: Computational and structural biotechnology journal
Publication date: 01/01/2021
Field of study

Funder: State Government of VictoriaKinases play crucial roles in cellular signalling and biological processes with their dysregulation associated with diseases, including cancers. Kinase inhibitors, most notably those targeting ABeLson 1 (ABL1) kinase in chronic myeloid leukemia, have had a significant impact on cancer survival, yet emergence of resistance mutations can reduce their effectiveness, leading to therapeutic failure. Limited effort, however, has been devoted to developing tools to accurately identify ABL1 resistance mutations, as well as providing insights into their molecular mechanisms. Here we investigated the structural basis of ABL1 mutations modulating binding affinity of eight FDA-approved drugs. We found mutations impair affinity of type I and type II inhibitors differently and used this insight to developed a novel web-based diagnostic tool, SUSPECT-ABL, to pre-emptively predict resistance profiles and binding free-energy changes (ΔΔG) of all possible ABL1 mutations against inhibitors with different binding modes. Resistance mutations in ABL1 were successfully identified, achieving a Matthew's Correlation Coefficient of up to 0.73 and the resulting change in ligand binding affinity with a Pearson's correlation of up to 0.77, with performances consistent across non-redundant blind tests. Through an in silico saturation mutagenesis, our tool has identified possibly emerging resistance mutations, which offers opportunities for in vivo experimental validation. We believe SUSPECT-ABL will be an important tool not just for improving precision medicine efforts, but for facilitating the development of next-generation inhibitors that are less prone to resistance. We have made our tool freely available at http://biosig.unimelb.edu.au/suspect_abl/

PubMed Central

Apollo (Cambridge)

Virtual screening for inhibitors of the human TSLP:TSLPR interaction

Author: AA Bogan
AC Braisted
AJ Souers
C Ghirelli
D Sun
DR Hall
G Neudert
G Pérez-Hernández
GM Gauvreau
HH Loeffler
J Corren
J Felix
J Yin
JM Wang
K Lindorff-Larsen
K Verstraete
K Verstraete
L Deng
LS Park
M Guillot-Delost
M Noti
M Piazza Di
MJ Abraham
NS Redhu
O Trott
P Buchwald
P Chames
RL Rich
S Demehri
S Doerr
S Ulrich
SD Levin
SF Ziegler
T Clackson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The pro-inflammatory cytokine thymic stromal lymphopoietin (TSLP) plays a pivotal role in the pathophysiology of various allergy disorders that are mediated by type 2 helper T cell (Th2) responses, such as asthma and atopic dermatitis. TSLP forms a ternary complex with the TSLP receptor (TSLPR) and the interleukin-7-receptor subunit alpha (IL-7Ra), thereby activating a signaling cascade that culminates in the release of pro-inflammatory mediators. In this study, we conducted an in silico characterization of the TSLP: TSLPR complex to investigate the drugability of this complex. Two commercially available fragment libraries were screened computationally for possible inhibitors and a selection of fragments was subsequently tested in vitro. The screening setup consisted of two orthogonal assays measuring TSLP binding to TSLPR: a BLI-based assay and a biochemical assay based on a TSLP: alkaline phosphatase fusion protein. Four fragments pertaining to diverse chemical classes were identified to reduce TSLP: TSLPR complex formation to less than 75% in millimolar concentrations. We have used unbiased molecular dynamics simulations to develop a Markov state model that characterized the binding pathway of the most interesting compound. This work provides a proof-ofprinciple for use of fragments in the inhibition of TSLP: TSLPR complexation

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

Institutional Repository Universiteit Antwerpen