Search CORE

142 research outputs found

Protein-Ligand Scoring with Convolutional Neural Networks

Author: Hochuli Joshua
Idrobo Elisa
Koes David Ryan
Ragoza Matthew
Sunseri Jocelyn
Publication venue
Publication date: 08/12/2016
Field of study

Computational approaches to drug discovery can reduce the time and cost associated with experimental assays and enable the screening of novel chemotypes. Structure-based drug design methods rely on scoring functions to rank and predict binding affinities and poses. The ever-expanding amount of protein-ligand binding and structural data enables the use of deep machine learning techniques for protein-ligand scoring. We describe convolutional neural network (CNN) scoring functions that take as input a comprehensive 3D representation of a protein-ligand interaction. A CNN scoring function automatically learns the key features of protein-ligand interactions that correlate with binding. We train and optimize our CNN scoring functions to discriminate between correct and incorrect binding poses and known binders and non-binders. We find that our CNN scoring function outperforms the AutoDock Vina scoring function when ranking poses both for pose prediction and virtual screening

arXiv.org e-Print Archive

FigShare

Accelerating Inference in Molecular Diffusion Models with Latent Representations of Protein Structure

Author: Dunn Ian
Koes David Ryan
Publication venue
Publication date: 22/11/2023
Field of study

Diffusion generative models have emerged as a powerful framework for addressing problems in structural biology and structure-based drug design. These models operate directly on 3D molecular structures. Due to the unfavorable scaling of graph neural networks (GNNs) with graph size as well as the relatively slow inference speeds inherent to diffusion models, many existing molecular diffusion models rely on coarse-grained representations of protein structure to make training and inference feasible. However, such coarse-grained representations discard essential information for modeling molecular interactions and impair the quality of generated structures. In this work, we present a novel GNN-based architecture for learning latent representations of molecular structure. When trained end-to-end with a diffusion model for de novo ligand design, our model achieves comparable performance to one with an all-atom protein representation while exhibiting a 3-fold reduction in inference time.Comment: This paper appeared as a spotlight paper at the NeurIPS 2023 Generative AI and Biology Worksho

arXiv.org e-Print Archive

Computing free energies with PyBrella

Author: Koes David
Yuan Chenhui
Publication venue: F1000 Research Ltd.
Publication date: 20/05/2014
Field of study

Calculations of the rates of disassociation between small molecules and proteins have numerous applications, including assisting rapid discovery and testing of novel drugs. Free energy calculations consider the enthalpy and entropy of the full protein-ligand-water system and so have the potential to be more accurate than faster, single-point calculations. In this study, methods are explored to predict the binding affinity of various molecules to proteins by molecular dynamics and umbrella sampling. An attempt was made to determine the potential of mean force (PMF) for the molecule, which was compared to its known binding capability. Factors including simulation resources, amounts of sampling, force strength parameters, and correlation between predicted energy and actual rate constants were considered in order to evaluate the umbrella sampling methods. Limitations in the simulation environment, such as the scaling of the PMF for sampling, biases in the SMD trajectory, and variations between ligands, were also investigated in the hope of creating a more comprehensive approach for predicting the target-molecule interaction

D-Scholarship@Pitt

A pyrazolopyran derivative preferentially inhibits the activity of human cytosolic hydroxymethyltransferase and induces cell death in lung cancer cells

Author: CONTESTABILE Roberto
CUTRUZZOLA' Francesca
FIASCARELLI ALESSIO
Gargano Maurizio
GIARDINA Giorgio
Koes David
MACONE ALBERTO
MARANI MARINA
Mcdermott Lee
PAIARDINI ALESSANDRO
PAONE ALESSIO
Pontecorvi Valentino
RINALDO Serena
Yang Tianyi
Publication venue: 'Impact Journals, LLC'
Publication date: 01/01/2016
Field of study

Serine hydroxymethyltransferase (SHMT) is a central enzyme in the metabolic reprogramming of cancer cells, providing activated one-carbon units in the serine-glycine one-carbon metabolism. Previous studies demonstrated that the cytoplasmic isoform of SHMT (SHMT1) plays a relevant role in lung cancer. SHMT1 is overexpressed in lung cancer patients and NSCLC cell lines. Moreover, SHMT1 is required to maintain DNA integrity. Depletion in lung cancer cell lines causes cell cycle arrest and uracil accumulation and ultimately leads to apoptosis. We found that a pyrazolopyran compound, namely 2.12, preferentially inhibits SHMT1 compared to the mitochondrial counterpart SHMT2. Computational and crystallographic approaches suggest binding at the active site of SHMT1 and a competitive inhibition mechanism. A radio isotopic activity assay shows that inhibition of SHMT by 2.12 also occurs in living cells. Moreover, administration of 2.12 in A549 and H1299 lung cancer cell lines causes apoptosis at LD50 34 μM and rescue experiments underlined selectivity towards SHMT1. These data not only further highlight the relevance of the cytoplasmic isoform SHMT1 in lung cancer but, more importantly, demonstrate that, at least in vitro, it is possible to find selective inhibitors against one specific isoform of SHMT, a key target in metabolic reprogramming of many cancer types

Archivio della ricerca- Università di Roma La Sapienza

Improvements to the APBS biomolecular solvation software suite

Author: Baker Nathan A.
Brandi Juan
Brookes David H.
Chen Jiahui
Chun Minju
Dolinsky Todd
Engel Dave
Felberg Lisa E.
Geng Weihua
Gohara David W.
Head-Gordon Teresa
Holst Michael J.
Jurrus Elizabeth
Koes David R.
Konecny Robert
Krasny Robert
Li Peter
Liles Karina
McCammon J. Andrew
Monson Kyle
Nielsen Jens Erik
Star Keith
Wei Guo Wei
Wilson Leighton
Publication venue: 'Wiley'
Publication date: 21/08/2017
Field of study

The Adaptive Poisson-Boltzmann Solver (APBS) software was developed to solve the equations of continuum electrostatics for large biomolecular assemblages that has provided impact in the study of a broad range of chemical, biological, and biomedical applications. APBS addresses three key technology challenges for understanding solvation and electrostatics in biomedical applications: accurate and efficient models for biomolecular solvation and electrostatics, robust and scalable software for applying those theories to biomolecular systems, and mechanisms for sharing and analyzing biomolecular electrostatics data in the scientific community. To address new research applications and advancing computational capabilities, we have continually updated APBS and its suite of accompanying software since its release in 2001. In this manuscript, we discuss the models and capabilities that have recently been implemented within the APBS software package including: a Poisson-Boltzmann analytical and a semi-analytical solver, an optimized boundary element solver, a geometry-based geometric flow solvation model, a graph theory based algorithm for determining p

K_a

values, and an improved web-based visualization tool for viewing electrostatics

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Deep Blue Documents at the University of Michigan

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening

Author: Chen Lieyang
Cruz Anthony
Dickson Callum J.
Duca Jose S.
Hornak Viktor
Koes David R.
Kurtzman Tom
Ramsey Steven
Publication venue: CUNY Academic Works
Publication date: 20/08/2019
Field of study

Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUDE). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of proteinligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development

City University of New York