291 research outputs found
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
This work introduces a number of algebraic topology approaches, such as
multicomponent persistent homology, multi-level persistent homology and
electrostatic persistence for the representation, characterization, and
description of small molecules and biomolecular complexes. Multicomponent
persistent homology retains critical chemical and biological information during
the topological simplification of biomolecular geometric complexity.
Multi-level persistent homology enables a tailored topological description of
inter- and/or intra-molecular interactions of interest. Electrostatic
persistence incorporates partial charge information into topological
invariants. These topological methods are paired with Wasserstein distance to
characterize similarities between molecules and are further integrated with a
variety of machine learning algorithms, including k-nearest neighbors, ensemble
of trees, and deep convolutional neural networks, to manifest their descriptive
and predictive powers for chemical and biological problems. Extensive numerical
experiments involving more than 4,000 protein-ligand complexes from the PDBBind
database and near 100,000 ligands and decoys in the DUD database are performed
to test respectively the scoring power and the virtual screening power of the
proposed topological approaches. It is demonstrated that the present approaches
outperform the modern machine learning based methods in protein-ligand binding
affinity predictions and ligand-decoy discrimination
Molecular docking: Shifting paradigms in drug discovery
Molecular docking is an established in silico structure-based method widely used in drug discovery. Docking enables the identification of novel compounds of therapeutic interest, predicting ligand-target interactions at a molecular level, or delineating structure-activity relationships (SAR), without knowing a priori the chemical structure of other target modulators. Although it was originally developed to help understanding the mechanisms of molecular recognition between small and large molecules, uses and applications of docking in drug discovery have heavily changed over the last years. In this review, we describe how molecular docking was firstly applied to assist in drug discovery tasks. Then, we illustrate newer and emergent uses and applications of docking, including prediction of adverse effects, polypharmacology, drug repurposing, and target fishing and profiling, discussing also future applications and further potential of this technique when combined with emergent techniques, such as artificial intelligence
From Static to Dynamic Structures: Improving Binding Affinity Prediction with a Graph-Based Deep Learning Model
Accurate prediction of the protein-ligand binding affinities is an essential
challenge in the structure-based drug design. Despite recent advance in
data-driven methods in affinity prediction, their accuracy is still limited,
partially because they only take advantage of static crystal structures while
the actual binding affinities are generally depicted by the thermodynamic
ensembles between proteins and ligands. One effective way to approximate such a
thermodynamic ensemble is to use molecular dynamics (MD) simulation. Here, we
curated an MD dataset containing 3,218 different protein-ligand complexes, and
further developed Dynaformer, which is a graph-based deep learning model.
Dynaformer was able to accurately predict the binding affinities by learning
the geometric characteristics of the protein-ligand interactions from the MD
trajectories. In silico experiments demonstrated that our model exhibits
state-of-the-art scoring and ranking power on the CASF-2016 benchmark dataset,
outperforming the methods hitherto reported. Moreover, we performed a virtual
screening on the heat shock protein 90 (HSP90) using Dynaformer that identified
20 candidates and further experimentally validated their binding affinities. We
demonstrated that our approach is more efficient, which can identify 12 hit
compounds (two were in the submicromolar range), including several newly
discovered scaffolds. We anticipate this new synergy between large-scale MD
datasets and deep learning models will provide a new route toward accelerating
the early drug discovery process.Comment: totally reorganize the texts and figure
Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges
Structure-based virtual screening (VS) is a widely used approach that employs the knowledge of the three-dimensional structure of the target of interest in the design of new lead compounds from large-scale molecular docking experiments. Through the prediction of the binding mode and affinity of a small molecule within the binding site of the target of interest, it is possible to understand important properties related to the binding process. Empirical scoring functions are widely used for pose and affinity prediction. Although pose prediction is performed with satisfactory accuracy, the correct prediction of binding affinity is still a challenging task and crucial for the success of structure-based VS experiments. There are several efforts in distinct fronts to develop even more sophisticated and accurate models for filtering and ranking large libraries of compounds. This paper will cover some recent successful applications and methodological advances, including strategies to explore the ligand entropy and solvent effects, training with sophisticated machine-learning techniques, and the use of quantum mechanics. Particular emphasis will be given to the discussion of critical aspects and further directions for the development of more accurate empirical scoring functions
- …