28,846 research outputs found
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
This work introduces a number of algebraic topology approaches, such as
multicomponent persistent homology, multi-level persistent homology and
electrostatic persistence for the representation, characterization, and
description of small molecules and biomolecular complexes. Multicomponent
persistent homology retains critical chemical and biological information during
the topological simplification of biomolecular geometric complexity.
Multi-level persistent homology enables a tailored topological description of
inter- and/or intra-molecular interactions of interest. Electrostatic
persistence incorporates partial charge information into topological
invariants. These topological methods are paired with Wasserstein distance to
characterize similarities between molecules and are further integrated with a
variety of machine learning algorithms, including k-nearest neighbors, ensemble
of trees, and deep convolutional neural networks, to manifest their descriptive
and predictive powers for chemical and biological problems. Extensive numerical
experiments involving more than 4,000 protein-ligand complexes from the PDBBind
database and near 100,000 ligands and decoys in the DUD database are performed
to test respectively the scoring power and the virtual screening power of the
proposed topological approaches. It is demonstrated that the present approaches
outperform the modern machine learning based methods in protein-ligand binding
affinity predictions and ligand-decoy discrimination
Recommended from our members
Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL) optimization framework
Simplicity and flexibility of meta-heuristic optimization algorithms have attracted lots of attention in the field of optimization. Different optimization methods, however, hold algorithm-specific strengths and limitations, and selecting the best-performing algorithm for a specific problem is a tedious task. We introduce a new hybrid optimization framework, entitled Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL), which combines the strengths of different evolutionary algorithms (EAs) in a parallel computing scheme. SC-SAHEL explores performance of different EAs, such as the capability to escape local attractions, speed, convergence, etc., during population evolution as each individual EA suits differently to various response surfaces. The SC-SAHEL algorithm is benchmarked over 29 conceptual test functions, and a real-world hydropower reservoir model case study. Results show that the hybrid SC-SAHEL algorithm is rigorous and effective in finding global optimum for a majority of test cases, and that it is computationally efficient in comparison to algorithms with individual EA
Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches using metric-agnostic template nudging
Efficient multi-dimensional template placement is crucial in computationally
intensive matched-filtering searches for Gravitational Waves (GWs). Here, we
implement the Neighboring Cell Algorithm (NCA) to improve the detection volume
of an existing Compact Binary Coalescence (CBC) template bank. This algorithm
has already been successfully applied for a binary millisecond pulsar search in
data from the Fermi satellite. It repositions templates from over-dense regions
to under-dense regions and reduces the number of templates that would have been
required by a stochastic method to achieve the same detection volume. Our
method is readily generalizable to other CBC parameter spaces. Here we apply
this method to the aligned--single-spin neutron-star--black-hole binary
coalescence inspiral-merger-ringdown gravitational wave parameter space. We
show that the template nudging algorithm can attain the equivalent
effectualness of the stochastic method with 12% fewer templates
- …