28,846 research outputs found

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches using metric-agnostic template nudging

    Full text link
    Efficient multi-dimensional template placement is crucial in computationally intensive matched-filtering searches for Gravitational Waves (GWs). Here, we implement the Neighboring Cell Algorithm (NCA) to improve the detection volume of an existing Compact Binary Coalescence (CBC) template bank. This algorithm has already been successfully applied for a binary millisecond pulsar search in data from the Fermi satellite. It repositions templates from over-dense regions to under-dense regions and reduces the number of templates that would have been required by a stochastic method to achieve the same detection volume. Our method is readily generalizable to other CBC parameter spaces. Here we apply this method to the aligned--single-spin neutron-star--black-hole binary coalescence inspiral-merger-ringdown gravitational wave parameter space. We show that the template nudging algorithm can attain the equivalent effectualness of the stochastic method with 12% fewer templates
    corecore