9,287 research outputs found

    Enhancing the effectiveness of ligand-based virtual screening using data fusion

    Get PDF
    Data fusion is being increasingly used to combine the outputs of different types of sensor. This paper reviews the application of the approach to ligand-based virtual screening, where the sensors to be combined are functions that score molecules in a database on their likelihood of exhibiting some required biological activity. Much of the literature to date involves the combination of multiple similarity searches, although there is also increasing interest in the combination of multiple machine learning techniques. Both approaches are reviewed here, focusing on the extent to which fusion can improve the effectiveness of searching when compared with a single screening mechanism, and on the reasons that have been suggested for the observed performance enhancement

    Similarity-based virtual screening using 2D fingerprints

    Get PDF
    This paper summarises recent work at the University of Sheffield on virtual screening methods that use 2D fingerprint measures of structural similarity. A detailed comparison of a large number of similarity coefficients demonstrates that the well-known Tanimoto coefficient remains the method of choice for the computation of fingerprint-based similarity, despite possessing some inherent biases related to the sizes of the molecules that are being sought. Group fusion involves combining the results of similarity searches based on multiple reference structures and a single similarity measure. We demonstrate the effectiveness of this approach to screening, and also describe an approximate form of group fusion, turbo similarity searching, that can be used when just a single reference structure is available

    Ligand-based virtual screening using binary kernel discrimination

    Get PDF
    This paper discusses the use of a machine-learning technique called binary kernel discrimination (BKD) for virtual screening in drug- and pesticide-discovery programmes. BKD is compared with several other ligand-based tools for virtual screening in databases of 2D structures represented by fragment bit-strings, and is shown to provide an effective, and reasonably efficient, way of prioritising compounds for biological screening

    Evaluation of a Bayesian inference network for ligand-based virtual screening

    Get PDF
    Background Bayesian inference networks enable the computation of the probability that an event will occur. They have been used previously to rank textual documents in order of decreasing relevance to a user-defined query. Here, we modify the approach to enable a Bayesian inference network to be used for chemical similarity searching, where a database is ranked in order of decreasing probability of bioactivity. Results Bayesian inference networks were implemented using two different types of network and four different types of belief function. Experiments with the MDDR and WOMBAT databases show that a Bayesian inference network can be used to provide effective ligand-based screening, especially when the active molecules being sought have a high degree of structural homogeneity; in such cases, the network substantially out-performs a conventional, Tanimoto-based similarity searching system. However, the effectiveness of the network is much less when structurally heterogeneous sets of actives are being sought. Conclusion A Bayesian inference network provides an interesting alternative to existing tools for ligand-based virtual screening

    Evaluation of machine-learning methods for ligand-based virtual screening

    Get PDF
    Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    Virtual screening for inhibitors of the human TSLP:TSLPR interaction

    Get PDF
    The pro-inflammatory cytokine thymic stromal lymphopoietin (TSLP) plays a pivotal role in the pathophysiology of various allergy disorders that are mediated by type 2 helper T cell (Th2) responses, such as asthma and atopic dermatitis. TSLP forms a ternary complex with the TSLP receptor (TSLPR) and the interleukin-7-receptor subunit alpha (IL-7Ra), thereby activating a signaling cascade that culminates in the release of pro-inflammatory mediators. In this study, we conducted an in silico characterization of the TSLP: TSLPR complex to investigate the drugability of this complex. Two commercially available fragment libraries were screened computationally for possible inhibitors and a selection of fragments was subsequently tested in vitro. The screening setup consisted of two orthogonal assays measuring TSLP binding to TSLPR: a BLI-based assay and a biochemical assay based on a TSLP: alkaline phosphatase fusion protein. Four fragments pertaining to diverse chemical classes were identified to reduce TSLP: TSLPR complex formation to less than 75% in millimolar concentrations. We have used unbiased molecular dynamics simulations to develop a Markov state model that characterized the binding pathway of the most interesting compound. This work provides a proof-ofprinciple for use of fragments in the inhibition of TSLP: TSLPR complexation
    • …
    corecore