9 research outputs found

    Digitizing chemical discovery with a Bayesian explorer for interpreting reactivity data

    Get PDF
    Interpreting the outcome of chemistry experiments consistently is slow and frequently introduces unwanted hidden bias. This difficulty limits the scale of collectable data and often leads to exclusion of negative results, which severely limits progress in the field. What is needed is a way to standardize the discovery process and accelerate the interpretation of high-dimensional data aided by the expert chemist’s intuition. We demonstrate a digital Oracle that interprets chemical reactivity using probability. By carrying out >500 reactions covering a large space and retaining both the positive and negative results, the Oracle was able to rediscover eight historically important reactions including the aldol condensation, Buchwald–Hartwig amination, Heck, Mannich, Sonogashira, Suzuki, Wittig, and Wittig–Horner reactions. This paradigm for decoding reactivity validates and formalizes the expert chemist’s experience and intuition, providing a quantitative criterion of discovery scalable to all available experimental data

    Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization

    Full text link
    Undetected overfitting can occur when there are significant redundancies between training and validation data. We describe AVE, a new measure of training-validation redundancy for ligand-based classification problems that accounts for the similarity amongst inactive molecules as well as active. We investigated seven widely-used benchmarks for virtual screening and classification, and show that the amount of AVE bias strongly correlates with the performance of ligand-based predictive methods irrespective of the predicted property, chemical fingerprint, similarity measure, or previously-applied unbiasing techniques. Therefore, it may be that the previously-reported performance of most ligand-based methods can be explained by overfitting to benchmarks rather than good prospective accuracy

    Evaluation et application de méthodes de criblage in silico

    Get PDF
    Lors de la conception de médicaments, le criblage in silico est de plus en plus utilisé et lesméthodes disponibles nécessitent d'être évaluées. L'évaluation de 8 méthodes a mis enévidence l'efficacité des méthodes de criblage in silico et des problèmes de construction de labanque d'évaluation de référence (DUD), la conformation choisie pour les sites de liaisonn'étant pas toujours adaptée à tous les actifs. La puissance informatique actuelle le permettant,plusieurs structures expérimentales ont été choisies pour tenter de mimer la flexibilité dessites de liaison. Un autre problème a été mis en évidence : les métriques d'évaluation desméthodes souffrent de biais. De nouvelles métriques ont donc été proposées, telles queBEDROC et RIE. Une autre alternative est proposée ici, mesurant la capacité prédictive d'uneméthode en actifs. Enfin, une petite molécule active sur le TNFa in vitro et in vivo sur souris aété identifiée par un protocole de criblage in silico. Ainsi, malgré le besoin d'amélioration desméthodes, le criblage in silico peut être d'un important soutien à l'identification de nouvellesmolécules a visée thérapeutique.Since the introduction of virtual screening in the drug discovery process, the number ofvirtual screening methods has been increasing and available methods have to be evaluated.In this work, eight virtual screening methods were evaluated in the DUD database, showingadequate efficiency. This also revealed some shortcomings of the DUD database as thebinding site conformation used in the DUD was not relevant for all the actives.As computational power now permits to address this issue, classical docking runs have beenperformed on several X-ray structures, used to represent the binding site flexibility. This alsorevealed that evaluation metrics show some biases. New evaluation metrics have thus beenproposed, e.g. BEDROC and RIE. An alternative method was also proposed usingpredictiveness curves, based on compound activity probabilityFinally, a virtual screening procedure has been applied to TNFa. A small molecule inhibitor,showing in vitro and in vivo activity in mice, has been identified. This demonstrated the valueof virtual screening for the drug discovery process, although virtual screening methods needto be improved.PARIS-CNAM (751032301) / SudocSudocFranceF

    Study of ligand-based virtual screening tools in computer-aided drug design

    Get PDF
    Virtual screening is a central technique in drug discovery today. Millions of molecules can be tested in silico with the aim to only select the most promising and test them experimentally. The topic of this thesis is ligand-based virtual screening tools which take existing active molecules as starting point for finding new drug candidates. One goal of this thesis was to build a model that gives the probability that two molecules are biologically similar as function of one or more chemical similarity scores. Another important goal was to evaluate how well different ligand-based virtual screening tools are able to distinguish active molecules from inactives. One more criterion set for the virtual screening tools was their applicability in scaffold-hopping, i.e. finding new active chemotypes. In the first part of the work, a link was defined between the abstract chemical similarity score given by a screening tool and the probability that the two molecules are biologically similar. These results help to decide objectively which virtual screening hits to test experimentally. The work also resulted in a new type of data fusion method when using two or more tools. In the second part, five ligand-based virtual screening tools were evaluated and their performance was found to be generally poor. Three reasons for this were proposed: false negatives in the benchmark sets, active molecules that do not share the binding mode, and activity cliffs. In the third part of the study, a novel visualization and quantification method is presented for evaluation of the scaffold-hopping ability of virtual screening tools.Siirretty Doriast

    Determination of in silico rules for predicting small molecule binding behavior to nucleic acids in vitro.

    Get PDF
    The vast knowledge of nucleic acids is evolving and it is now known that DNA can adopt highly complex, heterogeneous structures. Among the most intriguing are the G-quadruplex structures, which are thought to play a pivotal role in cancer pathogenesis. Efforts to find new small molecules for these and other physiologically relevant nucleic acid structures have generally been limited to isolation from natural sources or rationale synthesis of promising lead compounds. However, with the rapid growth in computational power that is increasingly becoming available, virtual screening and computational approaches are quickly becoming a reality in academia and industry as an efficient and economical way to discover new lead compounds. These computational efforts have historically almost entirely focused on proteins as targets and have neglected DNA. We present research here showing that not only can software be utilized for targeting DNA, but that selectivity metrics can be developed to predict the binding mechanism of a small molecule to a DNA target. The software Surflex and Autodock were chosen for evaluation and were demonstrated to be able to accurately reproduce the known crystal structures of several small molecules that bind by the most common nucleic acid interacting mechanisms of groove binding and intercalation. These software were further used to rationalize known affinity and selectivity data of a 67 compound library of compounds for a library of nucleic acid structures including duplex, triplex and quadruplexes. Based upon the known binding behavior of these compounds, in silica metrics were developed to classify compounds as either groove binders or intercalators. These rules were subsequently used to identify new triplex and quadruplex binding small molecules by structure and ligand-based virtual screening approaches using a virtual library consisting of millions of commercially available small molecules. The binding behavior of the newly discovered triplex and quadruplex binding compounds was empirically validated using a number of spectroscopic, fluorescent and thermodynamic equilibrium techniques. In total, this research predicted the binding behavior of these test compounds in silica and subsequently validated these findings in vitro. This research presents a novel approach to discover lead compounds that target multiple nucleic acid morphologies
    corecore