3,091 research outputs found

    Predicting Skin Permeability by means of Computational Approaches : Reliability and Caveats in Pharmaceutical Studies

    Get PDF
    © 2019 American Chemical Society.The skin is the main barrier between the internal body environment and the external one. The characteristics of this barrier and its properties are able to modify and affect drug delivery and chemical toxicity parameters. Therefore, it is not surprising that permeability of many different compounds has been measured through several in vitro and in vivo techniques. Moreover, many different in silico approaches have been used to identify the correlation between the structure of the permeants and their permeability, to reproduce the skin behavior, and to predict the ability of specific chemicals to permeate this barrier. A significant number of issues, like interlaboratory variability, experimental conditions, data set building rationales, and skin site of origin and hydration, still prevent us from obtaining a definitive predictive skin permeability model. This review wants to show the main advances and the principal approaches in computational methods used to predict this property, to enlighten the main issues that have arisen, and to address the challenges to develop in future research.Peer reviewedFinal Accepted Versio

    Selection by Prediction with Conformal p-values

    Full text link
    Decision making or scientific discovery pipelines such as job hiring and drug discovery often involve multiple stages: before any resource-intensive step, there is often an initial screening that uses predictions from a machine learning model to shortlist a few candidates from a large pool. We study screening procedures that aim to select candidates whose unobserved outcomes exceed user-specified values. We develop a method that wraps around any prediction model to produce a subset of candidates while controlling the proportion of falsely selected units. Building upon the conformal inference framework, our method first constructs p-values that quantify the statistical evidence for large outcomes; it then determines the shortlist by comparing the p-values to a threshold introduced in the multiple testing literature. In many cases, the procedure selects candidates whose predictions are above a data-dependent threshold. Our theoretical guarantee holds under mild exchangeability conditions on the samples, generalizing existing results on multiple conformal p-values. We demonstrate the empirical performance of our method via simulations, and apply it to job hiring and drug discovery datasets.Comment: Journal of Machine Learning Researc

    Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning

    Get PDF
    Confidence predictors can deliver predictions with the associated confidence required for decision making and can play an important role in drug discovery and toxicity predictions. In this work we investigate a recently introduced version of conformal prediction, synergy conformal prediction, focusing on the predictive performance when applied to bioactivity data. We compare the performance to other variants of conformal predictors for multiple partitioned datasets and demonstrate the utility of synergy conformal predictors for federated learning where data cannot be pooled in one location. Our results show that synergy conformal predictors based on training data randomly sampled with replacement can compete with other conformal setups, while using completely separate training sets often results in worse performance. However, in a federated setup where no method has access to all the data, synergy conformal prediction is shown to give promising results. Based on our study, we conclude that synergy conformal predictors are a valuable addition to the conformal prediction toolbox

    Conformal Drug Property Prediction with Density Estimation under Covariate Shift

    Full text link
    In drug discovery, it is vital to confirm the predictions of pharmaceutical properties from computational models using costly wet-lab experiments. Hence, obtaining reliable uncertainty estimates is crucial for prioritizing drug molecules for subsequent experimental validation. Conformal Prediction (CP) is a promising tool for creating such prediction sets for molecular properties with a coverage guarantee. However, the exchangeability assumption of CP is often challenged with covariate shift in drug discovery tasks: Most datasets contain limited labeled data, which may not be representative of the vast chemical space from which molecules are drawn. To address this limitation, we propose a method called CoDrug that employs an energy-based model leveraging both training data and unlabelled data, and Kernel Density Estimation (KDE) to assess the densities of a molecule set. The estimated densities are then used to weigh the molecule samples while building prediction sets and rectifying for distribution shift. In extensive experiments involving realistic distribution drifts in various small-molecule drug discovery tasks, we demonstrate the ability of CoDrug to provide valid prediction sets and its utility in addressing the distribution shift arising from de novo drug design models. On average, using CoDrug can reduce the coverage gap by over 35% when compared to conformal prediction sets not adjusted for covariate shift.Comment: Accepted at NeurIPS 202

    Using Predicted Bioactivity Profiles to Improve Predictive Modeling

    Get PDF
    Predictive modeling is a cornerstone in early drug development. Using information for multiple domains or across prediction tasks has the potential to improve the performance of predictive modeling. However, aggregating data often leads to incomplete data matrices that might be limiting for modeling. In line with previous studies, we show that by generating predicted bioactivity profiles, and using these as additional features, prediction accuracy of biological endpoints can be improved. Using conformal prediction, a type of confidence predictor, we present a robust framework for the calculation of these profiles and the evaluation of their impact. We report on the outcomes from several approaches to generate the predicted profiles on 16 datasets in cytotoxicity and bioactivity and show that efficiency is improved the most when including the p-values from conformal prediction as bioactivity profiles
    • …
    corecore