3 research outputs found

    KnowTox: pipeline and case study for confident prediction of potential toxic effects of compounds in early phases of development

    Get PDF
    Risk assessment of newly synthesised chemicals is a prerequisite for regulatory approval. In this context, in silico methods have great potential to reduce time, cost, and ultimately animal testing as they make use of the ever-growing amount of available toxicity data. Here, KnowTox is presented, a novel pipeline that combines three different in silico toxicology approaches to allow for confident prediction of potentially toxic effects of query compounds, i.e. machine learning models for 88 endpoints, alerts for 919 toxic substructures, and computational support for read-across. It is mainly based on the ToxCast dataset, containing after preprocessing a sparse matrix of 7912 compounds tested against 985 endpoints. When applying machine learning models, applicability and reliability of predictions for new chemicals are of utmost importance. Therefore, first, the conformal prediction technique was deployed, comprising an additional calibration step and per definition creating internally valid predictors at a given significance level. Second, to further improve validity and information efficiency, two adaptations are suggested, exemplified at the androgen receptor antagonism endpoint. An absolute increase in validity of 23% on the in-house dataset of 534 compounds could be achieved by introducing KNNRegressor normalisation. This increase in validity comes at the cost of efficiency, which could again be improved by 20% for the initial ToxCast model by balancing the dataset during model training. Finally, the value of the developed pipeline for risk assessment is discussed using two in-house triazole molecules. Compared to a single toxicity prediction method, complementing the outputs of different approaches can have a higher impact on guiding toxicity testing and de-selecting most likely harmful development-candidate compounds early in the development process

    Generalized Workflow for Generating Highly Predictive in Silico Off‑Target Activity Models

    No full text
    Chemical structure data and corresponding measured bioactivities of compounds are nowadays easily available from public and commercial databases. However, these databases contain heterogeneous data from different laboratories determined under different protocols and, in addition, sometimes even erroneous entries. In this study, we evaluated the use of data from bioactivity databases for the generation of high quality in silico models for off-target mediated toxicity as a decision support in early drug discovery and crop-protection research. We chose human acetylcholinesterase (hAChE) inhibition as an exemplary end point for our case study. A standardized and thorough quality management routine for input data consisting of more than 2,200 chemical entities from bioactivity databases was established. This procedure finally enables the development of predictive QSAR models based on heterogeneous in vitro data from multiple laboratories. An extended applicability domain approach was used, and regression results were refined by an error estimation routine. Subsequent classification augmented by special consideration of borderline candidates leads to high accuracies in external validation achieving correct predictive classification of 96%. The standardized process described herein is implemented as a (semi)­automated workflow and thus easily transferable to other off-targets and assay readouts
    corecore