31 research outputs found

    Investigating the influence of data splitting on the predictive ability of QSAR/QSPR models

    Get PDF
    The study was aimed at investigating how the method of splitting data into a training set and a test set influences the external predictivity of quantitative structure-activity and/or structure-property relationships (QSAR/QSPR) models. Six models of good quality were collected from the literature and then redeveloped and validated on the basis of five alternative splitting algorithms, namely: (i) a commonly used algorithm ('Z:1'), in which every zth (e.g. third) from the compounds sorted ascending (according to the response values, y) is selected into the test set; (ii-iv) three variations of the Kennard-Stone algorithm; and (v) the duplex algorithm. The external validation statistics reported for each model served as a basis for the final comparison. We demonstrated that the splitting techniques utilizing the values of molecular descriptors alone (X) or in combination with the model response (y) always lead to the development of the models yielding better external predictivity in comparison with the models designed with methodologies based on the y-values only. Moreover, we showed that the external validation coefficient (Q2EXT) is more sensitive to the splitting technique than the root mean square error of prediction (RMSEP). This difference becomes especially important when the test set is relatively small (between 5-10 compounds). In the case of the models trained/validated with a small number of compounds, it is strongly recommended that both statistics (Q2EXT and RMSEP) are taken into account for the external predictivity evaluation.JRC.I.6-Systems toxicolog

    Perspectives from the NanoSafety Modelling Cluster on the validation criteria for (Q)SAR models used in nanotechnology

    Get PDF
    Nanotechnology and the production of nanomaterials have been expanding rapidly in recent years. Since many types of engineered nanoparticles are suspected to be toxic to living organisms and to have a negative impact on the environment, the process of designing new nanoparticles and their applications must be accompanied by a thorough exposure risk analysis. (Quantitative) Structure-Activity Relationship ([Q]SAR) modelling creates promising options among the available methods for the risk assessment. These in silico models can be used to predict a variety of properties, including the toxicity of newly designed nanoparticles. However, (Q)SAR models must be appropriately validated to ensure the clarity, consistency and reliability of predictions. This paper is a joint initiative from recently completed European research projects focused on developing (Q)SAR methodology for nanomaterials. The aim was to interpret and expand the guidance for the well-known “OECD Principles for the Validation, for Regulatory Purposes, of (Q)SAR Models”, with reference to nano-(Q)SAR, and present our opinions on the criteria to be fulfilled for models developed for nanoparticles

    Computational nanotoxicology: challenges and perspectives

    No full text

    A machine learning q-RASPR approach for efficient predictions of the specific surface area of perovskites

    No full text
    In this study, the specific surface area of various perovskites was modeled using a novel quantitative read-across structure-property relationship (q-RASPR) approach, which clubs both Read-Across (RA) and quantitative structure-property relationship (QSPR) together. After optimization of the hyper-parameters, certain similarity-based error measures for each query compound were obtained. Clubbing some of these error-based measures with the previously selected features along with the Read-Across prediction function, a number of machine learning models were developed using Partial Least Squares (PLS), ridge regression (RR), linear support vector regression (LSVR), and random forest (RF) regression. Based on the external prediction quality and interpretability, the PLS model was selected as the best predictor which underscored the previously reported results. The finally selected model should efficiently predict specific surface areas of other perovskites for their use in photocatalysis. The new q-RASPR method also appears promising for the prediction of several other property endpoints of interest in materials science

    Modeling adsorption of brominated, chlorinated and mixed bromo/chloro-dibenzo-p-dioxins on C60 fullerene using Nano-QSPR

    No full text
    Many technological implementations in the field of nanotechnology have involved carbon nanomaterials, including fullerenes such as the buckminsterfullerene, C60. The unprecedented properties of such organic nanomaterials (in particular their large surface area) gained extensive attention for their potential use as organic pollutant sorbents. Sorption interactions can be very hazardous and useful at the same time. This work investigates the influence of halogenation by bromine and/or chlorine in dibenzo-p-dioxins on their sorption ability on the C60 fullerene surface. Halogenated dibenzo-p-dioxins (PXDDs, where X = Br or Cl) are ever-present in the environment and accidently produced in many technological processes in only approximately known quantities. If all combinatorial Br and/or Cl dioxin substitution possibilities are present in the environment, the experimental characterization and investigation of sorbent effectiveness is more than difficult. In this work, we have developed a quantitative structure–property relationship (QSPR) model (R2 = 0.998), predicting the adsorption energy [kcal/mol] for 1,701 PXDDs adsorbed on C60 (PXDD@C60). Based on the QSPR model reported herein, we concluded that the lowest energy PXDD@C60 complexes are those that the World Health Organization (WHO) considers to be less dangerous with respect to the aryl hydrocarbon receptor (AhR) toxicity mechanism. Therefore, the effectiveness of fullerenes as sorbent agents may be underestimated as sorption could be less effective for toxic congeners than previously believed
    corecore