53 research outputs found

    Investigating the influence of data splitting on the predictive ability of QSAR/QSPR models

    Get PDF
    The study was aimed at investigating how the method of splitting data into a training set and a test set influences the external predictivity of quantitative structure-activity and/or structure-property relationships (QSAR/QSPR) models. Six models of good quality were collected from the literature and then redeveloped and validated on the basis of five alternative splitting algorithms, namely: (i) a commonly used algorithm ('Z:1'), in which every zth (e.g. third) from the compounds sorted ascending (according to the response values, y) is selected into the test set; (ii-iv) three variations of the Kennard-Stone algorithm; and (v) the duplex algorithm. The external validation statistics reported for each model served as a basis for the final comparison. We demonstrated that the splitting techniques utilizing the values of molecular descriptors alone (X) or in combination with the model response (y) always lead to the development of the models yielding better external predictivity in comparison with the models designed with methodologies based on the y-values only. Moreover, we showed that the external validation coefficient (Q2EXT) is more sensitive to the splitting technique than the root mean square error of prediction (RMSEP). This difference becomes especially important when the test set is relatively small (between 5-10 compounds). In the case of the models trained/validated with a small number of compounds, it is strongly recommended that both statistics (Q2EXT and RMSEP) are taken into account for the external predictivity evaluation.JRC.I.6-Systems toxicolog

    Novel approach for efficient predictions properties of large pool of nanomaterials based on limited set of species: nano-read-across

    Get PDF
    Creating suitable chemical categories and developing read-across methods, supported by quantum mechanical calculations, can be an effective solution to solving key problems related to current scarcity of data on the toxicity of various nanoparticles. This study has demonstrated that by applying a nano-read-across, the cytotoxicity of nano-sized metal oxides could be estimated with a similar level of accuracy as provided by quantitative structure-activity relationship for nanomaterials (nano-QSAR model(s)). The method presented is a suitable computational tool for the preliminary hazard assessment of nanomaterials. It also could be used for the identification of nanomaterials that may pose potential negative impact to human health and the environment. Such approaches are especially necessary when there is paucity of relevant and reliable data points to develop and validate nano-QSAR model

    Comparing the CORAL and random forest approaches for modelling the in vitro cytotoxicity of silica nanomaterials

    Get PDF
    Nanotechnology is one of the most important technological developments of the twenty-first century. In silico methods such as quantitative structure-activity relationships (QSARs) to predict toxicity promote the safe-by-design approach for the development of new materials, including nanomaterials. In this study, a set of cytotoxicity experimental data corresponding to 19 data points for silica nanomaterials was investigated to compare the widely employed CORAL and Random Forest approaches in terms of their usefulness for developing so-called “nano-QSAR” models. “External” leave-one-out cross-validation (LOO) analysis was performed to validate the two different approaches. An analysis of variable importance measures and signed feature contributions for both algorithms was undertaken in order to interpret the models developed. CORAL showed a more pronounced difference between the average coefficient of determination (R2) between training and LOO (0.83 and 0.65 for training and LOO respectively) compared to Random Forest (0.87 and 0.78 without bootstrap sampling, 0.90 and 0.78 with bootstrap sampling), which may be due to overfitting. The aspect ratio and zeta potential from amongst the nanomaterials’ physico-chemical properties were found to be the two most important variables for the Random Forest and the average feature contributions calculated for the corresponding descriptors were consistent with the clear trends observed in the dataset: less negative zeta potential values and lower aspect ratio values were associated with higher cytotoxicity. In contrast, CORAL failed to capture these trends

    Scanning electron microscopy image representativeness: morphological data on nanoparticles.

    Get PDF
    A sample of a nanomaterial contains a distribution of nanoparticles of various shapes and/or sizes. A scanning electron microscopy image of such a sample often captures only a fragment of the morphological variety present in the sample. In order to quantitatively analyse the sample using scanning electron microscope digital images, and, in particular, to derive numerical representations of the sample morphology, image content has to be assessed. In this work, we present a framework for extracting morphological information contained in scanning electron microscopy images using computer vision algorithms, and for converting them into numerical particle descriptors. We explore the concept of image representativeness and provide a set of protocols for selecting optimal scanning electron microscopy images as well as determining the smallest representative image set for each of the morphological features. We demonstrate the practical aspects of our methodology by investigating tricalcium phosphate, Ca3 (PO4 )2 , and calcium hydroxyphosphate, Ca5 (PO4 )3 (OH), both naturally occurring minerals with a wide range of biomedical applications

    Comparing the CORAL and Random Forest approaches for modelling the in vitro cytotoxicity of silica nanomaterials.

    Get PDF
    Nanotechnology is one of the most important technological developments of the 21st century. In silico methods to predict toxicity, such as quantitative structure-activity relationships (QSARs), promote the safe-by-design approach for the development of new materials, including nanomaterials. In this study, a set of cytotoxicity experimental data corresponding to 19 data points for silica nanomaterials were investigated, to compare the widely employed CORAL and Random Forest approaches in terms of their usefulness for developing so-called 'nano-QSAR' models. 'External' leave-one-out cross-validation (LOO) analysis was performed, to validate the two different approaches. An analysis of variable importance measures and signed feature contributions for both algorithms was undertaken, in order to interpret the models developed. CORAL showed a more pronounced difference between the average coefficient of determination (R²) for training and for LOO (0.83 and 0.65 for training and LOO, respectively), compared to Random Forest (0.87 and 0.78 without bootstrap sampling, 0.90 and 0.78 with bootstrap sampling), which may be due to overfitting. With regard to the physicochemical properties of the nanomaterials, the aspect ratio and zeta potential were found to be the two most important variables for Random Forest, and the average feature contributions calculated for the corresponding descriptors were consistent with the clear trends observed in the data set: less negative zeta potential values and lower aspect ratio values were associated with higher cytotoxicity. In contrast, CORAL failed to capture these trends

    Dimensionality of Carbon Nanomaterials Determines the Binding and Dynamics of Amyloidogenic Peptides: Multiscale Theoretical Simulations

    Get PDF
    Experimental studies have demonstrated that nanoparticles can affect the rate of protein self-assembly, possibly interfering with the development of protein misfolding diseases such as Alzheimer's, Parkinson's and prion disease caused by aggregation and fibril formation of amyloid-prone proteins. We employ classical molecular dynamics simulations and large-scale density functional theory calculations to investigate the effects of nanomaterials on the structure, dynamics and binding of an amyloidogenic peptide apoC-II(60-70). We show that the binding affinity of this peptide to carbonaceous nanomaterials such as C60, nanotubes and graphene decreases with increasing nanoparticle curvature. Strong binding is facilitated by the large contact area available for π-stacking between the aromatic residues of the peptide and the extended surfaces of graphene and the nanotube. The highly curved fullerene surface exhibits reduced efficiency for π-stacking but promotes increased peptide dynamics. We postulate that the increase in conformational dynamics of the amyloid peptide can be unfavorable for the formation of fibril competent structures. In contrast, extended fibril forming peptide conformations are promoted by the nanotube and graphene surfaces which can provide a template for fibril-growth

    Grouping of nanomaterials to read-across hazard endpoints: from data collection to assessment of the grouping hypothesis by application of chemoinformatic techniques

    Get PDF
    An increasing number of manufactured nanomaterials (NMs) are being used in industrial products and need to be registered under the REACH legislation. The hazard characterisation of all these forms is not only technically challenging but resource and time demanding. The use of non-testing strategies like read-across is deemed essential to assure the assessment of all NMs in due time and at lower cost. The fact that read-across is based on the structural similarity of substances represents an additional difficulty for NMs as in general their structure is not unequivocally defined. In such a scenario, the identification of physicochemical properties affecting the hazard potential of NMs is crucial to define a grouping hypothesis and predict the toxicological hazards of similar NMs. In order to promote the read-across of NMs, ECHA has recently published “Recommendations for nanomaterials applicable to the guidance on QSARs and Grouping”, but no practical examples were provided in the document. Due to the lack of publicly available data and the inherent difficulties of reading-across NMs, only a few examples of read-across of NMs can be found in the literature. This manuscript presents the first case study of the practical process of grouping and read-across of NMs following the workflow proposed by ECHA. The workflow proposed by ECHA was used and slightly modified to present the read-across case study. The Read-Across Assessment Framework (RAAF) was used to evaluate the uncertainties of a read-across within NMs. Chemoinformatic techniques were used to support the grouping hypothesis and identify key physicochemical properties. A dataset of 6 nanoforms of TiO2 with more than 100 physicochemical properties each was collected. In vitro comet assay result was selected as the endpoint to read-across due to data availability. A correlation between the presence of coating or large amounts of impurities and negative comet assay results was observed. The workflow proposed by ECHA to read-across NMs was applied successfully. Chemoinformatic techniques were shown to provide key evidence for the assessment of the grouping hypothesis and the definition of similar NMs. The RAAF was found to be applicable to NMs
    corecore