83 research outputs found

    Modeling of the Acute Toxicity of Benzene Derivatives by Complementary QSAR Methods

    Get PDF
    A data set containing acute toxicity values (96-h LC50) of 69 substituted benzenes for fathead minnow (Pimephales promelas) was investigated with two Quantitative Structure- Activity Relationship (QSAR) models, either using or not using molecular descriptors, respectively. Recursive Neural Networks (RNN) derive a QSAR by direct treatment of the molecular structure, described through an appropriate graphical tool (variable-size labeled rooted ordered trees) by defining suitable representation rules. The input trees are encoded by an adaptive process able to learn, by tuning its free parameters, from a given set of structureactivity training examples. Owing to the use of a flexible encoding approach, the model is target invariant and does not need a priori definition of molecular descriptors. The results obtained in this study were analyzed together with those of a model based on molecular descriptors, i.e. a Multiple Linear Regression (MLR) model using CROatian MultiRegression selection of descriptors (CROMRsel). The comparison revealed interesting similarities that could lead to the development of a combined approach, exploiting the complementary characteristics of the two approaches

    Classification of Hungarian medieval silver coins using x-ray fluorescent spectroscopy and multivariate data analysis

    Get PDF
    A set of silver coins from the collection of DĂ©ri Museum Debrecen (Hungary) was examined by X-ray fluorescent elemental analysis with the aim to assign the coins to different groups with the best possible precision based on the acquired chemical information and to build models, which arrange the coins according to their historical periods. Results: Principal component analysis, linear discriminant analysis, partial least squares discriminant analysis, classification and regression trees and multivariate curve resolution with alternating least squares were applied to reveal dominant pattern in the data and classify the coins into several groups. We also identified those chemical components, which are present in small percentages, but are useful for the classification of the coins. With the coins divided into two groups according to adequate historical periods, we have obtained a correct classification (76-78%) based on the chemical compositions. Conclusions: X-ray fluorescent elemental analysis together with multivariate data analysis methods is suitable to group medieval coins according to historical periods. Keywords: X-ray fluorescence spectroscopy, Multivariate techniques, Coin, Silver, Middle age

    Apportionment and districting by Sum of Ranking Differences

    Get PDF
    Sum of Ranking Differences is an innovative statistical method that ranks competing solutions based on a reference point. The latter might arise naturally, or can be aggregated from the data. We provide two case studies to feature both possibilities. Apportionment and districting are two critical issues that emerge in relation to democratic elections. Theoreticians invented clever heuristics to measure malapportionment and the compactness of the shape of the constituencies, yet, there is no unique best method in either cases. Using data from Norway and the US we rank the standard methods both for the apportionment and for the districting problem. In case of apportionment, we find that all the classical methods perform reasonably well, with subtle but significant differences. By a small margin the Leximin method emerges as a winner, but—somewhat unexpectedly—the non-regular Imperiali method ties for first place. In districting, the Lee-Sallee index and a novel parametric method the so-called Moment Invariant performs the best, although the latter is sensitive to the function’s chosen parameter

    Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters

    Get PDF
    <div><p>Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.</p></div

    Characterization of hybrid materials by means of inverse gas chromatography and chemometrics

    Get PDF
    The surface properties of hybrid materials (potential carriers for sustained release of active agents) have been examined by inverse gas chromatography (IGC). A nonsteroidal antiinflammatory agent – ibuprofen was used as a model for active compound. The following parameters have been used to characterize the interactions between the constituents of the hybrid material and the active agent: dispersive component of the surface free energy D S g , KA and KD parameters describing the acidity and basicity, respectively, and Flory-Huggins parameter ' 23 c (the magnitude of interactions). Principal component analysis (PCA) and the procedure based on sum of ranking differences (SRD) were applied for selection of hybrid materials and parameters for characterization of these materials. One loose cluster found by PCA grouping of hybrid materials is refined by SRD analysis: SRD grouping indicates three groups having somewhat dissimilar properties

    Estimation of influential points in any data set from coefficient of determination and its leave-one-out cross-validated counterpart

    Get PDF
    Coefficient of determination (R2) and its leave-one-out cross-validated analogue (denoted by Q2 or Rcv 2) are the most frequantly published values to characterize the predictive performance of models. In this article we use R2 and Q2 in a reversed aspect to determine uncommon points, i.e. influential points in any data sets. The term (1 - Q2)/(1 - R2) corresponds to the ratio of predictive residual sum of squares and the residual sum of squares. The ratio correlates to the number of influential points in experimental and random data sets. We propose an (approximate) F test on (1 - Q2)/(1 - R2) term to quickly pre-estimate the presence of influential points in training sets of models. The test is founded upon the routinely calculated Q2 and R2 values and warns the model builders to verify the training set, to perform influence analysis or even to change to robust modeling. Graphical Abstract: [Figure not available: see fulltext.] © 2013 Springer Science+Business Media Dordrecht
    • …
    corecore