83 research outputs found
Modeling of the Acute Toxicity of Benzene Derivatives by Complementary QSAR Methods
A data set containing acute toxicity values (96-h LC50) of 69 substituted benzenes for
fathead minnow (Pimephales promelas) was investigated with two Quantitative Structure-
Activity Relationship (QSAR) models, either using or not using molecular descriptors,
respectively. Recursive Neural Networks (RNN) derive a QSAR by direct treatment of the
molecular structure, described through an appropriate graphical tool (variable-size labeled
rooted ordered trees) by defining suitable representation rules. The input trees are encoded by
an adaptive process able to learn, by tuning its free parameters, from a given set of structureactivity
training examples. Owing to the use of a flexible encoding approach, the model is
target invariant and does not need a priori definition of molecular descriptors. The results
obtained in this study were analyzed together with those of a model based on molecular
descriptors, i.e. a Multiple Linear Regression (MLR) model using CROatian MultiRegression
selection of descriptors (CROMRsel). The comparison revealed interesting similarities that
could lead to the development of a combined approach, exploiting the complementary
characteristics of the two approaches
Classification of Hungarian medieval silver coins using x-ray fluorescent spectroscopy and multivariate data analysis
A set of silver coins from the collection of DĂ©ri Museum Debrecen (Hungary) was examined by X-ray
fluorescent elemental analysis with the aim to assign the coins to different groups with the best possible precision
based on the acquired chemical information and to build models, which arrange the coins according to their
historical periods.
Results: Principal component analysis, linear discriminant analysis, partial least squares discriminant analysis,
classification and regression trees and multivariate curve resolution with alternating least squares were applied to
reveal dominant pattern in the data and classify the coins into several groups. We also identified those chemical
components, which are present in small percentages, but are useful for the classification of the coins. With the
coins divided into two groups according to adequate historical periods, we have obtained a correct classification
(76-78%) based on the chemical compositions.
Conclusions: X-ray fluorescent elemental analysis together with multivariate data analysis methods is suitable to
group medieval coins according to historical periods.
Keywords: X-ray fluorescence spectroscopy, Multivariate techniques, Coin, Silver, Middle age
Apportionment and districting by Sum of Ranking Differences
Sum of Ranking Differences is an innovative statistical method that ranks competing solutions based on a reference point. The latter might arise naturally, or can be aggregated from
the data. We provide two case studies to feature both possibilities. Apportionment and districting are two critical issues that emerge in relation to democratic elections. Theoreticians
invented clever heuristics to measure malapportionment and the compactness of the shape
of the constituencies, yet, there is no unique best method in either cases. Using data from
Norway and the US we rank the standard methods both for the apportionment and for the
districting problem. In case of apportionment, we find that all the classical methods perform
reasonably well, with subtle but significant differences. By a small margin the Leximin
method emerges as a winner, but—somewhat unexpectedly—the non-regular Imperiali
method ties for first place. In districting, the Lee-Sallee index and a novel parametric method
the so-called Moment Invariant performs the best, although the latter is sensitive to the function’s chosen parameter
Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters
<div><p>Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.</p></div
Generalized Pairwise Correlation and method comparison: Impact assessment for JAR attributes on overall liking
Quantitative determination and classification of energy drinks using near-infrared spectroscopy
Characterization of hybrid materials by means of inverse gas chromatography and chemometrics
The surface properties of hybrid materials (potential carriers for sustained release of active
agents) have been examined by inverse gas chromatography (IGC). A nonsteroidal antiinflammatory
agent – ibuprofen was used as a model for active compound. The following
parameters have been used to characterize the interactions between the constituents of the
hybrid material and the active agent: dispersive component of the surface free energy D
S g , KA
and KD parameters describing the acidity and basicity, respectively, and Flory-Huggins
parameter '
23 c (the magnitude of interactions). Principal component analysis (PCA) and the
procedure based on sum of ranking differences (SRD) were applied for selection of hybrid
materials and parameters for characterization of these materials. One loose cluster found by
PCA grouping of hybrid materials is refined by SRD analysis: SRD grouping indicates three
groups having somewhat dissimilar properties
Estimation of influential points in any data set from coefficient of determination and its leave-one-out cross-validated counterpart
Coefficient of determination (R2) and its leave-one-out cross-validated analogue (denoted by Q2 or Rcv 2) are the most frequantly published values to characterize the predictive performance of models. In this article we use R2 and Q2 in a reversed aspect to determine uncommon points, i.e. influential points in any data sets. The term (1 - Q2)/(1 - R2) corresponds to the ratio of predictive residual sum of squares and the residual sum of squares. The ratio correlates to the number of influential points in experimental and random data sets. We propose an (approximate) F test on (1 - Q2)/(1 - R2) term to quickly pre-estimate the presence of influential points in training sets of models. The test is founded upon the routinely calculated Q2 and R2 values and warns the model builders to verify the training set, to perform influence analysis or even to change to robust modeling. Graphical Abstract: [Figure not available: see fulltext.] © 2013 Springer Science+Business Media Dordrecht
- …