37,870 research outputs found
An adapted version of the element-wise weighted total least squares method for applications in chemometrics
The Maximum Likelihood PCA (MLPCA) method has been devised in chemometrics as a generalization of the well-known PCA method in order to derive consistent estimators in the presence of errors with known error distribution. For similar reasons, the Total Least Squares (TLS) method has been generalized in the field of computational mathematics and engineering to maintain consistency of the parameter estimates in linear models with measurement errors of known distribution. In a previous paper [M. Schuermans, I. Markovsky, P.D. Wentzell, S. Van Huffel, On the equivalance between total least squares and maximum likelihood PCA, Anal. Chim. Acta, 544 (2005), 254–267], the tight equivalences between MLPCA and Element-wise Weighted TLS (EW-TLS) have been explored. The purpose of this paper is to adapt the EW-TLS method in order to make it useful for problems in chemometrics. We will present a computationally efficient algorithm and compare this algorithm with the standard EW-TLS algorithm and the MLPCA algorithm in computation time and convergence behaviour on chemical data
Approximate Rank-Detecting Factorization of Low-Rank Tensors
We present an algorithm, AROFAC2, which detects the (CP-)rank of a degree 3
tensor and calculates its factorization into rank-one components. We provide
generative conditions for the algorithm to work and demonstrate on both
synthetic and real world data that AROFAC2 is a potentially outperforming
alternative to the gold standard PARAFAC over which it has the advantages that
it can intrinsically detect the true rank, avoids spurious components, and is
stable with respect to outliers and non-Gaussian noise
Common and Distinct Components in Data Fusion
In many areas of science multiple sets of data are collected pertaining to
the same system. Examples are food products which are characterized by
different sets of variables, bio-processes which are on-line sampled with
different instruments, or biological systems of which different genomics
measurements are obtained. Data fusion is concerned with analyzing such sets of
data simultaneously to arrive at a global view of the system under study. One
of the upcoming areas of data fusion is exploring whether the data sets have
something in common or not. This gives insight into common and distinct
variation in each data set, thereby facilitating understanding the
relationships between the data sets. Unfortunately, research on methods to
distinguish common and distinct components is fragmented, both in terminology
as well as in methods: there is no common ground which hampers comparing
methods and understanding their relative merits. This paper provides a unifying
framework for this subfield of data fusion by using rigorous arguments from
linear algebra. The most frequently used methods for distinguishing common and
distinct components are explained in this framework and some practical examples
are given of these methods in the areas of (medical) biology and food science.Comment: 50 pages, 12 figure
Independent components in spectroscopic analysis of complex mixtures
We applied two methods of "blind" spectral decomposition (MILCA and SNICA) to
quantitative and qualitative analysis of UV absorption spectra of several
non-trivial mixture types. Both methods use the concept of statistical
independence and aim at the reconstruction of minimally dependent components
from a linear mixture. We examined mixtures of major ecotoxicants (aromatic and
polyaromatic hydrocarbons), amino acids and complex mixtures of vitamins in a
veterinary drug. Both MICLA and SNICA were able to recover concentrations and
individual spectra with minimal errors comparable with instrumental noise. In
most cases their performance was similar to or better than that of other
chemometric methods such as MCR-ALS, SIMPLISMA, RADICAL, JADE and FastICA.
These results suggest that the ICA methods used in this study are suitable for
real life applications. Data used in this paper along with simple matlab codes
to reproduce paper figures can be found at
http://www.klab.caltech.edu/~kraskov/MILCA/spectraComment: 22 pages, 4 tables, 6 figure
Intra-regional classification of grape seeds produced in Mendoza province (Argentina) by multi-elemental analysis and chemometrics tools
The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes.Fil: Canizo, Brenda Vanina. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Escudero, Leticia Belén. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Pérez, María Belén. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Pellerano, Roberto Gerardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Química Básica y Aplicada del Nordeste Argentino. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Química Básica y Aplicada del Nordeste Argentino; ArgentinaFil: Wuilloud, Rodolfo German. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Química Básica y Aplicada del Nordeste Argentino. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Química Básica y Aplicada del Nordeste Argentino; Argentin
New frontiers in thermal analysis: A TG/Chemometrics approach for postmortem interval estimation in vitreous humor
The coupling of thermogravimetric analysis (TG) associated with chemometrics is proposed as an innovative approach in thanatochemistry in order to develop a new analytical tool using thermal analysis for the characterization of vitreous humor. Vitreous samples were selected from the medicolegal deaths which occurred in casualty and where the death interval is known. Only hospital deaths with no metabolic disorders were taken, and the precise time of death was certified by the treating physician. Samples were analyzed by TG7 thermobalance, and principal component analysis was used to evaluate the results. The TG/Chemometrics outcomes show a clearly distinct behavior according to the postmortem interval, concluding that TG and Chemometrics are capable of predicting the time since death using only a few microliters of vitreous, without any pretreatment and with an hour of analysis tim
Partial Least Squares: A Versatile Tool for the Analysis of High-Dimensional Genomic Data
Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of high-dimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray expression data we provide a systematic comparison of the PLS approaches currently employed, and discuss problems as different as tumor classification, identification of relevant genes, survival analysis and modeling of gene networks
- …
