37,870 research outputs found

    An adapted version of the element-wise weighted total least squares method for applications in chemometrics

    No full text
    The Maximum Likelihood PCA (MLPCA) method has been devised in chemometrics as a generalization of the well-known PCA method in order to derive consistent estimators in the presence of errors with known error distribution. For similar reasons, the Total Least Squares (TLS) method has been generalized in the field of computational mathematics and engineering to maintain consistency of the parameter estimates in linear models with measurement errors of known distribution. In a previous paper [M. Schuermans, I. Markovsky, P.D. Wentzell, S. Van Huffel, On the equivalance between total least squares and maximum likelihood PCA, Anal. Chim. Acta, 544 (2005), 254–267], the tight equivalences between MLPCA and Element-wise Weighted TLS (EW-TLS) have been explored. The purpose of this paper is to adapt the EW-TLS method in order to make it useful for problems in chemometrics. We will present a computationally efficient algorithm and compare this algorithm with the standard EW-TLS algorithm and the MLPCA algorithm in computation time and convergence behaviour on chemical data

    Approximate Rank-Detecting Factorization of Low-Rank Tensors

    Full text link
    We present an algorithm, AROFAC2, which detects the (CP-)rank of a degree 3 tensor and calculates its factorization into rank-one components. We provide generative conditions for the algorithm to work and demonstrate on both synthetic and real world data that AROFAC2 is a potentially outperforming alternative to the gold standard PARAFAC over which it has the advantages that it can intrinsically detect the true rank, avoids spurious components, and is stable with respect to outliers and non-Gaussian noise

    Common and Distinct Components in Data Fusion

    Get PDF
    In many areas of science multiple sets of data are collected pertaining to the same system. Examples are food products which are characterized by different sets of variables, bio-processes which are on-line sampled with different instruments, or biological systems of which different genomics measurements are obtained. Data fusion is concerned with analyzing such sets of data simultaneously to arrive at a global view of the system under study. One of the upcoming areas of data fusion is exploring whether the data sets have something in common or not. This gives insight into common and distinct variation in each data set, thereby facilitating understanding the relationships between the data sets. Unfortunately, research on methods to distinguish common and distinct components is fragmented, both in terminology as well as in methods: there is no common ground which hampers comparing methods and understanding their relative merits. This paper provides a unifying framework for this subfield of data fusion by using rigorous arguments from linear algebra. The most frequently used methods for distinguishing common and distinct components are explained in this framework and some practical examples are given of these methods in the areas of (medical) biology and food science.Comment: 50 pages, 12 figure

    Independent components in spectroscopic analysis of complex mixtures

    Full text link
    We applied two methods of "blind" spectral decomposition (MILCA and SNICA) to quantitative and qualitative analysis of UV absorption spectra of several non-trivial mixture types. Both methods use the concept of statistical independence and aim at the reconstruction of minimally dependent components from a linear mixture. We examined mixtures of major ecotoxicants (aromatic and polyaromatic hydrocarbons), amino acids and complex mixtures of vitamins in a veterinary drug. Both MICLA and SNICA were able to recover concentrations and individual spectra with minimal errors comparable with instrumental noise. In most cases their performance was similar to or better than that of other chemometric methods such as MCR-ALS, SIMPLISMA, RADICAL, JADE and FastICA. These results suggest that the ICA methods used in this study are suitable for real life applications. Data used in this paper along with simple matlab codes to reproduce paper figures can be found at http://www.klab.caltech.edu/~kraskov/MILCA/spectraComment: 22 pages, 4 tables, 6 figure

    Intra-regional classification of grape seeds produced in Mendoza province (Argentina) by multi-elemental analysis and chemometrics tools

    Get PDF
    The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes.Fil: Canizo, Brenda Vanina. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Escudero, Leticia Belén. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Pérez, María Belén. Universidad Nacional de Cuyo. Facultad de Ciencias Exactas y Naturales. Laboratorio de Química Analítica para Investigación y Desarrollo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Pellerano, Roberto Gerardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Química Básica y Aplicada del Nordeste Argentino. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Química Básica y Aplicada del Nordeste Argentino; ArgentinaFil: Wuilloud, Rodolfo German. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Química Básica y Aplicada del Nordeste Argentino. Universidad Nacional del Nordeste. Facultad de Ciencias Exactas Naturales y Agrimensura. Instituto de Química Básica y Aplicada del Nordeste Argentino; Argentin

    New frontiers in thermal analysis: A TG/Chemometrics approach for postmortem interval estimation in vitreous humor

    Get PDF
    The coupling of thermogravimetric analysis (TG) associated with chemometrics is proposed as an innovative approach in thanatochemistry in order to develop a new analytical tool using thermal analysis for the characterization of vitreous humor. Vitreous samples were selected from the medicolegal deaths which occurred in casualty and where the death interval is known. Only hospital deaths with no metabolic disorders were taken, and the precise time of death was certified by the treating physician. Samples were analyzed by TG7 thermobalance, and principal component analysis was used to evaluate the results. The TG/Chemometrics outcomes show a clearly distinct behavior according to the postmortem interval, concluding that TG and Chemometrics are capable of predicting the time since death using only a few microliters of vitreous, without any pretreatment and with an hour of analysis tim

    Partial Least Squares: A Versatile Tool for the Analysis of High-Dimensional Genomic Data

    Get PDF
    Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of high-dimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray expression data we provide a systematic comparison of the PLS approaches currently employed, and discuss problems as different as tumor classification, identification of relevant genes, survival analysis and modeling of gene networks
    corecore