7 research outputs found

    Multivariate Calibration Domain Adaptation with Unlabeled Data

    Get PDF
    Multivariate calibration is about modeling the relationship between a substance\u27s chemical profile and its spectrum (here, near-infrared) in order to predict the concentration of new samples with known spectra. However, these new samples are often measured under different conditions than the primary conditions; different instruments, instrument drift, and temperature all affect the measurement conditions. Domain adaptation (DA) methods force the model to ignore these differences in order to generate an accurate model for the new domain (secondary conditions). There are two fundamental DA processes that individual methods can be classified under. One augments a few samples from the secondary domain with chemical reference values (labels) to the primary data and the other augments only secondary spectra (unlabeled data). In this work, we compare two existing labeled DA methods and two existing unlabeled DA methods to two novel labeled methods and a novel unlabeled approach. Since DA methods require selection of hyperparameters, a model selection framework based on model diversity and prediction similarity (MDPS) is applied to the DA methods. Regardless of the DA method, the MDPS process is shown to select models more accurate than the first quartile of all models generated by the DA process in three near-infrared datasets

    Harnessing Model Diversity and Prediction Similarity for Selecting Multivariate Calibration Tuning Parameters

    Get PDF
    Spectral multivariate calibration offers a cost-effective mechanism to obtain sample analyte values of a substance (e.g. protein level). However, calibration requires varying one or more tuning parameters in order to identify the most accurate model. Model selection is particularly difficult for model updating where spectral and reference information in both the original (primary) conditions and new (secondary) conditions are combined in order to better predict new spectra. Secondary situations can be new instruments, temperatures, or other condition affecting the shape and magnitude of the spectra relative to the primary conditions and analyte values. This poster uses model diversity while maintaining similar analyte prediction values to choose a set of acceptable models. The model selection technique is tested across the calibration method partial least squares and four model updating methods: two require a small set of secondary samples with analyte values and two do not require the secondary analyte values (unlabeled data). Results are presented across a variety of datasets and conditions showing that the cosine of the angle between models in combination with model vector 2-norms and prediction differences are key to selecting models

    Raman Spectroscopy and Fusion Classification to Identify Plastic Recycables Targeting Microplastics

    Get PDF
    Identification of plastic type for microplastic particles (size range of 0.001 mm – 5 mm) is vital to understand the sources and consequences of microplastics in the environment. Fourier- transform infrared and Raman spectroscopy are two dominating techniques used to identify microplastics. The most common method to identify microplastics with spectroscopic data is library searching, a process that utilizes search algorithms against digital databases containing spectra of various plastics. Presented in this study is a new method to utilize spectroscopic data called fusion classification. Fusion classification consists of merging multiple non-optimized classification methods (classifiers) to assign samples into categories (classes). The purpose of this study is to demonstrate the applicability of fusion classification to identify microplastics.

    Regularization Adaption Processes for Multivariate Calibration Maintenance

    Get PDF
    In the field of chemometrics, an important issue in multivariate calibration is model updating. Model updating is the adaption process in which a model obtained for a given set of samples and measurement conditions (primary) is updated to predict the analyte in new samples and measurement conditions (secondary). The calibration method partial least squares is applied with two new updating approaches. In one approach, only one updated model is obtained to predict the analyte amount in both primary and secondary conditions. The other approach forms two updated models in which one model is used to predict in primary conditions and second model based on the first model is used to predict in secondary conditions. Both approaches are evaluated with near-infrared spectral datasets. Datasets include spectra of soil, corn, olive oil adulterated with sunflower and pharmaceutical tablets. Fusion process and single merits are used to select models. Model selection methods are evaluated based on prediction errors using selected models

    Fine Tuning Model Updating for Multivariate Calibration Maintenance

    Get PDF
    In the field of chemometrics, an important issue in multivariate calibration is model updating. Model updating is the adaption process in which a model obtained for a given set of samples and measurement conditions (primary) is updated to predict the analyte in new samples and measurement conditions (secondary). Primary and secondary conditions can be different due to variations in the geographical situation, instrumentation, or environment. Model updating can be performed using labeled data sets containing samples with reference analyte values for both conditions. A common approach is performed by sample augmenting the larger primary labeled sample set with a small weighted secondary labeled sample set. In this situation, only one updating model is obtained to predict the analyte amount in both primary and secondary conditions. The proposed new approach is similar to this common approach, but instead of one updated model, two models are formed simultaneously. One model is used to only predict samples from the primary conditions and the second model is based on this primary model but modified relative to the weighted augmented secondary samples. This second model is used to predict samples from the secondary conditions. Both model updating methods require multiple tuning parameters (penalties)

    Fusion of Synchronous Fluorescence Spectra with Application to Argan Oil for Adulteration Analysis

    Get PDF
    When synchronous fluorescence (SyF) spectroscopy is used for quantitative and qualitative analysis, selection of a useful wavelength interval between the excitation and emission wavelengths (Δλ) is needed. Presented is a fusion approach to combine Δλ intervals thereby negating the selection process. This study uses the fusion of SyF spectra to detect adulteration of argan oil by corn oil and quantitative analysis of the corn oil content. The SyF spectra were acquired by varying the excitation wavelength in the region 300-800 nm using Δλ wavelength intervals from 10 to 100 nm in steps of 10 nm producing 10 sets of SyF spectra. For quantitative analysis, two calibration approaches are evaluated with these 10 SyF spectral datasets. Multivariate calibration by partial least squares (PLS) and a univariate calibration process where the SyF spectra are summed over respective SyF spectral ranges, the area under the curve (AUC) method. For adulteration detection and quantitation of the corn oil, prediction errors decrease with fusion compared to individually using the 10 Δλ interval SyF spectral data sets. For this data set, the AUC method generally provides smaller prediction errors than PLS at individual Δλ intervals as well as with fusion of all 10 Δλ intervals

    Fusion of Similarity Measures to Characterize Differences in Sample Matrix Effects

    Get PDF
    Multivariate calibration applied to spectroscopic data is firmly rooted in the field of analytical chemistry. Over the past several decades, numerous methods have been developed to deduce a calibration model to predict new analyte values with sufficient accuracy and precision. These calibration models produce good results when calibration (primary) and new prediction (secondary) samples are measured under similar conditions. However, inherent sample matrix effects and measurement conditions for the secondary samples are often dissimilar to calibration samples resulting in inaccurate and imprecise predictions. To combat this issue, calibration maintenance by model updating can be used to manipulate the calibration model to adapt to the secondary conditions. Currently, evaluations of traditional and new calibration maintenance methods by researchers are performed without any consideration for the degree of difference between the primary and secondary data sets. Needed is a method that assesses the degree of difference between primary and secondary data sets for a robust evaluation of any model updating method. In order to solve this problem, multiple similarity measures are utilized in this presentation for a fusion consensus assessment of the degree of difference between the primary and secondary spectra assuming equal distributions of analyte values. Results will be shown for spectral data sets of varying similarity
    corecore