572 research outputs found

    An optimization approach coupling pre-processing with model regression for enhanced chemometrics

    Get PDF
    Chemometric methods are broadly used in the chemical and biochemical sectors. Typically, derivation of a regression model follows data preprocessing in a sequential manner. Yet, preprocessing can significantly influence the regression model and eventually its predictive ability. In this work, we investigate the coupling of preprocessing and model parameter estimation by incorporating them simultaneously in an optimization step. Common model selection techniques rely almost exclusively on the performance of some accuracy metric, yet having a quantitative metric for model robustness can prolong model up-time. Our approach is applied to optimize for model accuracy and robustness. This requires the introduction of a novel mathematical definition for robustness. We test our method in a simulated set up and with industrial case studies from multivariate calibration. The results highlight the importance of both accuracy and robustness properties and illustrate the potential of the proposed optimization approach toward automating the generation of efficient chemometric models

    Probabilistic predictions for partial least squares using bootstrap

    Get PDF
    Modeling the uncertainty in partial least squares (PLS) is made difficult because of the nonlinear effect of the observed data on the latent space that the method finds. We present an approach, based on bootstrapping, that automatically accounts for these nonlinearities in the parameter uncertainty, allowing us to equally well represent confidence intervals for points lying close to or far away from the latent space. To show the opportunities of this approach, we develop applications in determining the Design Space for industrial processes and model the uncertainty of spectroscopy data. Our results show the benefits of our method for accounting for uncertainty far from the latent space for the purposes of Design Space identification, and match the performance of well established methods for spectroscopy data
    • …
    corecore