8 research outputs found

    Statistical strategies for avoiding false discoveries in metabolomics and related experiments

    Full text link

    A functional approach to variable selection in spectrometric problems

    No full text
    In spectrometric problems, objects are characterized by high-resolution spectra that correspond to hundreds to thousands of variables. In this context, even fast variable selection methods lead to high computational load. However, spectra are generally smooth and can therefore be accurately approximated by splines. In this paper, we propose to use a B-spline expansion as a pre-processing step before variable selection, in which original variables are replaced by coefficients of the B-spline expansions. Using a simple leave-one-out procedure, the optimal number of B-spline coefficients can be found efficiently. As there is generally an order of magnitude less coefficients than original spectral variables, selecting optimal coefficients is faster than selecting variables. Moreover, a B-spline coefficient depends only on a limited range of original variables: this preserves interpretability of the selected variables. We demonstrate the interest of the proposed method on real-world data

    On-site variety discrimination of tomato plant using visible-near infrared reflectance spectroscopy*

    No full text
    The use of visible-near infrared (NIR) spectroscopy was explored as a tool to discriminate two new tomato plant varieties in China (Zheza205 and Zheza207). In this study, 82 top-canopy leaves of Zheza205 and 86 top-canopy leaves of Zheza207 were measured in visible-NIR reflectance mode. Discriminant models were developed using principal component analysis (PCA), discriminant analysis (DA), and discriminant partial least squares (DPLS) regression methods. After outliers detection, the samples were randomly split into two sets, one used as a calibration set (n=82) and the remaining samples as a validation set (n=82). When predicting the variety of the samples in validation set, the classification correctness of the DPLS model after optimizing spectral pretreatment was up to 93%. The DPLS model with raw spectra after multiplicative scatter correction and Savitzky-Golay filter smoothing pretreatments had the best satisfactory calibration and prediction abilities (correlation coefficient of calibration (R c)=0.920, root mean square errors of calibration=0.196, and root mean square errors of prediction=0.216). The results show that visible-NIR spectroscopy might be a suitable alternative tool to discriminate tomato plant varieties on-site
    corecore