126 research outputs found
Authentication of Sorrento walnuts by NIR spectroscopy coupled with different chemometric classification strategie
Walnuts have been widely investigated because of their chemical composition, which is particularly rich in unsaturated fatty acids, responsible for different benefits in the human body. Some of these fruits, depending on the harvesting area, are considered a high value-added food, thus resulting in a higher selling price. In Italy, walnuts are harvested throughout the national territory, but the fruits produced in the Sorrento area (South Italy) are commercially valuable for their peculiar organoleptic characteristics. The aim of the present study is to develop a non-destructive and shelf-life compatible method, capable of discriminating common walnuts from those harvested in Sorrento (a town in Southern Italy), considered a high quality product. Two-hundred-and-twenty-seven walnuts (105 from Sorrento and 132 grown in other areas) were analyzed by near-infrared spectroscopy (both whole or shelled), and classified by Partial Least Squares-Discriminant Analysis (PLS-DA). Eventually, two multi-block approaches have been exploited in order to combine the spectral information collected on the shell and on the kernel. One of these latter strategies provided the best results (98.3% of correct classification rate in external validation, corresponding to 1 misclassified object over 60). The present study suggests the proposed strategy is a suitable solution for the discrimination of Sorrento walnuts. © 2020 by the authors
Classification approaches for sorting maize (Zea mays subsp. mays) haploids using singleâkernel nearâinfrared spectroscopy
Doubled haploids (DHs) are an important breeding tool for creating maize inbred lines. One bottleneck in the DH process is the manual separation of haploids from among the much larger pool of hybrid siblings in a haploid induction cross. Here, we demonstrate the ability of singleâkernel nearâinfrared reflectance spectroscopy (skNIR) to identify haploid kernels. The skNIR is a highâthroughput device that acquires an NIR spectrum to predict individual kernel traits. We collected skNIR data from haploid and hybrid kernels in 15 haploid induction crosses and found significant differences in multiple traits such as percent oil, seed weight, or volume, within each cross. The two kernel classes were separated by their NIR profile using Partial Least Squares Linear Discriminant Analysis (PLSâLDA). A general classification model, in which all induction crosses were used in the discrimination model, and a specific model, in which only kernels within a specific induction cross, were compared. Specific models outperformed the general model and were able to enrich a haploid selection pool to above 50% haploids. Applications for the instrument are discussed
Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction
It is difficult to find the optimal sparse solution of a manifold learning
based dimensionality reduction algorithm. The lasso or the elastic net
penalized manifold learning based dimensionality reduction is not directly a
lasso penalized least square problem and thus the least angle regression (LARS)
(Efron et al. \cite{LARS}), one of the most popular algorithms in sparse
learning, cannot be applied. Therefore, most current approaches take indirect
ways or have strict settings, which can be inconvenient for applications. In
this paper, we proposed the manifold elastic net or MEN for short. MEN
incorporates the merits of both the manifold learning based dimensionality
reduction and the sparse learning based dimensionality reduction. By using a
series of equivalent transformations, we show MEN is equivalent to the lasso
penalized least square problem and thus LARS is adopted to obtain the optimal
sparse solution of MEN. In particular, MEN has the following advantages for
subsequent classification: 1) the local geometry of samples is well preserved
for low dimensional data representation, 2) both the margin maximization and
the classification error minimization are considered for sparse projection
calculation, 3) the projection matrix of MEN improves the parsimony in
computation, 4) the elastic net penalty reduces the over-fitting problem, and
5) the projection matrix of MEN can be interpreted psychologically and
physiologically. Experimental evidence on face recognition over various popular
datasets suggests that MEN is superior to top level dimensionality reduction
algorithms.Comment: 33 pages, 12 figure
Multi-Label Dimensionality Reduction
abstract: Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms.Dissertation/ThesisPh.D. Computer Science 201
Design of a virtual sensor data array for the analysis of RDX, HMX and DMNB using metal-doped screen printed electrodes and chemometric analysis
The detection of explosive substances is a subject of high importance in several areas including environmental health, de-mining efforts (land and sea) and security and defence against terrorist activity. The use of electrochemical methods for the detection of these substances has increased in recent years but still is quite restricted to the most common explosives. The electrochemical detection of explosive nitroamines and taggant substances in solution using a virtual sensor array of metal-doped screen printed electrodes and differential pulse voltammetry was achieved. The multiple sets of voltammetric data from the different electrodic systems using Differential Pulse Voltammetry (DPV) were integrated using multivariate analysis (PCA, NIPALS and LDA) and matched with known substances present in explosives. These combinations created a mathematical array which separated the explosives, even if the electrochemical information is buried or mixed with the background noise. Two explosive substances: octogen (HMX- 1,3,5-Trinitroperhydro-1,3,5-triazine) and cyclonite (RDX- Hexahydro-1,3,5-trinitro-1,3,5-triazine) and a taggant agent 2,3-dimethyl-2,3-dinitrobutane (DMNB) were subjected to electrochemical analysis using a solid carbon- based screen printed electrode modified with silver, gold and platinum in aqueous solutions.
Keywords
Latent Fisher Discriminant Analysis
Linear Discriminant Analysis (LDA) is a well-known method for dimensionality
reduction and classification. Previous studies have also extended the
binary-class case into multi-classes. However, many applications, such as
object detection and keyframe extraction cannot provide consistent
instance-label pairs, while LDA requires labels on instance level for training.
Thus it cannot be directly applied for semi-supervised classification problem.
In this paper, we overcome this limitation and propose a latent variable Fisher
discriminant analysis model. We relax the instance-level labeling into
bag-level, is a kind of semi-supervised (video-level labels of event type are
required for semantic frame extraction) and incorporates a data-driven prior
over the latent variables. Hence, our method combines the latent variable
inference and dimension reduction in an unified bayesian framework. We test our
method on MUSK and Corel data sets and yield competitive results compared to
the baseline approach. We also demonstrate its capacity on the challenging
TRECVID MED11 dataset for semantic keyframe extraction and conduct a
human-factors ranking-based experimental evaluation, which clearly demonstrates
our proposed method consistently extracts more semantically meaningful
keyframes than challenging baselines.Comment: 12 page
Neural Class-Specific Regression for face verification
Face verification is a problem approached in the literature mainly using
nonlinear class-specific subspace learning techniques. While it has been shown
that kernel-based Class-Specific Discriminant Analysis is able to provide
excellent performance in small- and medium-scale face verification problems,
its application in today's large-scale problems is difficult due to its
training space and computational requirements. In this paper, generalizing our
previous work on kernel-based class-specific discriminant analysis, we show
that class-specific subspace learning can be cast as a regression problem. This
allows us to derive linear, (reduced) kernel and neural network-based
class-specific discriminant analysis methods using efficient batch and/or
iterative training schemes, suited for large-scale learning problems. We test
the performance of these methods in two datasets describing medium- and
large-scale face verification problems.Comment: 9 pages, 4 figure
Highly Efficient Regression for Scalable Person Re-Identification
Existing person re-identification models are poor for scaling up to large
data required in real-world applications due to: (1) Complexity: They employ
complex models for optimal performance resulting in high computational cost for
training at a large scale; (2) Inadaptability: Once trained, they are
unsuitable for incremental update to incorporate any new data available. This
work proposes a truly scalable solution to re-id by addressing both problems.
Specifically, a Highly Efficient Regression (HER) model is formulated by
embedding the Fisher's criterion to a ridge regression model for very fast
re-id model learning with scalable memory/storage usage. Importantly, this new
HER model supports faster than real-time incremental model updates therefore
making real-time active learning feasible in re-id with human-in-the-loop.
Extensive experiments show that such a simple and fast model not only
outperforms notably the state-of-the-art re-id methods, but also is more
scalable to large data with additional benefits to active learning for reducing
human labelling effort in re-id deployment
Chemometric study on the forensic discrimination of soil types using their infrared spectral characteristics
Soil has been utilized in criminal investigations for some time because of its prevalence and transferability. It is usually the physical characteristics that are studied, however the research carried out here aims to make use of the chemical profile of soil samples. The research we are
presenting in this work used sieved (2mm) soil samples taken from the top soil layer (about 10cm) that were then analysed using mid infrared spectroscopy. The spectra obtained were pre-treated and then input into two chemometric classification tools: Nonlinear iterative partial least squares followed by linear discriminant analysis (NIPALS-LDA) and partial least squares
discriminant analysis (PLS-DA). The models produced show that it is possible to discriminate between soil samples from different land use types and both approaches are comparable in performance. NIPALS-LDA performs much better than PLS-DA in classifying samples to locatio
- âŠ