11,966 research outputs found
On-line PCA with Optimal Regrets
We carefully investigate the on-line version of PCA, where in each trial a
learning algorithm plays a k-dimensional subspace, and suffers the compression
loss on the next instance when projected into the chosen subspace. In this
setting, we analyze two popular on-line algorithms, Gradient Descent (GD) and
Exponentiated Gradient (EG). We show that both algorithms are essentially
optimal in the worst-case. This comes as a surprise, since EG is known to
perform sub-optimally when the instances are sparse. This different behavior of
EG for PCA is mainly related to the non-negativity of the loss in this case,
which makes the PCA setting qualitatively different from other settings studied
in the literature. Furthermore, we show that when considering regret bounds as
function of a loss budget, EG remains optimal and strictly outperforms GD.
Next, we study the extension of the PCA setting, in which the Nature is allowed
to play with dense instances, which are positive matrices with bounded largest
eigenvalue. Again we can show that EG is optimal and strictly better than GD in
this setting
Fast Selection of Spectral Variables with B-Spline Compression
The large number of spectral variables in most data sets encountered in
spectral chemometrics often renders the prediction of a dependent variable
uneasy. The number of variables hopefully can be reduced, by using either
projection techniques or selection methods; the latter allow for the
interpretation of the selected variables. Since the optimal approach of testing
all possible subsets of variables with the prediction model is intractable, an
incremental selection approach using a nonparametric statistics is a good
option, as it avoids the computationally intensive use of the model itself. It
has two drawbacks however: the number of groups of variables to test is still
huge, and colinearities can make the results unstable. To overcome these
limitations, this paper presents a method to select groups of spectral
variables. It consists in a forward-backward procedure applied to the
coefficients of a B-Spline representation of the spectra. The criterion used in
the forward-backward procedure is the mutual information, allowing to find
nonlinear dependencies between variables, on the contrary of the generally used
correlation. The spline representation is used to get interpretability of the
results, as groups of consecutive spectral variables will be selected. The
experiments conducted on NIR spectra from fescue grass and diesel fuels show
that the method provides clearly identified groups of selected variables,
making interpretation easy, while keeping a low computational load. The
prediction performances obtained using the selected coefficients are higher
than those obtained by the same method applied directly to the original
variables and similar to those obtained using traditional models, although
using significantly less spectral variables
Temporal Model Adaptation for Person Re-Identification
Person re-identification is an open and challenging problem in computer
vision. Majority of the efforts have been spent either to design the best
feature representation or to learn the optimal matching metric. Most approaches
have neglected the problem of adapting the selected features or the learned
model over time. To address such a problem, we propose a temporal model
adaptation scheme with human in the loop. We first introduce a
similarity-dissimilarity learning method which can be trained in an incremental
fashion by means of a stochastic alternating directions methods of multipliers
optimization procedure. Then, to achieve temporal adaptation with limited human
effort, we exploit a graph-based approach to present the user only the most
informative probe-gallery matches that should be used to update the model.
Results on three datasets have shown that our approach performs on par or even
better than state-of-the-art approaches while reducing the manual pairwise
labeling effort by about 80%
Stochastic Optimization of PCA with Capped MSG
We study PCA as a stochastic optimization problem and propose a novel
stochastic approximation algorithm which we refer to as "Matrix Stochastic
Gradient" (MSG), as well as a practical variant, Capped MSG. We study the
method both theoretically and empirically
Neural networks in geophysical applications
Neural networks are increasingly popular in geophysics.
Because they are universal approximators, these
tools can approximate any continuous function with an
arbitrary precision. Hence, they may yield important
contributions to finding solutions to a variety of geophysical applications.
However, knowledge of many methods and techniques
recently developed to increase the performance
and to facilitate the use of neural networks does not seem
to be widespread in the geophysical community. Therefore,
the power of these tools has not yet been explored to
their full extent. In this paper, techniques are described
for faster training, better overall performance, i.e., generalization,and the automatic estimation of network size
and architecture
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
- …