527 research outputs found
Online Matrix Completion Through Nuclear Norm Regularisation
It is the main goal of this paper to propose a novel method to perform matrix
completion on-line. Motivated by a wide variety of applications, ranging from
the design of recommender systems to sensor network localization through
seismic data reconstruction, we consider the matrix completion problem when
entries of the matrix of interest are observed gradually. Precisely, we place
ourselves in the situation where the predictive rule should be refined
incrementally, rather than recomputed from scratch each time the sample of
observed entries increases. The extension of existing matrix completion methods
to the sequential prediction context is indeed a major issue in the Big Data
era, and yet little addressed in the literature. The algorithm promoted in this
article builds upon the Soft Impute approach introduced in Mazumder et al.
(2010). The major novelty essentially arises from the use of a randomised
technique for both computing and updating the Singular Value Decomposition
(SVD) involved in the algorithm. Though of disarming simplicity, the method
proposed turns out to be very efficient, while requiring reduced computations.
Several numerical experiments based on real datasets illustrating its
performance are displayed, together with preliminary results giving it a
theoretical basis.Comment: Corrected a typo in the affiliatio
Online Matrix Completion Through Nuclear Norm Regularisation
Corrected a typo in the affiliationInternational audienceIt is the main goal of this paper to propose a novel method to perform matrix completion on-line. Motivated by a wide variety of applications, ranging from the design of recommender systems to sensor network localization through seismic data reconstruction, we consider the matrix completion problem when entries of the matrix of interest are observed gradually. Precisely, we place ourselves in the situation where the predictive rule should be refined incrementally, rather than recomputed from scratch each time the sample of observed entries increases. The extension of existing matrix completion methods to the sequential prediction context is indeed a major issue in the Big Data era, and yet little addressed in the literature. The algorithm promoted in this article builds upon the Soft Impute approach introduced in Mazumder et al. (2010). The major novelty essentially arises from the use of a randomised technique for both computing and updating the Singular Value Decomposition (SVD) involved in the algorithm. Though of disarming simplicity, the method proposed turns out to be very efficient, while requiring reduced computations. Several numerical experiments based on real datasets illustrating its performance are displayed, together with preliminary results giving it a theoretical basis
Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization
We propose a new first-order optimisation algorithm to solve high-dimensional
non-smooth composite minimisation problems. Typical examples of such problems
have an objective that decomposes into a non-smooth empirical risk part and a
non-smooth regularisation penalty. The proposed algorithm, called Semi-Proximal
Mirror-Prox, leverages the Fenchel-type representation of one part of the
objective while handling the other part of the objective via linear
minimization over the domain. The algorithm stands in contrast with more
classical proximal gradient algorithms with smoothing, which require the
computation of proximal operators at each iteration and can therefore be
impractical for high-dimensional problems. We establish the theoretical
convergence rate of Semi-Proximal Mirror-Prox, which exhibits the optimal
complexity bounds, i.e. , for the number of calls to linear
minimization oracle. We present promising experimental results showing the
interest of the approach in comparison to competing methods
AUC Optimisation and Collaborative Filtering
In recommendation systems, one is interested in the ranking of the predicted
items as opposed to other losses such as the mean squared error. Although a
variety of ways to evaluate rankings exist in the literature, here we focus on
the Area Under the ROC Curve (AUC) as it widely used and has a strong
theoretical underpinning. In practical recommendation, only items at the top of
the ranked list are presented to the users. With this in mind, we propose a
class of objective functions over matrix factorisations which primarily
represent a smooth surrogate for the real AUC, and in a special case we show
how to prioritise the top of the list. The objectives are differentiable and
optimised through a carefully designed stochastic gradient-descent-based
algorithm which scales linearly with the size of the data. In the special case
of square loss we show how to improve computational complexity by leveraging
previously computed measures. To understand theoretically the underlying matrix
factorisation approaches we study both the consistency of the loss functions
with respect to AUC, and generalisation using Rademacher theory. The resulting
generalisation analysis gives strong motivation for the optimisation under
study. Finally, we provide computation results as to the efficacy of the
proposed method using synthetic and real data
Matrix completion by singular value thresholding: sharp bounds
We consider the matrix completion problem where the aim is to esti-mate a
large data matrix for which only a relatively small random subset of its
entries is observed. Quite popular approaches to matrix completion problem are
iterative thresholding methods. In spite of their empirical success, the
theoretical guarantees of such iterative thresholding methods are poorly
understood. The goal of this paper is to provide strong theo-retical
guarantees, similar to those obtained for nuclear-norm penalization methods and
one step thresholding methods, for an iterative thresholding algorithm which is
a modification of the softImpute algorithm. An im-portant consequence of our
result is the exact minimax optimal rates of convergence for matrix completion
problem which were known until know only up to a logarithmic factor
Robust PCA for Anomaly Detection and Data Imputation in Seasonal Time Series
We propose a robust principal component analysis (RPCA) framework to recover
low-rank and sparse matrices from temporal observations. We develop an online
version of the batch temporal algorithm in order to process larger datasets or
streaming data. We empirically compare the proposed approaches with different
RPCA frameworks and show their effectiveness in practical situations
- …