13 research outputs found
Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions
We design simple screening tests to automatically discard data samples in empirical risk minimization without losing optimization guarantees. We derive loss functions that produce dual objectives with a sparse solution. We also show how to regularize convex losses to ensure such a dual sparsity-inducing property, and propose a general method to design screening tests for classification or regression based on ellipsoidal approximations of the optimal set. In addition to producing computational gains, our approach also allows us to compress a dataset into a subset of representative points
On the Relationship between Conjugate Gradient and Optimal First-Order Methods for Convex Optimization
In a series of work initiated by Nemirovsky and Yudin, and later extended by Nesterov, first-order algorithms for unconstrained minimization with optimal theoretical complexity bound have been proposed. On the other hand, conjugate gradient algorithms as one of the widely used first-order techniques suffer from the lack of a finite complexity bound. In fact their performance can possibly be quite poor. This dissertation is partially on tightening the gap between these two classes of algorithms, namely the traditional conjugate gradient methods and optimal first-order techniques. We derive conditions under which conjugate gradient methods attain the same complexity bound as in Nemirovsky-Yudin's and Nesterov's methods. Moreover, we propose a conjugate gradient-type algorithm named CGSO, for Conjugate Gradient with Subspace Optimization, achieving the optimal complexity bound with the payoff of a little extra computational cost.
We extend the theory of CGSO to convex problems with linear constraints. In particular we focus on solving -regularized least square problem, often referred to as Basis Pursuit Denoising (BPDN) problem in the optimization community. BPDN arises in many practical fields including sparse signal recovery, machine learning, and statistics. Solving BPDN is fairly challenging because the size of the involved signals can be quite large; therefore first order methods are of particular interest for these problems. We propose a quasi-Newton proximal method for solving BPDN. Our numerical results suggest that our technique is computationally effective, and can compete favourably with the other state-of-the-art solvers
Dynamic Screening: Accelerating First-Order Algorithms for the Lasso and Group-Lasso
Recent computational strategies based on screening tests have been proposed
to accelerate algorithms addressing penalized sparse regression problems such
as the Lasso. Such approaches build upon the idea that it is worth dedicating
some small computational effort to locate inactive atoms and remove them from
the dictionary in a preprocessing stage so that the regression algorithm
working with a smaller dictionary will then converge faster to the solution of
the initial problem. We believe that there is an even more efficient way to
screen the dictionary and obtain a greater acceleration: inside each iteration
of the regression algorithm, one may take advantage of the algorithm
computations to obtain a new screening test for free with increasing screening
effects along the iterations. The dictionary is henceforth dynamically screened
instead of being screened statically, once and for all, before the first
iteration. We formalize this dynamic screening principle in a general
algorithmic scheme and apply it by embedding inside a number of first-order
algorithms adapted existing screening tests to solve the Lasso or new screening
tests to solve the Group-Lasso. Computational gains are assessed in a large set
of experiments on synthetic data as well as real-world sounds and images. They
show both the screening efficiency and the gain in terms running times
An Algorithmic Framework for Computing Validation Performance Bounds by Using Suboptimal Models
Practical model building processes are often time-consuming because many
different models must be trained and validated. In this paper, we introduce a
novel algorithm that can be used for computing the lower and the upper bounds
of model validation errors without actually training the model itself. A key
idea behind our algorithm is using a side information available from a
suboptimal model. If a reasonably good suboptimal model is available, our
algorithm can compute lower and upper bounds of many useful quantities for
making inferences on the unknown target model. We demonstrate the advantage of
our algorithm in the context of model selection for regularized learning
problems
Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions
International audienceWe design simple screening tests to automatically discard data samples in empirical risk minimization without losing optimization guarantees. We derive loss functions that produce dual objectives with a sparse solution. We also show how to regularize convex losses to ensure such a dual sparsity-inducing property, and propose a general method to design screening tests for classification or regression based on ellipsoidal approximations of the optimal set. In addition to producing computational gains, our approach also allows us to compress a dataset into a subset of representative points
Safe rules for the identification of zeros in the solutions of the SLOPE problem
In this paper we propose a methodology to accelerate the resolution of the
so-called ``Sorted L-One Penalized Estimation'' (SLOPE) problem. Our method
leverages the concept of ``safe screening'', well-studied in the literature for
\textit{group-separable} sparsity-inducing norms, and aims at identifying the
zeros in the solution of SLOPE. More specifically, we introduce a family of
safe screening rules for this problem, where is the dimension of
the primal variable, and propose a tractable procedure to verify if one of
these tests is passed. Our procedure has a complexity where is a problem-dependent constant and is the number
of zeros identified by the tests. We assess the performance of our proposed
method on a numerical benchmark and emphasize that it leads to significant
computational savings in many setups.Comment: 24 pages, 3 figure