14,108 research outputs found
Iterative Regularization for Learning with Convex Loss Functions
We consider the problem of supervised learning with convex loss functions and
propose a new form of iterative regularization based on the subgradient method.
Unlike other regularization approaches, in iterative regularization no
constraint or penalization is considered, and generalization is achieved by
(early) stopping an empirical iteration. We consider a nonparametric setting,
in the framework of reproducing kernel Hilbert spaces, and prove finite sample
bounds on the excess risk under general regularity conditions. Our study
provides a new class of efficient regularized learning algorithms and gives
insights on the interplay between statistics and optimization in machine
learning
Super-Linear Convergence of Dual Augmented-Lagrangian Algorithm for Sparsity Regularized Estimation
We analyze the convergence behaviour of a recently proposed algorithm for
regularized estimation called Dual Augmented Lagrangian (DAL). Our analysis is
based on a new interpretation of DAL as a proximal minimization algorithm. We
theoretically show under some conditions that DAL converges super-linearly in a
non-asymptotic and global sense. Due to a special modelling of sparse
estimation problems in the context of machine learning, the assumptions we make
are milder and more natural than those made in conventional analysis of
augmented Lagrangian algorithms. In addition, the new interpretation enables us
to generalize DAL to wide varieties of sparse estimation problems. We
experimentally confirm our analysis in a large scale -regularized
logistic regression problem and extensively compare the efficiency of DAL
algorithm to previously proposed algorithms on both synthetic and benchmark
datasets.Comment: 51 pages, 9 figure
Fixed-point and coordinate descent algorithms for regularized kernel methods
In this paper, we study two general classes of optimization algorithms for
kernel methods with convex loss function and quadratic norm regularization, and
analyze their convergence. The first approach, based on fixed-point iterations,
is simple to implement and analyze, and can be easily parallelized. The second,
based on coordinate descent, exploits the structure of additively separable
loss functions to compute solutions of line searches in closed form. Instances
of these general classes of algorithms are already incorporated into state of
the art machine learning software for large scale problems. We start from a
solution characterization of the regularized problem, obtained using
sub-differential calculus and resolvents of monotone operators, that holds for
general convex loss functions regardless of differentiability. The two
methodologies described in the paper can be regarded as instances of non-linear
Jacobi and Gauss-Seidel algorithms, and are both well-suited to solve large
scale problems
Learning Output Kernels for Multi-Task Problems
Simultaneously solving multiple related learning tasks is beneficial under a
variety of circumstances, but the prior knowledge necessary to correctly model
task relationships is rarely available in practice. In this paper, we develop a
novel kernel-based multi-task learning technique that automatically reveals
structural inter-task relationships. Building over the framework of output
kernel learning (OKL), we introduce a method that jointly learns multiple
functions and a low-rank multi-task kernel by solving a non-convex
regularization problem. Optimization is carried out via a block coordinate
descent strategy, where each subproblem is solved using suitable conjugate
gradient (CG) type iterative methods for linear operator equations. The
effectiveness of the proposed approach is demonstrated on pharmacological and
collaborative filtering data
- …