Search CORE

11,965 research outputs found

Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness

Author: Sugiyama Masashi
Suzuki Taiji
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

We investigate the learning rate of multiple kernel learning (MKL) with

\ell_1

and elastic-net regularizations. The elastic-net regularization is a composition of an

\ell_1

-regularizer for inducing the sparsity and an

\ell_2

-regularizer for controlling the smoothness. We focus on a sparse setting where the total number of kernels is large, but the number of nonzero components of the ground truth is relatively small, and show sharper convergence rates than the learning rates have ever shown for both

\ell_1

and elastic-net regularizations. Our analysis reveals some relations between the choice of a regularization function and the performance. If the ground truth is smooth, we show a faster convergence rate for the elastic-net regularization with less conditions than

\ell_1

-regularization; otherwise, a faster convergence rate for the

\ell_1

-regularization is shown.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1095 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1103.043

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization

Author: Sugiyama Masashi
Suzuki Taiji
Tomioka Ryota
Publication venue
Publication date: 01/01/2011
Field of study

We investigate the learning rate of multiple kernel leaning (MKL) with elastic-net regularization, which consists of an

\ell_1

-regularizer for inducing the sparsity and an

\ell_2

-regularizer for controlling the smoothness. We focus on a sparse setting where the total number of kernels is large but the number of non-zero components of the ground truth is relatively small, and prove that elastic-net MKL achieves the minimax learning rate on the

\ell_2

-mixed-norm ball. Our bound is sharper than the convergence rates ever shown, and has a property that the smoother the truth is, the faster the convergence rate is.Comment: 21 pages, 0 figur

arXiv.org e-Print Archive

CiteSeerX

Does generalization performance of $l^q$ regularization learning depend on $q$ ? A negative example

Author: Fang Jian
Lin Shaobo
Xu Chen
Zeng Jingshan
Publication venue
Publication date: 24/07/2013
Field of study

l^q

-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling. It attempts to improve the generalization (prediction) capability of a machine (model) through appropriately shrinking its coefficients. The shape of a

l^q

estimator differs in varying choices of the regularization order

q

. In particular,

l^1

leads to the LASSO estimate, while

l^{2}

corresponds to the smooth ridge regression. This makes the order

q

a potential tuning parameter in applications. To facilitate the use of

l^{q}

-regularization, we intend to seek for a modeling strategy where an elaborative selection on

q

is avoidable. In this spirit, we place our investigation within a general framework of

l^{q}

-regularized kernel learning under a sample dependent hypothesis space (SDHS). For a designated class of kernel functions, we show that all

l^{q}

estimators for

0< q < \infty

attain similar generalization error bounds. These estimated bounds are almost optimal in the sense that up to a logarithmic factor, the upper and lower bounds are asymptotically identical. This finding tentatively reveals that, in some modeling contexts, the choice of

q

might not have a strong impact in terms of the generalization capability. From this perspective,

q

can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..Comment: 35 pages, 3 figure

arXiv.org e-Print Archive

Inverse Density as an Inverse Problem: The Fredholm Equation Approach

Author: Belkin Mikhail
Que Qichao
Publication venue
Publication date: 25/04/2013
Field of study

In this paper we address the problem of estimating the ratio

\frac{q}{p}

where

p

is a density function and

q

is another density, or, more generally an arbitrary function. Knowing or approximating this ratio is needed in various problems of inference and integration, in particular, when one needs to average a function with respect to one probability distribution, given a sample from another. It is often referred as {\it importance sampling} in statistical inference and is also closely related to the problem of {\it covariate shift} in transfer learning as well as to various MCMC methods. It may also be useful for separating the underlying geometry of a space, say a manifold, from the density function defined on it. Our approach is based on reformulating the problem of estimating

\frac{q}{p}

as an inverse problem in terms of an integral operator corresponding to a kernel, and thus reducing it to an integral equation, known as the Fredholm problem of the first kind. This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically. The resulting family of algorithms (FIRE, for Fredholm Inverse Regularized Estimator) is flexible, simple and easy to implement. We provide detailed theoretical analysis including concentration bounds and convergence rates for the Gaussian kernel in the case of densities defined on

\R^d

, compact domains in

\R^d

and smooth

d

-dimensional sub-manifolds of the Euclidean space. We also show experimental results including applications to classification and semi-supervised learning within the covariate shift framework and demonstrate some encouraging experimental comparisons. We also show how the parameters of our algorithms can be chosen in a completely unsupervised manner.Comment: Fixing a few typos in last versio

arXiv.org e-Print Archive

CiteSeerX