657 research outputs found
Statistical inference in compound functional models
We consider a general nonparametric regression model called the compound
model. It includes, as special cases, sparse additive regression and
nonparametric (or linear) regression with many covariates but possibly a small
number of relevant covariates. The compound model is characterized by three
main parameters: the structure parameter describing the "macroscopic" form of
the compound function, the "microscopic" sparsity parameter indicating the
maximal number of relevant covariates in each component and the usual
smoothness parameter corresponding to the complexity of the members of the
compound. We find non-asymptotic minimax rate of convergence of estimators in
such a model as a function of these three parameters. We also show that this
rate can be attained in an adaptive way
Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness
We investigate the learning rate of multiple kernel learning (MKL) with
and elastic-net regularizations. The elastic-net regularization is a
composition of an -regularizer for inducing the sparsity and an
-regularizer for controlling the smoothness. We focus on a sparse
setting where the total number of kernels is large, but the number of nonzero
components of the ground truth is relatively small, and show sharper
convergence rates than the learning rates have ever shown for both and
elastic-net regularizations. Our analysis reveals some relations between the
choice of a regularization function and the performance. If the ground truth is
smooth, we show a faster convergence rate for the elastic-net regularization
with less conditions than -regularization; otherwise, a faster
convergence rate for the -regularization is shown.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1095 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org). arXiv admin note: text overlap with
arXiv:1103.043
High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
The goal of supervised feature selection is to find a subset of input
features that are responsible for predicting output values. The least absolute
shrinkage and selection operator (Lasso) allows computationally efficient
feature selection based on linear dependency between input features and output
values. In this paper, we consider a feature-wise kernelized Lasso for
capturing non-linear input-output dependency. We first show that, with
particular choices of kernel functions, non-redundant features with strong
statistical dependence on output values can be found in terms of kernel-based
independence measures. We then show that the globally optimal solution can be
efficiently computed; this makes the approach scalable to high-dimensional
problems. The effectiveness of the proposed method is demonstrated through
feature selection experiments with thousands of features.Comment: 18 page
Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization
We investigate the learning rate of multiple kernel leaning (MKL) with
elastic-net regularization, which consists of an -regularizer for
inducing the sparsity and an -regularizer for controlling the
smoothness. We focus on a sparse setting where the total number of kernels is
large but the number of non-zero components of the ground truth is relatively
small, and prove that elastic-net MKL achieves the minimax learning rate on the
-mixed-norm ball. Our bound is sharper than the convergence rates ever
shown, and has a property that the smoother the truth is, the faster the
convergence rate is.Comment: 21 pages, 0 figur
- …