Search CORE

790 research outputs found

Does generalization performance of $l^q$ regularization learning depend on $q$ ? A negative example

Author: Fang Jian
Lin Shaobo
Xu Chen
Zeng Jingshan
Publication venue
Publication date: 24/07/2013
Field of study

l^q

-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling. It attempts to improve the generalization (prediction) capability of a machine (model) through appropriately shrinking its coefficients. The shape of a

l^q

estimator differs in varying choices of the regularization order

q

. In particular,

l^1

leads to the LASSO estimate, while

l^{2}

corresponds to the smooth ridge regression. This makes the order

q

a potential tuning parameter in applications. To facilitate the use of

l^{q}

-regularization, we intend to seek for a modeling strategy where an elaborative selection on

q

is avoidable. In this spirit, we place our investigation within a general framework of

l^{q}

-regularized kernel learning under a sample dependent hypothesis space (SDHS). For a designated class of kernel functions, we show that all

l^{q}

estimators for

0< q < \infty

attain similar generalization error bounds. These estimated bounds are almost optimal in the sense that up to a logarithmic factor, the upper and lower bounds are asymptotically identical. This finding tentatively reveals that, in some modeling contexts, the choice of

q

might not have a strong impact in terms of the generalization capability. From this perspective,

q

can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..Comment: 35 pages, 3 figure

arXiv.org e-Print Archive

Model selection of polynomial kernel regression

Author: Lin Shaobo
Sun Xingping
Xu Zongben
Zeng Jinshan
Publication venue
Publication date: 07/03/2015
Field of study

Polynomial kernel regression is one of the standard and state-of-the-art learning strategies. However, as is well known, the choices of the degree of polynomial kernel and the regularization parameter are still open in the realm of model selection. The first aim of this paper is to develop a strategy to select these parameters. On one hand, based on the worst-case learning rate analysis, we show that the regularization term in polynomial kernel regression is not necessary. In other words, the regularization parameter can decrease arbitrarily fast when the degree of the polynomial kernel is suitable tuned. On the other hand,taking account of the implementation of the algorithm, the regularization term is required. Summarily, the effect of the regularization term in polynomial kernel regression is only to circumvent the " ill-condition" of the kernel matrix. Based on this, the second purpose of this paper is to propose a new model selection strategy, and then design an efficient learning algorithm. Both theoretical and experimental analysis show that the new strategy outperforms the previous one. Theoretically, we prove that the new learning strategy is almost optimal if the regression function is smooth. Experimentally, it is shown that the new strategy can significantly reduce the computational burden without loss of generalization capability.Comment: 29 pages, 4 figure

arXiv.org e-Print Archive

Regularized Regression Problem in hyper-RKHS for Learning Kernels

Author: Huang Xiaolin
Liu Fanghui
Shi Lei
Suykens Johan A. K.
Yang Jie
Publication venue
Publication date: 06/11/2020
Field of study

This paper generalizes the two-stage kernel learning framework, illustrates its utility for kernel learning and out-of-sample extensions, and proves {asymptotic} convergence results for the introduced kernel learning model. Algorithmically, we extend target alignment by hyper-kernels in the two-stage kernel learning framework. The associated kernel learning task is formulated as a regression problem in a hyper-reproducing kernel Hilbert space (hyper-RKHS), i.e., learning on the space of kernels itself. To solve this problem, we present two regression models with bivariate forms in this space, including kernel ridge regression (KRR) and support vector regression (SVR) in the hyper-RKHS. By doing so, it provides significant model flexibility for kernel learning with outstanding performance in real-world applications. Specifically, our kernel learning framework is general, that is, the learned underlying kernel can be positive definite or indefinite, which adapts to various requirements in kernel learning. Theoretically, we study the convergence behavior of these learning algorithms in the hyper-RKHS and derive the learning rates. Different from the traditional approximation analysis in RKHS, our analyses need to consider the non-trivial independence of pairwise samples and the characterisation of hyper-RKHS. To the best of our knowledge, this is the first work in learning theory to study the approximation performance of regularized regression problem in hyper-RKHS.Comment: 25 pages, 3 figure

arXiv.org e-Print Archive

Local polynomial regression for circular predictors

Author: Agnese Panzera
Bai
Beran
Charles C. Taylor
Cogburn
Hall
Jammalamadaka
Klemelä
Lejeune
Marco Di Marzio
Mardia
Ruppert
Silverman
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We consider local smoothing of datasets where the design space is the d-dimensional (d >= 1) torus and the response variable is real-valued. Our purpose is to extend least squares local polynomial fitting to this situation. We give both theoretical and empirical results

CiteSeerX

Crossref

White Rose Research Online