Search CORE

4,618 research outputs found

Asymptotic Normality of Support Vector Machine Variants and Other Regularized Kernel Methods

Author: Hable Robert
Publication venue
Publication date: 12/04/2011
Field of study

In nonparametric classification and regression problems, regularized kernel methods, in particular support vector machines, attract much attention in theoretical and in applied statistics. In an abstract sense, regularized kernel methods (simply called SVMs here) can be seen as regularized M-estimators for a parameter in a (typically infinite dimensional) reproducing kernel Hilbert space. For smooth loss functions, it is shown that the difference between the estimator, i.e.\ the empirical SVM, and the theoretical SVM is asymptotically normal with rate

\sqrt{n}

. That is, the standardized difference converges weakly to a Gaussian process in the reproducing kernel Hilbert space. As common in real applications, the choice of the regularization parameter may depend on the data. The proof is done by an application of the functional delta-method and by showing that the SVM-functional is suitably Hadamard-differentiable

arXiv.org e-Print Archive

Elsevier - Publisher Connector

IDENTIFICATION AND ESTIMATION OF NONPARAMETRIC STRUCTURAL

Author: Woocheol Kim
Publication venue
Publication date
Field of study

This paper concerns a new statistical approach to instrumental variables (IV) method for nonparametric structural models with additive errors. A general identifying condition of the model is proposed, based on richness of the space generated by marginal discretizations of joint density functions. For consistent estimation, we develop statistical regularization theory to solve a random Fredholm integral equation of the first kind. A\ minimal set of conditions are given for consistency of a general regularization method. Using an abstract smoothness condition, we derive some optimal bounds, given the accuracies of preliminary estimates, and show the convergence rates of various regularization methods, including (the ordinary/iterated/generalized) Tikhonov and Showalter's methods. An application of the general regularization theory is discussed with a focus on a kernel smoothing method. We show an exact closed form, as well as the optimal convergence rate, of the kernel IV estimates of various regularization methods. The finite sample properties of the estimates are investigated via a small-scale Monte Carlo experimentNonparametric Strucutral Models, IV estimation, Statistical inverse problems

Research Papers in Economics

Inverse Density as an Inverse Problem: The Fredholm Equation Approach

Author: Belkin Mikhail
Que Qichao
Publication venue
Publication date: 25/04/2013
Field of study

In this paper we address the problem of estimating the ratio

\frac{q}{p}

where

p

is a density function and

q

is another density, or, more generally an arbitrary function. Knowing or approximating this ratio is needed in various problems of inference and integration, in particular, when one needs to average a function with respect to one probability distribution, given a sample from another. It is often referred as {\it importance sampling} in statistical inference and is also closely related to the problem of {\it covariate shift} in transfer learning as well as to various MCMC methods. It may also be useful for separating the underlying geometry of a space, say a manifold, from the density function defined on it. Our approach is based on reformulating the problem of estimating

\frac{q}{p}

as an inverse problem in terms of an integral operator corresponding to a kernel, and thus reducing it to an integral equation, known as the Fredholm problem of the first kind. This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically. The resulting family of algorithms (FIRE, for Fredholm Inverse Regularized Estimator) is flexible, simple and easy to implement. We provide detailed theoretical analysis including concentration bounds and convergence rates for the Gaussian kernel in the case of densities defined on

\R^d

, compact domains in

\R^d

and smooth

d

-dimensional sub-manifolds of the Euclidean space. We also show experimental results including applications to classification and semi-supervised learning within the covariate shift framework and demonstrate some encouraging experimental comparisons. We also show how the parameters of our algorithms can be chosen in a completely unsupervised manner.Comment: Fixing a few typos in last versio

arXiv.org e-Print Archive

CiteSeerX

Model selection of polynomial kernel regression

Author: Lin Shaobo
Sun Xingping
Xu Zongben
Zeng Jinshan
Publication venue
Publication date: 07/03/2015
Field of study

Polynomial kernel regression is one of the standard and state-of-the-art learning strategies. However, as is well known, the choices of the degree of polynomial kernel and the regularization parameter are still open in the realm of model selection. The first aim of this paper is to develop a strategy to select these parameters. On one hand, based on the worst-case learning rate analysis, we show that the regularization term in polynomial kernel regression is not necessary. In other words, the regularization parameter can decrease arbitrarily fast when the degree of the polynomial kernel is suitable tuned. On the other hand,taking account of the implementation of the algorithm, the regularization term is required. Summarily, the effect of the regularization term in polynomial kernel regression is only to circumvent the " ill-condition" of the kernel matrix. Based on this, the second purpose of this paper is to propose a new model selection strategy, and then design an efficient learning algorithm. Both theoretical and experimental analysis show that the new strategy outperforms the previous one. Theoretically, we prove that the new learning strategy is almost optimal if the regression function is smooth. Experimentally, it is shown that the new strategy can significantly reduce the computational burden without loss of generalization capability.Comment: 29 pages, 4 figure

arXiv.org e-Print Archive