21,819 research outputs found
A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression
Many machine learning problems can be formulated as predicting labels for a
pair of objects. Problems of that kind are often referred to as pairwise
learning, dyadic prediction or network inference problems. During the last
decade kernel methods have played a dominant role in pairwise learning. They
still obtain a state-of-the-art predictive performance, but a theoretical
analysis of their behavior has been underexplored in the machine learning
literature.
In this work we review and unify existing kernel-based algorithms that are
commonly used in different pairwise learning settings, ranging from matrix
filtering to zero-shot learning. To this end, we focus on closed-form efficient
instantiations of Kronecker kernel ridge regression. We show that independent
task kernel ridge regression, two-step kernel ridge regression and a linear
matrix filter arise naturally as a special case of Kronecker kernel ridge
regression, implying that all these methods implicitly minimize a squared loss.
In addition, we analyze universality, consistency and spectral filtering
properties. Our theoretical results provide valuable insights in assessing the
advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427
An Identity for Kernel Ridge Regression
This paper derives an identity connecting the square loss of ridge regression
in on-line mode with the loss of the retrospectively best regressor. Some
corollaries about the properties of the cumulative loss of on-line ridge
regression are also obtained.Comment: 35 pages; extended version of ALT 2010 paper (Proceedings of ALT
2010, LNCS 6331, Springer, 2010
Nonlinear Forecasting with Many Predictors using Kernel Ridge Regression
This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predictive regression model is based on a shrinkage estimator to avoid overfitting. We extend the kernel ridge regression methodology to enable its use for economic time-series forecasting, by including lags of the dependent variable or other individual variables as predictors, as is typically desired in macroeconomic and financial applications. Monte Carlo simulations as well as an empirical application to various key measures of real economic activity confirm that kernel ridge regression can produce more accurate forecasts than traditional linear methods for dealing with many predictors based on principal component regression
Spectral Norm of Random Kernel Matrices with Applications to Privacy
Kernel methods are an extremely popular set of techniques used for many
important machine learning and data analysis applications. In addition to
having good practical performances, these methods are supported by a
well-developed theory. Kernel methods use an implicit mapping of the input data
into a high dimensional feature space defined by a kernel function, i.e., a
function returning the inner product between the images of two data points in
the feature space. Central to any kernel method is the kernel matrix, which is
built by evaluating the kernel function on a given sample dataset.
In this paper, we initiate the study of non-asymptotic spectral theory of
random kernel matrices. These are n x n random matrices whose (i,j)th entry is
obtained by evaluating the kernel function on and , where
are a set of n independent random high-dimensional vectors. Our
main contribution is to obtain tight upper bounds on the spectral norm (largest
eigenvalue) of random kernel matrices constructed by commonly used kernel
functions based on polynomials and Gaussian radial basis.
As an application of these results, we provide lower bounds on the distortion
needed for releasing the coefficients of kernel ridge regression under
attribute privacy, a general privacy notion which captures a large class of
privacy definitions. Kernel ridge regression is standard method for performing
non-parametric regression that regularly outperforms traditional regression
approaches in various domains. Our privacy distortion lower bounds are the
first for any kernel technique, and our analysis assumes realistic scenarios
for the input, unlike all previous lower bounds for other release problems
which only hold under very restrictive input settings.Comment: 16 pages, 1 Figur
Inference in Nonlinear Differential Equations
Parameter inference in mechanistic models of coupled differential equations is a challenging problem. We propose a new method using kernel ridge regression in Reproducing Kernel Hilbert Spaces (RKHS). A three-step gradient matching algorithm is developed and applied to a realistic biochemical model
Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates
We establish optimal convergence rates for a decomposition-based scalable
approach to kernel ridge regression. The method is simple to describe: it
randomly partitions a dataset of size N into m subsets of equal size, computes
an independent kernel ridge regression estimator for each subset, then averages
the local solutions into a global predictor. This partitioning leads to a
substantial reduction in computation time versus the standard approach of
performing kernel ridge regression on all N samples. Our two main theorems
establish that despite the computational speed-up, statistical optimality is
retained: as long as m is not too large, the partition-based estimator achieves
the statistical minimax rate over all estimators using the set of N samples. As
concrete examples, our theory guarantees that the number of processors m may
grow nearly linearly for finite-rank kernels and Gaussian kernels and
polynomially in N for Sobolev spaces, which in turn allows for substantial
reductions in computational cost. We conclude with experiments on both
simulated data and a music-prediction task that complement our theoretical
results, exhibiting the computational and statistical benefits of our approach
- …