Search CORE

21,819 research outputs found

A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

Author: Airola Antti
De Baets Bernard
Pahikkala Tapio
Stock Michiel
Waegeman Willem
Publication venue
Publication date: 01/01/2018
Field of study

Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify existing kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency and spectral filtering properties. Our theoretical results provide valuable insights in assessing the advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427

arXiv.org e-Print Archive

Ghent University Academic Bibliography

An Identity for Kernel Ridge Regression

Author: Kalnishkan Yuri
Zhdanov Fedor
Publication venue
Publication date: 01/01/2011
Field of study

This paper derives an identity connecting the square loss of ridge regression in on-line mode with the loss of the retrospectively best regressor. Some corollaries about the properties of the cumulative loss of on-line ridge regression are also obtained.Comment: 35 pages; extended version of ALT 2010 paper (Proceedings of ALT 2010, LNCS 6331, Springer, 2010

arXiv.org e-Print Archive

CiteSeerX

Nonlinear Forecasting with Many Predictors using Kernel Ridge Regression

Author: Dijk D.J.C. (Dick) van
Exterkate P. (Peter)
Groenen P.J.F. (Patrick)
Heij C. (Christiaan)
Publication venue: Exterkate, P. (Peter)
Publication date: 01/01/2011
Field of study

This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predictive regression model is based on a shrinkage estimator to avoid overfitting. We extend the kernel ridge regression methodology to enable its use for economic time-series forecasting, by including lags of the dependent variable or other individual variables as predictors, as is typically desired in macroeconomic and financial applications. Monte Carlo simulations as well as an empirical application to various key measures of real economic activity confirm that kernel ridge regression can produce more accurate forecasts than traditional linear methods for dealing with many predictors based on principal component regression

EUR Research Repository

Erasmus University Digital Repository

Spectral Norm of Random Kernel Matrices with Applications to Privacy

Author: Kasiviswanathan Shiva Prasad
Rudelson Mark
Publication venue
Publication date: 01/01/2015
Field of study

Kernel methods are an extremely popular set of techniques used for many important machine learning and data analysis applications. In addition to having good practical performances, these methods are supported by a well-developed theory. Kernel methods use an implicit mapping of the input data into a high dimensional feature space defined by a kernel function, i.e., a function returning the inner product between the images of two data points in the feature space. Central to any kernel method is the kernel matrix, which is built by evaluating the kernel function on a given sample dataset. In this paper, we initiate the study of non-asymptotic spectral theory of random kernel matrices. These are n x n random matrices whose (i,j)th entry is obtained by evaluating the kernel function on

x_i

and

x_j

, where

x_1,...,x_n

are a set of n independent random high-dimensional vectors. Our main contribution is to obtain tight upper bounds on the spectral norm (largest eigenvalue) of random kernel matrices constructed by commonly used kernel functions based on polynomials and Gaussian radial basis. As an application of these results, we provide lower bounds on the distortion needed for releasing the coefficients of kernel ridge regression under attribute privacy, a general privacy notion which captures a large class of privacy definitions. Kernel ridge regression is standard method for performing non-parametric regression that regularly outperforms traditional regression approaches in various domains. Our privacy distortion lower bounds are the first for any kernel technique, and our analysis assumes realistic scenarios for the input, unlike all previous lower bounds for other release problems which only hold under very restrictive input settings.Comment: 16 pages, 1 Figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Inference in Nonlinear Differential Equations

Author: Filippone Maurizio
Husmeier Dirk
Niu Mu
Rogers Simon
Publication venue
Publication date: 01/07/2015
Field of study

Parameter inference in mechanistic models of coupled differential equations is a challenging problem. We propose a new method using kernel ridge regression in Reproducing Kernel Hilbert Spaces (RKHS). A three-step gradient matching algorithm is developed and applied to a realistic biochemical model

Enlighten

Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

Author: Duchi John C.
Wainwright Martin J.
Zhang Yuchen
Publication venue
Publication date: 29/04/2014
Field of study

We establish optimal convergence rates for a decomposition-based scalable approach to kernel ridge regression. The method is simple to describe: it randomly partitions a dataset of size N into m subsets of equal size, computes an independent kernel ridge regression estimator for each subset, then averages the local solutions into a global predictor. This partitioning leads to a substantial reduction in computation time versus the standard approach of performing kernel ridge regression on all N samples. Our two main theorems establish that despite the computational speed-up, statistical optimality is retained: as long as m is not too large, the partition-based estimator achieves the statistical minimax rate over all estimators using the set of N samples. As concrete examples, our theory guarantees that the number of processors m may grow nearly linearly for finite-rank kernels and Gaussian kernels and polynomially in N for Sobolev spaces, which in turn allows for substantial reductions in computational cost. We conclude with experiments on both simulated data and a music-prediction task that complement our theoretical results, exhibiting the computational and statistical benefits of our approach

arXiv.org e-Print Archive

CiteSeerX