Search CORE

48,739 research outputs found

Online Learning of Noisy Data with Kernels

Author: Cesa-Bianchi Nicolò
Shalev-Shwartz Shai
Shamir Ohad
Publication venue
Publication date: 01/01/2010
Field of study

We study online learning when individual instances are corrupted by adversarially chosen random noise. We assume the noise distribution is unknown, and may change over time with no restriction other than having zero mean and bounded variance. Our technique relies on a family of unbiased estimators for non-linear functions, which may be of independent interest. We show that a variant of online gradient descent can learn functions in any dot-product (e.g., polynomial) or Gaussian kernel space with any analytic convex loss function. Our variant uses randomized estimates that need to query a random number of noisy copies of each instance, where with high probability this number is upper bounded by a constant. Allowing such multiple queries cannot be avoided: Indeed, we show that online learning is in general impossible when only one noisy copy of each instance can be accessed.Comment: This is a full version of the paper appearing in the 23rd International Conference on Learning Theory (COLT 2010

arXiv.org e-Print Archive

CiteSeerX

AIR Universita degli studi di Milano

Online Learning with Multiple Operator-valued Kernels

Author: Audiffren Julien
Kadri Hachem
Publication venue
Publication date: 15/10/2013
Field of study

We consider the problem of learning a vector-valued function f in an online learning setting. The function f is assumed to lie in a reproducing Hilbert space of operator-valued kernels. We describe two online algorithms for learning f while taking into account the output structure. A first contribution is an algorithm, ONORMA, that extends the standard kernel-based online learning algorithm NORMA from scalar-valued to operator-valued setting. We report a cumulative error bound that holds both for classification and regression. We then define a second algorithm, MONORMA, which addresses the limitation of pre-defining the output structure in ONORMA by learning sequentially a linear combination of operator-valued kernels. Our experiments show that the proposed algorithms achieve good performance results with low computational cost

arXiv.org e-Print Archive

HAL AMU

HAL Descartes

Analyzing sparse dictionaries for online learning with kernels

Author: Honeine Paul
Publication venue
Publication date: 21/09/2014
Field of study

Many signal processing and machine learning methods share essentially the same linear-in-the-parameter model, with as many parameters as available samples as in kernel-based machines. Sparse approximation is essential in many disciplines, with new challenges emerging in online learning with kernels. To this end, several sparsity measures have been proposed in the literature to quantify sparse dictionaries and constructing relevant ones, the most prolific ones being the distance, the approximation, the coherence and the Babel measures. In this paper, we analyze sparse dictionaries based on these measures. By conducting an eigenvalue analysis, we show that these sparsity measures share many properties, including the linear independence condition and inducing a well-posed optimization problem. Furthermore, we prove that there exists a quasi-isometry between the parameter (i.e., dual) space and the dictionary's induced feature space.Comment: 10 page

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Sequential Gaussian Processes for Online Learning of Nonstationary Functions

Author: Dumitrascu Bianca
Engelhardt Barbara E.
Williamson Sinead A.
Zhang Michael Minyi
Publication venue
Publication date: 16/10/2019
Field of study

Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several drawbacks: i) Conventional GP inference scales

O(N^{3})

with respect to the number of observations; ii) updating a GP model sequentially is not trivial; and iii) covariance kernels often enforce stationarity constraints on the function, while GPs with non-stationary covariance kernels are often intractable to use in practice. To overcome these issues, we propose an online sequential Monte Carlo algorithm to fit mixtures of GPs that capture non-stationary behavior while allowing for fast, distributed inference. By formulating hyperparameter optimization as a multi-armed bandit problem, we accelerate mixing for real time inference. Our approach empirically improves performance over state-of-the-art methods for online GP estimation in the context of prediction for simulated non-stationary data and hospital time series data

arXiv.org e-Print Archive

Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel LMS

Author: Bouboulis Pantelis
Theodoridis Sergios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/11/2010
Field of study

Over the last decade, kernel methods for nonlinear processing have successfully been used in the machine learning community. The primary mathematical tool employed in these methods is the notion of the Reproducing Kernel Hilbert Space. However, so far, the emphasis has been on batch techniques. It is only recently, that online techniques have been considered in the context of adaptive signal processing tasks. Moreover, these efforts have only been focussed on real valued data sequences. To the best of our knowledge, no adaptive kernel-based strategy has been developed, so far, for complex valued signals. Furthermore, although the real reproducing kernels are used in an increasing number of machine learning problems, complex kernels have not, yet, been used, in spite of their potential interest in applications that deal with complex signals, with Communications being a typical example. In this paper, we present a general framework to attack the problem of adaptive filtering of complex signals, using either real reproducing kernels, taking advantage of a technique called \textit{complexification} of real RKHSs, or complex reproducing kernels, highlighting the use of the complex gaussian kernel. In order to derive gradients of operators that need to be defined on the associated complex RKHSs, we employ the powerful tool of Wirtinger's Calculus, which has recently attracted attention in the signal processing community. To this end, in this paper, the notion of Wirtinger's calculus is extended, for the first time, to include complex RKHSs and use it to derive several realizations of the Complex Kernel Least-Mean-Square (CKLMS) algorithm. Experiments verify that the CKLMS offers significant performance improvements over several linear and nonlinear algorithms, when dealing with nonlinearities.Comment: 15 pages (double column), preprint of article accepted in IEEE Trans. Sig. Pro

arXiv.org e-Print Archive

CiteSeerX

Crossref