133,804 research outputs found
Online Learning of Noisy Data with Kernels
We study online learning when individual instances are corrupted by
adversarially chosen random noise. We assume the noise distribution is unknown,
and may change over time with no restriction other than having zero mean and
bounded variance. Our technique relies on a family of unbiased estimators for
non-linear functions, which may be of independent interest. We show that a
variant of online gradient descent can learn functions in any dot-product
(e.g., polynomial) or Gaussian kernel space with any analytic convex loss
function. Our variant uses randomized estimates that need to query a random
number of noisy copies of each instance, where with high probability this
number is upper bounded by a constant. Allowing such multiple queries cannot be
avoided: Indeed, we show that online learning is in general impossible when
only one noisy copy of each instance can be accessed.Comment: This is a full version of the paper appearing in the 23rd
International Conference on Learning Theory (COLT 2010
Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes
Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair
Online Real-time Learning of Dynamical Systems from Noisy Streaming Data
Recent advancements in sensing and communication facilitate obtaining
high-frequency real-time data from various physical systems like power
networks, climate systems, biological networks, etc. However, since the data
are recorded by physical sensors, it is natural that the obtained data is
corrupted by measurement noise. In this paper, we present a novel algorithm for
online real-time learning of dynamical systems from noisy time-series data,
which employs the Robust Koopman operator framework to mitigate the effect of
measurement noise. The proposed algorithm has three main advantages: a) it
allows for online real-time monitoring of a dynamical system; b) it obtains a
linear representation of the underlying dynamical system, thus enabling the
user to use linear systems theory for analysis and control of the system; c) it
is computationally fast and less intensive than the popular Extended Dynamic
Mode Decomposition (EDMD) algorithm. We illustrate the efficiency of the
proposed algorithm by applying it to identify the Van der Pol oscillator, the
IEEE 68 bus system, and a ring network of Van der Pol oscillators
Empirical Gaussian priors for cross-lingual transfer learning
Sequence model learning algorithms typically maximize log-likelihood minus
the norm of the model (or minimize Hamming loss + norm). In cross-lingual
part-of-speech (POS) tagging, our target language training data consists of
sequences of sentences with word-by-word labels projected from translations in
languages for which we have labeled data, via word alignments. Our training
data is therefore very noisy, and if Rademacher complexity is high, learning
algorithms are prone to overfit. Norm-based regularization assumes a constant
width and zero mean prior. We instead propose to use the source language
models to estimate the parameters of a Gaussian prior for learning new POS
taggers. This leads to significantly better performance in multi-source
transfer set-ups. We also present a drop-out version that injects (empirical)
Gaussian noise during online learning. Finally, we note that using empirical
Gaussian priors leads to much lower Rademacher complexity, and is superior to
optimally weighted model interpolation.Comment: Presented at NIPS 2015 Workshop on Transfer and Multi-Task Learnin
- …