6,996 research outputs found
Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes
Successful biological systems adapt to change. In this paper, we are principally concerned with adaptive systems that operate in environments where data arrives sequentially and is multivariate in nature, for example, sensory streams in robotic systems. We contribute two reservoir inspired methods: 1) the online echostate Gaussian process (OESGP) and 2) its infinite variant, the online infinite echostate Gaussian process (OIESGP) Both algorithms are iterative fixed-budget methods that learn from noisy time series. In particular, the OESGP combines the echo-state network with Bayesian online learning for Gaussian processes. Extending this to infinite reservoirs yields the OIESGP, which uses a novel recursive kernel with automatic relevance determination that enables spatial and temporal feature weighting. When fused with stochastic natural gradient descent, the kernel hyperparameters are iteratively adapted to better model the target system. Furthermore, insights into the underlying system can be gleamed from inspection of the resulting hyperparameters. Experiments on noisy benchmark problems (one-step prediction and system identification) demonstrate that our methods yield high accuracies relative to state-of-the-art methods, and standard kernels with sliding windows, particularly on problems with irrelevant dimensions. In addition, we describe two case studies in robotic learning-by-demonstration involving the Nao humanoid robot and the Assistive Robot Transport for Youngsters (ARTY) smart wheelchair
Distributed Adaptive Learning with Multiple Kernels in Diffusion Networks
We propose an adaptive scheme for distributed learning of nonlinear functions
by a network of nodes. The proposed algorithm consists of a local adaptation
stage utilizing multiple kernels with projections onto hyperslabs and a
diffusion stage to achieve consensus on the estimates over the whole network.
Multiple kernels are incorporated to enhance the approximation of functions
with several high and low frequency components common in practical scenarios.
We provide a thorough convergence analysis of the proposed scheme based on the
metric of the Cartesian product of multiple reproducing kernel Hilbert spaces.
To this end, we introduce a modified consensus matrix considering this specific
metric and prove its equivalence to the ordinary consensus matrix. Besides, the
use of hyperslabs enables a significant reduction of the computational demand
with only a minor loss in the performance. Numerical evaluations with synthetic
and real data are conducted showing the efficacy of the proposed algorithm
compared to the state of the art schemes.Comment: Double-column 15 pages, 10 figures, submitted to IEEE Trans. Signal
Processin
Sparse multinomial kernel discriminant analysis (sMKDA)
Dimensionality reduction via canonical variate analysis (CVA) is important for pattern recognition and has been extended variously to permit more flexibility, e.g. by "kernelizing" the formulation. This can lead to over-fitting, usually ameliorated by regularization. Here, a method for sparse, multinomial kernel discriminant analysis (sMKDA) is proposed, using a sparse basis to control complexity. It is based on the connection between CVA and least-squares, and uses forward selection via orthogonal least-squares to approximate a basis, generalizing a similar approach for binomial problems. Classification can be performed directly via minimum Mahalanobis distance in the canonical variates. sMKDA achieves state-of-the-art performance in terms of accuracy and sparseness on 11 benchmark datasets
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
- …