6 research outputs found
An information-theoretic on-line update principle for perception-action coupling
Inspired by findings of sensorimotor coupling in humans and animals, there
has recently been a growing interest in the interaction between action and
perception in robotic systems [Bogh et al., 2016]. Here we consider perception
and action as two serial information channels with limited
information-processing capacity. We follow [Genewein et al., 2015] and
formulate a constrained optimization problem that maximizes utility under
limited information-processing capacity in the two channels. As a solution we
obtain an optimal perceptual channel and an optimal action channel that are
coupled such that perceptual information is optimized with respect to
downstream processing in the action module. The main novelty of this study is
that we propose an online optimization procedure to find bounded-optimal
perception and action channels in parameterized serial perception-action
systems. In particular, we implement the perceptual channel as a multi-layer
neural network and the action channel as a multinomial distribution. We
illustrate our method in a NAO robot simulator with a simplified cup lifting
task.Comment: 8 pages, 2017 IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS
A Tutorial on Sparse Gaussian Processes and Variational Inference
Gaussian processes (GPs) provide a framework for Bayesian inference that can
offer principled uncertainty estimates for a large range of problems. For
example, if we consider regression problems with Gaussian likelihoods, a GP
model enjoys a posterior in closed form. However, identifying the posterior GP
scales cubically with the number of training examples and requires to store all
examples in memory. In order to overcome these obstacles, sparse GPs have been
proposed that approximate the true posterior GP with pseudo-training examples.
Importantly, the number of pseudo-training examples is user-defined and enables
control over computational and memory complexity. In the general case, sparse
GPs do not enjoy closed-form solutions and one has to resort to approximate
inference. In this context, a convenient choice for approximate inference is
variational inference (VI), where the problem of Bayesian inference is cast as
an optimization problem -- namely, to maximize a lower bound of the log
marginal likelihood. This paves the way for a powerful and versatile framework,
where pseudo-training examples are treated as optimization arguments of the
approximate posterior that are jointly identified together with hyperparameters
of the generative model (i.e. prior and likelihood). The framework can
naturally handle a wide scope of supervised learning problems, ranging from
regression with heteroscedastic and non-Gaussian likelihoods to classification
problems with discrete labels, but also multilabel problems. The purpose of
this tutorial is to provide access to the basic matter for readers without
prior knowledge in both GPs and VI. A proper exposition to the subject enables
also access to more recent advances (like importance-weighted VI as well as
interdomain, multioutput and deep GPs) that can serve as an inspiration for new
research ideas
Bounded Rational Decision-Making in Feedforward Neural Networks
Bounded rational decision-makers transform sensory input into motor output under limited computational resources. Mathematically, such decision-makers can be modeled as information-theoretic channels with limited transmission rate. Here, we apply this formalism for the first time to multilayer feedforward neural networks. We derive synaptic weight update rules for two scenarios, where either each neuron is considered as a bounded rational decision-maker or the network as a whole. In the update rules, bounded rationality translates into information-theoretically motivated types of regularization in weight space. In experiments on the MNIST benchmark classification task for handwritten digits, we show that such information-theoretic regularization successfully prevents overfitting across different architectures and attains results that are competitive with other recent techniques like dropout, dropconnect and Bayes by backprop, for both ordinary and convolutional neural networks