260 research outputs found
Empowerment for Continuous Agent-Environment Systems
This paper develops generalizations of empowerment to continuous states.
Empowerment is a recently introduced information-theoretic quantity motivated
by hypotheses about the efficiency of the sensorimotor loop in biological
organisms, but also from considerations stemming from curiosity-driven
learning. Empowemerment measures, for agent-environment systems with stochastic
transitions, how much influence an agent has on its environment, but only that
influence that can be sensed by the agent sensors. It is an
information-theoretic generalization of joint controllability (influence on
environment) and observability (measurement by sensors) of the environment by
the agent, both controllability and observability being usually defined in
control theory as the dimensionality of the control/observation spaces. Earlier
work has shown that empowerment has various interesting and relevant
properties, e.g., it allows us to identify salient states using only the
dynamics, and it can act as intrinsic reward without requiring an external
reward. However, in this previous work empowerment was limited to the case of
small-scale and discrete domains and furthermore state transition probabilities
were assumed to be known. The goal of this paper is to extend empowerment to
the significantly more important and relevant case of continuous vector-valued
state spaces and initially unknown state transition probabilities. The
continuous state space is addressed by Monte-Carlo approximation; the unknown
transitions are addressed by model learning and prediction for which we apply
Gaussian processes regression with iterated forecasting. In a number of
well-known continuous control tasks we examine the dynamics induced by
empowerment and include an application to exploration and online model
learning
Benchmarking Deep Reinforcement Learning for Continuous Control
Recently, researchers have made significant progress combining the advances
in deep learning for learning feature representations with reinforcement
learning. Some notable examples include training agents to play Atari games
based on raw pixel data and to acquire advanced manipulation skills using raw
sensory inputs. However, it has been difficult to quantify progress in the
domain of continuous control due to the lack of a commonly adopted benchmark.
In this work, we present a benchmark suite of continuous control tasks,
including classic tasks like cart-pole swing-up, tasks with very high state and
action dimensionality such as 3D humanoid locomotion, tasks with partial
observations, and tasks with hierarchical structure. We report novel findings
based on the systematic evaluation of a range of implemented reinforcement
learning algorithms. Both the benchmark and reference implementations are
released at https://github.com/rllab/rllab in order to facilitate experimental
reproducibility and to encourage adoption by other researchers.Comment: 14 pages, ICML 201
HyperSNN: A new efficient and robust deep learning model for resource constrained control applications
In light of the increasing adoption of edge computing in areas such as
intelligent furniture, robotics, and smart homes, this paper introduces
HyperSNN, an innovative method for control tasks that uses spiking neural
networks (SNNs) in combination with hyperdimensional computing. HyperSNN
substitutes expensive 32-bit floating point multiplications with 8-bit integer
additions, resulting in reduced energy consumption while enhancing robustness
and potentially improving accuracy. Our model was tested on AI Gym benchmarks,
including Cartpole, Acrobot, MountainCar, and Lunar Lander. HyperSNN achieves
control accuracies that are on par with conventional machine learning methods
but with only 1.36% to 9.96% of the energy expenditure. Furthermore, our
experiments showed increased robustness when using HyperSNN. We believe that
HyperSNN is especially suitable for interactive, mobile, and wearable devices,
promoting energy-efficient and robust system design. Furthermore, it paves the
way for the practical implementation of complex algorithms like model
predictive control (MPC) in real-world industrial scenarios
- …