2,870 research outputs found
Adaptive Load Balancing: A Study in Multi-Agent Learning
We study the process of multi-agent reinforcement learning in the context of
load balancing in a distributed system, without use of either central
coordination or explicit communication. We first define a precise framework in
which to study adaptive load balancing, important features of which are its
stochastic nature and the purely local information available to individual
agents. Given this framework, we show illuminating results on the interplay
between basic adaptive behavior parameters and their effect on system
efficiency. We then investigate the properties of adaptive load balancing in
heterogeneous populations, and address the issue of exploration vs.
exploitation in that context. Finally, we show that naive use of communication
may not improve, and might even harm system efficiency.Comment: See http://www.jair.org/ for any accompanying file
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a
computer-science perspective. It is written to be accessible to researchers
familiar with machine learning. Both the historical basis of the field and a
broad selection of current work are summarized. Reinforcement learning is the
problem faced by an agent that learns behavior through trial-and-error
interactions with a dynamic environment. The work described here has a
resemblance to work in psychology, but differs considerably in the details and
in the use of the word ``reinforcement.'' The paper discusses central issues of
reinforcement learning, including trading off exploration and exploitation,
establishing the foundations of the field via Markov decision theory, learning
from delayed reinforcement, constructing empirical models to accelerate
learning, making use of generalization and hierarchy, and coping with hidden
state. It concludes with a survey of some implemented systems and an assessment
of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task
Programming (NTP), which bridges the idea of few-shot learning from
demonstration and neural program induction. NTP takes as input a task
specification (e.g., video demonstration of a task) and recursively decomposes
it into finer sub-task specifications. These specifications are fed to a
hierarchical neural program, where bottom-level programs are callable
subroutines that interact with the environment. We validate our method in three
robot manipulation tasks. NTP achieves strong generalization across sequential
tasks that exhibit hierarchal and compositional structures. The experimental
results show that NTP learns to generalize well to- wards unseen tasks with
increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201
Learning to Reach Agreement in a Continuous Ultimatum Game
It is well-known that acting in an individually rational manner, according to
the principles of classical game theory, may lead to sub-optimal solutions in a
class of problems named social dilemmas. In contrast, humans generally do not
have much difficulty with social dilemmas, as they are able to balance personal
benefit and group benefit. As agents in multi-agent systems are regularly
confronted with social dilemmas, for instance in tasks such as resource
allocation, these agents may benefit from the inclusion of mechanisms thought
to facilitate human fairness. Although many of such mechanisms have already
been implemented in a multi-agent systems context, their application is usually
limited to rather abstract social dilemmas with a discrete set of available
strategies (usually two). Given that many real-world examples of social
dilemmas are actually continuous in nature, we extend this previous work to
more general dilemmas, in which agents operate in a continuous strategy space.
The social dilemma under study here is the well-known Ultimatum Game, in which
an optimal solution is achieved if agents agree on a common strategy. We
investigate whether a scale-free interaction network facilitates agents to
reach agreement, especially in the presence of fixed-strategy agents that
represent a desired (e.g. human) outcome. Moreover, we study the influence of
rewiring in the interaction network. The agents are equipped with
continuous-action learning automata and play a large number of random pairwise
games in order to establish a common strategy. From our experiments, we may
conclude that results obtained in discrete-strategy games can be generalized to
continuous-strategy games to a certain extent: a scale-free interaction network
structure allows agents to achieve agreement on a common strategy, and rewiring
in the interaction network greatly enhances the agents ability to reach
agreement. However, it also becomes clear that some alternative mechanisms,
such as reputation and volunteering, have many subtleties involved and do not
have convincing beneficial effects in the continuous case
- …