503 research outputs found
Deep Reinforcement Learning for Swarm Systems
Recently, deep reinforcement learning (RL) methods have been applied
successfully to multi-agent scenarios. Typically, these methods rely on a
concatenation of agent states to represent the information content required for
decentralized decision making. However, concatenation scales poorly to swarm
systems with a large number of homogeneous agents as it does not exploit the
fundamental properties inherent to these systems: (i) the agents in the swarm
are interchangeable and (ii) the exact number of agents in the swarm is
irrelevant. Therefore, we propose a new state representation for deep
multi-agent RL based on mean embeddings of distributions. We treat the agents
as samples of a distribution and use the empirical mean embedding as input for
a decentralized policy. We define different feature spaces of the mean
embedding using histograms, radial basis functions and a neural network learned
end-to-end. We evaluate the representation on two well known problems from the
swarm literature (rendezvous and pursuit evasion), in a globally and locally
observable setup. For the local setup we furthermore introduce simple
communication protocols. Of all approaches, the mean embedding representation
using neural network features enables the richest information exchange between
neighboring agents facilitating the development of more complex collective
strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20
Search and Pursuit-Evasion in Mobile Robotics, A survey
This paper surveys recent results in pursuitevasion
and autonomous search relevant to applications
in mobile robotics. We provide a taxonomy of search
problems that highlights the differences resulting from
varying assumptions on the searchers, targets, and the
environment. We then list a number of fundamental
results in the areas of pursuit-evasion and probabilistic
search, and we discuss field implementations on mobile
robotic systems. In addition, we highlight current open
problems in the area and explore avenues for future
work
Bounded-From-Below Solutions of the Hamilton-Jacobi Equation for Optimal Control Problems with Exit Times: Vanishing Lagrangians, Eikonal Equations, and Shape-From-Shading
We study the Hamilton-Jacobi equation for undiscounted exit time control
problems with general nonnegative Lagrangians using the dynamic programming
approach. We prove theorems characterizing the value function as the unique
bounded-from-below viscosity solution of the Hamilton-Jacobi equation which is
null on the target. The result applies to problems with the property that all
trajectories satisfying a certain integral condition must stay in a bounded
set. We allow problems for which the Lagrangian is not uniformly bounded below
by positive constants, in which the hypotheses of the known uniqueness results
for Hamilton-Jacobi equations are not satisfied. We apply our theorems to
eikonal equations from geometric optics, shape-from-shading equations from
image processing, and variants of the Fuller Problem.Comment: 29 pages, 0 figures, accepted for publication in NoDEA Nonlinear
Differential Equations and Applications on July 29, 200
Optimizing delegation between human and AI collaborative agents
In the context of humans operating with artificial or autonomous agents in a
hybrid team, it is essential to accurately identify when to authorize those
team members to perform actions. Given past examples where humans and
autonomous systems can either succeed or fail at tasks, we seek to train a
delegating manager agent to make delegation decisions with respect to these
potential performance deficiencies. Additionally, we cannot always expect the
various agents to operate within the same underlying model of the environment.
It is possible to encounter cases where the actions and transitions would vary
between agents. Therefore, our framework provides a manager model which learns
through observations of team performance without restricting agents to matching
dynamics. Our results show our manager learns to perform delegation decisions
with teams of agents operating under differing representations of the
environment, significantly outperforming alternative methods to manage the
team.Comment: This work has been accepted to the 'Towards Hybrid Human-Machine
Learning and Decision Making (HLDM)' workshop at ECML PKDD 202
- …