10,113 research outputs found
Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems
A key challenge in multi-robot and multi-agent systems is generating
solutions that are robust to other self-interested or even adversarial parties
who actively try to prevent the agents from achieving their goals. The
practicality of existing works addressing this challenge is limited to only
small-scale synchronous decision-making scenarios or a single agent planning
its best response against a single adversary with fixed, procedurally
characterized strategies. In contrast this paper considers a more realistic
class of problems where a team of asynchronous agents with limited observation
and communication capabilities need to compete against multiple strategic
adversaries with changing strategies. This problem necessitates agents that can
coordinate to detect changes in adversary strategies and plan the best response
accordingly. Our approach first optimizes a set of stratagems that represent
these best responses. These optimized stratagems are then integrated into a
unified policy that can detect and respond when the adversaries change their
strategies. The near-optimality of the proposed framework is established
theoretically as well as demonstrated empirically in simulation and hardware
Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions
This paper presents a data-driven approach for multi-robot coordination in
partially-observable domains based on Decentralized Partially Observable Markov
Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a
general framework for cooperative sequential decision making under uncertainty
and MAs allow temporally extended and asynchronous action execution. To date,
most methods assume the underlying Dec-POMDP model is known a priori or a full
simulator is available during planning time. Previous methods which aim to
address these issues suffer from local optimality and sensitivity to initial
conditions. Additionally, few hardware demonstrations involving a large team of
heterogeneous robots and with long planning horizons exist. This work addresses
these gaps by proposing an iterative sampling based Expectation-Maximization
algorithm (iSEM) to learn polices using only trajectory data containing
observations, MAs, and rewards. Our experiments show the algorithm is able to
achieve better solution quality than the state-of-the-art learning-based
methods. We implement two variants of multi-robot Search and Rescue (SAR)
domains (with and without obstacles) on hardware to demonstrate the learned
policies can effectively control a team of distributed robots to cooperate in a
partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS 2017
Effects of alarms on control of robot teams
Annunciator driven supervisory control (ADSC) is a widely used technique for directing human attention to control systems otherwise beyond their capabilities. ADSC requires associating abnormal parameter values with alarms in such a way that operator attention can be directed toward the involved subsystems or conditions. This is hard to achieve in multirobot control because it is difficult to distinguish abnormal conditions for states of a robot team. For largely independent tasks such as foraging, however, self-reflection can serve as a basis for alerting the operator to abnormalities of individual robots. While the search for targets remains unalarmed the resulting system approximates ADSC. The described experiment compares a control condition in which operators perform a multirobot urban search and rescue (USAR) task without alarms with ADSC (freely annunciated) and with a decision aid that limits operator workload by showing only the top alarm. No differences were found in area searched or victims found, however, operators in the freely annunciated condition were faster in detecting both the annunciated failures and victims entering their cameras' fields of view. Copyright 2011 by Human Factors and Ergonomics Society, Inc. All rights reserved
Inferring Robot Task Plans from Human Team Meetings: A Generative Modeling Approach with Logic-Based Prior
We aim to reduce the burden of programming and deploying autonomous systems
to work in concert with people in time-critical domains, such as military field
operations and disaster response. Deployment plans for these operations are
frequently negotiated on-the-fly by teams of human planners. A human operator
then translates the agreed upon plan into machine instructions for the robots.
We present an algorithm that reduces this translation burden by inferring the
final plan from a processed form of the human team's planning conversation. Our
approach combines probabilistic generative modeling with logical plan
validation used to compute a highly structured prior over possible plans. This
hybrid approach enables us to overcome the challenge of performing inference
over the large solution space with only a small amount of noisy data from the
team planning session. We validate the algorithm through human subject
experimentation and show we are able to infer a human team's final plan with
83% accuracy on average. We also describe a robot demonstration in which two
people plan and execute a first-response collaborative task with a PR2 robot.
To the best of our knowledge, this is the first work that integrates a logical
planning technique within a generative model to perform plan inference.Comment: Appears in Proceedings of the Twenty-Seventh AAAI Conference on
Artificial Intelligence (AAAI-13
- …