4,794 research outputs found
Near-Optimal Adversarial Policy Switching for Decentralized Asynchronous Multi-Agent Systems
A key challenge in multi-robot and multi-agent systems is generating
solutions that are robust to other self-interested or even adversarial parties
who actively try to prevent the agents from achieving their goals. The
practicality of existing works addressing this challenge is limited to only
small-scale synchronous decision-making scenarios or a single agent planning
its best response against a single adversary with fixed, procedurally
characterized strategies. In contrast this paper considers a more realistic
class of problems where a team of asynchronous agents with limited observation
and communication capabilities need to compete against multiple strategic
adversaries with changing strategies. This problem necessitates agents that can
coordinate to detect changes in adversary strategies and plan the best response
accordingly. Our approach first optimizes a set of stratagems that represent
these best responses. These optimized stratagems are then integrated into a
unified policy that can detect and respond when the adversaries change their
strategies. The near-optimality of the proposed framework is established
theoretically as well as demonstrated empirically in simulation and hardware
Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions
The focus of this paper is on solving multi-robot planning problems in
continuous spaces with partial observability. Decentralized partially
observable Markov decision processes (Dec-POMDPs) are general models for
multi-robot coordination problems, but representing and solving Dec-POMDPs is
often intractable for large problems. To allow for a high-level representation
that is natural for multi-robot problems and scalable to large discrete and
continuous problems, this paper extends the Dec-POMDP model to the
decentralized partially observable semi-Markov decision process (Dec-POSMDP).
The Dec-POSMDP formulation allows asynchronous decision-making by the robots,
which is crucial in multi-robot domains. We also present an algorithm for
solving this Dec-POSMDP which is much more scalable than previous methods since
it can incorporate closed-loop belief space macro-actions in planning. These
macro-actions are automatically constructed to produce robust solutions. The
proposed method's performance is evaluated on a complex multi-robot package
delivery problem under uncertainty, showing that our approach can naturally
represent multi-robot problems and provide high-quality solutions for
large-scale problems
Symbiotic Navigation in Multi-Robot Systems with Remote Obstacle Knowledge Sharing
Large scale operational areas often require multiple service robots for coverage and task parallelism. In such scenarios, each robot keeps its individual map of the environment and serves specific areas of the map at different times. We propose a knowledge sharing mechanism for multiple robots in which one robot can inform other robots about the changes in map, like path blockage, or new static obstacles, encountered at specific areas of the map. This symbiotic information sharing allows the robots to update remote areas of the map without having to explicitly navigate those areas, and plan efficient paths. A node representation of paths is presented for seamless sharing of blocked path information. The transience of obstacles is modeled to track obstacles which might have been removed. A lazy information update scheme is presented in which only relevant information affecting the current task is updated for efficiency. The advantages of the proposed method for path planning are discussed against traditional method with experimental results in both simulation and real environments
Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions
A multi-agent partially observable Markov decision process (MPOMDP) is a
modeling paradigm used for high-level planning of heterogeneous autonomous
agents subject to uncertainty and partial observation. Despite their modeling
efficiency, MPOMDPs have not received significant attention in safety-critical
settings. In this paper, we use barrier functions to design policies for
MPOMDPs that ensure safety. Notably, our method does not rely on discretization
of the belief space, or finite memory. To this end, we formulate sufficient and
necessary conditions for the safety of a given set based on discrete-time
barrier functions (DTBFs) and we demonstrate that our formulation also allows
for Boolean compositions of DTBFs for representing more complicated safe sets.
We show that the proposed method can be implemented online by a sequence of
one-step greedy algorithms as a standalone safe controller or as a
safety-filter given a nominal planning policy. We illustrate the efficiency of
the proposed methodology based on DTBFs using a high-fidelity simulation of
heterogeneous robots.Comment: 8 pages and 4 figure
Towards adaptive multi-robot systems: self-organization and self-adaptation
Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugÀnglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.The development of complex systems ensembles that operate in uncertain environments is a major challenge. The reason for this is that system designers are not able to fully specify the system during specification and development and before it is being deployed. Natural swarm systems enjoy similar characteristics, yet, being self-adaptive and being able to self-organize, these systems show beneficial emergent behaviour. Similar concepts can be extremely helpful for artificial systems, especially when it comes to multi-robot scenarios, which require such solution in order to be applicable to highly uncertain real world application. In this article, we present a comprehensive overview over state-of-the-art solutions in emergent systems, self-organization, self-adaptation, and robotics. We discuss these approaches in the light of a framework for multi-robot systems and identify similarities, differences missing links and open gaps that have to be addressed in order to make this framework possible
Cooperative localization for mobile agents: a recursive decentralized algorithm based on Kalman filter decoupling
We consider cooperative localization technique for mobile agents with
communication and computation capabilities. We start by provide and overview of
different decentralization strategies in the literature, with special focus on
how these algorithms maintain an account of intrinsic correlations between
state estimate of team members. Then, we present a novel decentralized
cooperative localization algorithm that is a decentralized implementation of a
centralized Extended Kalman Filter for cooperative localization. In this
algorithm, instead of propagating cross-covariance terms, each agent propagates
new intermediate local variables that can be used in an update stage to create
the required propagated cross-covariance terms. Whenever there is a relative
measurement in the network, the algorithm declares the agent making this
measurement as the interim master. By acquiring information from the interim
landmark, the agent the relative measurement is taken from, the interim master
can calculate and broadcast a set of intermediate variables which each robot
can then use to update its estimates to match that of a centralized Extended
Kalman Filter for cooperative localization. Once an update is done, no further
communication is needed until the next relative measurement
Robust Environmental Mapping by Mobile Sensor Networks
Constructing a spatial map of environmental parameters is a crucial step to
preventing hazardous chemical leakages, forest fires, or while estimating a
spatially distributed physical quantities such as terrain elevation. Although
prior methods can do such mapping tasks efficiently via dispatching a group of
autonomous agents, they are unable to ensure satisfactory convergence to the
underlying ground truth distribution in a decentralized manner when any of the
agents fail. Since the types of agents utilized to perform such mapping are
typically inexpensive and prone to failure, this results in poor overall
mapping performance in real-world applications, which can in certain cases
endanger human safety. This paper presents a Bayesian approach for robust
spatial mapping of environmental parameters by deploying a group of mobile
robots capable of ad-hoc communication equipped with short-range sensors in the
presence of hardware failures. Our approach first utilizes a variant of the
Voronoi diagram to partition the region to be mapped into disjoint regions that
are each associated with at least one robot. These robots are then deployed in
a decentralized manner to maximize the likelihood that at least one robot
detects every target in their associated region despite a non-zero probability
of failure. A suite of simulation results is presented to demonstrate the
effectiveness and robustness of the proposed method when compared to existing
techniques.Comment: accepted to icra 201
Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions
This paper presents a data-driven approach for multi-robot coordination in
partially-observable domains based on Decentralized Partially Observable Markov
Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a
general framework for cooperative sequential decision making under uncertainty
and MAs allow temporally extended and asynchronous action execution. To date,
most methods assume the underlying Dec-POMDP model is known a priori or a full
simulator is available during planning time. Previous methods which aim to
address these issues suffer from local optimality and sensitivity to initial
conditions. Additionally, few hardware demonstrations involving a large team of
heterogeneous robots and with long planning horizons exist. This work addresses
these gaps by proposing an iterative sampling based Expectation-Maximization
algorithm (iSEM) to learn polices using only trajectory data containing
observations, MAs, and rewards. Our experiments show the algorithm is able to
achieve better solution quality than the state-of-the-art learning-based
methods. We implement two variants of multi-robot Search and Rescue (SAR)
domains (with and without obstacles) on hardware to demonstrate the learned
policies can effectively control a team of distributed robots to cooperate in a
partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS 2017
- âŠ