11,805 research outputs found
Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions
The focus of this paper is on solving multi-robot planning problems in
continuous spaces with partial observability. Decentralized partially
observable Markov decision processes (Dec-POMDPs) are general models for
multi-robot coordination problems, but representing and solving Dec-POMDPs is
often intractable for large problems. To allow for a high-level representation
that is natural for multi-robot problems and scalable to large discrete and
continuous problems, this paper extends the Dec-POMDP model to the
decentralized partially observable semi-Markov decision process (Dec-POSMDP).
The Dec-POSMDP formulation allows asynchronous decision-making by the robots,
which is crucial in multi-robot domains. We also present an algorithm for
solving this Dec-POSMDP which is much more scalable than previous methods since
it can incorporate closed-loop belief space macro-actions in planning. These
macro-actions are automatically constructed to produce robust solutions. The
proposed method's performance is evaluated on a complex multi-robot package
delivery problem under uncertainty, showing that our approach can naturally
represent multi-robot problems and provide high-quality solutions for
large-scale problems
Adoption of vehicular ad hoc networking protocols by networked robots
This paper focuses on the utilization of wireless networking in the robotics domain. Many researchers have already equipped their robots with wireless communication capabilities, stimulated by the observation that multi-robot systems tend to have several advantages over their single-robot counterparts. Typically, this integration of wireless communication is tackled in a quite pragmatic manner, only a few authors presented novel Robotic Ad Hoc Network (RANET) protocols that were designed specifically with robotic use cases in mind. This is in sharp contrast with the domain of vehicular ad hoc networks (VANET). This observation is the starting point of this paper. If the results of previous efforts focusing on VANET protocols could be reused in the RANET domain, this could lead to rapid progress in the field of networked robots. To investigate this possibility, this paper provides a thorough overview of the related work in the domain of robotic and vehicular ad hoc networks. Based on this information, an exhaustive list of requirements is defined for both types. It is concluded that the most significant difference lies in the fact that VANET protocols are oriented towards low throughput messaging, while RANET protocols have to support high throughput media streaming as well. Although not always with equal importance, all other defined requirements are valid for both protocols. This leads to the conclusion that cross-fertilization between them is an appealing approach for future RANET research. To support such developments, this paper concludes with the definition of an appropriate working plan
Planning for Decentralized Control of Multiple Robots Under Uncertainty
We describe a probabilistic framework for synthesizing control policies for
general multi-robot systems, given environment and sensor models and a cost
function. Decentralized, partially observable Markov decision processes
(Dec-POMDPs) are a general model of decision processes where a team of agents
must cooperate to optimize some objective (specified by a shared reward or cost
function) in the presence of uncertainty, but where communication limitations
mean that the agents cannot share their state, so execution must proceed in a
decentralized fashion. While Dec-POMDPs are typically intractable to solve for
real-world problems, recent research on the use of macro-actions in Dec-POMDPs
has significantly increased the size of problem that can be practically solved
as a Dec-POMDP. We describe this general model, and show how, in contrast to
most existing methods that are specialized to a particular problem class, it
can synthesize control policies that use whatever opportunities for
coordination are present in the problem, while balancing off uncertainty in
outcomes, sensor information, and information about other agents. We use three
variations on a warehouse task to show that a single planner of this type can
generate cooperative behavior using task allocation, direct communication, and
signaling, as appropriate
Deep Reinforcement Learning for Decentralized Multi-Robot Exploration With Macro Actions
Cooperative multi-robot teams need to be able to explore cluttered and
unstructured environments while dealing with communication dropouts that
prevent them from exchanging local information to maintain team coordination.
Therefore, robots need to consider high-level teammate intentions during action
selection. In this letter, we present the first Macro Action Decentralized
Exploration Network (MADE-Net) using multi-agent deep reinforcement learning
(DRL) to address the challenges of communication dropouts during multi-robot
exploration in unseen, unstructured, and cluttered environments. Simulated
robot team exploration experiments were conducted and compared against
classical and DRL methods where MADE-Net outperformed all benchmark methods in
terms of computation time, total travel distance, number of local interactions
between robots, and exploration rate across various degrees of communication
dropouts. A scalability study in 3D environments showed a decrease in
exploration time with MADE-Net with increasing team and environment sizes. The
experiments presented highlight the effectiveness and robustness of our method.Comment: 8 pages, 7 figure
Planning Algorithms for Multi-Robot Active Perception
A fundamental task of robotic systems is to use on-board sensors and perception algorithms to understand high-level semantic properties of an environment. These semantic properties may include a map of the environment, the presence of objects, or the parameters of a dynamic field. Observations are highly viewpoint dependent and, thus, the performance of perception algorithms can be improved by planning the motion of the robots to obtain high-value observations. This motivates the problem of active perception, where the goal is to plan the motion of robots to improve perception performance. This fundamental problem is central to many robotics applications, including environmental monitoring, planetary exploration, and precision agriculture. The core contribution of this thesis is a suite of planning algorithms for multi-robot active perception. These algorithms are designed to improve system-level performance on many fronts: online and anytime planning, addressing uncertainty, optimising over a long time horizon, decentralised coordination, robustness to unreliable communication, predicting plans of other agents, and exploiting characteristics of perception models. We first propose the decentralised Monte Carlo tree search algorithm as a generally-applicable, decentralised algorithm for multi-robot planning. We then present a self-organising map algorithm designed to find paths that maximally observe points of interest. Finally, we consider the problem of mission monitoring, where a team of robots monitor the progress of a robotic mission. A spatiotemporal optimal stopping algorithm is proposed and a generalisation for decentralised monitoring. Experimental results are presented for a range of scenarios, such as marine operations and object recognition. Our analytical and empirical results demonstrate theoretically-interesting and practically-relevant properties that support the use of the approaches in practice
Anytime Point-Based Approximations for Large POMDPs
The Partially Observable Markov Decision Process has long been recognized as
a rich framework for real-world planning and control problems, especially in
robotics. However exact solutions in this framework are typically
computationally intractable for all but the smallest problems. A well-known
technique for speeding up POMDP solving involves performing value backups at
specific belief points, rather than over the entire belief simplex. The
efficiency of this approach, however, depends greatly on the selection of
points. This paper presents a set of novel techniques for selecting informative
belief points which work well in practice. The point selection procedure is
combined with point-based value backups to form an effective anytime POMDP
algorithm called Point-Based Value Iteration (PBVI). The first aim of this
paper is to introduce this algorithm and present a theoretical analysis
justifying the choice of belief selection technique. The second aim of this
paper is to provide a thorough empirical comparison between PBVI and other
state-of-the-art POMDP methods, in particular the Perseus algorithm, in an
effort to highlight their similarities and differences. Evaluation is performed
using both standard POMDP domains and realistic robotic tasks
- …