7 research outputs found

    Formal Modelling for Multi-Robot Systems Under Uncertainty

    Get PDF
    Purpose of Review: To effectively synthesise and analyse multi-robot behaviour, we require formal task-level models which accurately capture multi-robot execution. In this paper, we review modelling formalisms for multi-robot systems under uncertainty, and discuss how they can be used for planning, reinforcement learning, model checking, and simulation. Recent Findings: Recent work has investigated models which more accurately capture multi-robot execution by considering different forms of uncertainty, such as temporal uncertainty and partial observability, and modelling the effects of robot interactions on action execution. Other strands of work have presented approaches for reducing the size of multi-robot models to admit more efficient solution methods. This can be achieved by decoupling the robots under independence assumptions, or reasoning over higher level macro actions. Summary: Existing multi-robot models demonstrate a trade off between accurately capturing robot dependencies and uncertainty, and being small enough to tractably solve real world problems. Therefore, future research should exploit realistic assumptions over multi-robot behaviour to develop smaller models which retain accurate representations of uncertainty and robot interactions; and exploit the structure of multi-robot problems, such as factored state spaces, to develop scalable solution methods.Comment: 23 pages, 0 figures, 2 tables. Current Robotics Reports (2023). This version of the article has been accepted for publication, after peer review (when applicable) but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://dx.doi.org/10.1007/s43154-023-00104-

    Planning for Decentralized Control of Multiple Robots Under Uncertainty

    Full text link
    We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the presence of uncertainty, but where communication limitations mean that the agents cannot share their state, so execution must proceed in a decentralized fashion. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be practically solved as a Dec-POMDP. We describe this general model, and show how, in contrast to most existing methods that are specialized to a particular problem class, it can synthesize control policies that use whatever opportunities for coordination are present in the problem, while balancing off uncertainty in outcomes, sensor information, and information about other agents. We use three variations on a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate

    Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

    Get PDF
    In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP setting (MMDP) such structure is not present. We propose a new optimal solver for transition-independent MMDPs, in which agents can only affect their own state but their reward depends on joint transitions. We represent these dependencies compactly in conditional return graphs (CRGs). Using CRGs the value of a joint policy and the bounds on partially specified joint policies can be efficiently computed. We propose CoRe, a novel branch-and-bound policy search algorithm building on CRGs. CoRe typically requires less runtime than the available alternatives and finds solutions to problems previously unsolvable

    Temporal Markov Decision Problems : Formalization and Resolution

    Get PDF
    This thesis addresses the question of planning under uncertainty within a time-dependent changing environment. Original motivation for this work came from the problem of building an autonomous agent able to coordinate with its uncertain environment; this environment being composed of other agents communicating their intentions or non-controllable processes for which some discrete-event model is available. We investigate several approaches for modeling continuous time-dependency in the framework of Markov Decision Processes (MDPs), leading us to a definition of Temporal Markov Decision Problems. Then our approach focuses on two separate paradigms. First, we investigate time-dependent problems as \emph{implicit-event} processes and describe them through the formalism of Time-dependent MDPs (TMDPs). We extend the existing results concerning optimality equations and present a new Value Iteration algorithm based on piecewise polynomial function representations in order to solve a more general class of TMDPs. This paves the way to a more general discussion on parametric actions in hybrid state and action spaces MDPs with continuous time. In a second time, we investigate the option of separately modeling the concurrent contributions of exogenous events. This approach of \emph{explicit-event} modeling leads to the use of Generalized Semi-Markov Decision Processes (GSMDP). We establish a link between the general framework of Discrete Events Systems Specification (DEVS) and the formalism of GSMDP, allowing us to build sound discrete-event compatible simulators. Then we introduce a simulation-based Policy Iteration approach for explicit-event Temporal Markov Decision Problems. This algorithmic contribution brings together results from simulation theory, forward search in MDPs, and statistical learning theory. The implicit-event approach was tested on a specific version of the Mars rover planning problem and on a drone patrol mission planning problem while the explicit-event approach was evaluated on a subway network control problem

    An extended study on addressing defender teamwork while accounting for uncertainty in attacker defender games using iterative Dec-MDPs

    Get PDF
    Multi-agent teamwork and defender-attacker security games are two areas that are currently receiving significant attention within multi-agent systems research. Unfortunately, despite the need for effective teamwork among multiple defenders, little has been done to harness the teamwork 1 research in security games. The problem that this paper seeks to solve is the coordination of decentralized defender agents in the presence of uncer-tainty while securing targets against an observing adversary. To address this problem, we offer the following novel contributions in this paper: (i) New model of security games with defender teams that coordinate under uncertainty; (ii) New algorithm based on column generation that uti-lizes Decentralized Markov Decision Processes (Dec-MDPs) to generate defender strategies that incorporate uncertainty; (iii) New techniques to handle global events (when one or more agents may leave the system) during defender execution; (iv) Heuristics that help scale up in the num-ber of targets and agents to handle real-world scenarios; (v) Exploration of the robustness of randomized pure strategies. The paper opens the door to a potentially new area combining computational game theory and multi-agent teamwork.

    GSMDPs for Multi-Robot Sequential Decision-Making

    No full text
    Markov Decision Processes (MDPs) provide an extensive theoretical background for problems of decision-making under uncertainty. In order to maintain computational tractability, however, real-world problems are typically discretized in states and actions as well as in time. Assuming synchronous state transitions and actions at fixed rates may result in models which are not strictly Markovian, or where agents are forced to idle between actions, losing their ability to react to sudden changes in the environment. In this work, we explore the application of Generalized Semi-Markov Decision Processes (GSMDPs) to a realistic multi-robot scenario. A case study will be presented in the domain of cooperative robotics, where real-time reactivity must be preserved, and synchronous discrete-time approaches are therefore sub-optimal. This case study is tested on a team of real robots, and also in realistic simulation. By allowing asynchronous events to be modeled over continuous time, the GSMDP approach is shown to provide greater solution quality than its discrete-time counterparts, while still being approximately solvable by existing methods
    corecore