17 research outputs found

    Planning for Decentralized Control of Multiple Robots Under Uncertainty

    Full text link
    We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the presence of uncertainty, but where communication limitations mean that the agents cannot share their state, so execution must proceed in a decentralized fashion. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be practically solved as a Dec-POMDP. We describe this general model, and show how, in contrast to most existing methods that are specialized to a particular problem class, it can synthesize control policies that use whatever opportunities for coordination are present in the problem, while balancing off uncertainty in outcomes, sensor information, and information about other agents. We use three variations on a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate

    Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions

    Get PDF
    The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for multi-robot coordination problems, but representing and solving Dec-POMDPs is often intractable for large problems. To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the decentralized partially observable semi-Markov decision process (Dec-POSMDP). The Dec-POSMDP formulation allows asynchronous decision-making by the robots, which is crucial in multi-robot domains. We also present an algorithm for solving this Dec-POSMDP which is much more scalable than previous methods since it can incorporate closed-loop belief space macro-actions in planning. These macro-actions are automatically constructed to produce robust solutions. The proposed method's performance is evaluated on a complex multi-robot package delivery problem under uncertainty, showing that our approach can naturally represent multi-robot problems and provide high-quality solutions for large-scale problems

    Observation-Based Multi-Agent Planning with Communication

    Get PDF
    This research has been sponsored by SELEX ES. We thank Feng Wu for providing the source code of the MAOP-COMM planner.Publisher PD

    Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions

    Get PDF
    The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems, but representing and solving Dec-POMDPs is often intractable for large problems. To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP). The Dec-POSMDP formulation allows asynchronous decision-making by the robots, which is crucial in multi-robot domains. We also present an algorithm for solving this Dec-POSMDP which is much more scalable than previous methods since it can incorporate closed-loop belief space macro-actions in planning. These macro-actions are automatically constructed to produce robust solutions. The proposed method's performance is evaluated on a complex multi-robot package delivery problem under uncertainty, showing that our approach can naturally represent multi-robot problems and provide high-quality solutions for large-scale problems

    SMCL - Stochastic Model Checker for Learning in Games

    Get PDF
    A stochastic model checker is presented for analysing the performance of game-theoretic learning algorithms. The method enables the comparison of short-term behaviour of learning algorithms intended for practical use. The procedure of comparison is automated and it can be tuned for accuracy and speed. Users can choose from among various learning algorithms to select a suitable one for a given practical problem. The powerful performance of the method is enabled by a novel behaviour-similarity-relation, which compacts large state spaces into small ones. The stochastic model checking tool is tested on a set of examples classified into four categories to demonstrate the effectiveness of selecting suitable algorithms for distributed decision making

    Optimal and Approximate Q-value Functions for Decentralized POMDPs

    Get PDF
    Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem

    Strategies for Scaleable Communication and Coordination in Multi-Agent (UAV) Systems

    Get PDF
    A system is considered in which agents (UAVs) must cooperatively discover interest-points (i.e., burning trees, geographical features) evolving over a grid. The objective is to locate as many interest-points as possible in the shortest possible time frame. There are two main problems: a control problem, where agents must collectively determine the optimal action, and a communication problem, where agents must share their local states and infer a common global state. Both problems become intractable when the number of agents is large. This survey/concept paper curates a broad selection of work in the literature pointing to a possible solution; a unified control/communication architecture within the framework of reinforcement learning. Two components of this architecture are locally interactive structure in the state-space, and hierarchical multi-level clustering for system-wide communication. The former mitigates the complexity of the control problem and the latter adapts to fundamental throughput constraints in wireless networks. The challenges of applying reinforcement learning to multi-agent systems are discussed. The role of clustering is explored in multi-agent communication. Research directions are suggested to unify these components

    Real-time motion planning methods for autonomous on-road driving: state-of-the-art and future research directions

    Get PDF
    Currently autonomous or self-driving vehicles are at the heart of academia and industry research because of its multi-faceted advantages that includes improved safety, reduced congestion, lower emissions and greater mobility. Software is the key driving factor underpinning autonomy within which planning algorithms that are responsible for mission-critical decision making hold a significant position. While transporting passengers or goods from a given origin to a given destination, motion planning methods incorporate searching for a path to follow, avoiding obstacles and generating the best trajectory that ensures safety, comfort and efficiency. A range of different planning approaches have been proposed in the literature. The purpose of this paper is to review existing approaches and then compare and contrast different methods employed for the motion planning of autonomous on-road driving that consists of (1) finding a path, (2) searching for the safest manoeuvre and (3) determining the most feasible trajectory. Methods developed by researchers in each of these three levels exhibit varying levels of complexity and performance accuracy. This paper presents a critical evaluation of each of these methods, in terms of their advantages/disadvantages, inherent limitations, feasibility, optimality, handling of obstacles and testing operational environments. Based on a critical review of existing methods, research challenges to address current limitations are identified and future research directions are suggested so as to enhance the performance of planning algorithms at all three levels. Some promising areas of future focus have been identified as the use of vehicular communications (V2V and V2I) and the incorporation of transport engineering aspects in order to improve the look-ahead horizon of current sensing technologies that are essential for planning with the aim of reducing the total cost of driverless vehicles. This critical review on planning techniques presented in this paper, along with the associated discussions on their constraints and limitations, seek to assist researchers in accelerating development in the emerging field of autonomous vehicle research

    Real-time motion planning methods for autonomous on-road driving: State-of-the-art and future research directions

    Get PDF
    Open access articleCurrently autonomous or self-driving vehicles are at the heart of academia and industry research because of its multi-faceted advantages that includes improved safety, reduced congestion,lower emissions and greater mobility. Software is the key driving factor underpinning autonomy within which planning algorithms that are responsible for mission-critical decision making hold a significant position. While transporting passengers or goods from a given origin to a given destination, motion planning methods incorporate searching for a path to follow, avoiding obstacles and generating the best trajectory that ensures safety, comfort and efficiency. A range of different planning approaches have been proposed in the literature. The purpose of this paper is to review existing approaches and then compare and contrast different methods employed for the motion planning of autonomous on-road driving that consists of (1) finding a path, (2) searching for the safest manoeuvre and (3) determining the most feasible trajectory. Methods developed by researchers in each of these three levels exhibit varying levels of complexity and performance accuracy. This paper presents a critical evaluation of each of these methods, in terms of their advantages/disadvantages, inherent limitations, feasibility, optimality, handling of obstacles and testing operational environments. Based on a critical review of existing methods, research challenges to address current limitations are identified and future research directions are suggested so as to enhance the performance of planning algorithms at all three levels. Some promising areas of future focus have been identified as the use of vehicular communications (V2V and V2I) and the incorporation of transport engineering aspects in order to improve the look-ahead horizon of current sensing technologies that are essential for planning with the aim of reducing the total cost of driverless vehicles. This critical review on planning techniques presented in this paper, along with the associated discussions on their constraints and limitations, seek to assist researchers in accelerating development in the emerging field of autonomous vehicle research
    corecore