13 research outputs found

    Planning for Decentralized Control of Multiple Robots Under Uncertainty

    Full text link
    We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the presence of uncertainty, but where communication limitations mean that the agents cannot share their state, so execution must proceed in a decentralized fashion. While Dec-POMDPs are typically intractable to solve for real-world problems, recent research on the use of macro-actions in Dec-POMDPs has significantly increased the size of problem that can be practically solved as a Dec-POMDP. We describe this general model, and show how, in contrast to most existing methods that are specialized to a particular problem class, it can synthesize control policies that use whatever opportunities for coordination are present in the problem, while balancing off uncertainty in outcomes, sensor information, and information about other agents. We use three variations on a warehouse task to show that a single planner of this type can generate cooperative behavior using task allocation, direct communication, and signaling, as appropriate

    Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

    Get PDF
    Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS 2015

    Decentralized Stochastic Planning with Anonymity in Interactions

    Get PDF
    In this paper, we solve cooperative decentralized stochastic planning problems, where the interactions between agents (specified using transition and reward functions) are dependent on the number of agents (and not on the identity of the individual agents) involved in the interaction. A collision of robots in a narrow corridor, defender teams coordinating patrol activities to secure a target, etc. are examples of such anonymous interactions. Formally, we consider problems that are a subset of the well known Decentralized MDP (DEC-MDP) model, where the anonymity in interactions is specified within the joint reward and transition functions. In this paper, not only do we introduce a general model model called D-SPAIT to capture anonymity in interactions, but also provide optimization based optimal and local-optimal solutions for generalizable sub-categories of D-SPAIT.Singapore. National Research Foundation (Singapore-MIT Alliance for Research and Technology Center. Future Urban Mobility Program

    Decision Support for Agent Populations in Uncertain and Congested Environments

    Get PDF
    This research is motivated by large scale problems in urban transportation and labor mobility where there is congestion for resources and uncertainty in movement. In such domains, even though the individual agents do not have an identity of their own and do not explicitly interact with other agents, they effect other agents. While there has been much research in handling such implicit effects, it has primarily assumed deterministic movements of agents. We address the issue of decision support for individual agents that are identical and have involuntary movements in dynamic environments. For instance, in a taxi fleet serving a city, when a taxi is hired by a customer, its movements are uncontrolled and depend on (a) the customers requirement; and (b) the location of other taxis in the fleet. Towards addressing decision support in such problems, we make two key contributions: (a) A framework to represent the decision problem for selfish individuals in a dynamic population, where there is transitional uncertainty (involuntary movements); and (b) Two techniques (Fictitious Play for Symmetric Agent Populations, FP-SAP and Softmax based Flow Update, SMFU) that converge to equilibrium solutions. We show that our techniques (apart from providing equilibrium strategies) outperform “driver ” strategies with respect to overall availability of taxis and the revenue obtained by the taxi drivers. We demonstrate this on a real world data set with 8,000 taxis and 83 zones (representing the entire area of Singapore)

    Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents

    Get PDF
    The use of distributed POMDPs for cooperative teams has been severely limited by the incredibly large joint policyspace that results from combining the policy-spaces of the individual agents. However, much of the computational cost of exploring the entire joint policy space can be avoided by observing that in many domains important interactions between agents occur in a relatively small set of scenarios, previously defined as coordination locales (CLs) [11]. Moreover, even when numerous interactions might occur, given a set of individual policies there are relatively few actual interactions. Exploiting this observation and building on an existing model shaping algorithm, this paper presents D-TREMOR, an algorithm in which cooperative agents iteratively generate individual policies, identify and communicate possibl
    corecore