6 research outputs found

    Marvin: A Heuristic Search Planner with Online Macro-Action Learning

    Full text link
    This paper describes Marvin, a planner that competed in the Fourth International Planning Competition (IPC 4). Marvin uses action-sequence-memoisation techniques to generate macro-actions, which are then used during search for a solution plan. We provide an overview of its architecture and search behaviour, detailing the algorithms used. We also empirically demonstrate the effectiveness of its features in various planning domains; in particular, the effects on performance due to the use of macro-actions, the novel features of its search behaviour, and the native support of ADL and Derived Predicates

    Resource Allocation Among Agents with MDP-Induced Preferences

    Full text link
    Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute actions in stochastic environments, modeled as Markov decision processes (MDPs), such that the value of a resource bundle is defined as the expected value of the optimal MDP policy realizable given these resources. We present an algorithm that simultaneously solves the resource-allocation and the policy-optimization problems. This allows us to avoid explicitly representing utilities over exponentially many resource bundles, leading to drastic (often exponential) reductions in computational complexity. We then use this algorithm in the context of self-interested agents to design a combinatorial auction for allocating resources. We empirically demonstrate the effectiveness of our approach by showing that it can, in minutes, optimally solve problems for which a straightforward combinatorial resource-allocation technique would require the agents to enumerate up to 2^100 resource bundles and the auctioneer to solve an NP-complete problem with an input of that size

    Distributed Constraint Optimization:Privacy Guarantees and Stochastic Uncertainty

    Get PDF
    Distributed Constraint Satisfaction (DisCSP) and Distributed Constraint Optimization (DCOP) are formal frameworks that can be used to model a variety of problems in which multiple decision-makers cooperate towards a common goal: from computing an equilibrium of a game, to vehicle routing problems, to combinatorial auctions. In this thesis, we independently address two important issues in such multi-agent problems: 1) how to provide strong guarantees on the protection of the privacy of the participants, and 2) how to anticipate future, uncontrollable events. On the privacy front, our contributions depart from previous work in two ways. First, we consider not only constraint privacy (the agents' private costs) and decision privacy (keeping the complete solution secret), but also two other types of privacy that have been largely overlooked in the literature: agent privacy, which has to do with protecting the identities of the participants, and topology privacy, which covers information about the agents' co-dependencies. Second, while previous work focused mainly on quantitatively measuring and reducing privacy loss, our algorithms provide stronger, qualitative guarantees on what information will remain secret. Our experiments show that it is possible to provide such privacy guarantees, while still scaling to much larger problems than the previous state of the art. When it comes to reasoning under uncertainty, we propose an extension to the DCOP framework, called DCOP under Stochastic Uncertainty (StochDCOP), which includes uncontrollable, random variables with known probability distributions that model uncertain, future events. The problem becomes one of making "optimal" offline decisions, before the true values of the random variables can be observed. We consider three possible concepts of optimality: minimizing the expected cost, minimizing the worst-case cost, or maximizing the probability of a-posteriori optimality. We propose a new family of StochDCOP algorithms, exploring the tradeoffs between solution quality, computational and message complexity, and privacy. In particular, we show how discovering and reasoning about co-dependencies on common random variables can yield higher-quality solutions
    corecore