6 research outputs found
Marvin: A Heuristic Search Planner with Online Macro-Action Learning
This paper describes Marvin, a planner that competed in the Fourth
International Planning Competition (IPC 4). Marvin uses
action-sequence-memoisation techniques to generate macro-actions, which are
then used during search for a solution plan. We provide an overview of its
architecture and search behaviour, detailing the algorithms used. We also
empirically demonstrate the effectiveness of its features in various planning
domains; in particular, the effects on performance due to the use of
macro-actions, the novel features of its search behaviour, and the native
support of ADL and Derived Predicates
Resource Allocation Among Agents with MDP-Induced Preferences
Allocating scarce resources among agents to maximize global utility is, in
general, computationally challenging. We focus on problems where resources
enable agents to execute actions in stochastic environments, modeled as Markov
decision processes (MDPs), such that the value of a resource bundle is defined
as the expected value of the optimal MDP policy realizable given these
resources. We present an algorithm that simultaneously solves the
resource-allocation and the policy-optimization problems. This allows us to
avoid explicitly representing utilities over exponentially many resource
bundles, leading to drastic (often exponential) reductions in computational
complexity. We then use this algorithm in the context of self-interested agents
to design a combinatorial auction for allocating resources. We empirically
demonstrate the effectiveness of our approach by showing that it can, in
minutes, optimally solve problems for which a straightforward combinatorial
resource-allocation technique would require the agents to enumerate up to 2^100
resource bundles and the auctioneer to solve an NP-complete problem with an
input of that size
Distributed Constraint Optimization:Privacy Guarantees and Stochastic Uncertainty
Distributed Constraint Satisfaction (DisCSP) and Distributed Constraint Optimization (DCOP) are formal frameworks that can be used to model a variety of problems in which multiple decision-makers cooperate towards a common goal: from computing an equilibrium of a game, to vehicle routing problems, to combinatorial auctions. In this thesis, we independently address two important issues in such multi-agent problems: 1) how to provide strong guarantees on the protection of the privacy of the participants, and 2) how to anticipate future, uncontrollable events. On the privacy front, our contributions depart from previous work in two ways. First, we consider not only constraint privacy (the agents' private costs) and decision privacy (keeping the complete solution secret), but also two other types of privacy that have been largely overlooked in the literature: agent privacy, which has to do with protecting the identities of the participants, and topology privacy, which covers information about the agents' co-dependencies. Second, while previous work focused mainly on quantitatively measuring and reducing privacy loss, our algorithms provide stronger, qualitative guarantees on what information will remain secret. Our experiments show that it is possible to provide such privacy guarantees, while still scaling to much larger problems than the previous state of the art. When it comes to reasoning under uncertainty, we propose an extension to the DCOP framework, called DCOP under Stochastic Uncertainty (StochDCOP), which includes uncontrollable, random variables with known probability distributions that model uncertain, future events. The problem becomes one of making "optimal" offline decisions, before the true values of the random variables can be observed. We consider three possible concepts of optimality: minimizing the expected cost, minimizing the worst-case cost, or maximizing the probability of a-posteriori optimality. We propose a new family of StochDCOP algorithms, exploring the tradeoffs between solution quality, computational and message complexity, and privacy. In particular, we show how discovering and reasoning about co-dependencies on common random variables can yield higher-quality solutions