118 research outputs found

    A fast and tight heuristic for A∗ in road networks

    Get PDF
    We study exact, efficient and practical algorithms for route planning in large road networks. Routing applications often require integrating the current traffic situation, planning ahead with traffic predictions for the future, respecting forbidden turns, and many other features depending on the exact application. While Dijkstra’s algorithm can be used to solve these problems, it is too slow for many applications. A* is a classical approach to accelerate Dijkstra’s algorithm. A* can support many extended scenarios without much additional implementation complexity. However, A*’s performance depends on the availability of a good heuristic that estimates distances. Computing tight distance estimates is a challenge on its own. On road networks, shortest paths can also be quickly computed using hierarchical speedup techniques. They achieve speed and exactness but sacrifice A*’s flexibility. Extending them to certain practical applications can be hard. In this paper, we present an algorithm to efficiently extract distance estimates for A* from Contraction Hierarchies (CH), a hierarchical technique. We call our heuristic CH-Potentials. Our approach allows decoupling the supported extensions from the hierarchical speed-up technique. Additionally, we describe A* optimizations to accelerate the processing of low degree nodes, which often occur in road networks

    On Reinforcement Learning for Full-length Game of StarCraft

    Full text link
    StarCraft II poses a grand challenge for reinforcement learning. The main difficulties of it include huge state and action space and a long-time horizon. In this paper, we investigate a hierarchical reinforcement learning approach for StarCraft II. The hierarchy involves two levels of abstraction. One is the macro-action automatically extracted from expert's trajectories, which reduces the action space in an order of magnitude yet remains effective. The other is a two-layer hierarchical architecture which is modular and easy to scale, enabling a curriculum transferring from simpler tasks to more complex tasks. The reinforcement training algorithm for this architecture is also investigated. On a 64x64 map and using restrictive units, we achieve a winning rate of more than 99\% against the difficulty level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat model, we can achieve over 93\% winning rate of Protoss against the most difficult non-cheating built-in AI (level-7) of Terran, training within two days using a single machine with only 48 CPU cores and 8 K40 GPUs. It also shows strong generalization performance, when tested against never seen opponents including cheating levels built-in AI and all levels of Zerg and Protoss built-in AI. We hope this study could shed some light on the future research of large-scale reinforcement learning.Comment: Appeared in AAAI 201

    Constraint Propagation on GPU: A Case Study for the Bin Packing Constraint

    Full text link
    The Bin Packing Problem is one of the most important problems in discrete optimization, as it captures the requirements of many real-world problems. Because of its importance, it has been approached with the main theoretical and practical tools. Resolution approaches based on Linear Programming are the most effective, while Constraint Programming proves valuable when the Bin Packing Problem is a component of a larger problem. This work focuses on the Bin Packing constraint and explores how GPUs can be used to enhance its propagation algorithm. Two approaches are motivated and discussed, one based on knapsack reasoning and one using alternative lower bounds. The implementations are evaluated in comparison with state-of-the-art approaches on different benchmarks from the literature. The results indicate that the GPU-accelerated lower bounds offers a desirable alternative to tackle large instances

    Modularity in answer set programs

    Get PDF
    Answer set programming (ASP) is an approach to rule-based constraint programming allowing flexible knowledge representation in variety of application areas. The declarative nature of ASP is reflected in problem solving. First, a programmer writes down a logic program the answer sets of which correspond to the solutions of the problem. The answer sets of the program are then computed using a special purpose search engine, an ASP solver. The development of efficient ASP solvers has enabled the use of answer set programming in various application domains such as planning, product configuration, computer aided verification, and bioinformatics. The topic of this thesis is modularity in answer set programming. While modern programming languages typically provide means to exploit modularity in a number of ways to govern the complexity of programs and their development process, relatively little attention has been paid to modularity in ASP. When designing a module architecture for ASP, it is essential to establish full compositionality of the semantics with respect to the module system. A balance is sought between introducing restrictions that guarantee the compositionality of the semantics and enforce a good programming style in ASP, and avoiding restrictions on the module hierarchy for the sake of flexibility of knowledge representation. To justify a replacement of a module with another, that is, to be able to guarantee that changes made on the level of modules do not alter the semantics of the program when seen as an entity, a notion of equivalence for modules is provided. In close connection with the development of the compositional module architecture, a transformation from verification of equivalence to search for answer sets is developed. The translation-based approach makes it unnecessary to develop a dedicated tool for the equivalence verification task by allowing the direct use of existing ASP solvers. Translations and transformations between different problems, program classes, and formalisms are another central theme in the thesis. To guarantee efficiency and soundness of the translation-based approach, certain syntactical and semantical properties of transformations are desirable, in terms of translation time, solution correspondence between the original and the transformed problem, and locality/globality of a particular transformation. In certain cases a more refined notion of minimality than that inherent in ASP can make program encodings more intuitive. Lifschitz' parallel and prioritized circumscription offer a solution in which certain atoms are allowed to vary or to have fixed values while others are falsified as far as possible according to priority classes. In this thesis a linear and faithful transformation embedding parallel and prioritized circumscription into ASP is provided. This enhances the knowledge representation capabilities of answer set programming by allowing the use of existing ASP solvers for computing parallel and prioritized circumscription

    Clause/Term Resolution and Learning in the Evaluation of Quantified Boolean Formulas

    Full text link
    Resolution is the rule of inference at the basis of most procedures for automated reasoning. In these procedures, the input formula is first translated into an equisatisfiable formula in conjunctive normal form (CNF) and then represented as a set of clauses. Deduction starts by inferring new clauses by resolution, and goes on until the empty clause is generated or satisfiability of the set of clauses is proven, e.g., because no new clauses can be generated. In this paper, we restrict our attention to the problem of evaluating Quantified Boolean Formulas (QBFs). In this setting, the above outlined deduction process is known to be sound and complete if given a formula in CNF and if a form of resolution, called Q-resolution, is used. We introduce Q-resolution on terms, to be used for formulas in disjunctive normal form. We show that the computation performed by most of the available procedures for QBFs --based on the Davis-Logemann-Loveland procedure (DLL) for propositional satisfiability-- corresponds to a tree in which Q-resolution on terms and clauses alternate. This poses the theoretical bases for the introduction of learning, corresponding to recording Q-resolution formulas associated with the nodes of the tree. We discuss the problems related to the introduction of learning in DLL based procedures, and present solutions extending state-of-the-art proposals coming from the literature on propositional satisfiability. Finally, we show that our DLL based solver extended with learning, performs significantly better on benchmarks used in the 2003 QBF solvers comparative evaluation

    Goal reasoning for autonomous agents using automated planning

    Get PDF
    Mención Internacional en el título de doctorAutomated planning deals with the task of finding a sequence of actions, namely a plan, which achieves a goal from a given initial state. Most planning research consider goals are provided by a external user, and agents just have to find a plan to achieve them. However, there exist many real world domains where agents should not only reason about their actions but also about their goals, generating new ones or changing them according to the perceived environment. In this thesis we aim at broadening the goal reasoning capabilities of planningbased agents, both when acting in isolation and when operating in the same environment as other agents. In single-agent settings, we firstly explore a special type of planning tasks where we aim at discovering states that fulfill certain cost-based requirements with respect to a given set of goals. By computing these states, agents are able to solve interesting tasks such as find escape plans that move agents in to safe places, hide their true goal to a potential observer, or anticipate dynamically arriving goals. We also show how learning the environment’s dynamics may help agents to solve some of these tasks. Experimental results show that these states can be quickly found in practice, making agents able to solve new planning tasks and helping them in solving some existing ones. In multi-agent settings, we study the automated generation of goals based on other agents’ behavior. We focus on competitive scenarios, where we are interested in computing counterplans that prevent opponents from achieving their goals. We frame these tasks as counterplanning, providing theoretical properties of the counterplans that solve them. We also show how agents can benefit from computing some of the states we propose in the single-agent setting to anticipate their opponent’s movements, thus increasing the odds of blocking them. Experimental results show how counterplans can be found in different environments ranging from competitive planning domains to real-time strategy games.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidenta: Eva Onaindía de la Rivaherrera.- Secretario: Ángel García Olaya.- Vocal: Mark Robert

    A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

    Full text link
    We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is to learn policies over multiple competing objectives whose relative importance (preferences) is unknown to the agent. While this alleviates dependence on scalar reward design, the expected return of a policy can change significantly with varying preferences, making it challenging to learn a single model to produce optimal policies under different preference conditions. We propose a generalized version of the Bellman equation to learn a single parametric representation for optimal policies over the space of all possible preferences. After an initial learning phase, our agent can execute the optimal policy under any given preference, or automatically infer an underlying preference with very few samples. Experiments across four different domains demonstrate the effectiveness of our approach.Comment: Accepted in NeurIPS 201

    Learning Strong Substitutes Demand via Queries

    Full text link
    This paper addresses the computational challenges of learning strong substitutes demand when given access to a demand (or valuation) oracle. Strong substitutes demand generalises the well-studied gross substitutes demand to a multi-unit setting. Recent work by Baldwin and Klemperer shows that any such demand can be expressed in a natural way as a finite list of weighted bid vectors. A simplified version of this bidding language has been used by the Bank of England. Assuming access to a demand oracle, we provide an algorithm that computes the unique list of weighted bid vectors corresponding to a bidder's demand preferences. In the special case where their demand can be expressed using positive bids only, we have an efficient algorithm that learns this list in linear time. We also show super-polynomial lower bounds on the query complexity of computing the list of bids in the general case where bids may be positive and negative. Our algorithms constitute the first systematic approach for bidders to construct a bid list corresponding to non-trivial demand, allowing them to participate in `product-mix' auctions
    corecore