118 research outputs found
A fast and tight heuristic for A∗ in road networks
We study exact, efficient and practical algorithms for route planning in large road networks. Routing applications often require integrating the current traffic situation, planning ahead with traffic predictions for the future, respecting forbidden turns, and many other features depending on the exact application. While Dijkstra’s algorithm can be used to solve these problems, it is too slow for many applications. A* is a classical approach to accelerate Dijkstra’s algorithm. A* can support many extended scenarios without much additional implementation complexity. However, A*’s performance depends on the availability of a good heuristic that estimates distances. Computing tight distance estimates is a challenge on its own. On road networks, shortest paths can also be quickly computed using hierarchical speedup techniques. They achieve speed and exactness but sacrifice A*’s flexibility. Extending them to certain practical applications can be hard. In this paper, we present an algorithm to efficiently extract distance estimates for A* from Contraction Hierarchies (CH), a hierarchical technique. We call our heuristic CH-Potentials. Our approach allows decoupling the supported extensions from the hierarchical speed-up technique. Additionally, we describe A* optimizations to accelerate the processing of low degree nodes, which often occur in road networks
On Reinforcement Learning for Full-length Game of StarCraft
StarCraft II poses a grand challenge for reinforcement learning. The main
difficulties of it include huge state and action space and a long-time horizon.
In this paper, we investigate a hierarchical reinforcement learning approach
for StarCraft II. The hierarchy involves two levels of abstraction. One is the
macro-action automatically extracted from expert's trajectories, which reduces
the action space in an order of magnitude yet remains effective. The other is a
two-layer hierarchical architecture which is modular and easy to scale,
enabling a curriculum transferring from simpler tasks to more complex tasks.
The reinforcement training algorithm for this architecture is also
investigated. On a 64x64 map and using restrictive units, we achieve a winning
rate of more than 99\% against the difficulty level-1 built-in AI. Through the
curriculum transfer learning algorithm and a mixture of combat model, we can
achieve over 93\% winning rate of Protoss against the most difficult
non-cheating built-in AI (level-7) of Terran, training within two days using a
single machine with only 48 CPU cores and 8 K40 GPUs. It also shows strong
generalization performance, when tested against never seen opponents including
cheating levels built-in AI and all levels of Zerg and Protoss built-in AI. We
hope this study could shed some light on the future research of large-scale
reinforcement learning.Comment: Appeared in AAAI 201
Constraint Propagation on GPU: A Case Study for the Bin Packing Constraint
The Bin Packing Problem is one of the most important problems in discrete
optimization, as it captures the requirements of many real-world problems.
Because of its importance, it has been approached with the main theoretical and
practical tools. Resolution approaches based on Linear Programming are the most
effective, while Constraint Programming proves valuable when the Bin Packing
Problem is a component of a larger problem. This work focuses on the Bin
Packing constraint and explores how GPUs can be used to enhance its propagation
algorithm. Two approaches are motivated and discussed, one based on knapsack
reasoning and one using alternative lower bounds. The implementations are
evaluated in comparison with state-of-the-art approaches on different
benchmarks from the literature. The results indicate that the GPU-accelerated
lower bounds offers a desirable alternative to tackle large instances
Modularity in answer set programs
Answer set programming (ASP) is an approach to rule-based constraint programming allowing flexible knowledge representation in variety of application areas. The declarative nature of ASP is reflected in problem solving. First, a programmer writes down a logic program the answer sets of which correspond to the solutions of the problem. The answer sets of the program are then computed using a special purpose search engine, an ASP solver. The development of efficient ASP solvers has enabled the use of answer set programming in various application domains such as planning, product configuration, computer aided verification, and bioinformatics.
The topic of this thesis is modularity in answer set programming. While modern programming languages typically provide means to exploit modularity in a number of ways to govern the complexity of programs and their development process, relatively little attention has been paid to modularity in ASP. When designing a module architecture for ASP, it is essential to establish full compositionality of the semantics with respect to the module system. A balance is sought between introducing restrictions that guarantee the compositionality of the semantics and enforce a good programming style in ASP, and avoiding restrictions on the module hierarchy for the sake of flexibility of knowledge representation.
To justify a replacement of a module with another, that is, to be able to guarantee that changes made on the level of modules do not alter the semantics of the program when seen as an entity, a notion of equivalence for modules is provided. In close connection with the development of the compositional module architecture, a transformation from verification of equivalence to search for answer sets is developed. The translation-based approach makes it unnecessary to develop a dedicated tool for the equivalence verification task by allowing the direct use of existing ASP solvers.
Translations and transformations between different problems, program classes, and formalisms are another central theme in the thesis. To guarantee efficiency and soundness of the translation-based approach, certain syntactical and semantical properties of transformations are desirable, in terms of translation time, solution correspondence between the original and the transformed problem, and locality/globality of a particular transformation.
In certain cases a more refined notion of minimality than that inherent in ASP can make program encodings more intuitive. Lifschitz' parallel and prioritized circumscription offer a solution in which certain atoms are allowed to vary or to have fixed values while others are falsified as far as possible according to priority classes. In this thesis a linear and faithful transformation embedding parallel and prioritized circumscription into ASP is provided. This enhances the knowledge representation capabilities of answer set programming by allowing the use of existing ASP solvers for computing parallel and prioritized circumscription
Clause/Term Resolution and Learning in the Evaluation of Quantified Boolean Formulas
Resolution is the rule of inference at the basis of most procedures for
automated reasoning. In these procedures, the input formula is first translated
into an equisatisfiable formula in conjunctive normal form (CNF) and then
represented as a set of clauses. Deduction starts by inferring new clauses by
resolution, and goes on until the empty clause is generated or satisfiability
of the set of clauses is proven, e.g., because no new clauses can be generated.
In this paper, we restrict our attention to the problem of evaluating
Quantified Boolean Formulas (QBFs). In this setting, the above outlined
deduction process is known to be sound and complete if given a formula in CNF
and if a form of resolution, called Q-resolution, is used. We introduce
Q-resolution on terms, to be used for formulas in disjunctive normal form. We
show that the computation performed by most of the available procedures for
QBFs --based on the Davis-Logemann-Loveland procedure (DLL) for propositional
satisfiability-- corresponds to a tree in which Q-resolution on terms and
clauses alternate. This poses the theoretical bases for the introduction of
learning, corresponding to recording Q-resolution formulas associated with the
nodes of the tree. We discuss the problems related to the introduction of
learning in DLL based procedures, and present solutions extending
state-of-the-art proposals coming from the literature on propositional
satisfiability. Finally, we show that our DLL based solver extended with
learning, performs significantly better on benchmarks used in the 2003 QBF
solvers comparative evaluation
Goal reasoning for autonomous agents using automated planning
Mención Internacional en el título de doctorAutomated planning deals with the task of finding a sequence of actions, namely
a plan, which achieves a goal from a given initial state. Most planning research
consider goals are provided by a external user, and agents just have to find a
plan to achieve them. However, there exist many real world domains where
agents should not only reason about their actions but also about their goals,
generating new ones or changing them according to the perceived environment.
In this thesis we aim at broadening the goal reasoning capabilities of planningbased
agents, both when acting in isolation and when operating in the same
environment as other agents.
In single-agent settings, we firstly explore a special type of planning tasks
where we aim at discovering states that fulfill certain cost-based requirements
with respect to a given set of goals. By computing these states, agents are able
to solve interesting tasks such as find escape plans that move agents in to safe
places, hide their true goal to a potential observer, or anticipate dynamically arriving
goals. We also show how learning the environment’s dynamics may help
agents to solve some of these tasks. Experimental results show that these states
can be quickly found in practice, making agents able to solve new planning
tasks and helping them in solving some existing ones.
In multi-agent settings, we study the automated generation of goals based on
other agents’ behavior. We focus on competitive scenarios, where we are interested
in computing counterplans that prevent opponents from achieving their
goals. We frame these tasks as counterplanning, providing theoretical properties
of the counterplans that solve them. We also show how agents can benefit
from computing some of the states we propose in the single-agent setting to
anticipate their opponent’s movements, thus increasing the odds of blocking
them. Experimental results show how counterplans can be found in different
environments ranging from competitive planning domains to real-time strategy
games.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidenta: Eva Onaindía de la Rivaherrera.- Secretario: Ángel García Olaya.- Vocal: Mark Robert
A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
We introduce a new algorithm for multi-objective reinforcement learning
(MORL) with linear preferences, with the goal of enabling few-shot adaptation
to new tasks. In MORL, the aim is to learn policies over multiple competing
objectives whose relative importance (preferences) is unknown to the agent.
While this alleviates dependence on scalar reward design, the expected return
of a policy can change significantly with varying preferences, making it
challenging to learn a single model to produce optimal policies under different
preference conditions. We propose a generalized version of the Bellman equation
to learn a single parametric representation for optimal policies over the space
of all possible preferences. After an initial learning phase, our agent can
execute the optimal policy under any given preference, or automatically infer
an underlying preference with very few samples. Experiments across four
different domains demonstrate the effectiveness of our approach.Comment: Accepted in NeurIPS 201
Learning Strong Substitutes Demand via Queries
This paper addresses the computational challenges of learning strong
substitutes demand when given access to a demand (or valuation) oracle. Strong
substitutes demand generalises the well-studied gross substitutes demand to a
multi-unit setting. Recent work by Baldwin and Klemperer shows that any such
demand can be expressed in a natural way as a finite list of weighted bid
vectors. A simplified version of this bidding language has been used by the
Bank of England.
Assuming access to a demand oracle, we provide an algorithm that computes the
unique list of weighted bid vectors corresponding to a bidder's demand
preferences. In the special case where their demand can be expressed using
positive bids only, we have an efficient algorithm that learns this list in
linear time. We also show super-polynomial lower bounds on the query complexity
of computing the list of bids in the general case where bids may be positive
and negative. Our algorithms constitute the first systematic approach for
bidders to construct a bid list corresponding to non-trivial demand, allowing
them to participate in `product-mix' auctions
- …