2,319 research outputs found
Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality
Conflict-Based Search is one of the most popular methods for multi-agent path
finding. Though it is complete and optimal, it does not scale well. Recent
works have been proposed to accelerate it by introducing various heuristics.
However, whether these heuristics can apply to non-grid-based problem settings
while maintaining their effectiveness remains an open question. In this work,
we find that the answer is prone to be no. To this end, we propose a
learning-based component, i.e., the Graph Transformer, as a heuristic function
to accelerate the planning. The proposed method is provably complete and
bounded-suboptimal with any desired factor. We conduct extensive experiments on
two environments with dense graphs. Results show that the proposed Graph
Transformer can be trained in problem instances with relatively few agents and
generalizes well to a larger number of agents, while achieving better
performance than state-of-the-art methods.Comment: Accepted by ICRA 202
Human-Agent Decision-making: Combining Theory and Practice
Extensive work has been conducted both in game theory and logic to model
strategic interaction. An important question is whether we can use these
theories to design agents for interacting with people? On the one hand, they
provide a formal design specification for agent strategies. On the other hand,
people do not necessarily adhere to playing in accordance with these
strategies, and their behavior is affected by a multitude of social and
psychological factors. In this paper we will consider the question of whether
strategies implied by theories of strategic behavior can be used by automated
agents that interact proficiently with people. We will focus on automated
agents that we built that need to interact with people in two negotiation
settings: bargaining and deliberation. For bargaining we will study game-theory
based equilibrium agents and for argumentation we will discuss logic-based
argumentation theory. We will also consider security games and persuasion games
and will discuss the benefits of using equilibrium based agents.Comment: In Proceedings TARK 2015, arXiv:1606.0729
Planning under time pressure
Heuristic search is a technique used pervasively in artificial intelligence and automated planning. Often an agent is given a task that it would like to solve as quickly as possible. It must allocate its time between planning the actions to achieve the task and actually executing them. We call this problem planning under time pressure. Most popular heuristic search algorithms are ill-suited for this setting, as they either search a lot to find short plans or search a little and find long plans. The thesis of this dissertation is: when under time pressure, an automated agent should explicitly attempt to minimize the sum of planning and execution times, not just one or just the other.
This dissertation makes four contributions. First we present new algorithms that use modern multi-core CPUs to decrease planning time without increasing execution. Second, we introduce a new model for predicting the performance of iterative-deepening search. The model is as accurate as previous offline techniques when using less training data, but can also be used online to reduce the overhead of iterative-deepening search, resulting in faster planning. Third we show offline planning algorithms that directly attempt to minimize the sum of planning and execution times. And, fourth we consider algorithms that plan online in parallel with execution. Both offline and online algorithms account for a user-specified preference between search and execution, and can greatly outperform the standard utility-oblivious techniques. By addressing the problem of planning under time pressure, these contributions demonstrate that heuristic search is no longer restricted to optimizing solution cost, obviating the need to choose between slow search times and expensive solutions
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
Many real-world applications can be described as large-scale games of
imperfect information. To deal with these challenging domains, prior work has
focused on computing Nash equilibria in a handcrafted abstraction of the
domain. In this paper we introduce the first scalable end-to-end approach to
learning approximate Nash equilibria without prior domain knowledge. Our method
combines fictitious self-play with deep reinforcement learning. When applied to
Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium,
whereas common reinforcement learning methods diverged. In Limit Texas Holdem,
a poker game of real-world scale, NFSP learnt a strategy that approached the
performance of state-of-the-art, superhuman algorithms based on significant
domain expertise.Comment: updated version, incorporating conference feedbac
Heuristic search under time and cost bounds
Intelligence is difficult to formally define, but one of its hallmarks is the ability find a solution to a novel problem. Therefore it makes good sense that heuristic search is a foundational topic in artificial intelligence. In this context search refers to the process of finding a solution to the problem by considering a large, possibly infinite, set of potential plans of action. Heuristic refers to a rule of thumb or a guiding, if not always accurate, principle. Heuristic search describes a family of techniques which consider members of the set of potential plans of action in turn, as determined by the heuristic, until a suitable solution to the problem is discovered.
This work is concerned primarily with suboptimal heuristic search algorithms. These algorithms are not inherently flawed, but they are suboptimal in the sense that the plans that they return may be more expensive than a least cost, or optimal, plan for the problem. While suboptimal heuristic search algorithms may not return least cost solutions to the problem, they are often far faster than their optimal counterparts, making them more attractive for many applications.
The thesis of this dissertation is that the performance of suboptimal search algorithms can be improved by taking advantage of information that, while widely available, has been overlooked. In particular, we will see how estimates of the length of a plan, estimates of plan cost that do not err on the side of caution, and measurements of the accuracy of our estimators can be used to improve the performance of suboptimal heuristic search algorithms
- …