241,345 research outputs found
Trajectory Planning on Grids: Considering Speed Limit Constraints
Trajectory (path) planning is a well known and thoroughly studied field
of automated planning. It is usually used in computer games, robotics or autonomous
agent simulations. Grids are often used for regular discretization of continuous
space. Many methods exist for trajectory (path) planning on grids, we
address the well known A* algorithm and the state-of-the-art Theta* algorithm.
Theta* algorithm, as opposed to A*, provides ‘any-angle‘ paths that look more realistic.
In this paper, we provide an extension of both these algorithms to enable
support for speed limit constraints.We experimentally evaluate and thoroughly discuss
how the extensions affect the planning process showing reasonability and justification
of our approach
Path Planning Problems with Side Observations-When Colonels Play Hide-and-Seek
Resource allocation games such as the famous Colonel Blotto (CB) and
Hide-and-Seek (HS) games are often used to model a large variety of practical
problems, but only in their one-shot versions. Indeed, due to their extremely
large strategy space, it remains an open question how one can efficiently learn
in these games. In this work, we show that the online CB and HS games can be
cast as path planning problems with side-observations (SOPPP): at each stage, a
learner chooses a path on a directed acyclic graph and suffers the sum of
losses that are adversarially assigned to the corresponding edges; and she then
receives semi-bandit feedback with side-observations (i.e., she observes the
losses on the chosen edges plus some others). We propose a novel algorithm,
EXP3-OE, the first-of-its-kind with guaranteed efficient running time for SOPPP
without requiring any auxiliary oracle. We provide an expected-regret bound of
EXP3-OE in SOPPP matching the order of the best benchmark in the literature.
Moreover, we introduce additional assumptions on the observability model under
which we can further improve the regret bounds of EXP3-OE. We illustrate the
benefit of using EXP3-OE in SOPPP by applying it to the online CB and HS games.Comment: Previously, this work appeared as arXiv:1911.09023 which was
mistakenly submitted as a new article (has been submitted to be withdrawn).
This is a preprint of the work published in Proceedings of the 34th AAAI
Conference on Artificial Intelligence (AAAI
Towards natural language understanding in text-based games
Text-based games are a very promising space for language-focused machine learning. Within them are huge hurdles in machine learning, like long-term planning and memory, interpretation and generation of natural language, unpredictability, and more. One problem to consider in the realm of natural language interpretation is how to train a machine learning model to understand a text-based game’s objective. This work considers treating this issue like a machine translation problem, where a detailed objective or list of instructions is given as input, and output is a predicted list of actions. This work also explores how a supervised learning system might learn long-term planning and memory through the example of an oracle that always knows the best path. In this exploration, the work here shows that finding this best path is infeasible
- …