2 research outputs found
Learning in Real-Time Search: A Unifying Framework
Real-time search methods are suited for tasks in which the agent is
interacting with an initially unknown environment in real time. In such
simultaneous planning and learning problems, the agent has to select its
actions in a limited amount of time, while sensing only a local part of the
environment centered at the agents current location. Real-time heuristic search
agents select actions using a limited lookahead search and evaluating the
frontier states with a heuristic function. Over repeated experiences, they
refine heuristic values of states to avoid infinite loops and to converge to
better solutions. The wide spread of such settings in autonomous software and
hardware agents has led to an explosion of real-time search algorithms over the
last two decades. Not only is a potential user confronted with a hodgepodge of
algorithms, but he also faces the choice of control parameters they use. In
this paper we address both problems. The first contribution is an introduction
of a simple three-parameter framework (named LRTS) which extracts the core
ideas behind many existing algorithms. We then prove that LRTA*, epsilon-LRTA*,
SLA*, and gamma-Trap algorithms are special cases of our framework. Thus, they
are unified and extended with additional features. Second, we prove
completeness and convergence of any algorithm covered by the LRTS framework.
Third, we prove several upper-bounds relating the control parameters and
solution quality. Finally, we analyze the influence of the three control
parameters empirically in the realistic scalable domains of real-time
navigation on initially unknown maps from a commercial role-playing game as
well as routing in ad hoc sensor networks
Lookahead Pathologies for Single Agent Search
Admissible and consistent heuristic functions are usually preferred in single-agent heuristic search as they guarantee optimal solutions with complete search methods such as A* and IDA*. Larger problems, however, frequently make a complete search intractable due to space and/or time limitations. In particular, a path-planning agent in a realtime strategy game may need to take an action before its complete search has the time to finish. In such cases, incomplete search techniques (such as RTA*, SRTA*, RTDP, DTA*) can be used. Such algorithms conduct a limited ply lookahead and then evaluate the states envisioned using a heuristic function. The action selected on the basis of such evaluations can be suboptimal due to the incompleteness of search and inaccuracies in the heuristic. It is usually believed that deeper lookahead increases the chances of taking the optimal action. In this paper, we demonstrate that this is not necessarily the case, even when admissible and consistent heuristic functions are used