Search CORE

2,313 research outputs found

Performance Guarantees for Homomorphisms Beyond Markov Decision Processes

Author: Hutter Marcus
Majeed Sultan Javed
Publication venue
Publication date: 09/11/2018
Field of study

Most real-world problems have huge state and/or action spaces. Therefore, a naive application of existing tabular solution methods is not tractable on such problems. Nonetheless, these solution methods are quite useful if an agent has access to a relatively small state-action space homomorphism of the true environment and near-optimal performance is guaranteed by the map. A plethora of research is focused on the case when the homomorphism is a Markovian representation of the underlying process. However, we show that near-optimal performance is sometimes guaranteed even if the homomorphism is non-Markovian. Moreover, we can aggregate significantly more states by lifting the Markovian requirement without compromising on performance. In this work, we expand Extreme State Aggregation (ESA) framework to joint state-action aggregations. We also lift the policy uniformity condition for aggregation in ESA that allows even coarser modeling of the true environment

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Solving large stochastic planning problems using multiple dynamic abstractions

Author: Steinkraus Kurt Alan, 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 165-172).One of the goals of AI is to produce a computer system that can plan and act intelligently in the real world. It is difficult to do so, in part because real-world domains are very large. Existing research generally deals with the large domain size using a static representation and exploiting a single type of domain structure. This leads either to an inability to complete planning on larger domains or to poor solution quality because pertinent information is discarded. This thesis creates a framework that encapsulates existing and new abstraction and approximation methods into modules and combines arbitrary modules into a 'hierarchy that allows for dynamic representation changes. The combination of different abstraction methods allows many qualitatively different types of structure in the domain to be exploited simultaneously. The ability to change the representation dynamically allows the framework to take advantage of how different domain subparts are relevant in different ways at different times. Since the current plan tracks the current representation, choosing to simplify (or omit) distant or improbable areas of the domain sacrifices little in the way of solution quality while making the planning problem considerably easier.(cont.) The module hierarchy approach leads to greater abstraction that is tailored to the domain and therefore need not give up hope of creating reasonable solutions. While there are no optimality guarantees, experimental results show that suitable module choices gain computational tractability at little cost to behavioral optimality and allow the module hierarchy to solve larger and more interesting domains than previously possible.by Kurt Alan Steinkraus.Ph.D

DSpace@MIT

Workshop on Rich Representations for Reinforcement Learning:Held in conjunction with the 22nd International Conference on Machine Learning, August 7, 2005, Bonn, Germany

Author: Driessens Kurt
Fern Alan
van Otterlo Martijn
Publication venue: University of Bonn
Publication date: 01/01/2005
Field of study

University of Twente Research Information

Adaptive Envelope MDPs for Relational Equivalence-based Planning

Author: Gardiol Natalia H.
Kaelbling Leslie Pack
Publication venue
Publication date: 29/07/2008
Field of study

We describe a method to use structured representations of the environmentâs dynamics to constrain and speed up the planning process. Given a problem domain described in a probabilistic logical description language, we develop an anytime technique that incrementally improves on an initial, partial policy. This partial solution is found by ï¬rst reducing the number of predicates needed to represent a relaxed version of the problem to a minimum, and then dynamically partitioning the action space into a set of equivalence classes with respect to this minimal representation. Our approach uses the envelope MDP framework, which creates a Markov decision process out of a subset of the full state space as de- termined by the initial partial solution. This strategy permits an agent to begin acting within a restricted part of the full state space and to expand its envelope judiciously as resources permit

DSpace@MIT