Search CORE

971 research outputs found

Speeding up planning in Markov decision processes via automatically constructed abstractions

Author: Bulitko Vadim
Greiner Russel
Isaza Alejandro
Szepesvári Csaba
Publication venue: AUAI Pr.
Publication date: 01/01/2008
Field of study

In this paper, we consider planning in stochastic shortest path problems, a subclass of Markov Decision Problems (MDP). We focus on medium-size problems whose state space can be fully enumerated. This problem has numerous important applications, such as navigation and planning under uncertainty. We propose a new approach for constructing a multi-level hierarchy of progressively simpler abstractions of the original problem. Once computed, the hierarchy can be used to speed up planning by first finding a policy for the most abstract level and then recursively refining it into a solution to the original problem. This approach is fully automated and delivers a speed-up of two orders of magnitude over a state-of-the-art MDP solver on sample problems while returning near-optimal solutions

SZTAKI Publication Repository

Flow for Meta Control

Author: Bulitko Vadim
Publication venue
Publication date: 17/07/2014
Field of study

The psychological state of flow has been linked to optimizing human performance. A key condition of flow emergence is a match between the human abilities and complexity of the task. We propose a simple computational model of flow for Artificial Intelligence (AI) agents. The model factors the standard agent-environment state into a self-reflective set of the agent's abilities and a socially learned set of the environmental complexity. Maximizing the flow serves as a meta control for the agent. We show how to apply the meta-control policy to a broad class of AI control policies and illustrate our approach with a specific implementation. Results in a synthetic testbed are promising and open interesting directions for future work

arXiv.org e-Print Archive

CiteSeerX

Adaptive Envelope MDPs for Relational Equivalence-based Planning

Author: Gardiol Natalia H.
Kaelbling Leslie Pack
Publication venue
Publication date: 29/07/2008
Field of study

We describe a method to use structured representations of the environmentâs dynamics to constrain and speed up the planning process. Given a problem domain described in a probabilistic logical description language, we develop an anytime technique that incrementally improves on an initial, partial policy. This partial solution is found by ï¬rst reducing the number of predicates needed to represent a relaxed version of the problem to a minimum, and then dynamically partitioning the action space into a set of equivalence classes with respect to this minimal representation. Our approach uses the envelope MDP framework, which creates a Markov decision process out of a subset of the full state space as de- termined by the initial partial solution. This strategy permits an agent to begin acting within a restricted part of the full state space and to expand its envelope judiciously as resources permit

DSpace@MIT

Workshop on Rich Representations for Reinforcement Learning:Held in conjunction with the 22nd International Conference on Machine Learning, August 7, 2005, Bonn, Germany

Author: Driessens Kurt
Fern Alan
van Otterlo Martijn
Publication venue: University of Bonn
Publication date: 01/01/2005
Field of study

University of Twente Research Information

Game State and Action Abstracting Monte Carlo Tree Search for General Strategy Game-Playing

Author: Dockhorn A
Hurtado-Grueso J
Jeurissen D
Perez-Liebana D
Xu L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

When implementing intelligent agents for strategy games, we observe that search-based methods struggle with the complexity of such games. To tackle this problem, we propose a new variant of Monte Carlo Tree Search which can incorporate action and game state abstractions. Focusing on the latter, we developed a game state encoding for turn-based strategy games that allows for a flexible abstraction. Using an optimization procedure, we optimize the agent's action and game state abstraction to maximize its performance against a rule-based agent. Furthermore, we compare different combinations of abstractions and their impact on the agent's performance based on the Kill the King game of the Stratega framework. Our results show that action abstractions have improved the performance of our agent considerably. Contrary, game state abstractions have not shown much impact. While these results may be limited to the tested game, they are in line with previous research on abstractions of simple Markov Decision Processes. The higher complexity of strategy games may require more intricate methods, such as hierarchical or time-based abstractions, to further improve the agent's performance

Queen Mary Research Online

Robot introspection through learned hidden Markov models

Author: Alami
Baum
Baum
Bobick
Bui
Cohen
Dempster
Derek Long
Forney
Guillaume Infantes
Ingrand
Jelinek
Kohonen
Makhoul
Malik Ghallab
Maria Fox
Minguez
Muscettola
Oppenheim
Rabiner
Rabiner
Shatkay
Wilson
Wilson
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system. Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task.The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain the learning process, the experimental setup and the structure of the resulting learned behavioural models. We then evaluate the extent to which explanations proposed by the learned models accord with a human observer's interpretation of the behaviour exhibited by the robot in its execution of the task

CiteSeerX

Elsevier - Publisher Connector

Crossref

University of Strathclyde Institutional Repository