Search CORE

13 research outputs found

Bounded Optimal Exploration in MDP

Author: Kawaguchi Kenji
Publication venue
Publication date: 21/02/2016
Field of study

Within the framework of probably approximately correct Markov decision processes (PAC-MDP), much theoretical work has focused on methods to attain near optimality after a relatively long period of learning and exploration. However, practical concerns require the attainment of satisfactory behavior within a short period of time. In this paper, we relax the PAC-MDP conditions to reconcile theoretically driven exploration methods and practical needs. We propose simple algorithms for discrete and continuous state spaces, and illustrate the benefits of our proposed relaxation via theoretical analyses and numerical examples. Our algorithms also maintain anytime error bounds and average loss bounds. Our approach accommodates both Bayesian and non-Bayesian methods.Comment: In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI), 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An efficient approach to model-based hierarchical reinforcement learning

Author: LEONG Tze-Yun
LI Zhuoru
NARAYAN Akshay
Publication venue: AAAI Press
Publication date: 01/02/2017
Field of study

National Research Foundation (NRF) Singapore under SMART and Future Mobility; Ministry of Education, Singapore under its Academic Research Funding Tier

Institutional Knowledge at Singapore Management University

Association for the Advancement of Artificial Intelligence: AAAI Publications

実環境における不確実性や遅延を考慮した学習に関する研究

Author: 斎藤淳哉
Publication venue
Publication date: 28/05/2012
Field of study

Tohoku University篠原

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Institutional Repositories DataBase (IRDB)