Search CORE

9 research outputs found

Book reports

Author
Publication venue: Published by Elsevier Ltd.
Publication date: 31/05/2002
Field of study

Efficient concept formation in large state spaces

Author: B Draganski
CG Langton
RS Sutton
SW Wilson
V Mnih
Y Niv
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

General autonomous agents must be able to operate in previously unseen worlds with large state spaces. To operate successfully in such worlds, the agents must maintain their own models of the environment, based on concept sets that are several orders of magnitude smaller. For adaptive agents, those concept sets cannot be fixed, but must adapt continuously to new situations. This, in turn, requires mechanisms for forming and preserving those concepts that are critical to successful decision-making, while removing others. In this paper we compare four general algorithms for learning and decision-making: (i) standard Q-learning, (ii) deep Q-learning, (iii) single-agent local Q-learning, and (iv) single-agent local Q-learning with improved concept formation rules. In an experiment with a state space larger than 232, it was found that a single-agent local Q-learning agent with improved concept formation rules performed substantially better than a similar agent with less sophisticated concept formation rules and slightly better than a deep Q-learning agent

Crossref

Chalmers Research

Reduction of Learning Time for Robots Using Automatic State Abstraction

Author: Ahmadabadi Majid Nili
Asadpour Masoud
Siegwart Roland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/12/2009
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Behavioral Hierarchy: Exploration and Representation

Author: A. G. Barto
A. G. Barto
A. Jonsson
A. Jonsson
A. McGovern
A. Newell
B. Bakker
B. C. Silva da
B. Digney
B. Hengst
C. Boutilier
C. Guestrin
D. A. Waterman
D. Heckerman
D. W. Schneider
E. D. Sacerdoti
G. A. Miller
G. J. Tesauro
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
G. Konidaris
H. A. Simon
H. A. Simon
H. Seijen van
H. Steck
I. Menache
J. Gibson
J. Mugan
J. Pearl
J. R. Anderson
J. Schmidhuber
K. Murphy
K. S. Lashley
L. Torrey
M. E. Taylor
M. E. Taylor
M. Huber
M. M. Botvinick
M. M. Botvinick
M. Pickett
N. Friedman
N. Mehta
P. Langley
R. Alur
R. E. Bellman
R. E. Fikes
R. E. Korf
R. M. Ryan
R. Parr
R. R. Burridge
R. S. Sutton
R. S. Sutton
R. Tedrake
R. Tedrake
R. W. White
S. B. Thrun
S. Hart
S. Mahadevan
S. Mannor
S. Singh
S. Tong
T. G. Dietterich
T. G. Dietterich
T. L. Dean
W. Buntine
W. Callebaut
Y. Liu
Ö. Şimşek
Ö. Şimşek
Ö. Şimşek
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Automated state abstraction for options using the U-Tree algorithm

Author: Barto AG
Jonsson A
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2001
Field of study

Learning a complex task can be significantly facilitated by defining a hierarchy of subtasks. An agent can learn to choose between various temporally abstract actions, each solving an assigned subtask, to accomplish the overall task. In this paper, we study hierarchical learning using the framework of options. We argue that to take full advantage of hierarchical structure, one should perform option-specific state abstraction, and that if this is to scale to larger tasks, state abstraction should be automated. We adapt McCallum’s U-Tree algorithm to automatically build option-specific representations of the state feature space, and we illustrate the resulting algorithm using a simple hierarchical task. Results suggest that automated option-specific state abstraction is an attractive approach to making hierarchical learning systems more effective.

CiteSeerX

ScholarWorks@UMass Amherst

Behaviour design in microrobots:hierarchical reinforcement learning under resource constraints

Author: Asadpour Masoud
Publication venue: Lausanne, EPFL
Publication date: 31/10/2006
Field of study

In order to verify models of collective behaviors of animals, robots could be manipulated to implement the model and interact with real animals in a mixed-society. This thesis describes design of the behavioral hierarchy of a miniature robot, that is able to interact with cockroaches, and participates in their collective decision makings. The robots are controlled via a hierarchical behavior-based controller in which, more complex behaviors are built by combining simpler behaviors through fusion and arbitration mechanisms. The experiments in the mixed-society confirms the similarity between the collective patterns of the mixed-society and those of the real society. Moreover, the robots are able to induce new collective patterns by modulation of some behavioral parameters. Difficulties in the manual extraction of the behavioral hierarchy and inability to revise it, direct us to benefit from machine learning techniques, in order to devise the composition hierarchy and coordination in an automated way. We derive a Compact Q-Learning method for micro-robots with processing and memory constraints, and try to learn behavior coordination through it. The behavior composition part is still done manually. However, the problem of the curse of dimensionality makes incorporation of this kind of flat-learning techniques unsuitable. Even though optimizing them could temporarily speed up the learning process and widen their range of applications, their scalability to real world applications remains under question. In the next steps, we apply hierarchical learning techniques to automate both behavior coordination and composition parts. In some situations, many features of the state space might be irrelevant to what the robot currently learns. Abstracting these features and discovering the hierarchy among them can help the robot learn the behavioral hierarchy faster. We formalize the automatic state abstraction problem with different heuristics, and derive three new splitting criteria that adapt decision tree learning techniques to state abstraction. Proof of performance is supported by strong evidences from simulation results in deterministic and non-deterministic environments. Simulation results show encouraging enhancements in the required number of learning trials, robot's performance, size of the learned abstraction trees, and computation time of the algorithms. In the other hand, learning in a group provides free sources of knowledge that, if communicated, can broaden the scales of learning, both temporally and spatially. We present two approaches to combine output or structure of abstraction trees. The trees are stored in different RL robots in a multi-robot system, or in the trees learned by the same robot but using different methods. Simulation results in a non-deterministic football learning task provide strong evidences for enhancement in convergence rate and policy performance, specially in heterogeneous cooperations

Infoscience - École polytechnique fédérale de Lausanne