Search CORE

6 research outputs found

Multi-task reinforcement learning: shaping and feature selection

Author: Snel M.
Whiteson S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

International Migration, Integration and Social Cohesion online publications

Finding minimal action sequences with a simple evaluation of actions

Author: Ashby
Ashby
Ashvin Shah
Balleine
Balleine
Barto
Barto
Barto
Barto
Barto
Berridge
Berridge
Berridge
Bertsekas
Bi
Bissmarck
Bolado-Gomez
Chen
Chersi
Chung
Curcio
Curtis
Daw
Dickinson
Dietterich
Fagg
Friston
GlÃ¤scher
Goldman-Rakic
Graziano
Green
Gurney
Hart
Haruno
Horvitz
Houk
Izhikevich
Kawato
Kevin N. Gurney
Klopf
Knox
Koch
Konidaris
Konidaris
Kurth-Nelson
Kurtzer
Lillicrap
Ljungberg
Logan
London
Mahadevan
Markram
Mel
Milner
Moser
Myerson
Myerson
Niv
Osentoski
Oudeyer
Packard
Pan
Pasupathy
Pavlov
Pearce
Pedotti
Ravindran
Redgrave
Redgrave
Redgrave
Redgrave
Redgrave
Rosenstein
Rummery
Samejima
Samuelson
Schmidhuber
Schultz
Schultz
Schultz
Scott
Shah
Shah
Shah
Shah
Shah
Shah
Skinner
Staddon
Stafford
Strotz
Suri
Sutton
Sutton
Sutton
Sutton
Sutton
Thaler
Thorndike
Todorov
van Essen
Vasilaki
Wassum
Wickens
Willis
WÃ¶rgÃ¶tter
Yin
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

Animals are able to discover the minimal number of actions that achieves an outcome (the minimal action sequence). In most accounts of this, actions are associated with a measure of behavior that is higher for actions that lead to the outcome with a shorter action sequence, and learning mechanisms find the actions associated with the highest measure. In this sense, previous accounts focus on more than the simple binary signal of “was the outcome achieved?”; they focus on “how well was the outcome achieved?” However, such mechanisms may not govern all types of behavioral development. In particular, in the process of action discovery (Redgrave and Gurney, 2006), actions are reinforced if they simply lead to a salient outcome because biological reinforcement signals occur too quickly to evaluate the consequences of an action beyond an indication of the outcome’s occurrence. Thus, action discovery mechanisms focus on the simple evaluation of “was the outcome achieved?” and not “how well was the outcome achieved?” Notwithstanding this impoverishment of information, can the process of action discovery find the minimal action sequence? We address this question by implementing computational mechanisms, referred to in this paper as no-cost learning rules, in which each action that leads to the outcome is associated with the same measure of behavior. No-cost rules focus on “was the outcome achieved?” and are consistent with action discovery. No-cost rules discover the minimal action sequence in simulated tasks and execute it for a substantial amount of time. Extensive training, however, results in extraneous actions, suggesting that a separate process (which has been proposed in action discovery) must attenuate learning if no-cost rules participate in behavioral development. We describe how no-cost rules develop behavior, what happens when attenuation is disrupted, and relate the new mechanisms to wider computational and biological context

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

White Rose Research Online

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

Author: Congeduti Elena
Loog Marco
Oliehoek Frans A.
Starre Rolf A. N.
Publication venue
Publication date: 15/11/2023
Field of study

Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of an MDP while maintaining a bounded loss with respect to the original problem. Therefore, it may come as a surprise that no such guarantees are available when combining both techniques, i.e., where MBRL merely observes abstract states. Our theoretical analysis shows that abstraction can introduce a dependence between samples collected online (e.g., in the real world). That means that, without taking this dependence into account, results for MBRL do not directly extend to this setting. Our result shows that we can use concentration inequalities for martingales to overcome this problem. This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction. We illustrate this by combining R-MAX, a prototypical MBRL algorithm, with abstraction, thus producing the first performance guarantees for model-based 'RL from Abstracted Observations': model-based reinforcement learning with an abstract model.Comment: 36 pages, 2 figures, published in Transactions on Machine Learning Research (TMLR) 202

arXiv.org e-Print Archive

Generalization strategies in reinforcement learning

Author: Snel M.
Publication venue
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

A Sufficient Statistic for Influence in Structured Multiagent Environments

Author: Kaelbling Leslie P
Oliehoek Frans A
Witwicki Stefan
Publication venue
Publication date: 01/01/2021
Field of study

© 2021 AI Access Foundation. All rights reserved. Making decisions in complex environments is a key challenge in artificial intelligence (AI). Situations involving multiple decision makers are particularly complex, leading to computational intractability of principled solution methods. A body of work in AI has tried to mitigate this problem by trying to distill interaction to its essence: how does the policy of one agent influence another agent? If we can find more compact representations of such influence, this can help us deal with the complexity, for instance by searching the space of influences rather than the space of policies. However, so far these notions of influence have been restricted in their applicability to special cases of interaction. In this paper we formalize influence-based abstraction (IBA), which facilitates the elimination of latent state factors without any loss in value, for a very general class of problems described as factored partially observable stochastic games (fPOSGs). On the one hand, this generalizes existing descriptions of influence, and thus can serve as the foundation for improvements in scalability and other insights in decision making in complex multiagent settings. On the other hand, since the presence of other agents can be seen as a generalization of single agent settings, our formulation of IBA also provides a sufficient statistic for decision making under abstraction for a single agent. We also give a detailed discussion of the relations to such previous works, identifying new insights and interpretations of these approaches. In these ways, this paper deepens our understanding of abstraction in a wide range of sequential decision making settings, providing the basis for new approaches and algorithms for a large class of problems

University of Liverpool Repository

DSpace@MIT

Representation Discovery in Sequential Decision Making

Author: Sridhar Mahadevan
Publication venue
Publication date: 05/07/2010
Field of study

Automatically constructing novel representations of tasks from analysis of state spaces is a longstanding fundamental challenge in AI. I review recent progress on this problem for sequential decision making tasks modeled as Markov decision processes. Specifically, I discuss three classes of representation discovery problems: finding functional, state, and temporal abstractions. I describe solution techniques varying along several dimensions: diagonalization or dilation methods using approximate or exact transition models; rewardspecific vs reward-invariant methods; global vs. local representation construction methods; multiscale vs. flat discovery methods; and finally, orthogonal vs. redundant representation discovery methods. I conclude by describing a number of open problems for future work

CiteSeerX

Association for the Advancement of Artificial Intelligence: AAAI Publications