Search CORE

197 research outputs found

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Principles of Human Learning

Author: Binz Marcel
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2021
Field of study

What are the general principles that drive human learning in different situations? I argue that much of human learning can be understood with just three principles. These are generalization, adaptation, and simplicity. To verify this conjecture, I introduce a modeling framework based on the same principles. This framework combines the idea of meta-learning -- also known as learning-to-learn -- with the minimum description length principle. The models that result from this framework capture many aspects of human learning across different domains, including decision-making, associative learning, function learning, multi-task learning, and reinforcement learning. In the context of decision-making, they explain why different heuristic decision-making strategies emerge and how appropriate strategies are selected. The same models furthermore capture order effects found in associative learning, function learning and multi-task learning. In the reinforcement learning context, they resemble individual differences between human exploration strategies and explain empirical data better than any other strategy under consideration. The proposed modeling framework -- together with its accompanying empirical evidence -- may therefore be viewed as a first step towards the identification of a minimal set of principles from which all human behavior derives

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Reinforcement learning approaches to the analysis of the emergence of goal-directed behaviour

Author: Mitsopoulos Constantinos
Publication venue
Publication date
Field of study

Over recent decades, theoretical neuroscience, helped by computational methods such as Reinforcement Learning (RL), has provided detailed descriptions of the psychology and neurobiology of decision-making. RL has provided many insights into the mechanisms underlying decision-making processes from neuronal to behavioral levels. In this work, we attempt to demonstrate the effectiveness of RL methods in explaining behavior in a normative setting through three main case studies. Evidence from literature shows that, apart from the commonly discussed cognitive search process, that governs the solution procedure of a planning task, there is an online perceptual process that directs the action selection towards moves that appear more ‘natural’ at a given configuration of a task. These two processes can be partially dissociated through developmental studies, with perceptual processes apparently more dominant in the planning of younger children, prior to the maturation of executive functions required for the control of search. Therefore, we present a formalization of planning processes to account for perceptual features of the task, and relate it to human data. Although young children are able to demonstrate their preferences by using physical actions, infants are restricted because of their as-yet-undeveloped motor skills. Eye-tracking methods have been employed to tackle this difficulty. Exploring different model-free RL algorithms and their possible cognitive realizations in decision making, in a second case study, we demonstrate behavioral signatures of decision making processes in eye-movement data and provide a potential framework for integrating eye-movement patterns with behavioral patterns. Finally, in a third project we examine how uncertainty in choices might guide exploration in 10-year-olds, using an abstract RL-based mathematical model. Throughout, aspects of action selection are seen as emerging from the RL computational framework. We, thus, conclude that computational descriptions of the developing decision making functions provide one plausible avenue by which to normatively characterize and define the functions that control action selection

Birkbeck Institutional Research Online

Planning with arithmetic and geometric attributes

Author: Folqué Garcia David
Publication venue: Universitat Politècnica de Catalunya
Publication date: 20/05/2018
Field of study

Often agents have to learn to act in environments with a mathematical structure. We propose to exploit such structure by augmenting the environment with user-specified attributes equipped with the appropriate geometric and arithmetic structure, bringing substantial gains in sample complexity

UPCommons. Portal del coneixement obert de la UPC

Reinforcement Learning

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

Directory of Open Access Books (DOAB)

Dark Control: The Default Mode Network as a Reinforcement Learning Agent

Author: Blumenfeld H.
Flechsig P.
Goodfellow I.
Hastie T.
James W.
Mesulam M.‐M.
Mohamed S.
Sallans B.
Silver D.
Song Z.
Stuss D.
Sutton R. S.
Whiten A.
Yakovlev P.
Publication venue: 'Wiley'
Publication date: 01/08/2020
Field of study

International audienceThe default mode network (DMN) is believed to subserve the baseline mental activity in humans. Its higher energy consumption compared to other brain networks and its intimate coupling with conscious awareness are both pointing to an unknown overarching function. Many research streams speak in favor of an evolutionarily adaptive role in envisioning experience to anticipate the future. In the present work, we propose a process model that tries to explain how the DMN may implement continuous evaluation and prediction of the environment to guide behavior. The main purpose of DMN activity, we argue, may be described by Markov Decision Processes that optimize action policies via value estimates based through vicarious trial and error. Our formal perspective on DMN function naturally accommodates as special cases previous interpretations based on (1) predictive coding, (2) semantic associations, and (3) a sentinel role. Moreover, this process model for the neural optimization of complex behavior in the DMN offers parsimonious explanations for recent experimental findings in animals and humans

Crossref

INRIA a CCSD electronic archive server

HAL-CEA

HAL-Pasteur

HAL-Rennes 1