Search CORE

4,007 research outputs found

Learning Representations in Model-Free Hierarchical Reinforcement Learning

Author: Noelle David C.
Rafati Jacob
Publication venue
Publication date: 12/04/2019
Field of study

Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences (trajectories) of the agent. When combined with an intrinsic motivation learning mechanism, this method learns both subgoals and skills, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the first screen of the ATARI 2600 Montezuma's Revenge game

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Memory Structure and Cognitive Maps

Author: Aronowitz Sara
Robins Sarah K.
Stolk Arjen
Publication venue
Publication date
Field of study

A common way to understand memory structures in the cognitive sciences is as a cognitive map. Cognitive maps are representational systems organized by dimensions shared with physical space. The appeal to these maps begins literally: as an account of how spatial information is represented and used to inform spatial navigation. Invocations of cognitive maps, however, are often more ambitious; cognitive maps are meant to scale up and provide the basis for our more sophisticated memory capacities. The extension is not meant to be metaphorical, but the way in which these richer mental structures are supposed to remain map-like is rarely made explicit. Here we investigate this missing link, asking: how do cognitive maps represent non-spatial information? We begin with a survey of foundational work on spatial cognitive maps and then provide a comparative review of alternative, non-spatial representational structures. We then turn to several cutting-edge projects that are engaged in the task of scaling up cognitive maps so as to accommodate non-spatial information: first, on the spatial-isometric approach , encoding content that is non-spatial but in some sense isomorphic to spatial content; second, on the abstraction approach , encoding content that is an abstraction over first-order spatial information; and third, on the embedding approach , embedding non-spatial information within a spatial context, a prominent example being the Method-of-Loci. Putting these cases alongside one another reveals the variety of options available for building cognitive maps, and the distinctive limitations of each. We conclude by reflecting on where these results take us in terms of understanding the place of cognitive maps in memory

PhilPapers

DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

Author: Jaunet Theo
Vuillemot Romain
Wolf Christian
Publication venue
Publication date: 25/05/2020
Field of study

We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explainability in the field of visual analytics

arXiv.org e-Print Archive

The spectro-contextual encoding and retrieval theory of episodic memory.

Author: Ekstrom Arne D
Watrous Andrew J
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

The spectral fingerprint hypothesis, which posits that different frequencies of oscillations underlie different cognitive operations, provides one account for how interactions between brain regions support perceptual and attentive processes (Siegel etal., 2012). Here, we explore and extend this idea to the domain of human episodic memory encoding and retrieval. Incorporating findings from the synaptic to cognitive levels of organization, we argue that spectrally precise cross-frequency coupling and phase-synchronization promote the formation of hippocampal-neocortical cell assemblies that form the basis for episodic memory. We suggest that both cell assembly firing patterns as well as the global pattern of brain oscillatory activity within hippocampal-neocortical networks represents the contents of a particular memory. Drawing upon the ideas of context reinstatement and multiple trace theory, we argue that memory retrieval is driven by internal and/or external factors which recreate these frequency-specific oscillatory patterns which occur during episodic encoding. These ideas are synthesized into a novel model of episodic memory (the spectro-contextual encoding and retrieval theory, or "SCERT") that provides several testable predictions for future research

Crossref

PubMed Central

Frontiers - Publisher Connector

eScholarship - University of California

Indexing, browsing and searching of digital video

Author: Abe
Avaro
Brown
Chang
Chang
Choi
Goodrum
Hauptmann
Hirschman
Jarina
Kavanagh
Kazman
Koegel Buford
Kravtchenko
Le Gall
Lee
Lienhart
Marchionini
Maybury
McTear
Myers
Myllymaki
Poynton
Puri
Rasmussen
Rorvig
Rowley
Smyth
Sparck Jones
Stein
Wactlar
Wallace
Witbrock
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service