4,007 research outputs found
Learning Representations in Model-Free Hierarchical Reinforcement Learning
Common approaches to Reinforcement Learning (RL) are seriously challenged by
large-scale applications involving huge state spaces and sparse delayed reward
feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address
this scalability issue by learning action selection policies at multiple levels
of temporal abstraction. Abstraction can be had by identifying a relatively
small set of states that are likely to be useful as subgoals, in concert with
the learning of corresponding skill policies to achieve those subgoals. Many
approaches to subgoal discovery in HRL depend on the analysis of a model of the
environment, but the need to learn such a model introduces its own problems of
scale. Once subgoals are identified, skills may be learned through intrinsic
motivation, introducing an internal reward signal marking subgoal attainment.
In this paper, we present a novel model-free method for subgoal discovery using
incremental unsupervised learning over a small memory of the most recent
experiences (trajectories) of the agent. When combined with an intrinsic
motivation learning mechanism, this method learns both subgoals and skills,
based on experiences in the environment. Thus, we offer an original approach to
HRL that does not require the acquisition of a model of the environment,
suitable for large-scale applications. We demonstrate the efficiency of our
method on two RL problems with sparse delayed feedback: a variant of the rooms
environment and the first screen of the ATARI 2600 Montezuma's Revenge game
Memory Structure and Cognitive Maps
A common way to understand memory structures in the cognitive sciences is as a cognitive map.
Cognitive maps are representational systems organized by dimensions shared with physical space. The
appeal to these maps begins literally: as an account of how spatial information is represented and used
to inform spatial navigation. Invocations of cognitive maps, however, are often more ambitious;
cognitive maps are meant to scale up and provide the basis for our more sophisticated memory
capacities. The extension is not meant to be metaphorical, but the way in which these richer mental
structures are supposed to remain map-like is rarely made explicit. Here we investigate this missing
link, asking: how do cognitive maps represent non-spatial information? We begin with a survey of
foundational work on spatial cognitive maps and then provide a comparative review of alternative,
non-spatial representational structures. We then turn to several cutting-edge projects that are engaged
in the task of scaling up cognitive maps so as to accommodate non-spatial information: first, on the
spatial-isometric approach , encoding content that is non-spatial but in some sense isomorphic to
spatial content; second, on the abstraction approach , encoding content that is an abstraction over
first-order spatial information; and third, on the embedding approach , embedding non-spatial
information within a spatial context, a prominent example being the Method-of-Loci. Putting these
cases alongside one another reveals the variety of options available for building cognitive maps, and the
distinctive limitations of each. We conclude by reflecting on where these results take us in terms of
understanding the place of cognitive maps in memory
DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning
We present DRLViz, a visual analytics interface to interpret the internal
memory of an agent (e.g. a robot) trained using deep reinforcement learning.
This memory is composed of large temporal vectors updated when the agent moves
in an environment and is not trivial to understand due to the number of
dimensions, dependencies to past vectors, spatial/temporal correlations, and
co-correlation between dimensions. It is often referred to as a black box as
only inputs (images) and outputs (actions) are intelligible for humans. Using
DRLViz, experts are assisted to interpret decisions using memory reduction
interactions, and to investigate the role of parts of the memory when errors
have been made (e.g. wrong direction). We report on DRLViz applied in the
context of video games simulators (ViZDoom) for a navigation scenario with item
gathering tasks. We also report on experts evaluation using DRLViz, and
applicability of DRLViz to other scenarios and navigation problems beyond
simulation games, as well as its contribution to black box models
interpretability and explainability in the field of visual analytics
The spectro-contextual encoding and retrieval theory of episodic memory.
The spectral fingerprint hypothesis, which posits that different frequencies of oscillations underlie different cognitive operations, provides one account for how interactions between brain regions support perceptual and attentive processes (Siegel etal., 2012). Here, we explore and extend this idea to the domain of human episodic memory encoding and retrieval. Incorporating findings from the synaptic to cognitive levels of organization, we argue that spectrally precise cross-frequency coupling and phase-synchronization promote the formation of hippocampal-neocortical cell assemblies that form the basis for episodic memory. We suggest that both cell assembly firing patterns as well as the global pattern of brain oscillatory activity within hippocampal-neocortical networks represents the contents of a particular memory. Drawing upon the ideas of context reinstatement and multiple trace theory, we argue that memory retrieval is driven by internal and/or external factors which recreate these frequency-specific oscillatory patterns which occur during episodic encoding. These ideas are synthesized into a novel model of episodic memory (the spectro-contextual encoding and retrieval theory, or "SCERT") that provides several testable predictions for future research
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
- …