4,007 research outputs found

    Learning Representations in Model-Free Hierarchical Reinforcement Learning

    Full text link
    Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences (trajectories) of the agent. When combined with an intrinsic motivation learning mechanism, this method learns both subgoals and skills, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the first screen of the ATARI 2600 Montezuma's Revenge game

    Memory Structure and Cognitive Maps

    Get PDF
    A common way to understand memory structures in the cognitive sciences is as a cognitive map​. Cognitive maps are representational systems organized by dimensions shared with physical space. The appeal to these maps begins literally: as an account of how spatial information is represented and used to inform spatial navigation. Invocations of cognitive maps, however, are often more ambitious; cognitive maps are meant to scale up and provide the basis for our more sophisticated memory capacities. The extension is not meant to be metaphorical, but the way in which these richer mental structures are supposed to remain map-like is rarely made explicit. Here we investigate this missing link, asking: how do cognitive maps represent non-spatial information?​ We begin with a survey of foundational work on spatial cognitive maps and then provide a comparative review of alternative, non-spatial representational structures. We then turn to several cutting-edge projects that are engaged in the task of scaling up cognitive maps so as to accommodate non-spatial information: first, on the spatial-isometric approach​ , encoding content that is non-spatial but in some sense isomorphic to spatial content; second, on the ​ abstraction approach​ , encoding content that is an abstraction over first-order spatial information; and third, on the ​ embedding approach​ , embedding non-spatial information within a spatial context, a prominent example being the Method-of-Loci. Putting these cases alongside one another reveals the variety of options available for building cognitive maps, and the distinctive limitations of each. We conclude by reflecting on where these results take us in terms of understanding the place of cognitive maps in memory

    DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning

    Full text link
    We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co-correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explainability in the field of visual analytics

    The spectro-contextual encoding and retrieval theory of episodic memory.

    Get PDF
    The spectral fingerprint hypothesis, which posits that different frequencies of oscillations underlie different cognitive operations, provides one account for how interactions between brain regions support perceptual and attentive processes (Siegel etal., 2012). Here, we explore and extend this idea to the domain of human episodic memory encoding and retrieval. Incorporating findings from the synaptic to cognitive levels of organization, we argue that spectrally precise cross-frequency coupling and phase-synchronization promote the formation of hippocampal-neocortical cell assemblies that form the basis for episodic memory. We suggest that both cell assembly firing patterns as well as the global pattern of brain oscillatory activity within hippocampal-neocortical networks represents the contents of a particular memory. Drawing upon the ideas of context reinstatement and multiple trace theory, we argue that memory retrieval is driven by internal and/or external factors which recreate these frequency-specific oscillatory patterns which occur during episodic encoding. These ideas are synthesized into a novel model of episodic memory (the spectro-contextual encoding and retrieval theory, or "SCERT") that provides several testable predictions for future research

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
    corecore