2 research outputs found

    Elements of Episodic Memory: Insights from Artificial Agents

    Get PDF
    Many recent AI systems take inspiration from biological episodic memory. Here, we ask how these ‘episodic-inspired’ AI systems might inform our understanding of biological episodic memory. We discuss work showing that these systems implement some key features of episodic memory whilst differing in important respects, and appear to enjoy behavioural advantages in the domains of strategic decision-making, fast learning, navigation, exploration and acting over temporal distance. We propose that these systems could be used to evaluate competing theories of episodic memory’s operations and function. However, further work is needed to validate them as models of episodic memory and isolate the contributions of their memory systems to their behaviour. More immediately, we propose that these systems have a role to play in directing episodic memory research by highlighting novel or neglected hypotheses as pursuit-worthy. In this vein, we propose that the evidence reviewed here highlights two pursuit-worthy hypotheses about episodic memory’s function: that it plays a role in planning that is independent of future-oriented simulation, and that it is adaptive in virtue of its contributions to fast learning in novel, sparse-reward environments

    Double Prioritized State Recycled Experience Replay

    Full text link
    Experience replay enables online reinforcement learning agents to store and reuse the previous experiences of interacting with the environment. In the original method, the experiences are sampled and replayed uniformly at random. A prior work called prioritized experience replay was developed where experiences are prioritized, so as to replay experiences seeming to be more important more frequently. In this paper, we develop a method called double-prioritized state-recycled (DPSR) experience replay, prioritizing the experiences in both training stage and storing stage, as well as replacing the experiences in the memory with state recycling to make the best of experiences that seem to have low priorities temporarily. We used this method in Deep Q-Networks (DQN), and achieved a state-of-the-art result, outperforming the original method and prioritized experience replay on many Atari games
    corecore