22 research outputs found
The Context Repetition Effect: Role of prediction in new memory formation.
3rd Place at Denman Undergraduate Research ForumMany theories posit that the associative process at the core of episodic memory binds the content of an experience to the context in which we experience it. Here, context can be broadly defined as the mental representation capturing our recent experience. We recently discovered the context repetition effect (CRE), which shows that repeating a context once leads to greater memory performance for an item learned within that context even if the item does not occur again. Currently, we have conducted three studies to test the CRE. Experiment 1 was a complete replication of the original experiment that first discovered the CRE, save that there were multiple repetitions of a context instead of just one. We found that the presentation of a context and item, followed by two repetitions of the context with a new item each time, resulted in a near significant boost in memory and confidence in memory of subjects for the original item. Experiment 2 replaced words with scenes and faces. Subjects associated male and female faces with indoor and outdoor scenes. Subjects showed trends towards reduced performance and no demonstration of the CRE. Lack of power for performance results possibly due to difficulty in encoding faces relative to words. Experiment 3 replicated Experiment 2, save that there was an additional repetition. Results trended toward those found in Experiment 2.No embargoAcademic Major: Psycholog
Predictive auxiliary objectives in deep RL mimic learning in the brain
The ability to predict upcoming events has been hypothesized to comprise a
key aspect of natural and machine cognition. This is supported by trends in
deep reinforcement learning (RL), where self-supervised auxiliary objectives
such as prediction are widely used to support representation learning and
improve task performance. Here, we study the effects predictive auxiliary
objectives have on representation learning across different modules of an RL
system and how these mimic representational changes observed in the brain. We
find that predictive objectives improve and stabilize learning particularly in
resource-limited architectures, and we identify settings where longer
predictive horizons better support representational transfer. Furthermore, we
find that representational changes in this RL system bear a striking
resemblance to changes in neural activity observed in the brain across various
experiments. Specifically, we draw a connection between the auxiliary
predictive model of the RL system and hippocampus, an area thought to learn a
predictive model to support memory-guided behavior. We also connect the encoder
network and the value learning network of the RL system to visual cortex and
striatum in the brain, respectively. This work demonstrates how representation
learning in deep RL systems can provide an interpretable framework for modeling
multi-region interactions in the brain. The deep RL perspective taken here also
suggests an additional role of the hippocampus in the brain -- that of an
auxiliary learning system that benefits representation learning in other
regions
Successor Feature Sets: Generalizing Successor Representations Across Policies
Successor-style representations have many advantages for reinforcement
learning: for example, they can help an agent generalize from past experience
to new goals, and they have been proposed as explanations of behavioral and
neural data from human and animal learners. They also form a natural bridge
between model-based and model-free RL methods: like the former they make
predictions about future experiences, and like the latter they allow efficient
prediction of total discounted rewards. However, successor-style
representations are not optimized to generalize across policies: typically, we
maintain a limited-length list of policies, and share information among them by
representation learning or GPI. Successor-style representations also typically
make no provision for gathering information or reasoning about latent
variables. To address these limitations, we bring together ideas from
predictive state representations, belief space value iteration, successor
features, and convex analysis: we develop a new, general successor-style
representation, together with a Bellman equation that connects multiple sources
of information within this representation, including different latent states,
policies, and reward functions. The new representation is highly expressive:
for example, it lets us efficiently read off an optimal policy for a new reward
function, or a policy that imitates a new demonstration. For this paper, we
focus on exact computation of the new representation in small, known
environments, since even this restricted setting offers plenty of interesting
questions. Our implementation does not scale to large, unknown environments --
nor would we expect it to, since it generalizes POMDP value iteration, which is
difficult to scale. However, we believe that future work will allow us to
extend our ideas to approximate reasoning in large, unknown environments
Predictive maps in rats and humans for spatial navigation
Much of our understanding of navigation comes from the study of individual species, often with specific tasks tailored to those species. Here, we provide a novel experimental and analytic framework integrating across humans, rats, and simulated reinforcement learning (RL) agents to interrogate the dynamics of behavior during spatial navigation. We developed a novel open-field navigation task ("Tartarus maze") requiring dynamic adaptation (shortcuts and detours) to frequently changing obstructions on the path to a hidden goal. Humans and rats were remarkably similar in their trajectories. Both species showed the greatest similarity to RL agents utilizing a "successor representation," which creates a predictive map. Humans also displayed trajectory features similar to model-based RL agents, which implemented an optimal tree-search planning procedure. Our results help refine models seeking to explain mammalian navigation in dynamic environments and highlight the utility of modeling the behavior of different species to uncover the shared mechanisms that support behavior
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Marginalized importance sampling (MIS), which measures the density ratio
between the state-action occupancy of a target policy and that of a sampling
distribution, is a promising approach for off-policy evaluation. However,
current state-of-the-art MIS methods rely on complex optimization tricks and
succeed mostly on simple toy problems. We bridge the gap between MIS and deep
reinforcement learning by observing that the density ratio can be computed from
the successor representation of the target policy. The successor representation
can be trained through deep reinforcement learning methodology and decouples
the reward optimization from the dynamics of the environment, making the
resulting algorithm stable and applicable to high-dimensional domains. We
evaluate the empirical performance of our approach on a variety of challenging
Atari and MuJoCo environments.Comment: ICML 202
Mice identify subgoal locations through an action-driven mapping process
Mammals form mental maps of the environments by exploring their surroundings. Here, we investigate which elements of exploration are important for this process. We studied mouse escape behavior, in which mice are known to memorize subgoal locations-obstacle edges-to execute efficient escape routes to shelter. To test the role of exploratory actions, we developed closed-loop neural-stimulation protocols for interrupting various actions while mice explored. We found that blocking running movements directed at obstacle edges prevented subgoal learning; however, blocking several control movements had no effect. Reinforcement learning simulations and analysis of spatial data show that artificial agents can match these results if they have a region-level spatial representation and explore with object-directed movements. We conclude that mice employ an action-driven process for integrating subgoals into a hierarchical cognitive map. These findings broaden our understanding of the cognitive toolkit that mammals use to acquire spatial knowledge