25 research outputs found
Entraining and copying of temporal correlations in dissociated cultured neurons
Here we used multi-electrode array technology to examine the encoding of temporal information in dissociated hippocampal networks. We demonstrate that two connected populations of neurons can be trained to encode a defined time interval, and this memory trace persists for several hours. We also investigate whether the spontaneous firing activity of a trained network, can act as a template for copying the encoded time interval to a naive network. Such findings are of general significance for understanding fundamental principles of information storage and replicatio
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning
Learning Nash equilibrium (NE) in complex zero-sum games with multi-agent
reinforcement learning (MARL) can be extremely computationally expensive.
Curriculum learning is an effective way to accelerate learning, but an
under-explored dimension for generating a curriculum is the difficulty-to-learn
of the subgames -- games induced by starting from a specific state. In this
work, we present a novel subgame curriculum learning framework for zero-sum
games. It adopts an adaptive initial state distribution by resetting agents to
some previously visited states where they can quickly learn to improve
performance. Building upon this framework, we derive a subgame selection metric
that approximates the squared distance to NE values and further adopt a
particle-based state sampler for subgame generation. Integrating these
techniques leads to our new algorithm, Subgame Automatic Curriculum Learning
(SACL), which is a realization of the subgame curriculum learning framework.
SACL can be combined with any MARL algorithm such as MAPPO. Experiments in the
particle-world environment and Google Research Football environment show SACL
produces much stronger policies than baselines. In the challenging
hide-and-seek quadrant environment, SACL produces all four emergent stages and
uses only half the samples of MAPPO with self-play. The project website is at
https://sites.google.com/view/sacl-rl
Conjugate Natural Selection: Fisher-Rao Natural Gradient Descent Optimally Approximates Evolutionary Dynamics and Continuous Bayesian Inference
Rather than refining individual candidate solutions for a general non-convex
optimization problem, by analogy to evolution, we consider minimizing the
average loss for a parametric distribution over hypotheses. In this setting, we
prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates
the continuous-time replicator equation (an essential model of evolutionary
dynamics) by minimizing the mean-squared error for the relative fitness of
competing hypotheses. We term this finding "conjugate natural selection" and
demonstrate its utility by numerically solving an example non-convex
optimization problem over a continuous strategy space. Next, by developing
known connections between discrete-time replicator dynamics and Bayes's rule,
we show that when absolute fitness corresponds to the negative KL-divergence of
a hypothesis's predictions from actual observations, FR-NGD provides the
optimal approximation of continuous Bayesian inference. We use this result to
demonstrate a novel method for estimating the parameters of stochastic
processes.Comment: 13 pages, 3 figure