133 research outputs found
Closing the loop between neural network simulators and the OpenAI Gym
Since the enormous breakthroughs in machine learning over the last decade,
functional neural network models are of growing interest for many researchers
in the field of computational neuroscience. One major branch of research is
concerned with biologically plausible implementations of reinforcement
learning, with a variety of different models developed over the recent years.
However, most studies in this area are conducted with custom simulation scripts
and manually implemented tasks. This makes it hard for other researchers to
reproduce and build upon previous work and nearly impossible to compare the
performance of different learning architectures. In this work, we present a
novel approach to solve this problem, connecting benchmark tools from the field
of machine learning and state-of-the-art neural network simulators from
computational neuroscience. This toolchain enables researchers in both fields
to make use of well-tested high-performance simulation software supporting
biologically plausible neuron, synapse and network models and allows them to
evaluate and compare their approach on the basis of standardized environments
of varying complexity. We demonstrate the functionality of the toolchain by
implementing a neuronal actor-critic architecture for reinforcement learning in
the NEST simulator and successfully training it on two different environments
from the OpenAI Gym
Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network
Learning from the interaction is the primary way biological agents know about
the environment and themselves. Modern deep reinforcement learning (DRL)
explores a computational approach to learning from interaction and has
significantly progressed in solving various tasks. However, the powerful DRL is
still far from biological agents in energy efficiency. Although the underlying
mechanisms are not fully understood, we believe that the integration of spiking
communication between neurons and biologically-plausible synaptic plasticity
plays a prominent role. Following this biological intuition, we optimize a
spiking policy network (SPN) by a genetic algorithm as an energy-efficient
alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects
and communicates through event-based spikes. Inspired by biological research
that the brain forms memories by forming new synaptic connections and rewires
these connections based on new experiences, we tune the synaptic connections
instead of weights in SPN to solve given tasks. Experimental results on several
robotic control tasks show that our method can achieve the performance level of
mainstream DRL methods and exhibit significantly higher energy efficiency
Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning
With the Deep Neural Networks (DNNs) as a powerful function approximator,
Deep Reinforcement Learning (DRL) has been excellently demonstrated on robotic
control tasks. Compared to DNNs with vanilla artificial neurons, the
biologically plausible Spiking Neural Network (SNN) contains a diverse
population of spiking neurons, making it naturally powerful on state
representation with spatial and temporal information. Based on a hybrid
learning framework, where a spike actor-network infers actions from states and
a deep critic network evaluates the actor, we propose a Population-coding and
Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state
representation from two different scales: input coding and neuronal coding. For
input coding, we apply population coding with dynamically receptive fields to
directly encode each input state component. For neuronal coding, we propose
different types of dynamic-neurons (containing 1st-order and 2nd-order neuronal
dynamics) to describe much more complex neuronal dynamics. Finally, the PDSAN
is trained in conjunction with deep critic networks using the Twin Delayed Deep
Deterministic policy gradient algorithm (TD3-PDSAN). Extensive experimental
results show that our TD3-PDSAN model achieves better performance than
state-of-the-art models on four OpenAI gym benchmark tasks. It is an important
attempt to improve RL with SNN towards the effective computation satisfying
biological plausibility.Comment: 27 pages, 11 figures, accepted by Journal of Neural Network
Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning
With the help of special neuromorphic hardware, spiking neural networks
(SNNs) are expected to realize artificial intelligence (AI) with less energy
consumption. It provides a promising energy-efficient way for realistic control
tasks by combining SNNs with deep reinforcement learning (DRL). In this paper,
we focus on the task where the agent needs to learn multi-dimensional
deterministic policies to control, which is very common in real scenarios.
Recently, the surrogate gradient method has been utilized for training
multi-layer SNNs, which allows SNNs to achieve comparable performance with the
corresponding deep networks in this task. Most existing spike-based RL methods
take the firing rate as the output of SNNs, and convert it to represent
continuous action space (i.e., the deterministic policy) through a
fully-connected (FC) layer. However, the decimal characteristic of the firing
rate brings the floating-point matrix operations to the FC layer, making the
whole SNN unable to deploy on the neuromorphic hardware directly. To develop a
fully spiking actor network without any floating-point matrix operations, we
draw inspiration from the non-spiking interneurons found in insects and employ
the membrane voltage of the non-spiking neurons to represent the action. Before
the non-spiking neurons, multiple population neurons are introduced to decode
different dimensions of actions. Since each population is used to decode a
dimension of action, we argue that the neurons in each population should be
connected in time domain and space domain. Hence, the intra-layer connections
are used in output populations to enhance the representation capacity. Finally,
we propose a fully spiking actor network with intra-layer connections
(ILC-SAN).Comment: 13 pages, 6 figure
Exploring the Influence of Energy Constraints on Liquid State Machines
Biological organisms operate under severe energy constraints but are still the most powerful computational systems that we know of. In contrast, modern AI algorithms are generally implemented on power-hungry hardware resources such as GPUs, limiting their use at the edge. This work explores the application of biologically-inspired energy constraints to spiking neural networks to better understand their effects on network dynamics and learning and to gain insight into the creation of more energy-efficient AI. Energy constraints are modeled by abstracting the role of astrocytes in metabolizing glucose and regulating the activity-driven distribution of ATP molecules to “pools” of neurons and synapses.
First, energy constraints are applied to the fixed recurrent part (a.k.a. reservoir) of liquid state machines (LSM)—a type of recurrent spiking neural network—in order to analyze their effects on both the network’s computational performance and ability to learn. Energy constraints were observed to have a significant influence on the dynamics of the network based on metrics such as Lyapunov exponent and separation ratio. In several cases the energy constraints also led to an increase in the LSM’s classification accuracy (up to 6.17\% improvement over baseline) when applied to two time series classification tasks: epileptic seizure detection and gait recognition. This improvement in classification accuracy was also typically correlated with the LSM separation metric (Pearson correlation coefficient of 0.9 for seizure detection task). However, the increased classification accuracy was generally not observed in LSMs with sparse connectivity, highlighting the role of energy constrains in sparsifying the LSM’s spike activity, which could lead to real-world energy savings in hardware implementations.
In addition to the fixed LSM reservoir, the impact of energy constraints was also explored in the context of unsupervised learning with spike-timing dependent plasticity (STDP). It was observed that energy constraints can have the effect of decreasing the magnitude of the update of synaptic weights by up to 72.4\%, on average, depending on factors such as the energy cost of neuron spikes and energy pool regeneration rate. Energy constraints under certain conditions were also seen to modify which input frequencies the synapses respond to, tending to attenuate or eliminate weight updates from high frequency inputs. The effects of neuronal energy constraints on STDP learning were also studied at the network level to determine their effects on classification task performance.
The final part of this work attempts to co-optimize an LSM’s energy consumption and performance through reinforcement learning. A proximal policy optimization (PPO) agent is introduced into the LSM reservoir to control the level of neuronal spiking. This was done by allowing it to modify individual energy constraint parameters. The agent is rewarded based on the separation of the reservoir and additionally rewarded for the reduction of reservoir energy consumption
The Islands Project for Managing Populations in Genetic Training of Spiking Neural Networks
The TENNLab software framework enables researchers to explore spiking neuroprocessors, neuromorphic applications and how they are trained. The centerpiece of training in TENNLab has been a genetic algorithm called Evolutionary Optimization For Neuromorphic System (EONS). EONS optimizes a single population of spiking neural networks, and heretofore, many methods to train with multiple populations have been ad hoc, typically consisting of shell scripts that execute multiple independent EONS jobs, whose results are combined and analyzed in another ad hoc fashion. The Islands project seeks to manage and manipulate multiple EONS populations in a controlled way. With Islands, one may spawn off independent EONS populations, each of which is an “Island.” One may define characteristics of a “stagnated” island, where further optimization is unlikely to improve the fitness of the population on the island. The Island software then allows one to create new islands by combining stagnated islands, or to migrate populations from one island to others, all in an attempt to increase diversity among the populations to improve their fitness. This thesis describes the software structure of Islands, its interface, and the functionalities that it implements. We then perform a case study with three neuromorphic control applications that demonstrate the wide variety of features of Islands
Brain topology improved spiking neural network for efficient reinforcement learning of continuous control
The brain topology highly reflects the complex cognitive functions of the biological brain after million-years of evolution. Learning from these biological topologies is a smarter and easier way to achieve brain-like intelligence with features of efficiency, robustness, and flexibility. Here we proposed a brain topology-improved spiking neural network (BT-SNN) for efficient reinforcement learning. First, hundreds of biological topologies are generated and selected as subsets of the Allen mouse brain topology with the help of the Tanimoto hierarchical clustering algorithm, which has been widely used in analyzing key features of the brain connectome. Second, a few biological constraints are used to filter out three key topology candidates, including but not limited to the proportion of node functions (e.g., sensation, memory, and motor types) and network sparsity. Third, the network topology is integrated with the hybrid numerical solver-improved leaky-integrated and fire neurons. Fourth, the algorithm is then tuned with an evolutionary algorithm named adaptive random search instead of backpropagation to guide synaptic modifications without affecting raw key features of the topology. Fifth, under the test of four animal-survival-like RL tasks (i.e., dynamic controlling in Mujoco), the BT-SNN can achieve higher scores than not only counterpart SNN using random topology but also some classical ANNs (i.e., long-short-term memory and multi-layer perception). This result indicates that the research effort of incorporating biological topology and evolutionary learning rules has much in store for the future
Systematic AI Approach for AGI: Addressing Alignment, Energy, and AGI Grand Challenges
AI faces a trifecta of grand challenges the Energy Wall, the Alignment
Problem and the Leap from Narrow AI to AGI. Contemporary AI solutions consume
unsustainable amounts of energy during model training and daily
operations.Making things worse, the amount of computation required to train
each new AI model has been doubling every 2 months since 2020, directly
translating to increases in energy consumption.The leap from AI to AGI requires
multiple functional subsystems operating in a balanced manner, which requires a
system architecture. However, the current approach to artificial intelligence
lacks system design; even though system characteristics play a key role in the
human brain from the way it processes information to how it makes decisions.
Similarly, current alignment and AI ethics approaches largely ignore system
design, yet studies show that the brains system architecture plays a critical
role in healthy moral decisions.In this paper, we argue that system design is
critically important in overcoming all three grand challenges. We posit that
system design is the missing piece in overcoming the grand challenges.We
present a Systematic AI Approach for AGI that utilizes system design principles
for AGI, while providing ways to overcome the energy wall and the alignment
challenges.Comment: International Journal on Semantic Computing (2024) Categories:
Artificial Intelligence; AI; Artificial General Intelligence; AGI; System
Design; System Architectur
- …