133 research outputs found

    Closing the loop between neural network simulators and the OpenAI Gym

    Full text link
    Since the enormous breakthroughs in machine learning over the last decade, functional neural network models are of growing interest for many researchers in the field of computational neuroscience. One major branch of research is concerned with biologically plausible implementations of reinforcement learning, with a variety of different models developed over the recent years. However, most studies in this area are conducted with custom simulation scripts and manually implemented tasks. This makes it hard for other researchers to reproduce and build upon previous work and nearly impossible to compare the performance of different learning architectures. In this work, we present a novel approach to solve this problem, connecting benchmark tools from the field of machine learning and state-of-the-art neural network simulators from computational neuroscience. This toolchain enables researchers in both fields to make use of well-tested high-performance simulation software supporting biologically plausible neuron, synapse and network models and allows them to evaluate and compare their approach on the basis of standardized environments of varying complexity. We demonstrate the functionality of the toolchain by implementing a neuronal actor-critic architecture for reinforcement learning in the NEST simulator and successfully training it on two different environments from the OpenAI Gym

    Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network

    Full text link
    Learning from the interaction is the primary way biological agents know about the environment and themselves. Modern deep reinforcement learning (DRL) explores a computational approach to learning from interaction and has significantly progressed in solving various tasks. However, the powerful DRL is still far from biological agents in energy efficiency. Although the underlying mechanisms are not fully understood, we believe that the integration of spiking communication between neurons and biologically-plausible synaptic plasticity plays a prominent role. Following this biological intuition, we optimize a spiking policy network (SPN) by a genetic algorithm as an energy-efficient alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects and communicates through event-based spikes. Inspired by biological research that the brain forms memories by forming new synaptic connections and rewires these connections based on new experiences, we tune the synaptic connections instead of weights in SPN to solve given tasks. Experimental results on several robotic control tasks show that our method can achieve the performance level of mainstream DRL methods and exhibit significantly higher energy efficiency

    Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning

    Full text link
    With the Deep Neural Networks (DNNs) as a powerful function approximator, Deep Reinforcement Learning (DRL) has been excellently demonstrated on robotic control tasks. Compared to DNNs with vanilla artificial neurons, the biologically plausible Spiking Neural Network (SNN) contains a diverse population of spiking neurons, making it naturally powerful on state representation with spatial and temporal information. Based on a hybrid learning framework, where a spike actor-network infers actions from states and a deep critic network evaluates the actor, we propose a Population-coding and Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state representation from two different scales: input coding and neuronal coding. For input coding, we apply population coding with dynamically receptive fields to directly encode each input state component. For neuronal coding, we propose different types of dynamic-neurons (containing 1st-order and 2nd-order neuronal dynamics) to describe much more complex neuronal dynamics. Finally, the PDSAN is trained in conjunction with deep critic networks using the Twin Delayed Deep Deterministic policy gradient algorithm (TD3-PDSAN). Extensive experimental results show that our TD3-PDSAN model achieves better performance than state-of-the-art models on four OpenAI gym benchmark tasks. It is an important attempt to improve RL with SNN towards the effective computation satisfying biological plausibility.Comment: 27 pages, 11 figures, accepted by Journal of Neural Network

    Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning

    Full text link
    With the help of special neuromorphic hardware, spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (DRL). In this paper, we focus on the task where the agent needs to learn multi-dimensional deterministic policies to control, which is very common in real scenarios. Recently, the surrogate gradient method has been utilized for training multi-layer SNNs, which allows SNNs to achieve comparable performance with the corresponding deep networks in this task. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected (FC) layer. However, the decimal characteristic of the firing rate brings the floating-point matrix operations to the FC layer, making the whole SNN unable to deploy on the neuromorphic hardware directly. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects and employ the membrane voltage of the non-spiking neurons to represent the action. Before the non-spiking neurons, multiple population neurons are introduced to decode different dimensions of actions. Since each population is used to decode a dimension of action, we argue that the neurons in each population should be connected in time domain and space domain. Hence, the intra-layer connections are used in output populations to enhance the representation capacity. Finally, we propose a fully spiking actor network with intra-layer connections (ILC-SAN).Comment: 13 pages, 6 figure

    Exploring the Influence of Energy Constraints on Liquid State Machines

    Get PDF
    Biological organisms operate under severe energy constraints but are still the most powerful computational systems that we know of. In contrast, modern AI algorithms are generally implemented on power-hungry hardware resources such as GPUs, limiting their use at the edge. This work explores the application of biologically-inspired energy constraints to spiking neural networks to better understand their effects on network dynamics and learning and to gain insight into the creation of more energy-efficient AI. Energy constraints are modeled by abstracting the role of astrocytes in metabolizing glucose and regulating the activity-driven distribution of ATP molecules to “pools” of neurons and synapses. First, energy constraints are applied to the fixed recurrent part (a.k.a. reservoir) of liquid state machines (LSM)—a type of recurrent spiking neural network—in order to analyze their effects on both the network’s computational performance and ability to learn. Energy constraints were observed to have a significant influence on the dynamics of the network based on metrics such as Lyapunov exponent and separation ratio. In several cases the energy constraints also led to an increase in the LSM’s classification accuracy (up to 6.17\% improvement over baseline) when applied to two time series classification tasks: epileptic seizure detection and gait recognition. This improvement in classification accuracy was also typically correlated with the LSM separation metric (Pearson correlation coefficient of 0.9 for seizure detection task). However, the increased classification accuracy was generally not observed in LSMs with sparse connectivity, highlighting the role of energy constrains in sparsifying the LSM’s spike activity, which could lead to real-world energy savings in hardware implementations. In addition to the fixed LSM reservoir, the impact of energy constraints was also explored in the context of unsupervised learning with spike-timing dependent plasticity (STDP). It was observed that energy constraints can have the effect of decreasing the magnitude of the update of synaptic weights by up to 72.4\%, on average, depending on factors such as the energy cost of neuron spikes and energy pool regeneration rate. Energy constraints under certain conditions were also seen to modify which input frequencies the synapses respond to, tending to attenuate or eliminate weight updates from high frequency inputs. The effects of neuronal energy constraints on STDP learning were also studied at the network level to determine their effects on classification task performance. The final part of this work attempts to co-optimize an LSM’s energy consumption and performance through reinforcement learning. A proximal policy optimization (PPO) agent is introduced into the LSM reservoir to control the level of neuronal spiking. This was done by allowing it to modify individual energy constraint parameters. The agent is rewarded based on the separation of the reservoir and additionally rewarded for the reduction of reservoir energy consumption

    The Islands Project for Managing Populations in Genetic Training of Spiking Neural Networks

    Get PDF
    The TENNLab software framework enables researchers to explore spiking neuroprocessors, neuromorphic applications and how they are trained. The centerpiece of training in TENNLab has been a genetic algorithm called Evolutionary Optimization For Neuromorphic System (EONS). EONS optimizes a single population of spiking neural networks, and heretofore, many methods to train with multiple populations have been ad hoc, typically consisting of shell scripts that execute multiple independent EONS jobs, whose results are combined and analyzed in another ad hoc fashion. The Islands project seeks to manage and manipulate multiple EONS populations in a controlled way. With Islands, one may spawn off independent EONS populations, each of which is an “Island.” One may define characteristics of a “stagnated” island, where further optimization is unlikely to improve the fitness of the population on the island. The Island software then allows one to create new islands by combining stagnated islands, or to migrate populations from one island to others, all in an attempt to increase diversity among the populations to improve their fitness. This thesis describes the software structure of Islands, its interface, and the functionalities that it implements. We then perform a case study with three neuromorphic control applications that demonstrate the wide variety of features of Islands

    Brain topology improved spiking neural network for efficient reinforcement learning of continuous control

    Get PDF
    The brain topology highly reflects the complex cognitive functions of the biological brain after million-years of evolution. Learning from these biological topologies is a smarter and easier way to achieve brain-like intelligence with features of efficiency, robustness, and flexibility. Here we proposed a brain topology-improved spiking neural network (BT-SNN) for efficient reinforcement learning. First, hundreds of biological topologies are generated and selected as subsets of the Allen mouse brain topology with the help of the Tanimoto hierarchical clustering algorithm, which has been widely used in analyzing key features of the brain connectome. Second, a few biological constraints are used to filter out three key topology candidates, including but not limited to the proportion of node functions (e.g., sensation, memory, and motor types) and network sparsity. Third, the network topology is integrated with the hybrid numerical solver-improved leaky-integrated and fire neurons. Fourth, the algorithm is then tuned with an evolutionary algorithm named adaptive random search instead of backpropagation to guide synaptic modifications without affecting raw key features of the topology. Fifth, under the test of four animal-survival-like RL tasks (i.e., dynamic controlling in Mujoco), the BT-SNN can achieve higher scores than not only counterpart SNN using random topology but also some classical ANNs (i.e., long-short-term memory and multi-layer perception). This result indicates that the research effort of incorporating biological topology and evolutionary learning rules has much in store for the future

    Systematic AI Approach for AGI: Addressing Alignment, Energy, and AGI Grand Challenges

    Full text link
    AI faces a trifecta of grand challenges the Energy Wall, the Alignment Problem and the Leap from Narrow AI to AGI. Contemporary AI solutions consume unsustainable amounts of energy during model training and daily operations.Making things worse, the amount of computation required to train each new AI model has been doubling every 2 months since 2020, directly translating to increases in energy consumption.The leap from AI to AGI requires multiple functional subsystems operating in a balanced manner, which requires a system architecture. However, the current approach to artificial intelligence lacks system design; even though system characteristics play a key role in the human brain from the way it processes information to how it makes decisions. Similarly, current alignment and AI ethics approaches largely ignore system design, yet studies show that the brains system architecture plays a critical role in healthy moral decisions.In this paper, we argue that system design is critically important in overcoming all three grand challenges. We posit that system design is the missing piece in overcoming the grand challenges.We present a Systematic AI Approach for AGI that utilizes system design principles for AGI, while providing ways to overcome the energy wall and the alignment challenges.Comment: International Journal on Semantic Computing (2024) Categories: Artificial Intelligence; AI; Artificial General Intelligence; AGI; System Design; System Architectur
    • …
    corecore