160 research outputs found
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table
Scaling of a large-scale simulation of synchronous slow-wave and asynchronous awake-like activity of a cortical model with long-range interconnections
Cortical synapse organization supports a range of dynamic states on multiple
spatial and temporal scales, from synchronous slow wave activity (SWA),
characteristic of deep sleep or anesthesia, to fluctuating, asynchronous
activity during wakefulness (AW). Such dynamic diversity poses a challenge for
producing efficient large-scale simulations that embody realistic metaphors of
short- and long-range synaptic connectivity. In fact, during SWA and AW
different spatial extents of the cortical tissue are active in a given timespan
and at different firing rates, which implies a wide variety of loads of local
computation and communication. A balanced evaluation of simulation performance
and robustness should therefore include tests of a variety of cortical dynamic
states. Here, we demonstrate performance scaling of our proprietary Distributed
and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and
AW for bidimensional grids of neural populations, which reflects the modular
organization of the cortex. We explored networks up to 192x192 modules, each
composed of 1250 integrate-and-fire neurons with spike-frequency adaptation,
and exponentially decaying inter-modular synaptic connectivity with varying
spatial decay constant. For the largest networks the total number of synapses
was over 70 billion. The execution platform included up to 64 dual-socket
nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40GHz
clock rates. Network initialization time, memory usage, and execution time
showed good scaling performances from 1 to 1024 processes, implemented using
the standard Message Passing Interface (MPI) protocol. We achieved simulation
speeds of between 2.3x10^9 and 4.1x10^9 synaptic events per second for both
cortical states in the explored range of inter-modular interconnections.Comment: 22 pages, 9 figures, 4 table
Simulation of networks of spiking neurons: A review of tools and strategies
We review different aspects of the simulation of spiking neural networks. We
start by reviewing the different types of simulation strategies and algorithms
that are currently implemented. We next review the precision of those
simulation strategies, in particular in cases where plasticity depends on the
exact timing of the spikes. We overview different simulators and simulation
environments presently available (restricted to those freely available, open
source and documented). For each simulation tool, its advantages and pitfalls
are reviewed, with an aim to allow the reader to identify which simulator is
appropriate for a given task. Finally, we provide a series of benchmark
simulations of different types of networks of spiking neurons, including
Hodgkin-Huxley type, integrate-and-fire models, interacting with current-based
or conductance-based synapses, using clock-driven or event-driven integration
strategies. The same set of models are implemented on the different simulators,
and the codes are made available. The ultimate goal of this review is to
provide a resource to facilitate identifying the appropriate integration
strategy and simulation tool to use for a given modeling problem related to
spiking neural networks.Comment: 49 pages, 24 figures, 1 table; review article, Journal of
Computational Neuroscience, in press (2007
The Brain on Low Power Architectures - Efficient Simulation of Cortical Slow Waves and Asynchronous States
Efficient brain simulation is a scientific grand challenge, a
parallel/distributed coding challenge and a source of requirements and
suggestions for future computing architectures. Indeed, the human brain
includes about 10^15 synapses and 10^11 neurons activated at a mean rate of
several Hz. Full brain simulation poses Exascale challenges even if simulated
at the highest abstraction level. The WaveScalES experiment in the Human Brain
Project (HBP) has the goal of matching experimental measures and simulations of
slow waves during deep-sleep and anesthesia and the transition to other brain
states. The focus is the development of dedicated large-scale
parallel/distributed simulation technologies. The ExaNeSt project designs an
ARM-based, low-power HPC architecture scalable to million of cores, developing
a dedicated scalable interconnect system, and SWA/AW simulations are included
among the driving benchmarks. At the joint between both projects is the INFN
proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation
engine. DPSNN can be configured to stress either the networking or the
computation features available on the execution platforms. The simulation
stresses the networking component when the neural net - composed by a
relatively low number of neurons, each one projecting thousands of synapses -
is distributed over a large number of hardware cores. When growing the number
of neurons per core, the computation starts to be the dominating component for
short range connections. This paper reports about preliminary performance
results obtained on an ARM-based HPC prototype developed in the framework of
the ExaNeSt project. Furthermore, a comparison is given of instantaneous power,
total energy consumption, execution time and energetic cost per synaptic event
of SWA/AW DPSNN simulations when executed on either ARM- or Intel-based server
platforms
Real-time cortical simulations: energy and interconnect scaling on distributed systems
We profile the impact of computation and inter-processor communication on the
energy consumption and on the scaling of cortical simulations approaching the
real-time regime on distributed computing platforms. Also, the speed and energy
consumption of processor architectures typical of standard HPC and embedded
platforms are compared. We demonstrate the importance of the design of
low-latency interconnect for speed and energy consumption. The cost of cortical
simulations is quantified using the Joule per synaptic event metric on both
architectures. Reaching efficient real-time on large scale cortical simulations
is of increasing relevance for both future bio-inspired artificial intelligence
applications and for understanding the cognitive functions of the brain, a
scientific quest that will require to embed large scale simulations into highly
complex virtual or real worlds. This work stands at the crossroads between the
WaveScalES experiment in the Human Brain Project (HBP), which includes the
objective of large scale thalamo-cortical simulations of brain states and their
transitions, and the ExaNeSt and EuroExa projects, that investigate the design
of an ARM-based, low-power High Performance Computing (HPC) architecture with a
dedicated interconnect scalable to million of cores; simulation of deep sleep
Slow Wave Activity (SWA) and Asynchronous aWake (AW) regimes expressed by
thalamo-cortical models are among their benchmarks.Comment: 8 pages, 8 figures, 4 tables, submitted after final publication on
PDP2019 proceedings, corrected final DOI. arXiv admin note: text overlap with
arXiv:1812.04974, arXiv:1804.0344
Gaussian and exponential lateral connectivity on distributed spiking neural network simulation
We measured the impact of long-range exponentially decaying intra-areal
lateral connectivity on the scaling and memory occupation of a distributed
spiking neural network simulator compared to that of short-range Gaussian
decays. While previous studies adopted short-range connectivity, recent
experimental neurosciences studies are pointing out the role of longer-range
intra-areal connectivity with implications on neural simulation platforms.
Two-dimensional grids of cortical columns composed by up to 11 M point-like
spiking neurons with spike frequency adaption were connected by up to 30 G
synapses using short- and long-range connectivity models. The MPI processes
composing the distributed simulator were run on up to 1024 hardware cores,
hosted on a 64 nodes server platform. The hardware platform was a cluster of
IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell
8-core E5-2630 v3 processors, with a clock of 2.40 G Hz, interconnected through
an InfiniBand network, equipped with 4x QDR switches.Comment: 9 pages, 9 figures, added reference to final peer reviewed version on
conference paper and DO
Closing the loop between neural network simulators and the OpenAI Gym
Since the enormous breakthroughs in machine learning over the last decade,
functional neural network models are of growing interest for many researchers
in the field of computational neuroscience. One major branch of research is
concerned with biologically plausible implementations of reinforcement
learning, with a variety of different models developed over the recent years.
However, most studies in this area are conducted with custom simulation scripts
and manually implemented tasks. This makes it hard for other researchers to
reproduce and build upon previous work and nearly impossible to compare the
performance of different learning architectures. In this work, we present a
novel approach to solve this problem, connecting benchmark tools from the field
of machine learning and state-of-the-art neural network simulators from
computational neuroscience. This toolchain enables researchers in both fields
to make use of well-tested high-performance simulation software supporting
biologically plausible neuron, synapse and network models and allows them to
evaluate and compare their approach on the basis of standardized environments
of varying complexity. We demonstrate the functionality of the toolchain by
implementing a neuronal actor-critic architecture for reinforcement learning in
the NEST simulator and successfully training it on two different environments
from the OpenAI Gym
Computational Modeling of Biological Neural Networks on GPUs: Strategies and Performance
Simulating biological neural networks is an important task for computational neuroscientists attempting to model and analyze brain activity and function. As these networks become larger and more complex, the computational power required grows significantly, often requiring the use of supercomputers or compute clusters. An emerging low-cost, highly accessible alternative to many of these resources is the Graphics Processing Unit (GPU) - specialized massively-parallel graphics hardware that has seen increasing use as a general purpose computational accelerator thanks largely due to NVIDIA\u27s CUDA programming interface. We evaluated the relative benefits and limitations of GPU-based tools for large-scale neural network simulation and analysis, first by developing an agent-inspired spiking neural network simulator then by adapting a neural signal decoding algorithm. Under certain network configurations, the simulator was able to outperform an equivalent MPI-based parallel implementation run on a dedicated compute cluster, while the decoding algorithm implementation consistently outperformed its serial counterpart. Additionally, the GPU-based simulator was able to readily visualize network spiking activity in real-time due to the close integration with standard computer graphics APIs. The GPU was shown to provide significant performance benefits under certain circumstances while lagging behind in others. Given the complex nature of these research tasks, a hybrid strategy that combines GPU- and CPU-based approaches provides greater performance than either separately
Benchmarking a many-core neuromorphic platform with an MPI-based DNA sequence matching algorithm
SpiNNaker is a neuromorphic globally asynchronous locally synchronous (GALS)multi-core architecture designed for simulating a spiking neural network (SNN) in real-time. Several studies have shown that neuromorphic platforms allow flexible and efficient simulations of SNN by exploiting the efficient communication infrastructure optimised for transmitting small packets across the many cores of the platform. However, the effectiveness of neuromorphic platforms in executing massively parallel general-purpose algorithms, while promising, is still to be explored. In this paper, we present an implementation of a parallel DNA sequence matching algorithm implemented by using the MPI programming paradigm ported to the SpiNNaker platform. In our implementation, all cores available in the board are configured for executing in parallel an optimised version of the Boyer-Moore (BM) algorithm. Exploiting this application, we benchmarked the SpiNNaker platform in terms of scalability and synchronisation latency. Experimental results indicate that the SpiNNaker parallel architecture allows a linear performance increase with the number of used cores and shows better scalability compared to a general-purpose multi-core computing platform
- …