238 research outputs found
Enabling Massive Deep Neural Networks with the GraphBLAS
Deep Neural Networks (DNNs) have emerged as a core tool for machine learning.
The computations performed during DNN training and inference are dominated by
operations on the weight matrices describing the DNN. As DNNs incorporate more
stages and more nodes per stage, these weight matrices may be required to be
sparse because of memory limitations. The GraphBLAS.org math library standard
was developed to provide high performance manipulation of sparse weight
matrices and input/output vectors. For sufficiently sparse matrices, a sparse
matrix library requires significantly less memory than the corresponding dense
matrix implementation. This paper provides a brief description of the
mathematics underlying the GraphBLAS. In addition, the equations of a typical
DNN are rewritten in a form designed to use the GraphBLAS. An implementation of
the DNN is given using a preliminary GraphBLAS C library. The performance of
the GraphBLAS implementation is measured relative to a standard dense linear
algebra library implementation. For various sizes of DNN weight matrices, it is
shown that the GraphBLAS sparse implementation outperforms a BLAS dense
implementation as the weight matrix becomes sparser.Comment: 10 pages, 7 figures, to appear in the 2017 IEEE High Performance
Extreme Computing (HPEC) conferenc
Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems
Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models
Parallel and pseudorandom discrete event system specification vs. networks of spiking neurons: Formalization and preliminary implementation results
International audienceUsual Parallel Discrete Event System Specification (P-DEVS) allows specifying systems from modeling to simulation. However, the framework does not incorporate parallel and stochastic simulations. This work intends to extend P-DEVS to parallel simulations and pseudorandom number generators in the context of a spiking neural network. The discrete event specification presented here makes explicit and centralized the parallel computation of events as well as their routing, making further implementations more easy. It is then expected to dispose of a well defined mathematical and computational framework to deal with networks of spiking neurons
An efficient automated parameter tuning framework for spiking neural networks
As the desire for biologically realistic spiking neural networks (SNNs) increases, tuning the enormous number of open parameters in these models becomes a difficult challenge. SNNs have been used to successfully model complex neural circuits that explore various neural phenomena such as neural plasticity, vision systems, auditory systems, neural oscillations, and many other important topics of neural function. Additionally, SNNs are particularly well-adapted to run on neuromorphic hardware that will support biological brain-scale architectures. Although the inclusion of realistic plasticity equations, neural dynamics, and recurrent topologies has increased the descriptive power of SNNs, it has also made the task of tuning these biologically realistic SNNs difficult. To meet this challenge, we present an automated parameter tuning framework capable of tuning SNNs quickly and efficiently using evolutionary algorithms (EA) and inexpensive, readily accessible graphics processing units (GPUs). A sample SNN with 4104 neurons was tuned to give V1 simple cell-like tuning curve responses and produce self-organizing receptive fields (SORFs) when presented with a random sequence of counterphase sinusoidal grating stimuli. A performance analysis comparing the GPU-accelerated implementation to a single-threaded central processing unit (CPU) implementation was carried out and showed a speedup of 65× of the GPU implementation over the CPU implementation, or 0.35 h per generation for GPU vs. 23.5 h per generation for CPU. Additionally, the parameter value solutions found in the tuned SNN were studied and found to be stable and repeatable. The automated parameter tuning framework presented here will be of use to both the computational neuroscience and neuromorphic engineering communities, making the process of constructing and tuning large-scale SNNs much quicker and easier
Introducing numerical bounds to improve event-based neural network simulation
Although the spike-trains in neural networks are mainly constrained by the
neural dynamics itself, global temporal constraints (refractoriness, time
precision, propagation delays, ..) are also to be taken into account. These
constraints are revisited in this paper in order to use them in event-based
simulation paradigms.
We first review these constraints, and discuss their consequences at the
simulation level, showing how event-based simulation of time-constrained
networks can be simplified in this context: the underlying data-structures are
strongly simplified, while event-based and clock-based mechanisms can be easily
mixed. These ideas are applied to punctual conductance-based generalized
integrate-and-fire neural networks simulation, while spike-response model
simulations are also revisited within this framework.
As an outcome, a fast minimal complementary alternative with respect to
existing simulation event-based methods, with the possibility to simulate
interesting neuron models is implemented and experimented.Comment: submitte
GPUs outperform current HPC and neuromorphic solutions in terms of speed and energy when simulating a highly-connected cortical model
While neuromorphic systems may be the ultimate platform for deploying spiking neural networks (SNNs), their distributed nature and optimisation for specific types of models makes them unwieldy tools for developing them. Instead, SNN models tend to be developed and simulated on computers or clusters of computers with standard von Neumann CPU architectures. Over the last decade, as well as becoming a common fixture in many workstations, NVIDIA GPU accelerators have entered the High Performance Computing field and are now used in 50% of the Top 10 super computing sites worldwide. In this paper we use our GeNN code generator to re-implement two neo-cortex-inspired, circuit-scale, point neuron network models on GPU hardware. We verify the correctness of our GPU simulations against prior results obtained with NEST running on traditional HPC hardware and compare the performance with respect to speed and energy consumption against published data from CPU-based HPC and neuromorphic hardware. A full-scale model of a cortical column can be simulated at speeds approaching 0.5× real-time using a single NVIDIA Tesla V100 accelerator – faster than is currently possible using a CPU based cluster or the SpiNNaker neuromorphic system. In addition, we find that, across a range of GPU systems, the energy to solution as well as the energy per synaptic event of the microcircuit simulation is as much as 14× lower than either on SpiNNaker or in CPU-based simulations. Besides performance in terms of speed and energy consumption of the simulation, efficient initialisation of models is also a crucial concern, particularly in a research context where repeated runs and parameter-space exploration are required. Therefore, we also introduce in this paper some of the novel parallel initialisation methods implemented in the latest version of GeNN and demonstrate how they can enable further speed and energy advantages
Parallel computing for brain simulation
[Abstract] Background: The human brain is the most complex system in the known universe, it is therefore one of the greatest mysteries. It provides human beings with extraordinary abilities. However, until now it has not been understood yet how and why most of these abilities are produced.
Aims: For decades, researchers have been trying to make computers reproduce these abilities, focusing on both understanding the nervous system and, on processing data in a more efficient way than before. Their aim is to make computers process information similarly to the brain. Important technological developments and vast multidisciplinary projects have allowed creating the first simulation with a number of neurons similar to that of a human brain.
Conclusion: This paper presents an up-to-date review about the main research projects that are trying to simulate and/or emulate the human brain. They employ different types of computational models using parallel computing: digital models, analog models and hybrid models. This review includes the current applications of these works, as well as future trends. It is focused on various works that look for advanced progress in Neuroscience and still others which seek new discoveries in Computer Science (neuromorphic hardware, machine learning techniques). Their most outstanding characteristics are summarized and the latest advances and future plans are presented. In addition, this review points out the importance of considering not only neurons: Computational models of the brain should also include glial cells, given the proven importance of astrocytes in information processing.Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2014/049Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/039Instituto de Salud Carlos III; PI13/0028
- …