    Usage and Scaling of an Open-Source Spiking Multi-Area Model of Monkey Cortex

    We are entering an age of `big' computational neuroscience, in which neural network models are increasing in size and in numbers of underlying data sets. Consolidating the zoo of models into large-scale models simultaneously consistent with a wide range of data is only possible through the effort of large teams, which can be spread across multiple research institutions. To ensure that computational neuroscientists can build on each other's work, it is important to make models publicly available as well-documented code. This chapter describes such an open-source model, which relates the connectivity structure of all vision-related cortical areas of the macaque monkey with their resting-state dynamics. We give a brief overview of how to use the executable model specification, which employs NEST as simulation engine, and show its runtime scaling. The solutions found serve as an example for organizing the workflow of future models from the raw experimental data to the visualization of the results, expose the challenges, and give guidance for the construction of ICT infrastructure for neuroscience

    Routing brain traffic through the von Neumann bottleneck: Efficient cache usage in spiking neural network simulation code on general purpose computers

    Simulation is a third pillar next to experiment and theory in the study of complex dynamic systems such as biological neural networks. Contemporary brain-scale networks correspond to directed graphs of a few million nodes, each with an in-degree and out-degree of several thousands of edges, where nodes and edges correspond to the fundamental biological units, neurons and synapses, respectively. When considering a random graph, each node's edges are distributed across thousands of parallel processes. The activity in neuronal networks is also sparse. Each neuron occasionally transmits a brief signal, called spike, via its outgoing synapses to the corresponding target neurons. This spatial and temporal sparsity represents an inherent bottleneck for simulations on conventional computers: Fundamentally irregular memory-access patterns cause poor cache utilization. Using an established neuronal network simulation code as a reference implementation, we investigate how common techniques to recover cache performance such as software-induced prefetching and software pipelining can benefit a real-world application. The algorithmic changes reduce simulation time by up to 50%. The study exemplifies that many-core systems assigned with an intrinsically parallel computational problem can overcome the von Neumann bottleneck of conventional computer architectures

    Routing Brain Traffic Through the Von Neumann Bottleneck: Parallel Sorting and Refactoring.

    Generic simulation code for spiking neuronal networks spends the major part of the time in the phase where spikes have arrived at a compute node and need to be delivered to their target neurons. These spikes were emitted over the last interval between communication steps by source neurons distributed across many compute nodes and are inherently irregular and unsorted with respect to their targets. For finding those targets, the spikes need to be dispatched to a three-dimensional data structure with decisions on target thread and synapse type to be made on the way. With growing network size, a compute node receives spikes from an increasing number of different source neurons until in the limit each synapse on the compute node has a unique source. Here, we show analytically how this sparsity emerges over the practically relevant range of network sizes from a hundred thousand to a billion neurons. By profiling a production code we investigate opportunities for algorithmic changes to avoid indirections and branching. Every thread hosts an equal share of the neurons on a compute node. In the original algorithm, all threads search through all spikes to pick out the relevant ones. With increasing network size, the fraction of hits remains invariant but the absolute number of rejections grows. Our new alternative algorithm equally divides the spikes among the threads and immediately sorts them in parallel according to target thread and synapse type. After this, every thread completes delivery solely of the section of spikes for its own neurons. Independent of the number of threads, all spikes are looked at only two times. The new algorithm halves the number of instructions in spike delivery which leads to a reduction of simulation time of up to 40 %. Thus, spike delivery is a fully parallelizable process with a single synchronization point and thereby well suited for many-core systems. Our analysis indicates that further progress requires a reduction of the latency that the instructions experience in accessing memory. The study provides the foundation for the exploration of methods of latency hiding like software pipelining and software-induced prefetching

    Meeting the performance challenges of spiking network simulations on general purpose computers

    Today’s extremely scalable simulation technology for spiking neuronal networks enables the representation of models of more than a billion of neurons and their connections using the entire K computer. However, the runtimes of the largest possible simulations carried out so far were too long to allow for observations of the network dynamics over long periods of time, and also small to medium-scale simulations typically run in far more than real-time. The performance challenges for spiking neuronal network simulators such as NEST on general purpose computers arise from the inherent sparse but broad connectivity between neurons and from the unpredictable neuronal spiking activity. In distributed simulations of spiking networks, this requires frequent communication of spike data, and on each compute node routing of the incoming spikes to the local targets. This entails irregular memory access and hence constitutes a major performance bottleneck, which is a problem that I will address in my talk. I will present recent developments in simulation technology that aim at meeting such performance challenges

    Multi-area spiking network model of human visual cortex

    Understanding and unraveling the wiring of the brain at the micro-, meso- and macroscale level and its influence on neuronal activity is a fundamental problem in neuroscience. While cortical network structure has been extensively studied at the level of local circuits and in terms of long-range connectivity, a spiking network model that integrates different scales, from single cells to global networks, has seldom been investigated. In this study we aim to simulate the vision-related areas of human cortex in a multi-area spiking network model. The model represents each area as a full-scale, 1mm² microcircuit [1] with area specific architecture and takes layer and population-resolved network connectivity into account. The procedure is based on the previously developed workflow for a multi-scale spiking network model of macaque visual cortex [2] using the NEST simulator. In this work, we will build three models based on different experimental datasets. In a first model, which simulates the whole brain, we use neuron densities and cortical thicknesses from von Economo and Koskinas [3] and DTI data from the Human Connectome Project complemented by statistical predictions of the connectivity based on cytoarchitecture[4]. In a second model we aim to use the Brainnetome parcellation and DTI data [5] and limit ourselves, as well as in the third model, to the vision-related areas of the cortex. The third model will use detailed characterization of cortical cytoarchitecture on the level of areas and layers in the human visual cortex measured by Timo Dickscheid and coworkers at INM1 and connectivity predictions by the group of Claus Hilgetag based on these data along with inter-area distances. We anticipate that reaching a plausible ground state of activity will require calibrating and stabilizing the model with mean-field theory [6].In our study, we aim to elucidate how detailed connectivity of cortex shapes its dynamics on multiple scales and how prominent features of cortical activity, such as population-specific spike rates, levels of asynchrony and irregularity, resting-state functional connectivity and intrinsic time-scales, can be explained by population-level connectivity. Furthermore, this work will provide a platform for future studies of cortical functions