11 research outputs found

    Brain fame:From FPGA to heterogeneous acceleration of brain simulations

    Get PDF
    Among the various methods in neuroscience for understanding brain function, in-silico simulations have been gaining popularity. Advances in neuroscience and engineering led to the creation of mathematical models of networks that do not simply mimic biological behaviour in an abstract fashion but emulate its in significant detail, even to the level of its biophysical properties. Such an example is the Spiking Neural Network (SNN) that can model a variety of additional behavioural features, like encoding data and adapting according to a spike train`s amplitude, frequency and general precise pattern of arrival of spiking events on a neuron. As a result, SNNs have higher explanatory power than their predecessors, thus brain simulations based on SNNs become an attractive topic to explore. In-silico simulations of SNNs can have beneficial results not only for neuroscience research but breakthroughs can also potentially benefit medical, computing and A.I. research. SNNs, though, computationally depending workloads that traditional computing might not be able to cover. Thus, the use of High Performance Computing (HPC) platforms in this application domain becomes desirable. This dissertation explores the topic of HPC-based in-silico brain simulations. Initially, the effort focuses on custom hardware accelerators, due to their potential in providing real-time performance alongside support for large-scale non-real-time experiments and specifically Field Programmable Gate Arrays (FPGAs). The nature of FPGA-based accelerators provides specific benefits against other similar paradigms like Application Specific Integrated Circuit (ASIC) designs.Firstly, we explore the general characteristics of typical SNNs model types to identify their computational requirements in relation to their explanatory strength. We also identify major design characteristics in model development that can directly affect its performance and behaviour when ported to an HPC platform. Subsequently, a detailed literature review is made on FPGA-based SNN implementations. The HPC porting effort begins with the implementation of an extended-Hodgkin-Huxley model of the Inferior-olivary nucleus featuring advanced connectivity. The model is quite demanding and complex enough to act as a realistic benchmark for HPC implementations, while also being scientifically relevant in its own right. FPGA development shows promising performance results not only when doing custom designs but also using High-level synthesis (HLS) toolflows that significantly reduce development time. FPGAs have proven suitable for small-scale embedded-HPC uses as well. The various efforts, though, reveal a very specific weakness of FPGA development that has less to do with the silicon itself and more with its programming environment. The FPGA tools are very inaccessible to non-experts, thus any acceleration effort would require the engineer (and the FPGA development time) to be in the critical path of the research process. An important question to be answered is how the FPGA platform would compare to other popular software-based HPC solutions such as GPU- and CPU-based platforms. A detailed comparison of the best FPGA implementation with GPU and manycore-CPU ports of the same benchmark is conducted. The comparison and evaluation shows that, when it comes to real-time performance, FPGAs have a clear advantage. But for non-real-time, large scale simulations, there is no single platform that can optimally support the complete range of experiments that could be conducted with the inferior olive model. The comparison makes a clear case for BrainFrame, a platform that supports heterogeneous HPC substrates. This dissertation, thus, concludes with the proposal of the BrainFrame system. The proof-of-concept design supports standard and extended Hodgkin-Huxley models, , such as the original inferior-olive model. The system integrates a GPU-, CPU- and FPGA-based HPC back-end while also using a standard neuroscientific language front-end (PyNN) that can score best-in-class performance, alleviate some of the development hurdles and make it far more user-friendly for the typical model developer. Additionally, the multi-node potential of the platform is being explored. BrainFrame provides both a powerful heterogeneous platform for acceleration and also a front-end familiar to the neuroscientist

    Brain fame:From FPGA to heterogeneous acceleration of brain simulations

    Get PDF
    Among the various methods in neuroscience for understanding brain function, in-silico simulations have been gaining popularity. Advances in neuroscience and engineering led to the creation of mathematical models of networks that do not simply mimic biological behaviour in an abstract fashion but emulate its in significant detail, even to the level of its biophysical properties. Such an example is the Spiking Neural Network (SNN) that can model a variety of additional behavioural features, like encoding data and adapting according to a spike train`s amplitude, frequency and general precise pattern of arrival of spiking events on a neuron. As a result, SNNs have higher explanatory power than their predecessors, thus brain simulations based on SNNs become an attractive topic to explore. In-silico simulations of SNNs can have beneficial results not only for neuroscience research but breakthroughs can also potentially benefit medical, computing and A.I. research. SNNs, though, computationally depending workloads that traditional computing might not be able to cover. Thus, the use of High Performance Computing (HPC) platforms in this application domain becomes desirable. This dissertation explores the topic of HPC-based in-silico brain simulations. Initially, the effort focuses on custom hardware accelerators, due to their potential in providing real-time performance alongside support for large-scale non-real-time experiments and specifically Field Programmable Gate Arrays (FPGAs). The nature of FPGA-based accelerators provides specific benefits against other similar paradigms like Application Specific Integrated Circuit (ASIC) designs.Firstly, we explore the general characteristics of typical SNNs model types to identify their computational requirements in relation to their explanatory strength. We also identify major design characteristics in model development that can directly affect its performance and behaviour when ported to an HPC platform. Subsequently, a detailed literature review is made on FPGA-based SNN implementations. The HPC porting effort begins with the implementation of an extended-Hodgkin-Huxley model of the Inferior-olivary nucleus featuring advanced connectivity. The model is quite demanding and complex enough to act as a realistic benchmark for HPC implementations, while also being scientifically relevant in its own right. FPGA development shows promising performance results not only when doing custom designs but also using High-level synthesis (HLS) toolflows that significantly reduce development time. FPGAs have proven suitable for small-scale embedded-HPC uses as well. The various efforts, though, reveal a very specific weakness of FPGA development that has less to do with the silicon itself and more with its programming environment. The FPGA tools are very inaccessible to non-experts, thus any acceleration effort would require the engineer (and the FPGA development time) to be in the critical path of the research process. An important question to be answered is how the FPGA platform would compare to other popular software-based HPC solutions such as GPU- and CPU-based platforms. A detailed comparison of the best FPGA implementation with GPU and manycore-CPU ports of the same benchmark is conducted. The comparison and evaluation shows that, when it comes to real-time performance, FPGAs have a clear advantage. But for non-real-time, large scale simulations, there is no single platform that can optimally support the complete range of experiments that could be conducted with the inferior olive model. The comparison makes a clear case for BrainFrame, a platform that supports heterogeneous HPC substrates. This dissertation, thus, concludes with the proposal of the BrainFrame system. The proof-of-concept design supports standard and extended Hodgkin-Huxley models, , such as the original inferior-olive model. The system integrates a GPU-, CPU- and FPGA-based HPC back-end while also using a standard neuroscientific language front-end (PyNN) that can score best-in-class performance, alleviate some of the development hurdles and make it far more user-friendly for the typical model developer. Additionally, the multi-node potential of the platform is being explored. BrainFrame provides both a powerful heterogeneous platform for acceleration and also a front-end familiar to the neuroscientist

    Exploring Complex Brain-Simulation Workloads on Multi-GPU Deployments

    No full text

    GPU Implementation of Neural-Network Simulations Based on Adaptive-Exponential Models

    No full text
    Detailed brain modeling has been presenting significant challenges to the world of high-performance computing (HPC), posing computational problems that can benefit from modern hardware-acceleration technologies. We explore the capacity of GPUs for simulating large-scale neuronal networks based on the Adaptive Exponential neuron-model, which is widely used in the neuroscientific community. Our GPU-powered simulator acts as a benchmark to evaluate the strengths and limitations of modern GPUs, as well as to explore their scaling properties when simulating large neural networks. This work presents an optimized GPU implementation that outperforms a reference multicore implementation by 50x, whereas utilizing a dual-GPU configuration can deliver a speedup of 90x for networks of 20,000 fully interconnected AdEx neurons

    A novel simulator for extended Hodgkin-Huxley neural networks

    No full text
    Computational neuroscience aims to investigate and explain the behaviour and functions of neural structures, through mathematical models. Due to the models' complexity, they can only be explored through computer simulation. Modern research in this field is increasingly adopting large networks of neurons, and diverse, physiologically-detailed neuron models, based on the extended Hodgkin-Huxley (eHH) formalism. However, existing eHH simulators either support highly specific neuron models, or they provide low computational performance, making model exploration costly in time and effort. This work introduces a simulator for extended Hodgkin-Huxley neural networks, on multiprocessing platforms. This simulator supports a broad range of neuron models, while still providing high performance. Simulator performance is evaluated against varying neuron complexity parameters, network size and density, and thread-level parallelism. Results indicate performance is within existing literature for single-model eHH codes, and scales well for large CPU core counts. Ultimately, this application combines model flexibility with high performance, and can serve as a new tool in computational neuroscience.</p

    A Probabilistic Analysis of Resilient Reconfigurable Designs

    No full text
    Reconfigurable hardware can be employed to tolerate permanent faults. Hardware components comprising a System-on-Chip can be partitioned into a handful of substitutable units interconnected with reconfigurable wires to allow isolation and replacement of faulty parts. This paper offers a probabilistic analysis of reconfigurable designs estimating for different fault densities the average number of fault-free components that can be constructed as well as the probability to guarantee a particular availability of components. Considering the area overheads of reconfigurability, we evaluate the resilience of various reconfigurable designs with different granularities. Based on this analysis, we conduct a comprehensive design-space exploration to identify the granularity mixes that maximize the fault-tolerance of a system. Our findings reveal that mixing fine-grain logic with a coarse-grain sparing approach tolerates up to 3x more permanent faults than component redundancy and 2x more than any other purely coarse-grain solution. Component redundancy is preferable at low fault densities, while coarse-grain and mixedgrain reconfigurability maximize availability at medium and high fault densities, respectively

    MCluster: A Software Framework for Portable Device-Based Volunteer Computing

    No full text
    Recent market forecasts predict that the portablecomputing trend will vastly spread, as by 2020 there will bemore than 3 billion LTE device users worldwide. Motivatedby this fact, many companies and research institutes havealready launched research projects that utilize portable devices, voluntarily provided by users, to perform the requiredcomputations. Many such projects employ Berkeley's BOINCmiddleware, since it can support a large variety of stationaryand mobile devices. However, currently available BOINChigh-level APIs, either do not support portable devices orlack advanced processing capabilities (such as inter-node taskdependencies) and/or easiness of use. To resolve these issues, we propose the mCluster software framework for applicationexecution powered by the BOINC middleware on portable devices. mCluster adopts a task-based programming model thatrequires simple, pragma-based annotations of the applicationsoftware, in order to dynamically resolve task dependencies. To evaluate our framework, we have have mapped a scientificapplication from the neuroscience domain on an small-scalednetwork of portable devices. mCluster significantly reducesthe required programming effort and complexity to efficientlymap BOINC-powered applications with task dependencies onportable devices compared to previous approaches
    corecore