7,958 research outputs found

    The Brain on Low Power Architectures - Efficient Simulation of Cortical Slow Waves and Asynchronous States

    Full text link
    Efficient brain simulation is a scientific grand challenge, a parallel/distributed coding challenge and a source of requirements and suggestions for future computing architectures. Indeed, the human brain includes about 10^15 synapses and 10^11 neurons activated at a mean rate of several Hz. Full brain simulation poses Exascale challenges even if simulated at the highest abstraction level. The WaveScalES experiment in the Human Brain Project (HBP) has the goal of matching experimental measures and simulations of slow waves during deep-sleep and anesthesia and the transition to other brain states. The focus is the development of dedicated large-scale parallel/distributed simulation technologies. The ExaNeSt project designs an ARM-based, low-power HPC architecture scalable to million of cores, developing a dedicated scalable interconnect system, and SWA/AW simulations are included among the driving benchmarks. At the joint between both projects is the INFN proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation engine. DPSNN can be configured to stress either the networking or the computation features available on the execution platforms. The simulation stresses the networking component when the neural net - composed by a relatively low number of neurons, each one projecting thousands of synapses - is distributed over a large number of hardware cores. When growing the number of neurons per core, the computation starts to be the dominating component for short range connections. This paper reports about preliminary performance results obtained on an ARM-based HPC prototype developed in the framework of the ExaNeSt project. Furthermore, a comparison is given of instantaneous power, total energy consumption, execution time and energetic cost per synaptic event of SWA/AW DPSNN simulations when executed on either ARM- or Intel-based server platforms

    Hardware and software optimization of fourier transform infrared spectrometry on hybrid-FPGAs

    Get PDF
    With the increasing complexity of today’s spacecrafts, there exists a concern that the on-board flight computer may be overburdened with various processing tasks. Currently available processors used by NASA are struggling to meet the requirements of scientific experiments [1, 2]. A new computational platform will soon be needed to contend with the increasing demands of future space missions. Recently developed hybrid field-programmable gate arrays (FPGA) offer the versatility of running diverse software applications on embedded processors while at the same time taking advantage of reconfigurable hardware resources, all on the same chip package. These tightly coupled HW/SW systems consume less power than general-purpose singleboard computers (SBC) and promise breakthrough performance previously impossible with traditional processors and reconfigurable devices. This thesis takes an existing floating-point intensive data processing algorithm, used for on-board spacecraft Fourier transform infrared (FTIR) spectrometry, ports it into the embedded PowerPC 405 (PPC405) processor, and evaluates system performance after applying different hardware and software optimizations and architectural configurations of the hybrid-FPGA. The hardware optimizations include Xilinx’s floating-point unit (FPU) for efficient single-precision floating-point calculations and a dedicated single-precision dot-product co-processor assembled from basic floating-point operator cores. The software optimizations include utilizing a non-ANSI single-precision math library as well as IBM’s PowerPC performance libraries recompiled for double-precision arithmetic only. The outcome of this thesis is a fully functional, optimized FTIR spectrometry algorithm implemented on a hybrid-FPGA. The computational and power performance of this system is evaluated and compared to a general-purpose SBC currently used for spacecraft data processing. Suggestions for future work, including a dual-processor concept, are given

    Energy-Efficiency Evaluation of FPGAs for Floating-Point Intensive Workloads

    Get PDF
    In this work we describe a method to measure the computing performance and energy-efficiency to be expected of an FPGA device. The motivation of this work is given by their possible usage as accelerators in the context of floating-point intensive HPC workloads. In fact, FPGA devices in the past were not considered an efficient option to address floating-point intensive computations, but more recently, with the advent of dedicated DSP units and the increased amount of resources in each chip, the interest towards these devices raised. Another obstacle to a wide adoption of FPGAs in the HPC field has been the low level hardware knowledge commonly required to program them, using Hardware Description Languages (HDLs). Also this issue has been recently mitigated by the introduction of higher level programming framework, adopting so called High Level Synthesis approaches, reducing the development time and shortening the gap between the skills required to program FPGAs wrt the skills commonly owned by HPC software developers. In this work we apply the proposed method to estimate the maximum floating-point performance and energy-efficiency of the FPGA embedded in a Xilinx Zynq Ultrascale+ MPSoC hosted on a Trenz board

    A versatile trigger and synchronization module with IEEE1588 capabilities and EPICS support.

    Get PDF
    Event timing and synchronization are two key aspects to improve in the implementation of distributed data acquisition (dDAQ) systems such as the ones used in fusion experiments. It is also of great importance the integration of dDAQ in control and measurement networks. This paper analyzes the applicability of the IEEE1588 and EPICS standards to solve these problems, and presents a hardware module implementation based in both of them that allow adding these functionalities to any DAQ. The IEEE1588 standard facilitates the integration of event timing and synchronization mechanisms in distributed data acquisition systems based on IEEE 803.3 (Ethernet). An optimal implementation of such system requires the use of network interface devices which include specific hardware resources devoted to the IEE1588 functionalities. Unfortunately, this is not the approach followed in most of the large number of applications available nowadays. Therefore, most solutions are based in software and use standard hardware network interfaces. This paper presents the development of a hardware module (GI2E) with IEEE1588 capabilities which includes USB, RS232, RS485 and CAN interfaces. This permits to integrate any DAQ element that uses these interfaces in dDAQ systems in an efficient and simple way. The module has been developed with Motorola's Coldfire MCF5234 processor and National Semiconductors's PHY DP83640T, providing it with the possibility to implement the PTP protocol of IEEE1588 by hardware, and therefore increasing its performance over other implementations based in software. To facilitate the integration of the dDAQ system in control and measurement networks the module includes a basic Input/Output Controller (IOC) functionality of the Experimental Physics and Industrial Control System (EPICS) architecture. The paper discusses the implementation details of this module and presents its applications in advanced dDAQ applications in the fusion community
    • …
    corecore