7,261 research outputs found

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    BrainFrame: A node-level heterogeneous accelerator platform for neuron simulations

    Full text link
    Objective: The advent of High-Performance Computing (HPC) in recent years has led to its increasing use in brain study through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a single acceleration (or homogeneous) platform to effectively address the complete array of modeling requirements. Approach: In this paper we propose and build BrainFrame, a heterogeneous acceleration platform, incorporating three distinct acceleration technologies, a Dataflow Engine, a Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform. As a challenging proof of concept, we analyze the performance of BrainFrame on different instances of a state-of-the-art neuron model, modeling the Inferior- Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley representation. The model instances take into account not only the neuronal- network dimensions but also different network-connectivity circumstances that can drastically change application workload characteristics. Main results: The synthetic approach of three HPC technologies demonstrated that BrainFrame is better able to cope with the modeling diversity encountered. Our performance analysis shows clearly that the model directly affect performance and all three technologies are required to cope with all the model use cases.Comment: 16 pages, 18 figures, 5 table

    Minimalistic SDHC-SPI hardware reader module for boot loader applications

    Get PDF
    This paper introduces a low-footprint full hardware boot loading solution for FPGA-based Programmable Systems on Chip. The proposed module allows loading the system code and data from a standard SD card without having to re-program the whole embedded system. The hardware boot loader is processor independent and removes the need of a software boot loader and the related memory resources. The hardware overhead introduced is manageable, even in low-range FPGA chips, and negligible in mid- and high-range devices. The implementation of the SD card reader module is explained in detail and an example of a multi-boot loader is offered as well. The multi-boot loader is implemented and tested with the Xilinx's Picoblaze microcontroller

    PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations

    Get PDF
    We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter rely on the hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions but also other forms of interactions such as van der Waals force, hydrodynamical interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for gravitational interaction, PROGRAPE-1 provided the speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS

    Memory and information processing in neuromorphic systems

    Full text link
    A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

    FPGA-Based Tracklet Approach to Level-1 Track Finding at CMS for the HL-LHC

    Full text link
    During the High Luminosity LHC, the CMS detector will need charged particle tracking at the hardware trigger level to maintain a manageable trigger rate and achieve its physics goals. The tracklet approach is a track-finding algorithm based on a road-search algorithm that has been implemented on commercially available FPGA technology. The tracklet algorithm has achieved high performance in track-finding and completes tracking within 3.4 ÎĽ\mus on a Xilinx Virtex-7 FPGA. An overview of the algorithm and its implementation on an FPGA is given, results are shown from a demonstrator test stand and system performance studies are presented.Comment: Submitted to proceedings of Connecting The Dots/Intelligent Trackers 2017, Orsay, Franc

    SpinLink: An interconnection system for the SpiNNaker biologically inspired multi-computer

    No full text
    SpiNNaker is a large-scale biologically-inspired multi-computer designed to model very heavily distributed problems, with the flagship application being the simulation of large neural networks. The project goal is to have one million processors included in a single machine, which consequently span many thousands of circuit boards. A computer of this scale imposes large communication requirements between these boards, and requires an extensible method of connecting to external equipment such as sensors, actuators and visualisation systems. This paper describes two systems that can address each of these problems.Firstly, SpinLink is a proposed method of connecting the SpiNNaker boards by using time-division multiplexing (TDM) to allow eight SpiNNaker links to run at maximum bandwidth between two boards. SpinLink will be deployed on Spartan-6 FPGAs and uses a locally generated clock that can be paused while the asynchronous links from SpiNNaker are sending data, thus ensuring a fast and glitch-free response. Secondly, SpiNNterceptor is a separate system, currently in the early stages of design, that will build upon SpinLink to address the important external I/O issues faced by SpiNNaker. Specifically, spare resources in the FPGAs will be used to implement the debugging and I/O interfacing features of SpiNNterceptor

    GRAPE-6: The massively-parallel special-purpose computer for astrophysical particle simulation

    Full text link
    In this paper, we describe the architecture and performance of the GRAPE-6 system, a massively-parallel special-purpose computer for astrophysical NN-body simulations. GRAPE-6 is the successor of GRAPE-4, which was completed in 1995 and achieved the theoretical peak speed of 1.08 Tflops. As was the case with GRAPE-4, the primary application of GRAPE-6 is simulation of collisional systems, though it can be used for collisionless systems. The main differences between GRAPE-4 and GRAPE-6 are (a) The processor chip of GRAPE-6 integrates 6 force-calculation pipelines, compared to one pipeline of GRAPE-4 (which needed 3 clock cycles to calculate one interaction), (b) the clock speed is increased from 32 to 90 MHz, and (c) the total number of processor chips is increased from 1728 to 2048. These improvements resulted in the peak speed of 64 Tflops. We also discuss the design of the successor of GRAPE-6.Comment: Accepted for publication in PASJ, scheduled to appear in Vol. 55, No.
    • …
    corecore