Search CORE

7,261 research outputs found

From FPGA to ASIC: A RISC-V processor experience

Author: Rojas Morales Carlos
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

BrainFrame: A node-level heterogeneous accelerator platform for neuron simulations

Author: Al-Ars Zaid
Chatzikonstantis Georgios
De Zeeuw Chris I.
Kachris Christoforos
Kukreja Rahul
Rodopoulos Dimitrios
Sidiropoulos Harry
Smaragdos Georgios
Soudris Dimitrios
Sourdis Ioannis
Strydis Christos
Publication venue
Publication date: 01/01/2017
Field of study

Objective: The advent of High-Performance Computing (HPC) in recent years has led to its increasing use in brain study through computational models. The scale and complexity of such models are constantly increasing, leading to challenging computational requirements. Even though modern HPC platforms can often deal with such challenges, the vast diversity of the modeling field does not permit for a single acceleration (or homogeneous) platform to effectively address the complete array of modeling requirements. Approach: In this paper we propose and build BrainFrame, a heterogeneous acceleration platform, incorporating three distinct acceleration technologies, a Dataflow Engine, a Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform. As a challenging proof of concept, we analyze the performance of BrainFrame on different instances of a state-of-the-art neuron model, modeling the Inferior- Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley representation. The model instances take into account not only the neuronal- network dimensions but also different network-connectivity circumstances that can drastically change application workload characteristics. Main results: The synthetic approach of three HPC technologies demonstrated that BrainFrame is better able to cope with the modeling diversity encountered. Our performance analysis shows clearly that the model directly affect performance and all three technologies are required to cope with all the model use cases.Comment: 16 pages, 18 figures, 5 table

arXiv.org e-Print Archive

Chalmers Research

Minimalistic SDHC-SPI hardware reader module for boot loader applications

Author: Bellido Díaz Manuel Jesús
Guerrero Martos David
Juan Chico Jorge
Ostúa Arangüena Enrique
Ruiz de Clavijo Vázquez Paulino
Viejo Cortés Julián
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

This paper introduces a low-footprint full hardware boot loading solution for FPGA-based Programmable Systems on Chip. The proposed module allows loading the system code and data from a standard SD card without having to re-program the whole embedded system. The hardware boot loader is processor independent and removes the need of a software boot loader and the related memory resources. The hardware overhead introduced is manageable, even in low-range FPGA chips, and negligible in mid- and high-range devices. The implementation of the SD card reader module is explained in detail and an example of a multi-boot loader is offered as well. The multi-boot loader is implemented and tested with the Xilinx's Picoblaze microcontroller

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations

Author: Fukushige Toshiyuki
Hamada Tsuyoshi
Kawai Atsushi
Makino Junichiro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/06/1999
Field of study

We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter rely on the hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions but also other forms of interactions such as van der Waals force, hydrodynamical interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for gravitational interaction, PROGRAPE-1 provided the speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Memory and information processing in neuromorphic systems

Author: Indiveri Giacomo
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

arXiv.org e-Print Archive

ZORA

FPGA-Based Tracklet Approach to Level-1 Track Finding at CMS for the HL-LHC

Author: Bartz Edward
Chaves Jorge
Gershtein Yuri
Halkiadakis Eva
Hildreth Michael
Kyriacou Savvas
Lannon Kevin
Lefeld Anthony
Ryd Anders
Skinnari Louise
Stone Robert
Strohman Charles
Tao Zhengcheng
Winer Brian
Wittich Peter
Zientek Margaret
Publication venue: 'EDP Sciences'
Publication date: 01/01/2017
Field of study

During the High Luminosity LHC, the CMS detector will need charged particle tracking at the hardware trigger level to maintain a manageable trigger rate and achieve its physics goals. The tracklet approach is a track-finding algorithm based on a road-search algorithm that has been implemented on commercially available FPGA technology. The tracklet algorithm has achieved high performance in track-finding and completes tracking within 3.4

\mu

s on a Xilinx Virtex-7 FPGA. An overview of the algorithm and its implementation on an FPGA is given, results are shown from a demonstrator test stand and system performance studies are presented.Comment: Submitted to proceedings of Connecting The Dots/Intelligent Trackers 2017, Orsay, Franc

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

SpinLink: An interconnection system for the SpiNNaker biologically inspired multi-computer

Author: Brown Andrew D.
Dugan Kier J.
Reeve Jeff S.
Publication venue
Publication date: 30/08/2012
Field of study

SpiNNaker is a large-scale biologically-inspired multi-computer designed to model very heavily distributed problems, with the flagship application being the simulation of large neural networks. The project goal is to have one million processors included in a single machine, which consequently span many thousands of circuit boards. A computer of this scale imposes large communication requirements between these boards, and requires an extensible method of connecting to external equipment such as sensors, actuators and visualisation systems. This paper describes two systems that can address each of these problems.Firstly, SpinLink is a proposed method of connecting the SpiNNaker boards by using time-division multiplexing (TDM) to allow eight SpiNNaker links to run at maximum bandwidth between two boards. SpinLink will be deployed on Spartan-6 FPGAs and uses a locally generated clock that can be paused while the asynchronous links from SpiNNaker are sending data, thus ensuring a fast and glitch-free response. Secondly, SpiNNterceptor is a separate system, currently in the early stages of design, that will build upon SpinLink to address the important external I/O issues faced by SpiNNaker. Specifically, spare resources in the FPGAs will be used to implement the debugging and I/O interfacing features of SpiNNterceptor

Southampton (e-Prints Soton)

GRAPE-6: The massively-parallel special-purpose computer for astrophysical particle simulation

Author: Fukushige Toshiyuki
Koga Masaki
Makino Junichiro
Namura Ken
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/10/2003
Field of study

In this paper, we describe the architecture and performance of the GRAPE-6 system, a massively-parallel special-purpose computer for astrophysical

N

-body simulations. GRAPE-6 is the successor of GRAPE-4, which was completed in 1995 and achieved the theoretical peak speed of 1.08 Tflops. As was the case with GRAPE-4, the primary application of GRAPE-6 is simulation of collisional systems, though it can be used for collisionless systems. The main differences between GRAPE-4 and GRAPE-6 are (a) The processor chip of GRAPE-6 integrates 6 force-calculation pipelines, compared to one pipeline of GRAPE-4 (which needed 3 clock cycles to calculate one interaction), (b) the clock speed is increased from 32 to 90 MHz, and (c) the total number of processor chips is increased from 1728 to 2048. These improvements resulted in the peak speed of 64 Tflops. We also discuss the design of the successor of GRAPE-6.Comment: Accepted for publication in PASJ, scheduled to appear in Vol. 55, No.

arXiv.org e-Print Archive

Crossref