Search CORE

38,654 research outputs found

PGPG: An Automatic Generator of Pipeline Design for Programmable GRAPE Systems

Author: Fukushige Toshiyuki
Hamada Tsuyoshi
Makino Junichiro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/03/2007
Field of study

We have developed PGPG (Pipeline Generator for Programmable GRAPE), a software which generates the low-level design of the pipeline processor and communication software for FPGA-based computing engines (FBCEs). An FBCE typically consists of one or multiple FPGA (Field-Programmable Gate Array) chips and local memory. Here, the term "Field-Programmable" means that one can rewrite the logic implemented to the chip after the hardware is completed, and therefore a single FBCE can be used for calculation of various functions, for example pipeline processors for gravity, SPH interaction, or image processing. The main problem with FBCEs is that the user need to develop the detailed hardware design for the processor to be implemented to FPGA chips. In addition, she or he has to write the control logic for the processor, communication and data conversion library on the host processor, and application program which uses the developed processor. These require detailed knowledge of hardware design, a hardware description language such as VHDL, the operating system and the application, and amount of human work is huge. A relatively simple design would require 1 person-year or more. The PGPG software generates all necessary design descriptions, except for the application software itself, from a high-level design description of the pipeline processor in the PGPG language. The PGPG language is a simple language, specialized to the description of pipeline processors. Thus, the design of pipeline processor in PGPG language is much easier than the traditional design. For real applications such as the pipeline for gravitational interaction, the pipeline processor generated by PGPG achieved the performance similar to that of hand-written code. In this paper we present a detailed description of PGPG version 1.0.Comment: 24 pages, 6 figures, accepted PASJ 2005 July 2

arXiv.org e-Print Archive

Crossref

CERN Document Server

Accelerating statistical texture analysis with an FPGA-DSP hybrid architecture

Author: Cuenca-Asensi Sergio
Córcoles López Víctor
Ibarra Picó Francisco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

Nowadays, most image processing systems are implemented using either MMX-optimized software libraries or, when time requirements are limited, expensive high performance DSP-based boards. In this paper we present a texture analysis co-processor concept that permits the efficient hardware implementation of statistical feature extraction, and hardware-software codesign to achieve high-performance low-cost solutions. We propose a hybrid architecture based on FPGA chips, for massive data processing, and digital signal processor (DSP) for floating-point computations. In our preliminary trials with test images, we achieved sufficient performance improvements to handle a wide range of real-time applications

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Multi-task Implementation for Image Reconstruction of an AER Communication

Author: Civit Balcells Antón
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Luján Martínez Carlos Daniel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Address-Event-Representation (AER) is a communication protocol for transferring spikes between bio-inspired chips. Such systems may consist of a hierarchical structure with several chips that transmit spikes among them in real time, while performing some processing. There exist several AER tools to help in developing and testing AER based systems. These tools require the use of a computer to allow the processing of the event information, reaching very high bandwidth at the AER communication level. We propose to use an embedded platform based on multi-task operating system to allow both, the AER communication and the AER processing without a laptop or a computer. We have connected and programmed a Gumstix computer to process Address- Event information and measured the performance referred to the previous AER tools solutions. In this paper, we present and study the performance of a new philosophy of a frame-grabber AER tool based on a multi-task environment, composed by the Intel XScale processor governed by an embedded GNU/Linux system.Ministerio de Ciencia e Innovación TEC2006-11730-C03-0

idUS. Depósito de Investigación Universidad de Sevilla

PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations

Author: Fukushige Toshiyuki
Hamada Tsuyoshi
Kawai Atsushi
Makino Junichiro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/06/1999
Field of study

We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter rely on the hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions but also other forms of interactions such as van der Waals force, hydrodynamical interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for gravitational interaction, PROGRAPE-1 provided the speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

GRAPE-6: The massively-parallel special-purpose computer for astrophysical particle simulation

Author: Fukushige Toshiyuki
Koga Masaki
Makino Junichiro
Namura Ken
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/10/2003
Field of study

In this paper, we describe the architecture and performance of the GRAPE-6 system, a massively-parallel special-purpose computer for astrophysical

N

-body simulations. GRAPE-6 is the successor of GRAPE-4, which was completed in 1995 and achieved the theoretical peak speed of 1.08 Tflops. As was the case with GRAPE-4, the primary application of GRAPE-6 is simulation of collisional systems, though it can be used for collisionless systems. The main differences between GRAPE-4 and GRAPE-6 are (a) The processor chip of GRAPE-6 integrates 6 force-calculation pipelines, compared to one pipeline of GRAPE-4 (which needed 3 clock cycles to calculate one interaction), (b) the clock speed is increased from 32 to 90 MHz, and (c) the total number of processor chips is increased from 1728 to 2048. These improvements resulted in the peak speed of 64 Tflops. We also discuss the design of the successor of GRAPE-6.Comment: Accepted for publication in PASJ, scheduled to appear in Vol. 55, No.

arXiv.org e-Print Archive

Crossref

Performance evaluation of multi-core multi-cluster architecture

Author: Hamid Norhazlina
Walters Robert John
Wills Gary Brian
Publication venue
Publication date: 03/04/2014
Field of study

A multi-core cluster is a cluster composed of numbers of nodes where each node has a number of processors, each with more than one core within each single chip. Cluster nodes are connected via an interconnection network. Multi-cored processors are able to achieve higher performance without driving up power consumption and heat, which is the main concern in a single-core processor. A general problem in the network arises from the fact that multiple messages can be in transit at the same time on the same network links. This paper considers the communication latencies of a multi-core multi-cluster architecture will be investigated using simulation experiments and measurements under various working conditions

Southampton (e-Prints Soton)