Search CORE

73,145 research outputs found

A Bit Serial Approach to Massively Parallel Floating Point Operations on an FPGA

Author: Kshirsagar Parija
Mahabalagiri Anvith Katte
Marcy Duane
Schlereth Fred
Publication venue: SURFACE at Syracuse University
Publication date: 23/11/2010
Field of study

In this paper we discuss the pros and cons of bit serial arithmetic for performing mathematical operations for signal processing and scientific computations on an FPGA. We describe our formulation of the architecture for such massively parallel systems, the advantage being that it requires no parallel programming in the traditional sense. We describe a pseudo floating point bit serial circuit which is less complex than full precision floating point and show that it is suitable for many applications. We conclude with several application examples and show that a bit serial implementation can be competitive with a high speed parallel implementation

Syracuse University Research Facility and Collaborative Environment

Bit-Level Systolic Architecture for a Matrix-Matrix Multiplier

Author: Murty M.N.
Nayak S.S.
Padhy Binayak
Panda S.N.
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 28/08/2020
Field of study

Highly efficient arithmetic operations are necessary to achieve the desired performance in many real-time systems and digital image processing applications. In all these applications, one of the important arithmetic operations frequently performed is to multiply and accumulate with small computational time. In this paper, a 4-bit serial - parallel multiplier, which can perform both positive and negative multiplications, is presented. Baugh-Wooley algorithm necessitates complementation of last bit of each partial product except the last partial product in which all but the last bit are complemented. In the proposed algorithm all bits of the last partial product are complemented. This modification results in considerable reduction in hardware compared to Baugh-Wooley multiplier. This multiplier can be used for implementation of discrete orthogonal transforms, which are used in many applications, including image and signal processing. This paper presents a 2D bit-level systolic architecture for a matrixmatrix multiplier. A comparison with similar structures has shown that the proposed structure performs better

Interscience Research Network

A study about FPGA-based digital filters

Author: Boemo Eduardo I.
Peiró Marcos M.
Sansaloni T.
Valls Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. Valls, M. M. Peiró, T. Sansaloni, and E. Boemo, "A study about FPGA-based digital filters", in IEEE Workshop on Signal Processing Systems, 1998, p. 192-201A set of operators suitable for digit-serial FIR filtering is presented. The canonical and inverted forms are studied. In each of these structures both the symmetrical and anti-symmetrical particular cases are also covered. All circuits have been implemented using an EPF10K50 Altera FPGA. The main results show that the canonical form presents less occupation and higher throughput. The 8-tap filter versions implemented can be applied in real-time processing with sample rate ranging up to 7 MHz using the bit-serial versions and up to 25 MHz with the bit-parallel one

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

DFT algorithms for bit-serial GaAs array processor architectures

Author: Mcmillan Gary B.
Publication venue
Publication date
Field of study

Systems and Processes Engineering Corporation (SPEC) has developed an innovative array processor architecture for computing Fourier transforms and other commonly used signal processing algorithms. This architecture is designed to extract the highest possible array performance from state-of-the-art GaAs technology. SPEC's architectural design includes a high performance RISC processor implemented in GaAs, along with a Floating Point Coprocessor and a unique Array Communications Coprocessor, also implemented in GaAs technology. Together, these data processors represent the latest in technology, both from an architectural and implementation viewpoint. SPEC has examined numerous algorithms and parallel processing architectures to determine the optimum array processor architecture. SPEC has developed an array processor architecture with integral communications ability to provide maximum node connectivity. The Array Communications Coprocessor embeds communications operations directly in the core of the processor architecture. A Floating Point Coprocessor architecture has been defined that utilizes Bit-Serial arithmetic units, operating at very high frequency, to perform floating point operations. These Bit-Serial devices reduce the device integration level and complexity to a level compatible with state-of-the-art GaAs device technology

NASA Technical Reports Server

Evaluation of High Speed Hardware Multipliers - Fixed Point and Floating point

Author: Abbas Syed Haider
Ahmed Awais
Haider Hussnain
Siddique Muhammad Faheem
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2013
Field of study

There is a huge demand in high speed arithmetic blocks, due to increased performance of processing units. For higher frequency clocks of the system, the arithmetic blocks must keep pace with greater requirement of more computational power. Area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. In our research we will try to determine the best solution to this problem by comparing the results of different multipliers. Different sized of two algorithms for high speed hardware multipliers were studied and implemented ie. Parallel multiplier, Bit serial multiplier. The workings of these two multipliers were compared by implementing each of them separately in VHDL. A number of high speed adder designs are developed and algorithm and design of these adders are discussed. The result of this research will help us to choose the better option between serial and parallel multipliers for both fixed point and floating point multipliers to fabricate in different systems. As multipliers form one of the most important components of many systems, analysing different multipliers will help us to frame a better system with area and better speed.DOI:http://dx.doi.org/10.11591/ijece.v3i6.418

Institute of Advanced Engineering and Science

Radix-2n serial–serial multipliers

Author: A. Aggoun
A. Ashur
A.F. Farwan
Aggoun
Aggoun
Ashur
Chang
Hatley
M.K. Ibrahim
Nibouche
Smith
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2004
Field of study

All serial–serial multiplication structures previously reported in the literature have been confined to bit serial–serial multipliers. An architecture for digit serial–serial multipliers is presented. A set of designs are derived from the radix-2n design procedure, which was first reported by the authors for the design of bit level pipelined digit serial–parallel structures. One significant aspect of the new designs is that they can be pipelined to the bit level and give the designer the flexibility to obtain the best trade-off between throughput rate and hardware cost by varying the digit size and the number of pipelining levels. Also, an area-efficient digit serial–serial multiplier is proposed which provides a 50% reduction in hardware without degrading the speed performance. This is achieved by exploiting the fact that some cells are idle for most of the multiplication operation. In the new design, the computations of these cells are remapped to other cells, which make them redundant. The new designs have been implemented on the S40BG256 device from the SPARTAN family to prove functionality and assess performance

Crossref

De Montfort University Open Research Archive

Brunel University Research Archive

Bit-level pipelined digit-serial array processors

Author: Aggoun A
Ashur A
Ibrahim MK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/1998
Field of study

A new architecture for high performance digit-serial vector inner product (VIP) which can be pipelined to the bit-level is introduced. The design of the digit-serial vector inner product is based on a new systematic design methodology using radix-2n arithmetic. The proposed architecture allows a high level of bit-level pipelining to increase the throughput rate with minimum initial delay and minimum area. This will give designers greater flexibility in finding the best tradeoff between hardware cost and throughput rate. It is shown that sub-digit pipelined digit-serial structure can achieve a higher throughput rate with much less area consumption than an equivalent bit-parallel structure. A twin-pipe architecture to double the throughput rate of digit-serial multipliers and consequently that of the digit-serial vector inner product is also presented. The effect of the number of pipelining levels and the twin-pipe architecture on the throughput rate and hardware cost are discussed. A two's complement digit-serial architecture which can operate on both negative and positive numbers is also presented

Crossref

Brunel University Research Archive

An AER handshake-less modular infrastructure PCB with x8 2.5Gbps LVDS serial links

Author: Iakymchuk T.
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Linares Barranco Bernabé
Rosado A.
Serrano Gotarredona María Teresa
Publication venue: IEEE Computer Society
Publication date: 01/01/2014
Field of study

Nowadays spike-based brain processing emulation is taking off. Several EU and others worldwide projects are demonstrating this, like SpiNNaker, BrainScaleS, FACETS, or NeuroGrid. The larger the brain process emulation on silicon is, the higher the communication performance of the hosting platforms has to be. Many times the bottleneck of these system implementations is not on the performance inside a chip or a board, but in the communication between boards. This paper describes a novel modular Address-Event-Representation (AER) FPGA-based (Spartan6) infrastructure PCB (the AER-Node board) with 2.5Gbps LVDS high speed serial links over SATA cables that offers a peak performance of 32-bit 62.5Meps (Mega events per second) on board-to-board communications. The board allows back compatibility with parallel AER devices supporting up to x2 28-bit parallel data with asynchronous handshake. These boards also allow modular expansion functionality through several daughter boards. The paper is focused on describing in detail the LVDS serial interface and presenting its performance.Ministerio de Ciencia e Innovación TEC2009-10639-C04-02/01Ministerio de Economía y Competitividad TEC2012-37868-C04-02/01Junta de Andalucía TIC-6091Ministerio de Economía y Competitividad PRI-PIMCHI-2011-076

idUS. Depósito de Investigación Universidad de Sevilla