Search CORE

17 research outputs found

High-Speed Function Approximation using a Minimax Quadratic Interpolator

Author: Bruguera Javier
Muller Jean-Michel
Oberman Stuart
Pineiro Jose-Alejandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

A table-based method for high-speed function approximation in single-precision floating-point format is presented in this paper. Our focus is the approximation of reciprocal, square root, square root reciprocal, exponentials, logarithms, trigonometric functions, powering (with a fixed exponent p), or special functions. The algorithm presented here combines table look-up, an enhanced minimax quadratic approximation, and an efficient evaluation of the second-degree polynomial (using a specialized squaring unit, redundant arithmetic, and multioperand addition). The execution times and area costs of an architecture implementing our method are estimated, showing the achievement of the fast execution times of linear approximation methods and the reduced area requirements of other second-degree interpolation algorithms. Moreover, the use of an enhanced minimax approximation which, through an iterative process, takes into account the effect of rounding the polynomial coefficients to a finite size allows for a further reduction in the size of the look-up tables to be used, making our method very suitable for the implementation of an elementary function generator in state-of-the-art DSPs or graphics processing units (GPUs)

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

EXPERIMENTAL STUDIES ON MULTI-OPERAND ADDERS

Author: Saha P.
Sonowal M.
Thabah S. D.
Publication venue: 'Exeley, Inc.'
Publication date
Field of study

Multipliers for Floating-Point Double Precision and Beyond on FPGAs

Author: Banescu Sebastian
de Dinechin Florent
Pasca Bogdan
Tudoran Radu
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/01/2010
Field of study

International audienceThe implementation of high-precision floating-point applications on reconfigurable hardware requires a variety of large multipliers: Standard multipliers are the core of floating-point multipliers; Truncated multipliers, trading resources for a well-controlled accuracy degradation, are useful building blocks in situations where a full multiplier is not needed. This work studies the automated generation of such multipliers using the embedded multipliers and adders present in DSP blocks of current FPGAs. The optimization of such multipliers is expressed as a tiling problem where a tile represents a hardware multiplier and super-tiles are the wiring of several hardware multipliers making efficient use of the DSP internal resources. This tiling technique is shown to adapt to full or truncated multipliers. It addresses arbitrary precisions including single, double but also in the quadruple precision introduced by the IEEE-754-2008 standard and currently unsupported by processor hardware. An open-source implementation is provided in the FloPoCo project

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Fast Multi Operand Decimal Adders using Digit Compressors with Decimal Carry Generation

Author: Dadda Luigi
Nannarelli Alberto
Publication venue: Technical University of Denmark, DTU Informatics, Building 321
Publication date: 01/01/2009
Field of study

An Alternative Carry-save Arithmetic for New Generation Field Programmable Gate Arrays

Author: Aktan Mustafa
Morgül Avni
Çini Uğur
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 01/01/2016
Field of study

In this work, a double carry-save addition operation is proposed, which is efficiently synthesized for 6-input LUT-based eld programmable gate arrays (FPGAs). The proposed arithmetic operation is based on redundant number representation and provides carry propagation-free addition. Using the proposed arithmetic operation, a compact and fast multiply and accumulate unit is designed. To our knowledge, the proposed design provides the fastest multiply-add operation for 6-input LUT-based FPGA systems. A nite impulse response lter implementation is given to show the performance of the proposed structure. The proposed implementation provides a dramatic performance increase, which is at least 2 times faster than conventional binary multiply-add implementations

Public key cryptosystems : theory, application and implementation

Author: McAuley Anthony Joseph
Publication venue
Publication date: 01/04/1985
Field of study

The determination of an individual's right to privacy is mainly a nontechnical matter, but the pragmatics of providing it is the central concern of the cryptographer. This thesis has sought answers to some of the outstanding issues in cryptography. In particular, some of the theoretical, application and implementation problems associated with a Public Key Cryptosystem (PKC).The Trapdoor Knapsack (TK) PKC is capable of fast throughput, but suffers from serious disadvantages. In chapter two a more general approach to the TK-PKC is described, showing how the public key size can be significantly reduced. To overcome the security limitations a new trapdoor was described in chapter three. It is based on transformations between the radix and residue number systems.Chapter four considers how cryptography can best be applied to multi-addressed packets of information. We show how security or communication network structure can be used to advantage, then proposing a new broadcast cryptosystem, which is more generally applicable.Copyright is traditionally used to protect the publisher from the pirate. Chapter five shows how to protect information when in easily copyable digital format.Chapter six describes the potential and pitfalls of VLSI, followed in chapter seven by a model for comparing the cost and performance of VLSI architectures. Chapter eight deals with novel architectures for all the basic arithmetic operations. These architectures provide a basic vocabulary of low complexity VLSI arithmetic structures for a wide range of applications.The design of a VLSI device, the Advanced Cipher Processor (ACP), to implement the RSA algorithm is described in chapter nine. It's heart is the modular exponential unit, which is a synthesis of the architectures in chapter eight. The ACP is capable of a throughput of 50 000 bits per second