Search CORE

104 research outputs found

High-Speed Function Approximation using a Minimax Quadratic Interpolator

Author: Bruguera Javier
Muller Jean-Michel
Oberman Stuart
Pineiro Jose-Alejandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

A table-based method for high-speed function approximation in single-precision floating-point format is presented in this paper. Our focus is the approximation of reciprocal, square root, square root reciprocal, exponentials, logarithms, trigonometric functions, powering (with a fixed exponent p), or special functions. The algorithm presented here combines table look-up, an enhanced minimax quadratic approximation, and an efficient evaluation of the second-degree polynomial (using a specialized squaring unit, redundant arithmetic, and multioperand addition). The execution times and area costs of an architecture implementing our method are estimated, showing the achievement of the fast execution times of linear approximation methods and the reduced area requirements of other second-degree interpolation algorithms. Moreover, the use of an enhanced minimax approximation which, through an iterative process, takes into account the effect of rounding the polynomial coefficients to a finite size allows for a further reduction in the size of the look-up tables to be used, making our method very suitable for the implementation of an elementary function generator in state-of-the-art DSPs or graphics processing units (GPUs)

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

An efficient hardware architecture for a neural network activation function generator

Author: B. Widrow
D. Zhang
D.J. Myers
G. Cauwenberghs
H. Amin
H. Faiedh
I. Koren
J. Holt
J. Pineiro
K. Basterretxea
M. Tommiska
S. Draghici
S. Haykins
S. Vassiliadis
U. Ruckert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

This paper proposes an efficient hardware architecture for a function generator suitable for an artificial neural network (ANN). A spline-based approximation function is designed that provides a good trade-off between accuracy and silicon area, whilst also being inherently scalable and adaptable for numerous activation functions. This has been achieved by using a minimax polynomial and through optimal placement of the approximating polynomials based on the results of a genetic algorithm. The approximation error of the proposed method compares favourably to all related research in this field. Efficient hardware multiplication circuitry is used in the implementation, which reduces the area overhead and increases the throughput

Crossref

Irish Universities

DCU Online Research Access Service

Fast, area-efficient 32-bit LNS for computer arithmetic operations

Author: Che Ismail Rizalafande
Publication venue: Newcastle University
Publication date: 01/01/2012
Field of study

PhD ThesisThe logarithmic number system has been proposed as an alternative to floating-point. Multiplication, division and square-root operations are accomplished with fixedpoint arithmetic, but addition and subtraction are considerably more challenging. Recent work has demonstrated that these operations too can be done with similar speed and accuracy to their floating-point equivalents, but the necessary circuitry is complex. In particular, it is dominated by the need for large lookup tables for the storage of a non-linear function. This thesis describes the architectures required to implement a newly design approach for producing fast and area-efficient 32-bit LNS arithmetic unit. The designs are structured based on two different algorithms. At first, a new cotransformation procedure is introduced in the singularity region whilst performing subtractions in which the technique capable to generate less total storage than the cotransformation method in the previous LNS architecture. Secondly, improvement to an existing interpolation process is proposed, that also reduce the total tables to an extent that allows their easy synthesis in logic. Consequently, the total delays in the system can be significantly reduced. According to the comparison analysis with previous best LNS design and floating-point units, it is shown that the new LNS architecture capable to offer significantly better in speed while sustaining its accuracy within floating-point limit. In addition, its implementation is more economical than previous best LNS system and almost equivalent with existing floating-point arithmetic unit.University Malaysia Perlis: Ministry of Higher Education, Malaysia

Newcastle University eTheses

Optimized linear, quadratic and cubic interpolators for elementary function hardware implementations

Author: Sadeghian Masoud
Stine James E.
Walters E. George, III
Publication venue: 'MDPI AG'
Publication date: 01/04/2016
Field of study

This paper presents a method for designing linear, quadratic and cubic interpolators that compute elementary functions using truncated multipliers, squarers and cubers. Initial coefficient values are obtained using a Chebyshev series approximation. A direct search algorithm is then used to optimize the quantized coefficient values to meet a user-specified error constraint. The algorithm minimizes coefficient lengths to reduce lookup table requirements, maximizes the number of truncated columns to reduce the area, delay and power of the arithmetic units, and minimizes the maximum absolute error of the interpolator output. The method can be used to design interpolators to approximate any function to a user-specified accuracy, up to and beyond 53-bits of precision (e.g., IEEE double precision significand). Linear, quadratic and cubic interpolator designs that approximate reciprocal, square root, reciprocal square root and sine are presented and analyzed. Area, delay and power estimates are given for 16, 24 and 32-bit interpolators that compute the reciprocal function, targeting a 65 nm CMOS technology from IBM. Results indicate the proposed method uses smaller arithmetic units and has reduced lookup table sizes compared to previously proposed methods. The method can be used to optimize coefficients in other systems while accounting for coefficient quantization as well as truncation and rounding effects of multiple arithmetic units.Peer reviewedElectrical and Computer Engineerin

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

SHAREOK repository

The design and multiplier-less realization of software radio receivers with reduced system delay

Author: Chan SC
Yeung KS
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

This paper studies the design and multiplier-less realization of a new software radio receiver (SRR) with reduced system delay. It employs low-delay finite-impulse response (FIR) and digital allpass filters to effectively reduce the system delay of the multistage decimators in SRRs. The optimal least-square and minimax designs of these low-delay FIR and allpass-based filters are formulated as a semidefinite programming (SDP) problem, which allows zero magnitude constraint at ω = π to be incorporated readily as additional linear matrix inequalities (LMIs). By implementing the sampling rate converter (SRC) using a variable digital filter (VDF) immediately after the integer decimators, the needs for an expensive programmable FIR filter in the traditional SRR is avoided. A new method for the optimal minimax design of this VDF-based SRC using SDP is also proposed and compared with traditional weight least squares method. Other implementation issues including the multiplier-less and digital signal processor (DSP) realizations of the SRR and the generation of the clock signal in the SRC are also studied. Design results show that the system delay and implementation complexities (especially in terms of high-speed variable multipliers) of the proposed architecture are considerably reduced as compared with conventional approaches. © 2004 IEEE.published_or_final_versio

HKU Scholars Hub

Concepts for on-board satellite image registration, volume 1

Author: Aanstoos J. V.
Daluge D. R.
Ruedger W. H.
Publication venue
Publication date
Field of study

The NASA-NEEDS program goals present a requirement for on-board signal processing to achieve user-compatible, information-adaptive data acquisition. One very specific area of interest is the preprocessing required to register imaging sensor data which have been distorted by anomalies in subsatellite-point position and/or attitude control. The concepts and considerations involved in using state-of-the-art positioning systems such as the Global Positioning System (GPS) in concert with state-of-the-art attitude stabilization and/or determination systems to provide the required registration accuracy are discussed with emphasis on assessing the accuracy to which a given image picture element can be located and identified, determining those algorithms required to augment the registration procedure and evaluating the technology impact on performing these procedures on-board the satellite

NASA Technical Reports Server

On the Functional Test of Special Function Units in GPUs

Author: Guerrero-Balaguera Juan-David
Reorda Matteo Sonza
Rodriguez Condia Josie E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

The Graphics Processing Units (GPUs) usage has extended from graphic applications to others where their high computational power is exploited (e.g., to implement Artificial Intelligence algorithms). These complex applications usually need highly intensive computations based on floating-point transcendental functions. GPUs may efficiently compute these functions in hardware using ad hoc Special Function Units (SFUs). However, a permanent fault in such units could be very critical (e.g., in safety-critical automotive applications). Thus, test methodologies for SFUs are strictly required to achieve the target reliability and safety levels. In this work, we present a functional test method based on a Software-Based Self-Test (SBST) approach targeting the SFUs in GPUs. This method exploits different approaches to build a test program and applies several optimization strategies to exploit the GPU parallelism to speed up the test procedure and reduce the required memory. The effectiveness of this methodology was proven by resorting to an open-source GPU model (FlexGripPlus) compatible with NVIDIA GPUs. The experimental results show that the proposed technique achieves 90.75% of fault coverage and up to 94.26% of Testable Fault Coverage, reducing the required memory and test duration with respect to pseudorandom strategies proposed by other authors

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A semi-definite programming (SDP) method for designing IIR sharp cut-off digital filters using frequency-response masking

Author: Chan SC
Chen HH
Ho KL
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

IEEE International Symposium on Circuits and Systems Proceedings, Vancouver, Canada, 23-26 May 2004This paper studies the design of frequency response masking (FRM) filters with infinite duration impulse response (IIR) model and masking sub-filters. They are useful in realizing sharp cutoff digital filters with low passband delays. The designs of the model and masking filters are carried out by means of semidefinite programming (SDP) and model order reduction. Design results show that low complexity FRM filters with low passband delay can be obtained.published_or_final_versio

HKU Scholars Hub

Nonuniform Fast Fourier Transforms Using Min-Max Interpolation

Author: Fessler Jeffrey A.
Sutton Bradley P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

The fast Fourier transform (FFT) is used widely in signal processing for efficient computation of the FT of finite-length signals over a set of uniformly spaced frequency locations. However, in many applications, one requires nonuniform sampling in the frequency domain, i.e., a nonuniform FT. Several papers have described fast approximations for the nonuniform FT based on interpolating an oversampled FFT. This paper presents an interpolation method for the nonuniform FT that is optimal in the min-max sense of minimizing the worst-case approximation error over all signals of unit norm. The proposed method easily generalizes to multidimensional signals. Numerical results show that the min-max approach provides substantially lower approximation errors than conventional interpolation methods. The min-max criterion is also useful for optimizing the parameters of interpolation kernels such as the Kaiser-Bessel function.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/85840/1/Fessler70.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

Recommended from our members

Reassessing the Paradigms of Statistical Model-Building

Author
Publication venue: Zürich : EMS Publ. House
Publication date: 01/01/2007
Field of study

Statistical model-building is the science of constructing models from data and from information about the data-generation process, with the aim of analysing those data and drawing inference from that analysis. Many statistical tasks are undertaken during this analysis; they include classification, forecasting, prediction and testing. Model-building has assumed substantial importance, as new technologies enable data on highly complex phenomena to be gathered in very large quantities. This creates a demand for more complex models, and requires the model-building process itself to be adaptive. The word “paradigm” refers to philosophies, frameworks and methodologies for developing and interpreting statistical models, in the context of data, and applying them for inference. In order to solve contemporary statistical problems it is often necessary to combine techniques from previously separate paradigms. The workshop addressed model-building paradigms that are at the frontiers of modern statistical research. It tried to create synergies, by delineating the connections and collisions among different paradigms. It also endeavoured to shape the future evolution of paradigms

Repositorium für Naturwissenschaften und Technik