Search CORE

257 research outputs found

A parallel VLSI architecture for a digital filter of arbitrary length using Fermat number transforms

Author: Reed I. S.
Shao H. M.
Truong T. K.
Yeh C. S.
Publication venue
Publication date
Field of study

A parallel architecture for computation of the linear convolution of two sequences of arbitrary lengths using the Fermat number transform (FNT) is described. In particular a pipeline structure is designed to compute a 128-point FNT. In this FNT, only additions and bit rotations are required. A standard barrel shifter circuit is modified so that it performs the required bit rotation operation. The overlap-save method is generalized for the FNT to compute a linear convolution of arbitrary length. A parallel architecture is developed to realize this type of overlap-save method using one FNT and several inverse FNTs of 128 points. The generalized overlap save method alleviates the usual dynamic range limitation in FNTs of long transform lengths. Its architecture is regular, simple, and expandable, and therefore naturally suitable for VLSI implementation

NASA Technical Reports Server

Towards a triple mode common operator FFT for Software Radio systems

Author: Al Ghouwayel Ali
El-Bazzal Zouhair
Haj-Ali Amin
Louët Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2012
Field of study

International audienceA scenario to design a Triple Mode FFT is addressed. Based on a Dual Mode FFT structure, we present a methodology to reach a triple mode FFT operator (TMFFT) able to operate over three different fields: complex number domain C, Galois Fields GF(Ft) and GF(2m). We propose a reconfigurable Triple mode Multiplier that constitutes the core of the Butterflybased FFT. A scalable and flexible unit for the polynomial reduction needed in the GF(2m) multiplication is also proposed. An FPGA implementation of the proposed multiplier is given and the measures show a gain of 18%in terms of performance-to-cost ratio compared to a "Velcro" approach where two self-contained operators are implemented separately

HAL-CentraleSupelec

HAL-Rennes 1

Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography

Author: baktir selcuk
Publication venue: Digital WPI
Publication date: 05/05/2008
Field of study

Efficient implementation of the number theoretic transform(NTT), also known as the discrete Fourier transform(DFT) over a finite field, has been studied actively for decades and found many applications in digital signal processing. In 1971 Schonhage and Strassen proposed an NTT based asymptotically fast multiplication method with the asymptotic complexity O(m log m log log m) for multiplication of

m

-bit integers or (m-1)st degree polynomials. Schonhage and Strassen\u27s algorithm was known to be the asymptotically fastest multiplication algorithm until Furer improved upon it in 2007. However, unfortunately, both algorithms bear significant overhead due to the conversions between the time and frequency domains which makes them impractical for small operands, e.g. less than 1000 bits in length as used in many applications. With this work we investigate for the first time the practical application of the NTT, which found applications in digital signal processing, to finite field multiplication with an emphasis on elliptic curve cryptography(ECC). We present efficient parameters for practical application of NTT based finite field multiplication to ECC which requires key and operand sizes as short as 160 bits in length. With this work, for the first time, the use of NTT based finite field arithmetic is proposed for ECC and shown to be efficient. We introduce an efficient algorithm, named DFT modular multiplication, for computing Montgomery products of polynomials in the frequency domain which facilitates efficient multiplication in GF(p^m). Our algorithm performs the entire modular multiplication, including modular reduction, in the frequency domain, and thus eliminates costly back and forth conversions between the frequency and time domains. We show that, especially in computationally constrained platforms, multiplication of finite field elements may be achieved more efficiently in the frequency domain than in the time domain for operand sizes relevant to ECC. This work presents the first hardware implementation of a frequency domain multiplier suitable for ECC and the first hardware implementation of ECC in the frequency domain. We introduce a novel area/time efficient ECC processor architecture which performs all finite field arithmetic operations in the frequency domain utilizing DFT modular multiplication over a class of Optimal Extension Fields(OEF). The proposed architecture achieves extension field modular multiplication in the frequency domain with only a linear number of base field GF(p) multiplications in addition to a quadratic number of simpler operations such as addition and bitwise rotation. With its low area and high speed, the proposed architecture is well suited for ECC in small device environments such as smart cards and wireless sensor networks nodes. Finally, we propose an adaptation of the Itoh-Tsujii algorithm to the frequency domain which can achieve efficient inversion in a class of OEFs relevant to ECC. This is the first time a frequency domain finite field inversion algorithm is proposed for ECC and we believe our algorithm will be well suited for efficient constrained hardware implementations of ECC in affine coordinates

DigitalCommons@WPI

A Reconfigurable Butterfly Architecture for Fourier and Fermat Transforms

Author: Al Ghouwayel Ali
Louët Yves
Palicot Jacques
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

International audienceReconfiguration is an essential part of Soft- Ware Radio (SWR) technology. Thanks to this technique, systems are designed for change in operating mode with the aim to carry out several types of computations. In this SWR context, the Fast Fourier Transform (FFT) operator was defined as a common operator for many classical telecommunications operations [1]. In this paper we propose a new architecture for this operator that makes it a device intended to perform two different transforms. The first one is the Fast Fourier Transform (FFT) used for the classical operations in the complex field. The second one is the Fermat Number Transform (FNT) in the Galois Field (GF) for channel coding and decoding

HAL-CentraleSupelec

HAL-Rennes 1

Orthogonal transforms and their application to image coding

Author: Nasrabadi N. M.
Nasrabadi N. M.
Publication venue: Department of Electrical Engineering, Imperial College London
Publication date: 01/01/1984
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Image processing using a two-dimensional digital convolution filter.

Author: Rathi Rajendra P.
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/1984
Field of study

Scholarship at UWindsor

Radix-2<sup>2</sup> Algorithm for the Odd New Mersenne Number Transform (ONMNT)

Author: Al-Aali Y
Boussakta S
Hamood MT
Publication venue: 'MDPI AG'
Publication date
Field of study

\ua9 2023 by the authors. This paper introduces a new derivation of the radix- (Formula presented.) fast algorithm for the forward odd new Mersenne number transform (ONMNT) and the inverse odd new Mersenne number transform (IONMNT). This involves introducing new equations and functions in finite fields, bringing particular challenges unlike those in other fields. The radix- (Formula presented.) algorithm combines the benefits of the reduced number of operations of the radix-4 algorithm and the simple butterfly structure of the radix-2 algorithm, making it suitable for various applications such as lightweight ciphers, authenticated encryption, hash functions, signal processing, and convolution calculations. The multidimensional linear index mapping technique is the conventional method used to derive the radix- (Formula presented.) algorithm. However, this method does not provide clear insights into the underlying structure and flexibility of the radix- (Formula presented.) approach. This paper addresses this limitation and proposes a derivation based on bit-unscrambling techniques, which reverse the ordering of the output sequence, resulting in efficient calculations with fewer operations. Butterfly and signal flow diagrams are also presented to illustrate the structure of the fast algorithm for both ONMNT and IONMNT. The proposed method should pave the way for efficient and flexible implementation of ONMNT and IONMNT in applications such as lightweight ciphers and signal processing. The algorithm has been implemented in C and is validated with an example

Newcastle University E-Prints

Design of microprocessor-based hardware for number theoretic transform implementation

Author: Shamim Anwar Ahmed
Publication venue
Publication date: 01/01/1983
Field of study

Number Theoretic Transforms (NTTs) are defined in a finite ring of integers Z (_M), where M is the modulus. All the arithmetic operations are carried out modulo M. NTTs are similar in structure to DFTs, hence fast FFT type algorithms may be used to compute NTTs efficiently. A major advantage of the NTT is that it can be used to compute error free convolutions, unlike the FFT it is not subject to round off and truncation errors. In 1976 Winograd proposed a set of short length DFT algorithms using a fewer number of multiplications and approximately the same number of additions as the Cooley-Tukey FFT algorithm. This saving is accomplished at the expense of increased algorithm complexity. These short length DFT algorithms may be combined to perform longer transforms. The Winograd Fourier Transform Algorithm (WFTA) was implemented on a TMS9900 microprocessor to compute NTTs. Since multiplication conducted modulo M is very time consuming a special purpose external hardware modular multiplier was designed, constructed and interfaced with the TMS9900 microprocessor. This external hardware modular multiplier allowed an improvement in the transform execution time. Computation time may further be reduced by employing several microprocessors. Taking advantage of the inherent parallelism of the WFTA, a dedicated parallel microprocessor system was designed and constructed to implement a 15-point WFTA in parallel. Benchmark programs were written to choose a suitable microprocessor for the parallel microprocessor system. A master or a host microprocessor is used to control the parallel microprocessor system and provides an interface to the outside world. An analogue to digital (A/D) and a digital to analogue (D/A) converter allows real time digital signal processing

Durham e-Theses

OpenGrey Repository