Search CORE

175 research outputs found

FIR Filter Implementation Based on the RNS with Diminished-1 Encoded Channel

Author: Dragana Uros Zivaljevic
Negovan Stamenković
Vidosav Stojanović
Publication venue: 'International Science and Engineering Society'
Publication date: 01/01/2013
Field of study

The use of reversible logic gates in the design of residue number systems

Author: Asadpour Ailin
Emrani Zarandi Azadeh Alsadat
Molahosseini Amir Sabbagh
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/04/2023
Field of study

Reversible computing is an emerging technique to achieve ultra-low-power circuits. Reversible arithmetic circuits allow for achieving energy-efficient high-performance computational systems. Residue number systems (RNS) provide parallel and fault-tolerant additions and multiplications without carry propagation between residue digits. The parallelism and fault-tolerance features of RNS can be leveraged to achieve high-performance reversible computing. This paper proposed RNS full reversible circuits, including forward converters, modular adders and multipliers, and reverse converters used for a class of RNS moduli sets with the composite form {2k, 2p-1}. Modulo 2n-1, 2n, and 2n+1 adders and multipliers were designed using reversible gates. Besides, reversible forward and reverse converters for the 3-moduli set {2n-1, 2n+k, 2n+1} have been designed. The proposed RNS-based reversible computing approach has been applied for consecutive multiplications with an improvement of above 15% in quantum cost after the twelfth iteration, and above 27% in quantum depth after the ninth iteration. The findings show that the use of the proposed RNS-based reversible computing in convolution results in a significant improvement in quantum depth in comparison to conventional methods based on weighted binary adders and multipliers

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Residue Number System Based Building Blocks for Applications in Digital Signal Processing

Author: Younes Dina
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2013
Field of study

Předkládaná disertační práce se zabývá návrhem základních bloků v systému zbytkových tříd pro zvýšení výkonu aplikací určených pro digitální zpracování signálů (DSP). Systém zbytkových tříd (RNS) je neváhová číselná soustava, jež umožňuje provádět paralelizovatelné, vysokorychlostní, bezpečné a proti chybám odolné aritmetické operace, které jsou zpracovávány bez přenosu mezi řády. Tyto vlastnosti jej činí značně perspektivním pro použití v DSP aplikacích náročných na výpočetní výkon a odolných proti chybám. Typický RNS systém se skládá ze tří hlavních částí: převodníku z binárního kódu do RNS, který počítá ekvivalent vstupních binárních hodnot v systému zbytkových tříd, dále jsou to paralelně řazené RNS aritmetické jednotky, které provádějí aritmetické operace s operandy již převedenými do RNS. Poslední část pak tvoří převodník z RNS do binárního kódu, který převádí výsledek zpět do výchozího binárního kódu. Hlavním cílem této disertační práce bylo navrhnout nové struktury základních bloků výše zmiňovaného systému zbytkových tříd, které mohou být využity v aplikacích DSP. Tato disertační práce předkládá zlepšení a návrhy nových struktur komponent RNS, simulaci a také ověření jejich funkčnosti prostřednictvím implementace v obvodech FPGA. Kromě návrhů nové struktury základních komponentů RNS je prezentován také podrobný výzkum různých sad modulů, který je srovnává a determinuje nejefektivnější sadu pro různé dynamické rozsahy. Dalším z klíčových přínosů disertační práce je objevení a ověření podmínky určující výběr optimální sady modulů, která umožňuje zvýšit výkonnost aplikací DSP. Dále byla navržena aplikace pro zpracování obrazu využívající RNS, která má vůči klasické binární implementanci nižší spotřebu a vyšší maximální pracovní frekvenci. V závěru práce byla vyhodnocena hlavní kritéria při rozhodování, zda je vhodnější pro danou aplikaci využít binární číselnou soustavu nebo RNS.This doctoral thesis deals with designing residue number system based building blocks to enhance the performance of digital signal processing applications. The residue number system (RNS) is a non-weighted number system that provides carry-free, parallel, high speed, secure and fault tolerant arithmetic operations. These features make it very attractive to be used in high-performance and fault tolerant digital signal processing (DSP) applications. A typical RNS system consists of three main components; the first one is the binary to residue converter that computes the RNS equivalent of the inputs represented in the binary number system. The second component in this system is parallel residue arithmetic units that perform arithmetic operations on the operands already represented in RNS. The last component is the residue to binary converter, which converts the outputs back into their binary representation. The main aim of this thesis was to propose novel structures of the basic components of this system in order to be later used as fundamental units in DSP applications. This thesis encloses improving and designing novel structures of these components, simulating and verifying their efficiency via FPGA implementation. In addition to suggesting novel structures of basic RNS components, a detailed study on different moduli sets that compares and determines the most efficient one for different dynamic range requirements is also presented. One of the main outcomes of this thesis is concluding and verifying the main condition that should be met when choosing a moduli set, in order to improve the timing performance of a DSP application. An RNS-based image processing application is also proposed. Its efficiency, in terms of timing performance and power consumption, is proved via comparing it with a binary-based one. Finally, the main considerations that should be taken into account when choosing to use the binary number system or RNS are also discussed in details.

Digital library of Brno University of Technology

National Repository of Grey Literature

Number Systems for Deep Neural Network Architectures: A Survey

Author: Al-Qutayri Mahmoud
Alsuhli Ghada
Mohammad Baker
Sakellariou Vasileios
Saleh Hani
Stouraitis Thanos
Publication venue
Publication date: 11/07/2023
Field of study

Deep neural networks (DNNs) have become an enabling component for a myriad of artificial intelligence applications. DNNs have shown sometimes superior performance, even compared to humans, in cases such as self-driving, health applications, etc. Because of their computational complexity, deploying DNNs in resource-constrained devices still faces many challenges related to computing complexity, energy efficiency, latency, and cost. To this end, several research directions are being pursued by both academia and industry to accelerate and efficiently implement DNNs. One important direction is determining the appropriate data representation for the massive amount of data involved in DNN processing. Using conventional number systems has been found to be sub-optimal for DNNs. Alternatively, a great body of research focuses on exploring suitable number systems. This article aims to provide a comprehensive survey and discussion about alternative number systems for more efficient representations of DNN data. Various number systems (conventional/unconventional) exploited for DNNs are discussed. The impact of these number systems on the performance and hardware design of DNNs is considered. In addition, this paper highlights the challenges associated with each number system and various solutions that are proposed for addressing them. The reader will be able to understand the importance of an efficient number system for DNN, learn about the widely used number systems for DNN, understand the trade-offs between various number systems, and consider various design aspects that affect the impact of number systems on DNN performance. In addition, the recent trends and related research opportunities will be highlightedComment: 28 page

arXiv.org e-Print Archive

Efficient convolvers using the Polynomial Residue Number System technique

Author: Paruchuri Surendar
Publication venue: LSU Digital Commons
Publication date: 01/01/2002
Field of study

The problem of computing linear convolution is a very important one because with linear convolution we can mechanize digital filtering. The linear convolution of two N-point sequences can be computed by the cyclic convolution of the following 2N-point sequences. The original sequence padded with N zero’s each. The cyclic convolution of two N-point sequences requires multiplications and additions for its computation. A very efficient way of computing cyclic convolution of two sequences is by using the Polynomial Residue Number System (PRNS) technique. Using this technique the cyclic convolution of two N-point sequences can be computed using only N multiplications instead of N2 multiplications. This can be achieved based on some forward and inverse PRNS transformation mappings. These mappings rely on additions, subtractions and many scaling operations (multiplications by constants). The PRNS technique would lose a lot in value if these many scaling operations were difficultly implemented. In this thesis we will show how to calculate cyclic convolution of two sequences using the PRNS technique based on forward and inverse transformation mapping which rely on complement operations (negations), additions and rotation operations. These rotation operations do not require any computational hardware. Therefore the complicated hardware required for the scaling operations has now been substituted by rotators, which do not require any computational hardware

Louisiana State University

Montgomery and RNS for RSA Hardware Implementation

Author: Manochehri Kooroush
Pourmozafari Saadat
Sadeghian Babak
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

There are many architectures for RSA hardware implementation which improve its performance. Two main methods for this purpose are Montgomery and RNS. These are fast methods to convert plaintext to ciphertext in RSA algorithm with hardware implementation. RNS is faster than Montgomery but it uses more area. The goal of this paper is to compare these two methods based on the speed and on the used area. For this purpose the architecture that has a better performance for each method is selected, and some modification is done to enhance their performance. This comparison can be used to select the proper method for hardware implementation in both FPGA and ASIC design

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

A Reconfigurable Butterfly Architecture for Fourier and Fermat Transforms

Author: Al Ghouwayel Ali
Louët Yves
Palicot Jacques
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

International audienceReconfiguration is an essential part of Soft- Ware Radio (SWR) technology. Thanks to this technique, systems are designed for change in operating mode with the aim to carry out several types of computations. In this SWR context, the Fast Fourier Transform (FFT) operator was defined as a common operator for many classical telecommunications operations [1]. In this paper we propose a new architecture for this operator that makes it a device intended to perform two different transforms. The first one is the Fast Fourier Transform (FFT) used for the classical operations in the complex field. The second one is the Fermat Number Transform (FNT) in the Galois Field (GF) for channel coding and decoding

High-Throughput Hardware Architecture for the SWIFFT / SWIFFTX Hash Functions

Author: Guillaume Hanrot
Nicolas Brisebarre
Octavian Cret
Tamas Gyorfi
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 07/09/2012
Field of study

Introduced in 1996 and greatly developed over the last few years, Lattice-based cryptography oers a whole set of primitives with nice features, including provable security and asymptotic efficiency. Going from \asymptotic to \real-world efficiency seems important as the set of available primitives increases in size and functionality. In this present paper, we explore the improvements that can be obtained through the use of an FPGA architecture for implementing an ideal-lattice based cryptographic primitive. We chose to target two of the simplest, yet powerful and useful, lattice-based primitives, namely the SWIFFT and SWIFFTX primitives. Apart from being simple, those are also of central use for future primitives as Lyubashevsky\u27s lattice-based signatures. We present a high-throughput FPGA architecture for the SWIFFT and SWIFFTX primitives. One of the main features of this implementation is an efficient implementation of a variant of the Fast Fourier Transform of order 64 on Z257. On a Virtex-5 LX110T FPGA, we are able to hash 0.6GB/s, which shows a ca. 16x speedup compared to SIMD implementations of the literature. We feel that this demonstrates the revelance of FPGA as a target architecture for the implementation of ideal-lattice based primitives

CiteSeerX

Cryptology ePrint Archive

Digital signal processing application based on residue number system

Author: Rolko Maroš
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2011
Field of study

Tato práce se zabývá systémem zbytkových tříd a jeho aplikacemi v digitálních obvodech. První část se zabývá VHDL návrhem různých typů sčítaček v systému zbytkových tříd a jejich porovnání se standartními sčítačkami. V druhé části je implementován obrázkový processor který pracuje v systému zbytkových tříd a jeho výkonostní analýza. V textu je popsán postup návrhu a jsou prezentovány výsledky analýz.This work deals with residue number system and its applications in digital circuits. The first part is VHDL design of different adder types in residue number system and their comparison with regular adders. The second part is VHDL implementation of image processor that computes in residue number system and its performance analysis. Presented text contains description of design procedures and presentation of analysis results.

Digital library of Brno University of Technology

National Repository of Grey Literature

A high-speed integrated circuit with applications to RSA Cryptography

Author: Onions Paul David
Publication venue: 'University of Plymouth'
Publication date: 01/01/1995
Field of study

Merged with duplicate record 10026.1/833 on 01.02.2017 by CS (TIS)The rapid growth in the use of computers and networks in government, commercial and private communications systems has led to an increasing need for these systems to be secure against unauthorised access and eavesdropping. To this end, modern computer security systems employ public-key ciphers, of which probably the most well known is the RSA ciphersystem, to provide both secrecy and authentication facilities. The basic RSA cryptographic operation is a modular exponentiation where the modulus and exponent are integers typically greater than 500 bits long. Therefore, to obtain reasonable encryption rates using the RSA cipher requires that it be implemented in hardware. This thesis presents the design of a high-performance VLSI device, called the WHiSpER chip, that can perform the modular exponentiations required by the RSA cryptosystem for moduli and exponents up to 506 bits long. The design has an expected throughput in excess of 64kbit/s making it attractive for use both as a general RSA processor within the security function provider of a security system, and for direct use on moderate-speed public communication networks such as ISDN. The thesis investigates the low-level techniques used for implementing high-speed arithmetic hardware in general, and reviews the methods used by designers of existing modular multiplication/exponentiation circuits with respect to circuit speed and efficiency. A new modular multiplication algorithm, MMDDAMMM, based on Montgomery arithmetic, together with an efficient multiplier architecture, are proposed that remove the speed bottleneck of previous designs. Finally, the implementation of the new algorithm and architecture within the WHiSpER chip is detailed, along with a discussion of the application of the chip to ciphering and key generation

Plymouth Electronic Archive and Research Library

OpenGrey Repository