205 research outputs found

    Efficient modular arithmetic units for low power cryptographic applications

    Get PDF
    The demand for high security in energy constrained devices such as mobiles and PDAs is growing rapidly. This leads to the need for efficient design of cryptographic algorithms which offer data integrity, authentication, non-repudiation and confidentiality of the encrypted data and communication channels. The public key cryptography is an ideal choice for data integrity, authentication and non-repudiation whereas the private key cryptography ensures the confidentiality of the data transmitted. The latter has an extremely high encryption speed but it has certain limitations which make it unsuitable for use in certain applications. Numerous public key cryptographic algorithms are available in the literature which comprise modular arithmetic modules such as modular addition, multiplication, inversion and exponentiation. Recently, numerous cryptographic algorithms have been proposed based on modular arithmetic which are scalable, do word based operations and efficient in various aspects. The modular arithmetic modules play a crucial role in the overall performance of the cryptographic processor. Hence, better results can be obtained by designing efficient arithmetic modules such as modular addition, multiplication, exponentiation and squaring. This thesis is organized into three papers, describes the efficient implementation of modular arithmetic units, application of these modules in International Data Encryption Algorithm (IDEA). Second paper describes the IDEA algorithm implementation using the existing techniques and using the proposed efficient modular units. The third paper describes the fault tolerant design of a modular unit which has online self-checking capability --Abstract, page iv

    Voting margin: A scheme for error-tolerant k nearest neighbors classifiers for machine learning

    Get PDF
    Machine learning (ML) techniques such as classifiers are used in many applications, some of which are related to safety or critical systems. In this case, correct processing is a strict requirement and thus ML algorithms (such as for classification) must be error tolerant. A naive approach to implement error tolerant classifiers is to resort to general protection techniques such as modular redundancy. However, modular redundancy incurs in large overheads in many metrics such as hardware utilization and power consumption that may not be acceptable in applications that run on embedded or battery powered systems. Another option is to exploit the algorithmic properties of the classifier to provide protection and error tolerance at a lower cost. This paper explores this approach for a widely used classifier, the k Nearest Neighbors ( k NNs), and proposes an efficient scheme to protect it against errors. The proposed technique is based on a time-based modular redundancy (TBMR) scheme. The proposed scheme exploits the intrinsic redundancy of k NNs to drastically reduce the number of re-computations needed to detect errors. This is achieved by noting that when voting among the k nearest neighbors has a large majority, an error in one of the voters cannot change the result, hence voting margin (VM). This observation has been refined and extended in the proposed VM scheme to also avoid re-computations in some cases in which the majority vote is tight. The VM scheme has been implemented and evaluated with publicly available data sets that cover a wide range of applications and settings. The results show that by exploiting the intrinsic redundancy of the classifier, the proposed scheme is able to reduce the cost compared to modular redundancy by more than 60 percent in all configurations evaluated.Pedro Reviriego and Josée Alberto Hernández would like to acknowledge the support of the TEXEO project TEC2016-80339-R funded by the Spanish Ministry of Economy and Competitivity and of the Madrid Community research project TAPIR-CM Grant no. P2018/TCS-4496

    Residue Number System Based Building Blocks for Applications in Digital Signal Processing

    Get PDF
    Předkládaná disertační práce se zabývá návrhem základních bloků v systému zbytkových tříd pro zvýšení výkonu aplikací určených pro digitální zpracování signálů (DSP). Systém zbytkových tříd (RNS) je neváhová číselná soustava, jež umožňuje provádět paralelizovatelné, vysokorychlostní, bezpečné a proti chybám odolné aritmetické operace, které jsou zpracovávány bez přenosu mezi řády. Tyto vlastnosti jej činí značně perspektivním pro použití v DSP aplikacích náročných na výpočetní výkon a odolných proti chybám. Typický RNS systém se skládá ze tří hlavních částí: převodníku z binárního kódu do RNS, který počítá ekvivalent vstupních binárních hodnot v systému zbytkových tříd, dále jsou to paralelně řazené RNS aritmetické jednotky, které provádějí aritmetické operace s operandy již převedenými do RNS. Poslední část pak tvoří převodník z RNS do binárního kódu, který převádí výsledek zpět do výchozího binárního kódu. Hlavním cílem této disertační práce bylo navrhnout nové struktury základních bloků výše zmiňovaného systému zbytkových tříd, které mohou být využity v aplikacích DSP. Tato disertační práce předkládá zlepšení a návrhy nových struktur komponent RNS, simulaci a také ověření jejich funkčnosti prostřednictvím implementace v obvodech FPGA. Kromě návrhů nové struktury základních komponentů RNS je prezentován také podrobný výzkum různých sad modulů, který je srovnává a determinuje nejefektivnější sadu pro různé dynamické rozsahy. Dalším z klíčových přínosů disertační práce je objevení a ověření podmínky určující výběr optimální sady modulů, která umožňuje zvýšit výkonnost aplikací DSP. Dále byla navržena aplikace pro zpracování obrazu využívající RNS, která má vůči klasické binární implementanci nižší spotřebu a vyšší maximální pracovní frekvenci. V závěru práce byla vyhodnocena hlavní kritéria při rozhodování, zda je vhodnější pro danou aplikaci využít binární číselnou soustavu nebo RNS.This doctoral thesis deals with designing residue number system based building blocks to enhance the performance of digital signal processing applications. The residue number system (RNS) is a non-weighted number system that provides carry-free, parallel, high speed, secure and fault tolerant arithmetic operations. These features make it very attractive to be used in high-performance and fault tolerant digital signal processing (DSP) applications. A typical RNS system consists of three main components; the first one is the binary to residue converter that computes the RNS equivalent of the inputs represented in the binary number system. The second component in this system is parallel residue arithmetic units that perform arithmetic operations on the operands already represented in RNS. The last component is the residue to binary converter, which converts the outputs back into their binary representation. The main aim of this thesis was to propose novel structures of the basic components of this system in order to be later used as fundamental units in DSP applications. This thesis encloses improving and designing novel structures of these components, simulating and verifying their efficiency via FPGA implementation. In addition to suggesting novel structures of basic RNS components, a detailed study on different moduli sets that compares and determines the most efficient one for different dynamic range requirements is also presented. One of the main outcomes of this thesis is concluding and verifying the main condition that should be met when choosing a moduli set, in order to improve the timing performance of a DSP application. An RNS-based image processing application is also proposed. Its efficiency, in terms of timing performance and power consumption, is proved via comparing it with a binary-based one. Finally, the main considerations that should be taken into account when choosing to use the binary number system or RNS are also discussed in details.

    NCL Implementation of Dual-Rail 2\u3csup\u3eS\u3c/sup\u3e Complement 8x8 Booth2 Multiplier using Static and Semi-Static Primitives

    Get PDF
    In this work, we use static and semi-static versions of NULL Convention Logic (NCL) primitives (i.e., threshold gates) to implement a dual-rail 8times8 2s complement multiplier using the Modified Booth2 algorithm for partial product generation and a Wallace tree for partial product summation. We establish the multiplier\u27s functionality utilizing VHDL-based simulations of the gate-level structural design. The design is then implemented at the transistor-level and layout-level using both static and semi-static threshold gates, for a 1.8V 0.18mum TSMC CMOS process; and these two implementations are compared in terms of area, power, and speed

    The 1992 4th NASA SERC Symposium on VLSI Design

    Get PDF
    Papers from the fourth annual NASA Symposium on VLSI Design, co-sponsored by the IEEE, are presented. Each year this symposium is organized by the NASA Space Engineering Research Center (SERC) at the University of Idaho and is held in conjunction with a quarterly meeting of the NASA Data System Technology Working Group (DSTWG). One task of the DSTWG is to develop new electronic technologies that will meet next generation electronic data system needs. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The NASA SERC is proud to offer, at its fourth symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories, the electronics industry, and universities. These speakers share insights into next generation advances that will serve as a basis for future VLSI design

    Approximation Opportunities in Edge Computing Hardware : A Systematic Literature Review

    Get PDF
    With the increasing popularity of the Internet of Things and massive Machine Type Communication technologies, the number of connected devices is rising. However, while enabling valuable effects to our lives, bandwidth and latency constraints challenge Cloud processing of their associated data amounts. A promising solution to these challenges is the combination of Edge and approximate computing techniques that allows for data processing nearer to the user. This paper aims to survey the potential benefits of these paradigms’ intersection. We provide a state-of-the-art review of circuit-level and architecture-level hardware techniques and popular applications. We also outline essential future research directions.publishedVersionPeer reviewe

    Low power predictable memory and processing architectures

    Get PDF
    Great demand in power optimized devices shows promising economic potential and draws lots of attention in industry and research area. Due to the continuously shrinking CMOS process, not only dynamic power but also static power has emerged as a big concern in power reduction. Other than power optimization, average-case power estimation is quite significant for power budget allocation but also challenging in terms of time and effort. In this thesis, we will introduce a methodology to support modular quantitative analysis in order to estimate average power of circuits, on the basis of two concepts named Random Bag Preserving and Linear Compositionality. It can shorten simulation time and sustain high accuracy, resulting in increasing the feasibility of power estimation of big systems. For power saving, firstly, we take advantages of the low power characteristic of adiabatic logic and asynchronous logic to achieve ultra-low dynamic and static power. We will propose two memory cells, which could run in adiabatic and non-adiabatic mode. About 90% dynamic power can be saved in adiabatic mode when compared to other up-to-date designs. About 90% leakage power is saved. Secondly, a novel logic, named Asynchronous Charge Sharing Logic (ACSL), will be introduced. The realization of completion detection is simplified considerably. Not just the power reduction improvement, ACSL brings another promising feature in average power estimation called data-independency where this characteristic would make power estimation effortless and be meaningful for modular quantitative average case analysis. Finally, a new asynchronous Arithmetic Logic Unit (ALU) with a ripple carry adder implemented using the logically reversible/bidirectional characteristic exhibiting ultra-low power dissipation with sub-threshold region operating point will be presented. The proposed adder is able to operate multi-functionally

    NASA Space Engineering Research Center Symposium on VLSI Design

    Get PDF
    The NASA Space Engineering Research Center (SERC) is proud to offer, at its second symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories and the electronics industry. These featured speakers share insights into next generation advances that will serve as a basis for future VLSI design. Questions of reliability in the space environment along with new directions in CAD and design are addressed by the featured speakers

    Survey of FPGA applications in the period 2000 – 2015 (Technical Report)

    Get PDF
    Romoth J, Porrmann M, Rückert U. Survey of FPGA applications in the period 2000 – 2015 (Technical Report).; 2017.Since their introduction, FPGAs can be seen in more and more different fields of applications. The key advantage is the combination of software-like flexibility with the performance otherwise common to hardware. Nevertheless, every application field introduces special requirements to the used computational architecture. This paper provides an overview of the different topics FPGAs have been used for in the last 15 years of research and why they have been chosen over other processing units like e.g. CPUs
    corecore