Search CORE

119 research outputs found

Efficient modular arithmetic units for low power cryptographic applications

Author: Modugu Rajashekhar Reddy
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2010
Field of study

The demand for high security in energy constrained devices such as mobiles and PDAs is growing rapidly. This leads to the need for efficient design of cryptographic algorithms which offer data integrity, authentication, non-repudiation and confidentiality of the encrypted data and communication channels. The public key cryptography is an ideal choice for data integrity, authentication and non-repudiation whereas the private key cryptography ensures the confidentiality of the data transmitted. The latter has an extremely high encryption speed but it has certain limitations which make it unsuitable for use in certain applications. Numerous public key cryptographic algorithms are available in the literature which comprise modular arithmetic modules such as modular addition, multiplication, inversion and exponentiation. Recently, numerous cryptographic algorithms have been proposed based on modular arithmetic which are scalable, do word based operations and efficient in various aspects. The modular arithmetic modules play a crucial role in the overall performance of the cryptographic processor. Hence, better results can be obtained by designing efficient arithmetic modules such as modular addition, multiplication, exponentiation and squaring. This thesis is organized into three papers, describes the efficient implementation of modular arithmetic units, application of these modules in International Data Encryption Algorithm (IDEA). Second paper describes the IDEA algorithm implementation using the existing techniques and using the proposed efficient modular units. The third paper describes the fault tolerant design of a modular unit which has online self-checking capability --Abstract, page iv

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

CPU, GPU i FPGA implementacija MALD algoritma za otkrivanje nepravilnosti na površini keramičkih pločica

Author: Abramov A.
Backer M.
Blelloch G. E.
Brodtkorba A. R.
Costa C. E.
Crow F.
Fulkerson B.
Galiano V.
Gipp M.
Hocenksi Ž.
Hocenski V.
Hocenski Z.
Matić T.
Nimmagadda V. K.
Shum W. W.-K.
Tomov S.
Yuan Z.
Publication venue: 'KOREMA'
Publication date: 01/01/2014
Field of study

This paper addresses adjustments, implementation and performance comparison of the Moving Average with Local Difference (MALD) method for ceramic tile surface defects detection. Ceramic tile production process is completely autonomous, except the final stage where human eye is required for defects detection. Recent computational platform development and advances in machine vision provides us with several options for MALD algorithm implementation. In order to exploit the shortest execution time for ceramic tile production process, the MALD method is implemented on three different platforms: CPU, GPU and FPGA, and it is implemented on each platform in at least two ways. Implementations are done in MATLAB’s MEX/C++, C++, CUDA/C++, VHDL and Assembly programming languages. Execution times are measured and compared for different algorithms and their implementations on different computational platforms.U ovom radu razmatra se prilagodba, implementacija i usporedba performansi metode pomičnog usrednjavanja s lokalnom diferencijom (MALD) s primjenom u otkrivanju površinskih nedostataka na keramičkim pločicama. Proizvodna linija keramičkih pločica je autonomna sve do zadnje faze u kojoj je potreban ljudski vid kako bi se otkrili eventualni nedostaci na keramičkim pločicama. Nedavnim razvojem računalnih platformi i razvojem metoda računalnog vida omogućena je implementacija MALD metode na nekoliko načina. U nastojanju skraćenja vremena potrebnog za proizvodnju keramičkih pločica, MALD metoda je implementirana u trima različitim platformama: CPU (central processing unit), GPU (graphic processing unit) i FPGA (field programmable gate array), te s barem dva različita algoritma. Implementacija je izvršena sa MATLAB MEX/C++, C++, CUDA/C++, VHDL te Asembler programskim jezicima. Izmjerena vremena obrade su me.usobno uspore.ena za različite algoritme i njihove implementacije na različitim računalnim platformama

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Recommended from our members

Low-cost duplication for separable error detection in computer arithmetic

Author: Sullivan Michael Brendan
Publication venue
Publication date: 18/02/2016
Field of study

Low-cost arithmetic error detection will be necessary in the future to ensure correct and safe system operation. However, current error detection mechanisms for arithmetic either have high area and energy overheads or are complex and offer incomplete protection against errors. Full duplication is simple, strong, and separable, but often is prohibitively costly. Alternative techniques such as arithmetic error coding require lower hardware and energy overheads than full duplication, but they do so at the expense of high design effort and error coverage holes. The goal of this research is to mitigate the deficiencies of duplication and arithmetic error coding to form an error detection scheme that may be readily employed in future systems. The techniques described by this work use a general duplication technique that employs an alternate number system in the duplicate arithmetic unit. These novel dual modular redundancy organizations are referred to as low-cost duplication, and they provide compelling efficiency and coverage advantages over prior arithmetic error detection mechanisms.Electrical and Computer Engineerin

Texas ScholarWorks

Design of Soft Error Robust High Speed 64-bit Logarithmic Adder

Author: Shah Jaspal Singh
Publication venue: 'University of Waterloo'
Publication date: 01/01/2008
Field of study

Continuous scaling of the transistor size and reduction of the operating voltage have led to a significant performance improvement of integrated circuits. However, the vulnerability of the scaled circuits to transient data upsets or soft errors, which are caused by alpha particles and cosmic neutrons, has emerged as a major reliability concern. In this thesis, we have investigated the effects of soft errors in combinational circuits and proposed soft error detection techniques for high speed adders. In particular, we have proposed an area-efficient 64-bit soft error robust logarithmic adder (SRA). The adder employs the carry merge Sklansky adder architecture in which carries are generated every 4 bits. Since the particle-induced transient, which is often referred to as a single event transient (SET) typically lasts for 100~200 ps, the adder uses time redundancy by sampling the sum outputs twice. The sampling instances have been set at 110 ps apart. In contrast to the traditional time redundancy, which requires two clock cycles to generate a given output, the SRA generates an output in a single clock cycle. The sampled sum outputs are compared using a 64-bit XOR tree to detect any possible error. An energy efficient 4-input transmission gate based XOR logic is implemented to reduce the delay and the power in this case. The pseudo-static logic (PSL), which has the ability to recover from a particle induced transient, is used in the adder implementation. In comparison with the space redundant approach which requires hardware duplication for error detection, the SRA is 50% more area efficient. The proposed SRA is simulated for different operands with errors inserted at different nodes at the inputs, the carry merge tree, and the sum generation circuit. The simulation vectors are carefully chosen such that the SET is not masked by error masking mechanisms, which are inherently present in combinational circuits. Simulation results show that the proposed SRA is capable of detecting 77% of the errors. The undetected errors primarily result when the SET causes an even number of errors and when errors occur outside the sampling window

University of Waterloo's Institutional Repository

Recommended from our members

A Performance-Efficient and Practical Processor Error Recovery Framework

Author: Soman Jyothish
Publication venue: University of Cambridge
Publication date: 04/01/2019
Field of study

Continued reduction in the size of a transistor has affected the reliability of pro- cessors built using them. This is primarily due to factors such as inaccuracies while manufacturing, as well as non-ideal operating conditions, causing transistors to slow down consistently, eventually leading to permanent breakdown and erroneous operation of the processor. Permanent transistor breakdown, or faults, can occur at any point in time in the processor’s lifetime. Errors are the discrepancies in the output of faulty circuits. This dissertation shows that the components containing faults can continue operating if the errors caused by them are within certain bounds. Further, the lifetime of a processor can be increased by adding supportive structures that start working once the processor develops these hard errors. This dissertation has three major contributions, namely REPAIR, FaultSim and PreFix. REPAIR is a fault tolerant system with minimal changes to the processor design. It uses an external Instruction Re-execution Unit (IRU) to perform operations, which the faulty processor might have erroneously executed. Instructions that are found to use faulty hardware are then re-executed on the IRU. REPAIR shows that the performance overhead of such targeted re-execution is low for a limited number of faults. FaultSim is a fast fault-simulator capable of simulating large circuits at the transistor level. It is developed in this dissertation to understand the effect of faults on different circuits. It performs digital logic based simulations, trading off analogue accuracy with speed, while still being able to support most fault models. A 32-bit addition takes under 15 micro-seconds, while simulating more than 1500 transistors. It can also be integrated into an architectural simulator, which added a performance overhead of 10 to 26 percent to a simulation. The results obtained show that single faults cause an error in an adder in less than 10 percent of the inputs. PreFix brings together the fault models created using FaultSim and the design directions found using REPAIR. PreFix performs re-execution of instructions on a remote core, which pick up instructions to execute using a global instruction buffer. Error prediction and detection are used to reduce the number of re-executed instructions. PreFix has an area overhead of 3.5 percent in the setup used, and the performance overhead is within 5 percent of a fault-free case. This dissertation shows that faults in processors can be tolerated without explicitly switching off any component, and minimal redundancy is sufficient to achieve the same

Apollo (Cambridge)

The 1991 3rd NASA Symposium on VLSI Design

Author: Maki Gary K.
Publication venue
Publication date
Field of study

Papers from the symposium are presented from the following sessions: (1) featured presentations 1; (2) very large scale integration (VLSI) circuit design; (3) VLSI architecture 1; (4) featured presentations 2; (5) neural networks; (6) VLSI architectures 2; (7) featured presentations 3; (8) verification 1; (9) analog design; (10) verification 2; (11) design innovations 1; (12) asynchronous design; and (13) design innovations 2

NASA Technical Reports Server

A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits

Author: Chen Chuangtao
Han Jie
Qian Weikang
Wang Xuan
Wen Chenyi
Wu Ying
Xiao Weihua
Yin Xunzhao
Zhuo Cheng
Publication venue
Publication date: 29/06/2023
Field of study

Given the stringent requirements of energy efficiency for Internet-of-Things edge devices, approximate multipliers, as a basic component of many processors and accelerators, have been constantly proposed and studied for decades, especially in error-resilient applications. The computation error and energy efficiency largely depend on how and where the approximation is introduced into a design. Thus, this article aims to provide a comprehensive review of the approximation techniques in multiplier designs ranging from algorithms and architectures to circuits. We have implemented representative approximate multiplier designs in each category to understand the impact of the design techniques on accuracy and efficiency. The designs can then be effectively deployed in high-level applications, such as machine learning, to gain energy efficiency at the cost of slight accuracy loss.Comment: 38 pages, 37 figure

arXiv.org e-Print Archive

Design for pre-bond testability in 3D integrated circuits

Author: Lewis Dean Leon
Publication venue: Georgia Institute of Technology
Publication date: 17/08/2012
Field of study

In this dissertation we propose several DFT techniques specific to 3D stacked IC systems. The goal has explicitly been to create techniques that integrate easily with existing IC test systems. Specifically, this means utilizing scan- and wrapper-based techniques, two foundations of the digital IC test industry. First, we describe a general test architecture for 3D ICs. In this architecture, each tier of a 3D design is wrapped in test control logic that both manages tier test pre-bond and integrates the tier into the large test architecture post-bond. We describe a new kind of boundary scan to provide the necessary test control and observation of the partial circuits, and we propose a new design methodology for test hardcore that ensures both pre-bond functionality and post-bond optimality. We present the application of these techniques to the 3D-MAPS test vehicle, which has proven their effectiveness. Second, we extend these DFT techniques to circuit-partitioned designs. We find that boundary scan design is generally sufficient, but that some 3D designs require special DFT treatment. Most importantly, we demonstrate that the functional partitioning inherent in 3D design can potentially decrease the total test cost of verifying a circuit. Third, we present a new CAD algorithm for designing 3D test wrappers. This algorithm co-designs the pre-bond and post-bond wrappers to simultaneously minimize test time and routing cost. On average, our algorithm utilizes over 90% of the wires in both the pre-bond and post-bond wrappers. Finally, we look at the 3D vias themselves to develop a low-cost, high-volume pre-bond test methodology appropriate for production-level test. We describe the shorting probes methodology, wherein large test probes are used to contact multiple small 3D vias. This technique is an all-digital test method that integrates seamlessly into existing test flows. Our experimental results demonstrate two key facts: neither the large capacitance of the probe tips nor the process variation in the 3D vias and the probe tips significantly hinders the testability of the circuits. Taken together, this body of work defines a complete test methodology for testing 3D ICs pre-bond, eliminating one of the key hurdles to the commercialization of 3D technology.PhDCommittee Chair: Lee, Hsien-Hsin; Committee Member: Bakir, Muhannad; Committee Member: Lim, Sung Kyu; Committee Member: Vuduc, Richard; Committee Member: Yalamanchili, Sudhaka

Scholarly Materials And Research @ Georgia Tech

The 1992 4th NASA SERC Symposium on VLSI Design

Author: Whitaker Sterling R.
Publication venue
Publication date
Field of study

Papers from the fourth annual NASA Symposium on VLSI Design, co-sponsored by the IEEE, are presented. Each year this symposium is organized by the NASA Space Engineering Research Center (SERC) at the University of Idaho and is held in conjunction with a quarterly meeting of the NASA Data System Technology Working Group (DSTWG). One task of the DSTWG is to develop new electronic technologies that will meet next generation electronic data system needs. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The NASA SERC is proud to offer, at its fourth symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories, the electronics industry, and universities. These speakers share insights into next generation advances that will serve as a basis for future VLSI design

NASA Technical Reports Server