Search CORE

109 research outputs found

Indicating Asynchronous Array Multipliers

Author: Balasubramanian P
Maskell D L
Publication venue
Publication date: 01/01/2019
Field of study

Multiplication is an important arithmetic operation that is frequently encountered in microprocessing and digital signal processing applications, and multiplication is physically realized using a multiplier. This paper discusses the physical implementation of many indicating asynchronous array multipliers, which are inherently elastic and modular and are robust to timing, process and parametric variations. We consider the physical realization of many indicating asynchronous array multipliers using a 32/28nm CMOS technology. The weak-indication array multipliers comprise strong-indication or weak-indication full adders, and strong-indication 2-input AND functions to realize the partial products. The multipliers were synthesized in a semi-custom ASIC design style using standard library cells including a custom-designed 2-input C-element. 4x4 and 8x8 multiplication operations were considered for the physical implementations. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one (RTO) handshake protocols were utilized for data communication, and the delay-insensitive dual-rail code was used for data encoding. Among several weak-indication array multipliers, a weak-indication array multiplier utilizing a biased weak-indication full adder and the strong-indication 2-input AND function is found to have reduced cycle time and power-cycle time product with respect to RTZ and RTO handshaking for 4x4 and 8x8 multiplications. Further, the 4-phase RTO handshaking is found to be preferable to the 4-phase RTZ handshaking for achieving enhanced optimizations of the design metrics.Comment: arXiv admin note: text overlap with arXiv:1903.0943

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Asynchronous Early Output Dual-Bit Full Adders Based on Homogeneous and Heterogeneous Delay-Insensitive Data Encoding

Author: Balasubramanian P
Prasad K
Publication venue
Publication date: 25/04/2017
Field of study

This paper presents the designs of asynchronous early output dual-bit full adders without and with redundant logic (implicit) corresponding to homogeneous and heterogeneous delay-insensitive data encoding. For homogeneous delay-insensitive data encoding only dual-rail i.e. 1-of-2 code is used, and for heterogeneous delay-insensitive data encoding 1-of-2 and 1-of-4 codes are used. The 4-phase return-to-zero protocol is used for handshaking. To demonstrate the merits of the proposed dual-bit full adder designs, 32-bit ripple carry adders (RCAs) are constructed comprising dual-bit full adders. The proposed dual-bit full adders based 32-bit RCAs incorporating redundant logic feature reduced latency and area compared to their non-redundant counterparts with no accompanying power penalty. In comparison with the weakly indicating 32-bit RCA constructed using homogeneously encoded dual-bit full adders containing redundant logic, the early output 32-bit RCA comprising the proposed homogeneously encoded dual-bit full adders with redundant logic reports corresponding reductions in latency and area by 22.2% and 15.1% with no associated power penalty. On the other hand, the early output 32-bit RCA constructed using the proposed heterogeneously encoded dual-bit full adder which incorporates redundant logic reports respective decreases in latency and area than the weakly indicating 32-bit RCA that consists of heterogeneously encoded dual-bit full adders with redundant logic by 21.5% and 21.3% with nil power overhead. The simulation results obtained are based on a 32/28nm CMOS process technology

arXiv.org e-Print Archive

AUT Scholarly Commons

Графы сигнальных переходов для схем асинхронного тракта данных

Author: Alex Kushnerov
Sergey Bystrov
Александр Кушнеров
Сергей Быстров
Publication venue: Yaroslavl State University
Publication date: 14/06/2023
Field of study

The paper proposes a method for constructing signal transition graphs (STGs), which are directly mapped into asynchronous circuits for data processing. The advantage of the proposed method is that the resulting circuits are not only output-persistent, but also conformant to the environment. In other approaches, the environment is specified implicitly and/or inexactly and therefore they guarantee only output persistence. The conformation can be verified if both the circuit and its environment are specified by STGs. As an example, we consider a module realizing the function AND2. This module can either wait for both 1s or evaluate the function as soon as at least one 0 arrives. For each case, we draw up a separate STG (scenario) and map it into NCL gates. To provide such a mapping, we specify the behaviors of NCL gates by STG protocols. For data path, such an STG always contains alternative branches with the so-called garbage transitions at the gate inputs. The garbage transitions on a certain wire mean that the circuit is sensitive to the delay in this wire. Ignoring the garbage may lead to a violation of conformation or/and output persistence. For example, in the combinational part of the NCL circuits, the garbage appears on the inputs of NCL gates, and therefore these circuits are not delay insensitive.В статье предлагается метод построения графов сигнальных переходов (STG), которые напрямую отображаются в схемы асинхронной обработки данных. Преимуществом предлагаемого метода является то, что полученные схемы не только неизменны по выходу (output-persistent), но и конформны внешней среде. В других подходах среда задаётся неявно и/или неточно, и поэтому они гарантируют только неизменность по выходу. Конформность можно проверить, если как схема, так и её внешняя среда заданы STG. В качестве примера мы рассматриваем модуль, реализующий функцию 2И. Этот модуль может либо ожидать лог. 1 на обоих входах, либо вычислить функцию, как только придёт хотя бы один 0. Для каждого случая мы составляем отдельный STG (сценарий) и отображаем его в элементы NCL. Чтобы обеспечить такое отображение, мы задаём поведение NCL элементов STG протоколами . Для тракта данных такой STG всегда содержит альтернативные ветви с так называемыми мусорными переключениями на входах элементов. Мусорные переключения на определенном проводе означают, что схема чувствительна к задержке в этом проводе. Игнорирование мусора может привести к нарушению конформности и/или неизменности по выходу. Например, в комбинационной части NCL схем мусор появляется на входах NCL элементов, поэтому эти схемы чувствительны к задержкам

Modeling and Analysis of Information Systems / Моделирование и анализ информационных систем (МАИС)

Latency Optimized Asynchronous Early Output Ripple Carry Adder based on Delay-Insensitive Dual-Rail Data Encoding

Author: Balasubramanian P
Prasad K
Publication venue
Publication date: 13/06/2017
Field of study

Asynchronous circuits employing delay-insensitive codes for data representation i.e. encoding and following a 4-phase return-to-zero protocol for handshaking are generally robust. Depending upon whether a single delay-insensitive code or multiple delay-insensitive code(s) are used for data encoding, the encoding scheme is called homogeneous or heterogeneous delay-insensitive data encoding. This article proposes a new latency optimized early output asynchronous ripple carry adder (RCA) that utilizes single-bit asynchronous full adders (SAFAs) and dual-bit asynchronous full adders (DAFAs) which incorporate redundant logic and are based on the delay-insensitive dual-rail code i.e. homogeneous data encoding, and follow a 4-phase return-to-zero handshaking. Amongst various RCA, carry lookahead adder (CLA), and carry select adder (CSLA) designs, which are based on homogeneous or heterogeneous delay-insensitive data encodings which correspond to the weak-indication or the early output timing model, the proposed early output asynchronous RCA that incorporates SAFAs and DAFAs with redundant logic is found to result in reduced latency for a dual-operand addition operation. In particular, for a 32-bit asynchronous RCA, utilizing 15 stages of DAFAs and 2 stages of SAFAs leads to reduced latency. The theoretical worst-case latencies of the different asynchronous adders were calculated by taking into account the typical gate delays of a 32/28nm CMOS digital cell library, and a comparison is made with their practical worst-case latencies estimated. The theoretical and practical worst-case latencies show a close correlation....Comment: arXiv admin note: text overlap with arXiv:1704.0761

arXiv.org e-Print Archive

AUT Scholarly Commons

Approximate Early Output Asynchronous Adders Based on Dual-Rail Data Encoding and 4-Phase Return-to-Zero and Return-to-One Handshaking

Author: Balasubramanian P
Publication venue
Publication date: 01/01/2017
Field of study

Approximate computing is emerging as an alternative to accurate computing due to its potential for realizing digital circuits and systems with low power dissipation, less critical path delay, and less area occupancy for an acceptable trade-off in the accuracy of results. In the domain of computer arithmetic, several approximate adders and multipliers have been designed and their potential have been showcased versus accurate adders and multipliers for practical digital signal processing applications. Nevertheless, in the existing literature, almost all the approximate adders and multipliers reported correspond to the synchronous design method. In this work, we consider robust asynchronous i.e. quasi-delay-insensitive realizations of approximate adders by employing delay-insensitive codes for data representation and processing, and the 4-phase handshake protocols for data communication. The 4-phase handshake protocols used are the return-to-zero and the return-to-one protocols. Specifically, we consider the implementations of 32-bit approximate adders based on the return-to-zero and return-to-one handshake protocols by adopting the delay-insensitive dual-rail code for data encoding. We consider a range of approximations varying from 4-bits to 20-bits for the least significant positions of the accurate 32-bit asynchronous adder. The asynchronous adders correspond to early output (i.e. early reset) type, which are based on the well-known ripple carry adder architecture. The experimental results show that approximate asynchronous adders achieve reductions in the design metrics such as latency, cycle time, average power dissipation, and silicon area compared to the accurate asynchronous adders. Further, the reductions in the design metrics are greater for the return-to-one protocol compared to the return-to-zero protocol. The design metrics were estimated using a 32/28nm CMOS technology.Comment: arXiv admin note: text overlap with arXiv:1711.0233

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)