5,977 research outputs found
Area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders
This article presents two area/latency optimized gate level asynchronous full
adder designs which correspond to early output logic. The proposed full adders
are constructed using the delay-insensitive dual-rail code and adhere to the
four-phase return-to-zero handshaking. For an asynchronous ripple carry adder
(RCA) constructed using the proposed early output full adders, the
relative-timing assumption becomes necessary and the inherent advantages of the
relative-timed RCA are: (1) computation with valid inputs, i.e., forward
latency is data-dependent, and (2) computation with spacer inputs involves a
bare minimum constant reverse latency of just one full adder delay, thus
resulting in the optimal cycle time. With respect to different 32-bit RCA
implementations, and in comparison with the optimized strong-indication,
weak-indication, and early output full adder designs, one of the proposed early
output full adders achieves respective reductions in latency by 67.8, 12.3 and
6.1 %, while the other proposed early output full adder achieves corresponding
reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty.
Further, the proposed early output full adders based asynchronous RCAs enable
minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering
carry-propagation over the entire RCA width of 32-bits, and maximum reductions
in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical
carry chain length of 4 full adder stages, when compared to the least of the
cycle time estimates of various strong-indication, weak-indication, and early
output asynchronous RCAs of similar size. All the asynchronous full adders and
RCAs were realized using standard cells in a semi-custom design fashion based
on a 32/28 nm CMOS process technology
Indicating Asynchronous Array Multipliers
Multiplication is an important arithmetic operation that is frequently
encountered in microprocessing and digital signal processing applications, and
multiplication is physically realized using a multiplier. This paper discusses
the physical implementation of many indicating asynchronous array multipliers,
which are inherently elastic and modular and are robust to timing, process and
parametric variations. We consider the physical realization of many indicating
asynchronous array multipliers using a 32/28nm CMOS technology. The
weak-indication array multipliers comprise strong-indication or weak-indication
full adders, and strong-indication 2-input AND functions to realize the partial
products. The multipliers were synthesized in a semi-custom ASIC design style
using standard library cells including a custom-designed 2-input C-element. 4x4
and 8x8 multiplication operations were considered for the physical
implementations. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one
(RTO) handshake protocols were utilized for data communication, and the
delay-insensitive dual-rail code was used for data encoding. Among several
weak-indication array multipliers, a weak-indication array multiplier utilizing
a biased weak-indication full adder and the strong-indication 2-input AND
function is found to have reduced cycle time and power-cycle time product with
respect to RTZ and RTO handshaking for 4x4 and 8x8 multiplications. Further,
the 4-phase RTO handshaking is found to be preferable to the 4-phase RTZ
handshaking for achieving enhanced optimizations of the design metrics.Comment: arXiv admin note: text overlap with arXiv:1903.0943
Fast Quantum Modular Exponentiation
We present a detailed analysis of the impact on modular exponentiation of
architectural features and possible concurrent gate execution. Various
arithmetic algorithms are evaluated for execution time, potential concurrency,
and space tradeoffs. We find that, to exponentiate an n-bit number, for storage
space 100n (twenty times the minimum 5n), we can execute modular exponentiation
two hundred to seven hundred times faster than optimized versions of the basic
algorithms, depending on architecture, for n=128. Addition on a neighbor-only
architecture is limited to O(n) time when non-neighbor architectures can reach
O(log n), demonstrating that physical characteristics of a computing device
have an important impact on both real-world running time and asymptotic
behavior. Our results will help guide experimental implementations of quantum
algorithms and devices.Comment: to appear in PRA 71(5); RevTeX, 12 pages, 12 figures; v2 revision is
substantial, with new algorithmic variants, much shorter and clearer text,
and revised equation formattin
A technology based complexity model for reversible Cuccaro ripple-carry adder
Reversible logic provides an alternative to classical computing, that may overcome many of the power dissipation problems. The paper presents a simple complexity model, from the study of a cascade of Cuccaro adders processed in standard 0.35 micrometer CMOS technology
Hierarchical probabilistic macromodeling for QCA circuits
With the goal of building an hierarchical design methodology for quantum-dot cellular automata (QCA) circuits, we put forward a novel, theoretically sound, method for abstracting the behavior of circuit components in QCA circuit, such as majority logic, lines, wire-taps, cross-overs, inverters, and corners, using macromodels. Recognizing that the basic operation of QCA is probabilistic in nature, we propose probabilistic macromodels for standard QCA circuit elements based on conditional probability characterization, defined over the output states given the input states. Any circuit model is constructed by chaining together the individual logic element macromodels, forming a Bayesian network, defining a joint probability distribution over the whole circuit. We demonstrate three uses for these macromodel-based circuits. First, the probabilistic macromodels allow us to model the logical function of QCA circuits at an abstract level - the "circuit" level - above the current practice of layout level in a time and space efficient manner. We show that the circuit level model is orders of magnitude faster and requires less space than layout level models, making the design and testing of large QCA circuits efficient and relegating the costly full quantum-mechanical simulation of the temporal dynamics to a later stage in the design process. Second, the probabilistic macromodels abstract crucial device level characteristics such as polarization and low-energy error state configurations at the circuit level. We demonstrate how this macromodel-based circuit level representation can be used to infer the ground state probabilities, i.e., cell polarizations, a crucial QCA parameter. This allows us to study the thermal behavior of QCA circuits at a higher level of abstraction. Third, we demonstrate the use of these macromodels for error analysis. We show that low-energy state configurations of the macromodel circuit match those of the layout level, thus allowing us to isolate weak p- oints in circuits design at the circuit level itsel
- …