This article presents two area/latency optimized gate level asynchronous full
adder designs which correspond to early output logic. The proposed full adders
are constructed using the delay-insensitive dual-rail code and adhere to the
four-phase return-to-zero handshaking. For an asynchronous ripple carry adder
(RCA) constructed using the proposed early output full adders, the
relative-timing assumption becomes necessary and the inherent advantages of the
relative-timed RCA are: (1) computation with valid inputs, i.e., forward
latency is data-dependent, and (2) computation with spacer inputs involves a
bare minimum constant reverse latency of just one full adder delay, thus
resulting in the optimal cycle time. With respect to different 32-bit RCA
implementations, and in comparison with the optimized strong-indication,
weak-indication, and early output full adder designs, one of the proposed early
output full adders achieves respective reductions in latency by 67.8, 12.3 and
6.1 %, while the other proposed early output full adder achieves corresponding
reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty.
Further, the proposed early output full adders based asynchronous RCAs enable
minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering
carry-propagation over the entire RCA width of 32-bits, and maximum reductions
in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical
carry chain length of 4 full adder stages, when compared to the least of the
cycle time estimates of various strong-indication, weak-indication, and early
output asynchronous RCAs of similar size. All the asynchronous full adders and
RCAs were realized using standard cells in a semi-custom design fashion based
on a 32/28 nm CMOS process technology