40,136 research outputs found
A Logic Simplification Approach for Very Large Scale Crosstalk Circuit Designs
Crosstalk computing, involving engineered interference between nanoscale
metal lines, offers a fresh perspective to scaling through co-existence with
CMOS. Through capacitive manipulations and innovative circuit style, not only
primitive gates can be implemented, but custom logic cells such as an Adder,
Subtractor can be implemented with huge gains. Our simulations show over 5x
density and 2x power benefits over CMOS custom designs at 16nm [1]. This paper
introduces the Crosstalk circuit style and a key method for large-scale circuit
synthesis utilizing existing EDA tool flow. We propose to manipulate the CMOS
synthesis flow by adding two extra steps: conversion of the gate-level netlist
to Crosstalk implementation friendly netlist through logic simplification and
Crosstalk gate mapping, and the inclusion of custom cell libraries for
automated placement and layout. Our logic simplification approach first
converts Cadence generated structured netlist to Boolean expressions and then
uses the majority synthesis tool to obtain majority functions, which is further
used to simplify functions for Crosstalk friendly implementations. We compare
our approach of logic simplification to that of CMOS and majority logic-based
approaches. Crosstalk circuits share some similarities to majority synthesis
that are typically applied to Quantum Cellular Automata technology. However,
our investigation shows that by closely following Crosstalk's core circuit
styles, most benefits can be achieved. In the best case, our approach shows 36%
density improvements over majority synthesis for MCNC benchmark
Area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders
This article presents two area/latency optimized gate level asynchronous full
adder designs which correspond to early output logic. The proposed full adders
are constructed using the delay-insensitive dual-rail code and adhere to the
four-phase return-to-zero handshaking. For an asynchronous ripple carry adder
(RCA) constructed using the proposed early output full adders, the
relative-timing assumption becomes necessary and the inherent advantages of the
relative-timed RCA are: (1) computation with valid inputs, i.e., forward
latency is data-dependent, and (2) computation with spacer inputs involves a
bare minimum constant reverse latency of just one full adder delay, thus
resulting in the optimal cycle time. With respect to different 32-bit RCA
implementations, and in comparison with the optimized strong-indication,
weak-indication, and early output full adder designs, one of the proposed early
output full adders achieves respective reductions in latency by 67.8, 12.3 and
6.1 %, while the other proposed early output full adder achieves corresponding
reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty.
Further, the proposed early output full adders based asynchronous RCAs enable
minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering
carry-propagation over the entire RCA width of 32-bits, and maximum reductions
in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical
carry chain length of 4 full adder stages, when compared to the least of the
cycle time estimates of various strong-indication, weak-indication, and early
output asynchronous RCAs of similar size. All the asynchronous full adders and
RCAs were realized using standard cells in a semi-custom design fashion based
on a 32/28 nm CMOS process technology
Desynchronization: Synthesis of asynchronous circuits from synchronous specifications
Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur
- …