17,373 research outputs found
Indicating Asynchronous Array Multipliers
Multiplication is an important arithmetic operation that is frequently
encountered in microprocessing and digital signal processing applications, and
multiplication is physically realized using a multiplier. This paper discusses
the physical implementation of many indicating asynchronous array multipliers,
which are inherently elastic and modular and are robust to timing, process and
parametric variations. We consider the physical realization of many indicating
asynchronous array multipliers using a 32/28nm CMOS technology. The
weak-indication array multipliers comprise strong-indication or weak-indication
full adders, and strong-indication 2-input AND functions to realize the partial
products. The multipliers were synthesized in a semi-custom ASIC design style
using standard library cells including a custom-designed 2-input C-element. 4x4
and 8x8 multiplication operations were considered for the physical
implementations. The 4-phase return-to-zero (RTZ) and the 4-phase return-to-one
(RTO) handshake protocols were utilized for data communication, and the
delay-insensitive dual-rail code was used for data encoding. Among several
weak-indication array multipliers, a weak-indication array multiplier utilizing
a biased weak-indication full adder and the strong-indication 2-input AND
function is found to have reduced cycle time and power-cycle time product with
respect to RTZ and RTO handshaking for 4x4 and 8x8 multiplications. Further,
the 4-phase RTO handshaking is found to be preferable to the 4-phase RTZ
handshaking for achieving enhanced optimizations of the design metrics.Comment: arXiv admin note: text overlap with arXiv:1903.0943
Latency Optimized Asynchronous Early Output Ripple Carry Adder based on Delay-Insensitive Dual-Rail Data Encoding
Asynchronous circuits employing delay-insensitive codes for data
representation i.e. encoding and following a 4-phase return-to-zero protocol
for handshaking are generally robust. Depending upon whether a single
delay-insensitive code or multiple delay-insensitive code(s) are used for data
encoding, the encoding scheme is called homogeneous or heterogeneous
delay-insensitive data encoding. This article proposes a new latency optimized
early output asynchronous ripple carry adder (RCA) that utilizes single-bit
asynchronous full adders (SAFAs) and dual-bit asynchronous full adders (DAFAs)
which incorporate redundant logic and are based on the delay-insensitive
dual-rail code i.e. homogeneous data encoding, and follow a 4-phase
return-to-zero handshaking. Amongst various RCA, carry lookahead adder (CLA),
and carry select adder (CSLA) designs, which are based on homogeneous or
heterogeneous delay-insensitive data encodings which correspond to the
weak-indication or the early output timing model, the proposed early output
asynchronous RCA that incorporates SAFAs and DAFAs with redundant logic is
found to result in reduced latency for a dual-operand addition operation. In
particular, for a 32-bit asynchronous RCA, utilizing 15 stages of DAFAs and 2
stages of SAFAs leads to reduced latency. The theoretical worst-case latencies
of the different asynchronous adders were calculated by taking into account the
typical gate delays of a 32/28nm CMOS digital cell library, and a comparison is
made with their practical worst-case latencies estimated. The theoretical and
practical worst-case latencies show a close correlation....Comment: arXiv admin note: text overlap with arXiv:1704.0761
Impact Ionization and Hot-Electron Injection Derived Consistently from Boltzmann Transport
We develop a quantitative model of the impact-ionizationand hot-electron–injection processes in MOS devices from first principles. We begin by modeling hot-electron transport in the drain-to-channel depletion region using the spatially varying Boltzmann transport equation, and we analytically find a self consistent distribution function in a two step process. From the electron distribution function, we calculate the probabilities of impact ionization and hot-electron injection as functions of channel current, drain voltage, and floating-gate voltage. We compare our analytical model results to measurements in long-channel devices. The model simultaneously fits both the hot-electron- injection and impact-ionization data. These analytical results yield an energydependent impact-ionization collision rate that is consistent with numerically calculated collision rates reported in the literature
Concepts for on-board satellite image registration. Volume 3: Impact of VLSI/VHSIC on satellite on-board signal processing
Anticipated major advances in integrated circuit technology in the near future are described as well as their impact on satellite onboard signal processing systems. Dramatic improvements in chip density, speed, power consumption, and system reliability are expected from very large scale integration. Improvements are expected from very large scale integration enable more intelligence to be placed on remote sensing platforms in space, meeting the goals of NASA's information adaptive system concept, a major component of the NASA End-to-End Data System program. A forecast of VLSI technological advances is presented, including a description of the Defense Department's very high speed integrated circuit program, a seven-year research and development effort
A Multi-objective Perspective for Operator Scheduling using Fine-grained DVS Architecture
The stringent power budget of fine grained power managed digital integrated
circuits have driven chip designers to optimize power at the cost of area and
delay, which were the traditional cost criteria for circuit optimization. The
emerging scenario motivates us to revisit the classical operator scheduling
problem under the availability of DVFS enabled functional units that can
trade-off cycles with power. We study the design space defined due to this
trade-off and present a branch-and-bound(B/B) algorithm to explore this state
space and report the pareto-optimal front with respect to area and power. The
scheduling also aims at maximum resource sharing and is able to attain
sufficient area and power gains for complex benchmarks when timing constraints
are relaxed by sufficient amount. Experimental results show that the algorithm
that operates without any user constraint(area/power) is able to solve the
problem for most available benchmarks, and the use of power budget or area
budget constraints leads to significant performance gain.Comment: 18 pages, 6 figures, International journal of VLSI design &
Communication Systems (VLSICS
- …