15 research outputs found
High-Performance Ternary (4:2) Compressor Based on Capacitive Threshold Logic
This paper presents a ternary (4:2) compressor, which is an important component in multiplication. However, the structure differs from the binary counterpart since the ternary model does not require carry signals. The method of capacitive threshold logic (CTL) is used to achieve the output signals directly. Unlike the previously presented similar structure, the entire capacitor network is divided into two parts. This segregation results in higher reliability and robustness against unwanted process, voltage, and temperature (PVT) variations. Simulations are performed by HSPICE and 32nm CNFET technology. Simulation results demonstrate about 94% higher performance in terms of power-delay product (PDP) for the new design over the previous one
Ultra-Fast, High-Performance 8x8 Approximate Multipliers by a New Multicolumn 3,3:2 Inexact Compressor and its Derivatives
Multiplier, as a key role in many different applications, is a
time-consuming, energy-intensive computation block. Approximate computing is a
practical design paradigm that attempts to improve hardware efficacy while
keeping computation quality satisfactory. A novel multicolumn 3,3:2 inexact
compressor is presented in this paper. It takes three partial products from two
adjacent columns each for rapid partial product reduction. The proposed inexact
compressor and its derivates enable us to design a high-speed approximate
multiplier. Then, another ultra-fast, high-efficient approximate multiplier is
achieved utilizing a systematic truncation strategy. The proposed multipliers
accumulate partial products in only two stages, one fewer stage than other
approximate multipliers in the literature. Implementation results by Synopsys
Design Compiler and 45 nm technology node demonstrates nearly 11.11% higher
speed for the second proposed design over the fastest existing approximate
multiplier. Furthermore, the new approximate multipliers are applied to the
image processing application of image sharpening, and their performance in this
application is highly satisfactory. It is shown in this paper that the error
pattern of an approximate multiplier, in addition to the mean error distance
and error rate, has a direct effect on the outcomes of the image processing
application.Comment: 21 Pages, 18 Figures, 6 Table
High-Performance Ternary (4:2) Compressor Based on Capacitive Threshold Logic
This paper presents a ternary (4:2) compressor, which is an important component in multiplication. However, the structure differs from the binary counterpart since the ternary model does not require carry signals. The method of capacitive threshold logic (CTL) is used to achieve the output signals directly. Unlike the previously presented similar structure, the entire capacitor network is divided into two parts. This segregation results in higher reliability and robustness against unwanted process, voltage, and temperature (PVT) variations. Simulations are performed by HSPICE and 32nm CNFET technology. Simulation results demonstrate about 94% higher performance in terms of power-delay product (PDP) for the new design over the previous one
Design and implementation of an ASIP-based cryptography processor for AES, IDEA, and MD5
In this paper, a new 32-bit ASIP-based crypto processor for AES, IDEA, and MD5 is designed. The instruction-set consists of both general purpose and specific instructions for the above cryptographic algorithms. The proposed architecture has nine function units and two data buses. It has also two types of 32-bit instruction formats for executing Memory Reference (M.R.), Register Reference (R.R.), and Input/Output Reference (I/O R.) instructions. The maximum achieved frequency is 166.916Â MHz. The encoded output results of the encryption process of a 128-bit input block are obtained after 122, 146 and 170 clock cycles for AES-128, AES-192, and AES-256, respectively. Moreover, it takes 95 clock cycles to encrypt or decrypt a 64-bit input block by using IDEA. Finally, the MD5 hash algorithm requires 469 clock cycles to generate the coded outputs for a block of 512Â bits. The performance of the proposed processor is compared to some previous and state-of-the-art implementations in terms of speed, latency, throughput, and flexibility
New Current-Mode Integrated Ternary Min/Max Circuits without Constant Independent Current Sources
Novel designs of current-mode Ternary minimum (AND) and maximum (OR) are proposed in this paper based on Carbon NanoTube Field Effect Transistors (CNTFET). First, these Ternary operators are designed separately. Then, they are combined together in order to generate both outputs concurrently in an integrated design. This integration results in the elimination of common parts when both functions are required at the same time. The third proposed current-mode integrated circuit generates both ternary operators with the usage of only 30 transistors. The new designs are composed of three main parts: (1) the part which converts current to voltage; (2) threshold detectors; and (3) the parallel paths through which the output current flows. Unlike the previously presented structure, there is no need for any constant current source within the new designs. This elimination leads to less static power dissipation. The second proposed current-mode segregated Ternary minimum operates 43% faster and consumes 40% less power in comparison with a previously presented structure
Comprehensive Survey of Ternary Full Adders: Statistics, Corrections, and Assessments
The history of ternary adders goes back to more than six decades ago. Since
then, a multitude of ternary full adders (TFAs) have been presented in the
literature. This paper aims to conduct a survey to be familiar with the
utilized design methodologies and logic families and their prevalence. Although
the number of papers about this topic is high, almost none of the previously
presented TFAs are in their simplest form. A large number of transistors could
have been eliminated by considering a partial TFA instead of a complete one.
Moreover, they could have been simplified even further by assuming a partial
TFA where the voltage of the output carry is either 0V or VDD. This way, less
static power would be dissipated. Therefore, a strong motivation is to correct
and enhance the previous designs. Furthermore, different simulation setups,
which are not realistic enough, have been taken into account. Therefore, the
simulation results reported in the previous papers are neither comparable nor
entirely valid. Among the 75 papers in which a new design of TFA has been
given, 11 papers are selected, simplified, and simulated in this paper.
Simulations are carried out by HSPICE and 32nm CNFET technology while
considering a standard test-bed and a complete input pattern to reveal the
maximum cell delay. The simplified partial TFAs outperform their original
versions in delay, power, and transistor count.Comment: 26 Pages, 20 Figures, 16 Table