116 research outputs found

    A Modified Architecture for Radix-4 Booth Multiplier with Adaptive Hold Logic

    Get PDF
    High speed digital multipliers are most efficiently used in many applications such as Fourier transform, discrete cosine transforms, and digital filtering. The throughput of the multipliers is based on speed of the multiplier, and then the entire performance of the circuit depends on it. The pMOS transistor in negative bias cause negative bias temperature instability (NBTI), which increases the threshold voltage of the transistor and reduces the multiplier speed. Similarly, the nMOS transistor in positive bias cause positive bias temperature instability (PBTI).These effects reduce the transistor speed and the system may fail due to timing violations. So here a new multiplier was designed with novel adaptive hold logic (AHL) using Radix-4 Modified Booth Multiplier. By using Radix-4 Modified Booth Encoding (MBE), we can reduce the number of partial products by half. Modified booth multiplier helps to provide higher throughput with low power consumption. This can adjust the AHL circuit to reduce the performance degradation. The expected result will be reduce threshold voltage, increase throughput and speed and also reduce power. This modified multiplier design is coded by Verilog and simulated using Xilinx ISE 12.1 and implemented in Spartan 3E FPGA kit

    PERFORMANCE EVALUATION OF BOOTH AND WALLACE MULTIPLIER USING FIR FILTER

    Get PDF
    An area-and speed efficient multipliers is proposed in the thesis. the proposed booth and Wallace multipliers shows the tradeoff in the performance evaluation for the fir filter applications. For implementation of fir filter in this paper the adders introduced are carry save adder and carry skip adder. For evaluating the fir filter performance the tested combinations are booth carry save , booth carry skip , Wallace carry save , Wallace carry skip

    Designing Approximate Computing Circuits with Scalable and Systematic Data-Driven Techniques

    Get PDF
    Semiconductor feature size has been shrinking significantly in the past decades. This decreasing trend of feature size leads to faster processing speed as well as lower area and power consumption. Among these attributes, power consumption has emerged as the primary concern in the design of integrated circuits in recent years due to the rapid increasing demand of energy efficient Internet of Things (IoT) devices. As a result, low power design approaches for digital circuits have become of great attractive in the past few years. To this end, approximate computing in hardware design has emerged as a promising design technique. It provides design opportunities to improve timing and energy efficiency by relaxing computing quality. This technique is feasible because of the error-resiliency of many emerging resource-hungry computational applications such as multimedia processing and machine learning. Thus, it is reasonable to utilize this characteristic to trade an acceptable amount of computing quality for energy saving. In the literature, most prior works on approximate circuit design focus on using manual design strategies to redesign fundamental computational blocks such as adders and multipliers. However, the manual design techniques are not suitable for system level hardware due to much higher design complexity. In order to tackle this challenge, we focus on designing scalable, systematic and general design methodologies that are applicable on any circuits. In this paper, we present two novel approximate circuit design methods based on machine learning techniques. Both methods skip the complicated manual analysis steps and primarily look at the given input-error pattern to generate approximate circuits. Our first work presents a framework for designing compensation block, an essential component in many approximate circuits, based on feature selection. Our second work further extends and optimizes this framework and integrates data-driven consideration into the design. Several case studies on fixed-width multipliers and other approximate circuits are presented to demonstrate the effectiveness of the proposed design methods. The experimental results show that both of the proposed methods are able to automatically and efficiently design low-error approximate circuits

    Design of Energy-Efficient Approximate Arithmetic Circuits

    Get PDF
    Energy consumption has become one of the most critical design challenges in integrated circuit design. Arithmetic computing circuits, in particular array-based arithmetic computing circuits such as adders, multipliers, squarers, have been widely used. In many cases, array-based arithmetic computing circuits consume a significant amount of energy in a chip design. Hence, reduction of energy consumption of array-based arithmetic computing circuits is an important design consideration. To this end, designing low-power arithmetic circuits by intelligently trading off processing precision for energy saving in error-resilient applications such as DSP, machine learning and neuromorphic circuits provides a promising solution to the energy dissipation challenge of such systems. To solve the chip’s energy problem, especially for those applications with inherent error resilience, array-based approximate arithmetic computing (AAAC) circuits that produce errors while having improved energy efficiency have been proposed. Specifically, a number of approximate adders, multipliers and squarers have been presented in the literature. However, the chief limitation of these designs is their un-optimized processing accuracy, which is largely due to the current lack of systemic guidance for array-based AAAC circuit design pertaining to optimal tradeoffs between error, energy and area overhead. Therefore, in this research, our first contribution is to propose a general model for approximate array-based approximate arithmetic computing to guide the minimization of processing error. As part of this model, the Error Compensation Unit (ECU) is identified as a key building block for a wide range of AAAC circuits. We develop theoretical analysis geared towards addressing two critical design problems of the ECU, namely, determination of optimal error compensation values and identification of the optimal error compensation scheme. We demonstrate how this general AAAC model can be leveraged to derive practical design insights that may lead to optimal tradeoffs between accuracy, energy dissipation and area overhead. To further minimize energy consumption, delay and area of AAAC circuits, we perform ECU logic simplification by introducing don't cares. By applying the proposed model, we propose an approximate 16x16 fixed-width Booth multiplier that consumes 44.85% and 28.33% less energy and area compared with theoretically the most accurate fixed-width Booth multiplier when implemented using a 90nm CMOS standard cell library. Furthermore, it reduces average error, max error and mean square error by 11.11%, 28.11% and 25.00%, respectively, when compared with the best reported approximate Booth multiplier and outperforms the best reported approximate design significantly by 19.10% in terms of the energy-delay-mean square error product (EDE_(ms)). Using the same approach, significant energy consumption, area and error reduction is achieved for a squarer unit, with more than 20.00% EDE_(ms) reduction over existing fixed-width squarer designs. To further reduce error and cost by utilizing extra signatures and don't cares, we demonstrate a 16-bit fixed-width squarer that improves the energy-delay-max error (EDE_(max)) by 15.81%

    Design and Evaluation of Approximate Logarithmic Multipliers for Low Power Error-Tolerant Applications

    Get PDF
    In this work, the designs of both non-iterative and iterative approximate logarithmic multipliers (LMs) are studied to further reduce power consumption and improve performance. Non-iterative approximate LMs (ALMs) that use three inexact mantissa adders, are presented. The proposed iterative approximate logarithmic multipliers (IALMs) use a set-one adder in both mantissa adders during an iteration; they also use lower-part-or adders and approximate mirror adders for the final addition. Error analysis and simulation results are also provided; it is found that the proposed approximate LMs with an appropriate number of inexact bits achieve a higher accuracy and lower power consumption than conventional LMs using exact units. Compared with conventional LMs with exact units, the normalized mean error distance (NMED) of 16-bit approximate LMs is decreased by up to 18% and the power-delay product (PDP) has a reduction of up to 37%. The proposed approximate LMs are also compared with previous approximate multipliers; it is found that the proposed approximate LMs are best suitable for applications allowing larger errors, but requiring lower energy consumption and low power. Approximate Booth multipliers fit applications with less stringent power requirements, but also requiring smaller errors. Case studies for error-tolerant computing applications are provided

    A Study on Efficient Designs of Approximate Arithmetic Circuits

    Get PDF
    Approximate computing is a popular field where accuracy is traded with energy. It can benefit applications such as multimedia, mobile computing and machine learning which are inherently error resilient. Error introduced in these applications to a certain degree is beyond human perception. This flexibility can be exploited to design area, delay and power efficient architectures. However, care must be taken on how approximation compromises the correctness of results. This research work aims to provide approximate hardware architectures with error metrics and design metrics analyzed and their effects in image processing applications. Firstly, we study and propose unsigned array multipliers based on probability statistics and with approximate 4-2 compressors, full adders and half adders. This work deals with a new design approach for approximation of multipliers. The partial products of the multiplier are altered to introduce varying probability terms. Logic complexity of approximation is varied for the accumulation of altered partial products based on their probability. The proposed approximation is utilized in two variants of 16-bit multipliers. Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38% respectively compared to an exact multiplier. They have better precision when compared to existing approximate multipliers. Mean relative error distance (MRED) figures are as low as 7.6% and 0.02% for the proposed approximate multipliers, which are better than the previous state-of-the-art works. Performance of the proposed multipliers is evaluated with geometric mean filtering application, where one of the proposed models achieves the highest peak signal to noise ratio (PSNR). Second, approximation is proposed for signed Booth multiplication. Approximation is introduced in partial product generation and partial product accumulation circuits. In this work, three multipliers (ABM-M1, ABM-M2, and ABM-M3) are proposed in which the modified Booth algorithm is approximated. In all three designs, approximate Booth partial product generators are designed with different variations of approximation. The approximations are performed by reducing the logic complexity of the Booth partial product generator, and the accumulation of partial products is slightly modified to improve circuit performance. Compared to the exact Booth multiplier, ABM-M1 achieves up to 15% reduction in power consumption with an MRED value of 7.9 Ă— 10-4. ABM-M2 has power savings of up to 60% with an MRED of 1.1 Ă— 10-1. ABM-M3 has power savings of up to 50% with an MRED of 3.4 Ă— 10-3. Compared to existing approximate Booth multipliers, the proposed multipliers ABM-M1 and ABM-M3 achieve up to a 41% reduction in power consumption while exhibiting very similar error metrics. Image multiplication and matrix multiplication are used as case studies to illustrate the high performance of the proposed approximate multipliers. Third, distributed arithmetic based sum of products units approximation is analyzed. Sum of products units are key elements in many digital signal processing applications. Three approximate sum of products models which are based on distributed arithmetic are proposed. They are designed for different levels of accuracy. First model of approximate sum of products achieves an improvement up to 64% on area and 70% on power, when compared to conventional unit. Other two models provide an improvement of 32% and 48% on area and 54% and 58% on power, respectively, with a reduced error rate compared to the first model. Third model achieves MRED and normalized mean error distance (NMED) as low as 0.05% and 0.009%. Performance of approximate units is evaluated with a noisy image smoothing application, where the proposed models are capable of achieving higher PSNR than existing state of the art techniques. Fourth, approximation is applied in division architecture. Two approximation models are proposed for restoring divider. In the first design, approximation is performed at circuit level, where approximate divider cells are utilized in place of exact ones by simplifying the logic equations. In the second model, restoring divider is analyzed strategically and number of restoring divider cells are reduced by finding the portions of divisor and dividend with significant information. An approximation factor pp is used in both designs. In model 1, the design with p=8 has a 58% reduction in both area and power consumption compared to exact design, with a Q-MRED of 1.909 Ă— 10-2 and Q-NMED of 0.449 Ă— 10-2. The second model with an approximation factor p=4 has 54% area savings and 62% power savings compared to exact design. The proposed models are found to have better error metrics compared to existing designs, with better performance at similar error values. A change detection image processing application is used for real time assessment of proposed and existing approximate dividers and one of the models achieves a PSNR of 54.27 dB

    A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits

    Full text link
    Given the stringent requirements of energy efficiency for Internet-of-Things edge devices, approximate multipliers, as a basic component of many processors and accelerators, have been constantly proposed and studied for decades, especially in error-resilient applications. The computation error and energy efficiency largely depend on how and where the approximation is introduced into a design. Thus, this article aims to provide a comprehensive review of the approximation techniques in multiplier designs ranging from algorithms and architectures to circuits. We have implemented representative approximate multiplier designs in each category to understand the impact of the design techniques on accuracy and efficiency. The designs can then be effectively deployed in high-level applications, such as machine learning, to gain energy efficiency at the cost of slight accuracy loss.Comment: 38 pages, 37 figure

    Lossy Polynomial Datapath Synthesis

    No full text
    The design of the compute elements of hardware, its datapath, plays a crucial role in determining the speed, area and power consumption of a device. The building blocks of datapath are polynomial in nature. Research into the implementation of adders and multipliers has a long history and developments in this area will continue. Despite such efficient building block implementations, correctly determining the necessary precision of each building block within a design is a challenge. It is typical that standard or uniform precisions are chosen, such as the IEEE floating point precisions. The hardware quality of the datapath is inextricably linked to the precisions of which it is composed. There is, however, another essential element that determines hardware quality, namely that of the accuracy of the components. If one were to implement each of the official IEEE rounding modes, significant differences in hardware quality would be found. But in the same fashion that standard precisions may be unnecessarily chosen, it is typical that components may be constructed to return one of these correctly rounded results, where in fact such accuracy is far from necessary. Unfortunately if a lesser accuracy is permissible then the techniques that exist to reduce hardware implementation cost by exploiting such freedom invariably produce an error with extremely difficult to determine properties. This thesis addresses the problem of how to construct hardware to efficiently implement fixed and floating-point polynomials while exploiting a global error freedom. This is a form of lossy synthesis. The fixed-point contributions include resource minimisation when implementing mutually exclusive polynomials, the construction of minimal lossy components with guaranteed worst case error and a technique for efficient composition of such components. Contributions are also made to how a floating-point polynomial can be implemented with guaranteed relative error.Open Acces

    An Efficient Implementation of Wallace Tree Multiplier

    Get PDF
    In Very large Scale Integration (VLSI) technology, power consumption and speed are the two important constraints for determining the efficiency of the architecture. This paper aspires at declining this parameters of the Wallace tree multiplier with the efficient use of modified Booth encoding and compressors.The proposed architecture is employed in Verilog HDL and it is simulated in Cadence NC Sim and synthesized using Encounter RTL Compiler in 180nm Taiwan Semiconductor Manufacturing Company(TSMC) slow library .The proposed architecture is found to be 42.2% faster than the conventional Wallace tree architecture and the power consumption lowered by 45% as compared to the conventional Wallace Tree
    • …
    corecore