450 research outputs found

    Design, Verification, Test and In-Field Implications of Approximate Computing Systems

    Get PDF
    Today, the concept of approximation in computing is becoming more and more a “hot topic” to investigate how computing systems can be more energy efficient, faster, and less complex. Intuitively, instead of performing exact computations and, consequently, requiring a high amount of resources, Approximate Computing aims at selectively relaxing the specifications, trading accuracy off for efficiency. While Approximate Computing gives several promises when looking at systems’ performance, energy efficiency and complexity, it poses significant challenges regarding the design, the verification, the test and the in-field reliability of Approximate Computing systems. This tutorial paper covers these aspects leveraging the experience of the authors in the field to present state-of-the-art solutions to apply during the different development phases of an Approximate Computing system

    X-Rel: Energy-Efficient and Low-Overhead Approximate Reliability Framework for Error-Tolerant Applications Deployed in Critical Systems

    Full text link
    Triple Modular Redundancy (TMR) is one of the most common techniques in fault-tolerant systems, in which the output is determined by a majority voter. However, the design diversity of replicated modules and/or soft errors that are more likely to happen in the nanoscale era may affect the majority voting scheme. Besides, the significant overheads of the TMR scheme may limit its usage in energy consumption and area-constrained critical systems. However, for most inherently error-resilient applications such as image processing and vision deployed in critical systems (like autonomous vehicles and robotics), achieving a given level of reliability has more priority than precise results. Therefore, these applications can benefit from the approximate computing paradigm to achieve higher energy efficiency and a lower area. This paper proposes an energy-efficient approximate reliability (X-Rel) framework to overcome the aforementioned challenges of the TMR systems and get the full potential of approximate computing without sacrificing the desired reliability constraint and output quality. The X-Rel framework relies on relaxing the precision of the voter based on a systematical error bounding method that leverages user-defined quality and reliability constraints. Afterward, the size of the achieved voter is used to approximate the TMR modules such that the overall area and energy consumption are minimized. The effectiveness of employing the proposed X-Rel technique in a TMR structure, for different quality constraints as well as with various reliability bounds are evaluated in a 15-nm FinFET technology. The results of the X-Rel voter show delay, area, and energy consumption reductions of up to 86%, 87%, and 98%, respectively, when compared to those of the state-of-the-art approximate TMR voters.Comment: This paper has been published in IEEE Transactions on Very Large Scale Integration (VLSI) System

    Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques

    Full text link
    The rapid growth of demanding applications in domains applying multimedia processing and machine learning has marked a new era for edge and cloud computing. These applications involve massive data and compute-intensive tasks, and thus, typical computing paradigms in embedded systems and data centers are stressed to meet the worldwide demand for high performance. Concurrently, the landscape of the semiconductor field in the last 15 years has constituted power as a first-class design concern. As a result, the community of computing systems is forced to find alternative design approaches to facilitate high-performance and/or power-efficient computing. Among the examined solutions, Approximate Computing has attracted an ever-increasing interest, with research works applying approximations across the entire traditional computing stack, i.e., at software, hardware, and architectural levels. Over the last decade, there is a plethora of approximation techniques in software (programs, frameworks, compilers, runtimes, languages), hardware (circuits, accelerators), and architectures (processors, memories). The current article is Part I of our comprehensive survey on Approximate Computing, and it reviews its motivation, terminology and principles, as well it classifies and presents the technical details of the state-of-the-art software and hardware approximation techniques.Comment: Under Review at ACM Computing Survey

    A Study on Efficient Designs of Approximate Arithmetic Circuits

    Get PDF
    Approximate computing is a popular field where accuracy is traded with energy. It can benefit applications such as multimedia, mobile computing and machine learning which are inherently error resilient. Error introduced in these applications to a certain degree is beyond human perception. This flexibility can be exploited to design area, delay and power efficient architectures. However, care must be taken on how approximation compromises the correctness of results. This research work aims to provide approximate hardware architectures with error metrics and design metrics analyzed and their effects in image processing applications. Firstly, we study and propose unsigned array multipliers based on probability statistics and with approximate 4-2 compressors, full adders and half adders. This work deals with a new design approach for approximation of multipliers. The partial products of the multiplier are altered to introduce varying probability terms. Logic complexity of approximation is varied for the accumulation of altered partial products based on their probability. The proposed approximation is utilized in two variants of 16-bit multipliers. Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38% respectively compared to an exact multiplier. They have better precision when compared to existing approximate multipliers. Mean relative error distance (MRED) figures are as low as 7.6% and 0.02% for the proposed approximate multipliers, which are better than the previous state-of-the-art works. Performance of the proposed multipliers is evaluated with geometric mean filtering application, where one of the proposed models achieves the highest peak signal to noise ratio (PSNR). Second, approximation is proposed for signed Booth multiplication. Approximation is introduced in partial product generation and partial product accumulation circuits. In this work, three multipliers (ABM-M1, ABM-M2, and ABM-M3) are proposed in which the modified Booth algorithm is approximated. In all three designs, approximate Booth partial product generators are designed with different variations of approximation. The approximations are performed by reducing the logic complexity of the Booth partial product generator, and the accumulation of partial products is slightly modified to improve circuit performance. Compared to the exact Booth multiplier, ABM-M1 achieves up to 15% reduction in power consumption with an MRED value of 7.9 Ă— 10-4. ABM-M2 has power savings of up to 60% with an MRED of 1.1 Ă— 10-1. ABM-M3 has power savings of up to 50% with an MRED of 3.4 Ă— 10-3. Compared to existing approximate Booth multipliers, the proposed multipliers ABM-M1 and ABM-M3 achieve up to a 41% reduction in power consumption while exhibiting very similar error metrics. Image multiplication and matrix multiplication are used as case studies to illustrate the high performance of the proposed approximate multipliers. Third, distributed arithmetic based sum of products units approximation is analyzed. Sum of products units are key elements in many digital signal processing applications. Three approximate sum of products models which are based on distributed arithmetic are proposed. They are designed for different levels of accuracy. First model of approximate sum of products achieves an improvement up to 64% on area and 70% on power, when compared to conventional unit. Other two models provide an improvement of 32% and 48% on area and 54% and 58% on power, respectively, with a reduced error rate compared to the first model. Third model achieves MRED and normalized mean error distance (NMED) as low as 0.05% and 0.009%. Performance of approximate units is evaluated with a noisy image smoothing application, where the proposed models are capable of achieving higher PSNR than existing state of the art techniques. Fourth, approximation is applied in division architecture. Two approximation models are proposed for restoring divider. In the first design, approximation is performed at circuit level, where approximate divider cells are utilized in place of exact ones by simplifying the logic equations. In the second model, restoring divider is analyzed strategically and number of restoring divider cells are reduced by finding the portions of divisor and dividend with significant information. An approximation factor pp is used in both designs. In model 1, the design with p=8 has a 58% reduction in both area and power consumption compared to exact design, with a Q-MRED of 1.909 Ă— 10-2 and Q-NMED of 0.449 Ă— 10-2. The second model with an approximation factor p=4 has 54% area savings and 62% power savings compared to exact design. The proposed models are found to have better error metrics compared to existing designs, with better performance at similar error values. A change detection image processing application is used for real time assessment of proposed and existing approximate dividers and one of the models achieves a PSNR of 54.27 dB

    An Approximate Carry Estimating Simultaneous Adder with Rectification

    Full text link
    Approximate computing has in recent times found significant applications towards lowering power, area, and time requirements for arithmetic operations. Several works done in recent years have furthered approximate computing along these directions. In this work, we propose a new approximate adder that employs a carry prediction method. This allows parallel propagation of the carry allowing faster calculations. In addition to the basic adder design, we also propose a rectification logic which would enable higher accuracy for larger computations. Experimental results show that our adder produces results 91.2% faster than the conventional ripple-carry adder. In terms of accuracy, the addition of rectification logic to the basic design produces results that are more accurate than state-of-the-art adders like SARA and BCSA by 74%.Comment: To appear at the 30th ACM Great Lakes Symposium on VLS

    Two examples of approximate arithmetic to reduce hardware complexity and power consumption

    Get PDF
    © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.As the end of Moore's Law approaches, electronic system designers must find ways to keep up with the ever increasing computational demands of the modern era. Some computationally intensive applications, such as multimedia processing, computer vision and artificial intelligence, present a unique feature that makes them especially suitable for hardware-level optimizations: their inherent robustness to noise and errors. This allows circuit designers to relax the constraint that arithmetic operations, such as multiplications and additions, must be completely accurate. Instead, approximations can be used in the arithmetic units, enabling system-level reductions in hardware area and power consumption, as well as improvements in performance, while hardly affecting the output of the final application. In this work, we explore two approximate arithmetic techniques. First, we consider approximations at the circuit design level by implementing several approximate multiplier units and evaluating their accuracy when used in executing YOLOv3, a state-of-the-art camera-based object detection deep neural network. Second, we apply the technique of overscaling to induce approximations in adder circuits by aggressively undervoltaging and overclocking them, and we compare the behavior of exact and approximate adders under these conditions. We find that, on one hand, some approximate multipliers are able to execute the YOLO network with almost no effect on the results, and on the other, approximate adder circuits are much more resilient to overscaling techniques than exact adders.This work was partially supported by Spanish MCIN/AEI/10.13039/501100011033, Project PID2019-103869RB-C33.Peer ReviewedPostprint (author's final draft
    • …
    corecore