Abstract-Low-power dissipation is an imperative requirement in the design of an efficient Digital Signal Processing system which is employed in many multimedia applications such as image and video processing. Finite Impulse Response (FIR) filter is indispensable in the design of several such Digital Signal Processing (DSP) applications. The output of these applications, either an image or a video can be nearly accurate for human perception. This toleration in the loss of quality of the output can be exploited to design an energy-efficient system by using approximate computation. Moreover, the efficacy of a system can be improved multi-fold by using reversible logic which benefits in the design of ultra-low-power systems. In this paper, we propose an approximate adder using a reversible Toffoli gate and employ it in designing NEON (Near-accurate and Efficient FIR filter for ultra low-power applicatiONs). Simulation results carried out using Cadence c design tools in 45nm technology node show that the FIR Filter designed using the proposed adder gives significantly better results compared to the designs using the adders in literature. Experiment results using ISCAS benchmarks and comparison with previous methods demonstrate the effectiveness of the proposed method. In addition to producing fewer garbage outputs, the FIR filter designed using the proposed adder yields power reduction of 74%, delay reduction of 64% and PowerDelay Product reduction of 90.1%.
I. INTRODUCTION
Typically used multimedia applications involving image or video processing have Digital Signal Processing blocks as the backbone. Finite Impulse Response (FIR) filters are widely used in these applications. The resultant output, usually an image or a video, can be nearly accurate for human perception. This relaxation on producing strictly accurate outputs facilitates in carrying out approximate computation. This flexibility can be utilized in designing low-power systems.
Earlier works targeting design of low-power systems using proximate computations include Algorithmic Noise Tolerance (ANT) [1] [2] [3] [4] which focus on limiting the errors by using the concept of Voltage OverScaling. A low-power approximate adder is proposed in [5] , which functions by separating the inputs into accurate and approximate parts. Probabilistic Computing has been proposed as an alternative in [6] [7] [8] which produces inexact outputs in order to achieve less power consumption and hardware complexity. Various inaccurate adders have been proposed in [9] [10] which embrace error to achieve reduced power dissipation.
An imprecise specification has been proposed in [11] which focuses on improving RTL specification by adding new semantic elements. Woo [12] has proposed an analog adder using approximate computation.
Of late, reversible logic has emerged as a paradigm in designing energy-efficient systems. Landauer [13] has proved that for irreversible logic computations, every bit of information lost results in the generation of KT log2 joules of heat energy. Bennet [14] showed that this heat dissipation would not occur in reversible computation. Several reversible gates have been proposed such as Fredkin Gate [15] , Toffoli Gate [16] and Feynman Gate [17] [18] . Low-power adder circuits are designed using a new reversible gate proposed in [19] . Using reversible gates, several adders and multipliers are designed in [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] which dissipate low-power. An FIR filter model is designed using reversible logic gates in [33] . In [34] , an ultra low-power FIR filter is proposed using subthreshold design approach. One of the important concerns in designing a reversible logic system is to use as few reversible logic gates as possible and produce fewer number of garbage outputs.
FIR filters use several addition operations. To design an energy-efficient low-power FIR filter, it is imperative to design low-power adder cells. Our approach is described as follows: First, an approximate full adder cell is designed using reversible logic. This adder cell is later utilized in the design of an FIR filter (NEON).
Despite the emergence of several approximation algorithms, limited amount of work has been focused on approximating FIR Filters. Previous work [33] presents the design of FIR Filter using reversible Fredkin gates. However, this design produces many garbage outputs and consumes a lot of power and thereby presents hindrance to the design of an efficient Digital Signal Processing system. To the best of our knowledge, this work presents the first approximate reversible adder which not only produces minimal garbage outputs but also yields significant reductions in terms of power, delay and Power-Delay Product.
The remainder of the paper is organized as follows. Section II outlines previous work related to the emergence of Approximation Algorithms and Reversible Logic. In section III, an approximate full adder using reversible logic gates is presented. In section IV, we propose NEON. In section V, we compare NEON with the design of FIR filters using existing adders in terms of power, delay and Power Delay Product (PDP). Section VI concludes the paper.
II. PREVIOUS WORK
Approximate Computing has emerged as a paradigm shifting technique in the design of ultra lowpower systems whose applications can tolerate error. Of all the techniques proposed previously, we make use of Logic Complexity Reduction technique which minimizes the Boolean Expressions [9] [10] [11] resulting in reduced hardware overhead and thereby CPU time.
A. Reversible Logic
Reversible Logic bestows in carrying out lowpower computations. Irreversible logic computations result in the dissipation of KT log2 joules of heat energy for every bit of information lost. In reversible systems, the aforementioned heat dissipation would not occur as the system's input vectors can be retrieved from output vectors. There is a one-to-one correspondence between input and output assignments [13] [14] in reversible systems. Hence, an input vector (I) can be uniquely obtained from the output vector (O) in a reversible system. Block diagram of a reversible system is shown in Fig. 1 . below. A k × k reversible gate can be represented as:
We propose the design of approximate Full-Adder cell using a reversible Toffoli gate.
B. Toffoli Gate
Toffoli Gate is a 3 X 3 Reversible Gate with input and output equations as follows [16] :
The block diagram of Toffoli Gate is shown in Fig.2 . 
The above equation is implemented using a standard Toffoli gate to design the Proposed Adder which is presented in Fig. 3 . The Gate level structure of the Proposed Adder is shown in Fig. 4 . To evaluate the efficacy of the proposed design, we compare the Proposed Adder with reversible full adders and with approximate adders in the literature. These comparison results are presented in Table II . It can be observed that the proposed adder yields better results in terms of power and number of garbage outputs compared to adders existing in the literature.
IV. DESIGN OF NEON USING THE PROPOSED APPROXIMATE ADDER
In the previous section, the design of the Proposed Approximate Adder was presented and its efficiency in terms of Power, Delay and Power-Delay Product was compared with the existing design in the literature. In this section, the design of NEON (FIR Filter), which is widely used in many multimedia applications is presented using the Proposed Approximate Adder.
A. Finite Impulse Response (FIR) Filter
FIR filters are extensively used in many Digital Signal Processing systems intended for multimedia applications. The schematic of a typical FIR filter is presented in this section. An FIR Filter produces a finite impulse response. For a causal discrete-time FIR filter of order 'N', each value of the output sequence is a weighted sum of most recent input values. The mathematical representation of a causal discrete-time FIR Filter is presented below. 
V. EXPERIMENTAL RESULTS
To demonstrate the efficacy of the proposed adder, we have compared its performance with the adders existing in literature [9] [10] 30] . The simulation results are carried out using ISCAS Benchmark Suite [35] and the results are very encouraging. These results are tabulated in Table III .
A. Performance Evaluation of NEON
In this section, we compare the efficiency of 8, 12 and 16 order NEON with the FIR Filters using existing adders [9, [30] [31] . We have embedded each of these adders in designing the FIR Filter (Fig. 5.) and implemented them using Verilog HDL for 8,12 and 16 bit widths of input numbers. We have implemented the aforementioned designs and synthesized them using Cadence design tools at 45nm technology node. The performance of NEON in terms of Power, Delay and Power-Delay Product (PDP) is considered with respect to the bit width of the input numbers.
1) Analysis of FIR Filter in terms of Power Dissipation:
The performance of NEON in terms of power dissipation for various orders of 8, 12 and 16 has been tabulated in Table IV . As the proposed adder produces only one garbage output, the power dissipation of NEON is very much less when compared to FIR filter using existing adders in literature [9, [30] [31] . Fig. 6 . depicts the average power dissipation measurements of 8, 12 and 16 order NEON over the FIR Filter using existing adders [9, [30] [31] . Table  5 presents the performance of NEON in terms of delay savings for FIR Filters of order 8, 12 and 16. As the design of the Proposed Adder utilizes only a single reversible gate, lesser delay is obtained for NEON compared to the FIR filter using existing adders in literature [9, [30] [31] . The delay measurements of 8, 12 and 16 order NEON over the FIR filters designed using existing adders [9, [30] [31] is presented in Fig.  7 . Table VI , the utility of NEON in terms of Power-Delay Product (PDP) savings is shown. PDP measurement determines the energyefficiency of a system. Savings in terms of Power and Delay of NEON benefits in the reduction of PowerDelay Product as well. Table VI depicts that the PDP of NEON is less compared to FIR filter using adders in [9, [30] [31] . PDP measurements of 8, 12 and 16 order NEON over the FIR filter using adders in literature [9, [30] [31] is shown in Fig. 8 . 
2) Analysis of FIR Filter in terms of Delay:

3) Analysis of FIR Filter in terms of Power-Delay Product (PDP): In
VI. CONCLUSIONS
In this paper, we have proposed an Approximate Reversible Adder and analyzed its performance in FIR Filters of various orders. Our approach is aimed at achieving an ultra-low power design of FIR Filter with minimal garbage outputs. We have compared the performance of Proposed Adder in terms of power, delay and PDP with reversible adders and approximate adders in literature. Usage of the Proposed Adder in designing FIR Filters of 8, 12 and 16 orders yielded significant reduction in terms of power, delay and PDP in comparison with the FIR filters designed using adders in the literature.
VII. ACKNOWLEDGEMENT
This research project was carried out at C-ACRL, Vardhaman College of Engineering. The authors would like to thank the management and faculty for their constant support throughout.
