Abstract-In recent years, near threshold computing is becoming a promising solution to achieve minimum energy consumption. In this paper, the Dynamic Threshold body MOS (DTMOS) technique is assessed in the context of 10T full subtractor circuit designed to operate in the near threshold region. The performance parameters -Energy, power, area, delay, and EDP were computed and compared with the conventional CMOS (C-CMOS) Full subtractor. The simulations were performed using cadence 90 nm technology with Ultra Low Voltage (ULV) of 0.3V. The results have been shown that the proposed 10T full subtractor circuit with DTMOS scheme achieves more than 18% savings in delay, 26% savings in energy consumption and 39% savings in EDP in comparison with the conventional CMOS configuration and other hybrid counterparts.
I. INTRODUCTION
As the technology trends to scale down to the nanometer regime with the increase in the demand for portable battery operated devices like laptops, calculators, mobiles, wrist watches and IoT devices, the energy consumption of digital circuits is becoming a major issue [1] . Supply voltage (V dd ) scaling serves as an effective knob for reducing the energy consumption due to the quadratic reduction in the switching energy. Figure 1 shows the values of the V DD with the recent technology trends. It can be observed that scaling of V dd stalled at 1V. However, this scaling trend of V dd leads to degradation in the circuit performance. In [3] it was shown that near threshold computing results in energy savings of more than an order of 10× with only 10 times degradation in the performance as illustrated in figure 2.
To achieve maximum computational energy efficiency with acceptable performance, operating digital circuits in the near threshold region (scaling supply voltage V dd near to V T ) is one of the solutions [2, 3] . This is known as Near Threshold Computing (NTC).
Arithmetic operations play a crucial role in most of the signal and image processing applications [12] [13] . One of the fundamental arithmetic units is 1-bit full subtractor circuit. Therefore, evaluation of the 1-bit full subtractor circuit performance is required to realize the overall system performance [14] [15] . The basic logic diagram of the 1-bit full subtractor is shown in figure 3 and table.1 shows the truth table for 1-bit full subtractor.
Many full subtractor circuit designs employing different logic approaches have been proposed earlier in the literature. C-CMOS [4] , Pass Transistor Logic (PTL) [5] , and Transmission Gate (TG) [6] [7] logic styles are the most conventional logic designs. Each of the logic designs is having its own advantages and drawbacks with one of the performance parameters-power, delay, and area. In [8] , we have proposed a new full subtractor circuit which was designed with only two XOR gates and one multiplexer gate for area and energy efficient applications. However, these design shows good performance in the super threshold operation (V dd > V T ), but the case differs when V dd scales down to the near threshold voltage. The performance of the circuit degrades because of the exponential increase in delay with the supply voltage scaling [1] . Also because of the reduced output swing, the logic styles like CPL may lead to functionality failure for some input test cases.
The same is the case with our proposed design. The C-CMOS logic is the most optimal design style for subthreshold operation, as it provides full output swing and also more robust against PVT variations than the other logic designs [1] [2] but it requires number of transistors which lead to huge power consumption and long borrow propagation paths in the design of full subtractor circuits.
In this paper, a popular body bias scheme known as DTMOS [9] is used along with our most area and energy efficient full subtractor circuit to overcome abovementioned issues in the near threshold region of operation. The remaining part of the paper organized as follows. The proposed full subtractor design methodology is mentioned in section-II followed by results and comparative study in section-III. Lastly in section-IV conclusions were made.
II. PROPOSED DESIGN
The circuit and the logic diagram of the proposed 1-bit full subtractor were shown in figure 4 and 5 respectively. The design of proposed 1-bit full subtractor employs static CMOS logic [4] , Pass Transistor logic (PTL) [5] with Dynamic Threshold MOS (DTMOS) scheme [9] . The basic logic design structure of the proposed design is similar to the design proposed by ours in [8] which uses static CMOS and PTL logic. Two identical XOR gates and a multiplexer were used to generate the outputs 'difference' and 'borrow' respectively. Since the difference block consumes more power, the XOR gates with minimum transistor count are used to reduce power consumption. The propagation delay for obtaining the output 'borrow' is greatly reduced; as only one multiplexer gate is used for propagation of input borrow (B IN ).
As supply voltage scales down, the performance of the circuit degrades due to the exponential increase in delay. If this supply voltage scaling continues towards the near or subthreshold regime, the degradation in the circuit performance may result in the huge energy consumption [1] [2] . Hence, a DTMOS scheme is employed with this design to improve the energy efficiency while operating in near threshold region.
We have employed DTMOS scheme since it is the most area and energy efficient technique at ULV. Basically, body biasing comes in connecting transistor body terminal to a bias network in the circuit, instead of V dd or ground. This body bias can be supplied from an external (off-chip) or an internal (on-chip) source [10] .
In DTMOS scheme, the transistor body terminal is tied to the gate input for varying the threshold voltage (V T ) of the device dynamically with respect to the change in gate voltage.
The basic equation which models the impact of body bias on the threshold voltage is given as [8] [9] .
The parameters γ is the body effect coefficient, ϕ B is the flat band voltage, V SB is the source to body bias voltage and V To is the threshold voltage with zero substrate/ body bias. From equation (1), it is understood that by employing the DTMOS scheme there will be a increase in the V T (When V G =V B =0) and reduction in the V T (When V G =V B =1). This increase of V T reduces the leakage current flow in the transistor which is necessary when the transistor is in OFF state and the reduction of V T improves the switching speed of the transistor which is required when the transistor is in ON state.
III. SIMULATION RESULTS
The proposed full subtractor circuit is simulated and the results were compared with conventional CMOS (C-CMOS) and Dynamic Threshold CMOS (DT-CMOS) full subtractor designs. All the simulations were performed using Cadence 90nm technology with a supply voltage of 300mV and operational frequency of 20KHz.
The From the comparisons, it can be noticed that the proposed design consumes less energy and EDP than the other designs. This is because of employing the DTMOS technique with combination of CMOS and PTL logics in the proposed design. Because of the steep sub threshold swing, the DTMOS transistor will have higher carrier mobility than the standard MOSFET. This results in more delay savings.
From the comparisons, it is clear that the DT-CMOS design achieves more than 38% and 46% delay savings than the C-CMOS and the 10T designs respectively.
Even though the transistor count of 10T is very less in comparison with the DT-CMOS and the C-MOS designs, but still the 10T design results in more delay. The 10T design may give better performance in the strong inversion region of operation but lags at ULV operation because poor driving capability due to V T drops which may lead to the failure in the functionality of the circuit.
However, the proposed design didn't achieve much savings in terms of power consumption than the C-CMOS, DT-CMOS designs and also 37% more than our previous design (10T).This is due to the continuous charging and discharging of the body capacitances. But still, the proposed design managed to achieve more than 26% and 39% savings in terms of energy and EDP respectively than the C-CMOS, DT-CMOS and 10T designs. This is because of the comparatively higher delay savings of the proposed design.
The layout of the proposed full subtractor design is shown in figure 7 . The obtained timing waveform after post-layout functional simulation of the proposed full subtractor circuit is shown in figure 8 . It can be noticed that the proposed design with the layout area of 25.36µm 2 , achieved more than 18% savings in delay, 26% savings in energy consumption and 39% savings in EDP in comparison with the conventional CMOS configuration and other counter parts.
Monte-Carlo (MC) simulations have been performed on 500 samples in order to study the effect of global and local process variations on the energy consumption of the proposed design as shown in figure 9 .
The simulations result in 100% yield with a mean (µ) = 57.5623aJ and standard deviation (σ) =117.053aJ. Normal Quantile (NQ) plot is also obtained from the simulations and it seems to be reasonable as per the expected distributions across all the samples. In this paper, a new 10T full subtractor circuit using Dynamic Threshold CMOS (DT-CMOS) scheme is designed for Near Threshold Computing (NTC) to achieve minimum energy consumption. The simulations of the circuits have been carried out using cadence 90nm technology with a supply voltage of 0.3V. The results have shown that the proposed design outperforms the other designs (CMOS, DT-CMOS, 10T) by achieving more than 18% delay savings, 26% energy savings and 39% EDP savings. Hence, the proposed full subtractor circuit can be used as one of the substitutes instead of many Ultra Low Voltage (ULV) subtractor circuits designed for energy efficient arithmetic applications.
