Abstract-Dual threshold voltage (V th ) design is a common method for reducing leakage power in above-threshold circuits. This research shows that it is also effective in reducing energy per cycle of sub-threshold circuits. We first study the single-V th design theoretically and by simulations, and find that the energy per cycle is independent of threshold voltage. However, in a dual-V th design, the energy per cycle depends on both threshold voltage and supply voltage. We propose a framework to further reduce energy per cycle below what is possible with a single V th . Given a nominal value for V th , we determine an optimal supply voltage V dd and an optimal higher V th . Application to a 32-bit ripple carry adder shows energy saving of 29% over the single-V th lowest energy.
I. INTRODUCTION
Sub-threshold operation is often referred to as weak inversion operation, where sub-threshold current I sub is the main source of current. It can be summarized as follows [1] :
where
µ is effective mobility, C ox is oxide capacitance, W is transistor width, L is transistor length, V t is thermal voltage, V gs is gate-source voltage, V ds is drain-source voltage, V th is threshold voltage and n is sub-threshold slope.
Sub-threshold circuits are expected to receive increasing attention in the coming years since the minimum energy CMOS operation occurs in the sub-threshold region [2] . In other words, the optimal supply voltage (V ddopt ) is typically below V th when minimum energy is achieved. As the supply voltage scales down into the sub-threshold region, due to the exponential relation between V gs and I sub , circuit delay increases exponentially which causes significant increase in the fraction of leakage energy. On the other hand, dynamic energy decreases relatively slower, i.e., quadratically as supply voltage scales down. Minimum energy point is reached when dynamic energy equals leakage energy.
II. SINGLE-V th SUB-THRESHOLD V dd DESIGN We establish that energy is independent of V th for a single-V th and sub-threshold V dd design. However, it is feasible to make the circuit faster without increasing the energy per cycle (EPC) by decreasing V th . An analytical expression for EPC can be written as, where C ef f is average switched capacitance per clock cycle in the circuit, C g is gate capacitance of a characteristic inverter, l is the length of the critical path in terms of characteristic inverters and T is the clock period. In addition, if we assume that V ds > 3V t , so that 1 − exp( −V ds Vt ) ≈ 1, then we arrive at the following expression for energy,
From (4) we see that the V th factor is canceled out in the energy expression which means V th has no effect on EPC. We verified this theory by simulating a 32-bit ripple carry adder (RCA) in HSPICE. We used PTM 32nm technology [3] which offers two models, a low performance (LP) model with high V th and a high performance (HP) model with low V th . Table I lists the V th values calculated at nominal V dd = 0.9V by HSPICE [4] . The EPC for the two single threshold voltage circuits as functions of V dd computed by HSPICE [4] (simulating random input vectors) are shown in Figure 1 . The red curve is for low V th and the blue curve is for high V th . We notice that EPC for the two designs remain practically same over the sub-threshold supply voltage range V dd = 0.12V to V dd = 0.4V. As V dd scales down EPC decreases reaching a minimum at the same V ddopt just above 300mV. When V dd decreases further, EPC increases as leakage energy dominates. Logic operations breakdown earlier, at about V dd = 200mV, for high V th . The low V th design continues to work at lower V dd .
III. DUAL-V th SUB-THRESHOLD V dd DESIGN
In this section, we demonstrate the effectiveness of dual-V th technique as a method to reduce EPC for sub-threshold 978-1-4799-1361-9/13/$31.00 ©2013 IEEE circuits. Our dual-V th design keeps circuit speed the same as single-V th design with low V th , while reducing the leakage power via assigning high V th to appropriate gates. We use a gate-level slack based algorithm [5] , [6] to generate dual-V th design, consisting of following steps: 1) Assign low V th to all gates. 2) Run static timing analysis (STA) to calculate slack [5] for each gate and circuit delay (T ) for every V dd and high V th condition. 3) Using gate slacks assign gates to high V th such that critical path delay does not degrade. 4) Estimate EPC, V ddopt and an optimal high V th level which gives the lowest EPC. To estimate EPC, we sum up the energy of all gates, i = 1, · · · n. as shown below, where C ef f and P leak are obtained from HSPICE [4] for basic logic gates under varying V dd , V th and fan-out conditions.
5) Simulate dual-V th design [4] and compare EPC, V ddopt and optimal high V th . Figure 2 shows how EPC is lowered via optimized dual-V th design. Supply voltage ranges from 0.12V to 0.6V and we apply reverse body bias voltages to the example circuit in the range between 0.1V to 0.8V. For any given V dd , EPC decreases as bias voltage increases until it reaches a lower bound. Then it starts to increase slowly, finally reaching the same value as the single-V th design. The lowest minimum energy occurs when the bias voltage equals 0.3V. The minimum EPC in Figure 2 [4] simulation. The average error between the estimation and simulation is 6.99%. The error may result from simplifications made in the framework. For example, we assume that fan-out gates are always low V th gates when calculating output capacitance of the driving gate in HSPICE [4] . That is, when a gate drives high V th gates, the difference in output capacitance is considered negligible.
