This paper explores functionality, performance, and energy efficiency of an 80,000 transistor, 0.35pm, back-bias tunable, near-zero V t h , 32 x 32-bit multiplier operating at 100°C.
INTRODUCTION
An increasing number of applications, especially portable ones, are becoming limited by power, rather than performance. Reducing supply voltage [l, 21 or signal swing [3] have been shown effective in reducing power. However, to maintain performance with lower supply voltage, the transistor threshold voltage ( V t h ) also needs to be lowered [4] , resulting in increased leakage power.
We have demonstrated [5] that circuits fabricated in nearzero threshold CMOS technology, combined with variable threshold CMOS (VTCMOS) techniques [6, 7, 8, 91 and operating at room temperature can achieve significant energy savings as compared to equivalent circuits fabricated in standard CMOS technology. It was also shown that tunable near-zero threshold CMOS technology offers performance advantages over standard CMOS. Unlike standard CMOS where leakage power is usually a negligible portion of the total power, we have shown that the leakage power should be a significant portion of the total power if one is to achieve minimum energy per operation.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISLPED'Ol, August 6-7, 2001 , Huntington Beach, Califomia, USA.
Copyright 2001 ACM 1-581 13-371-5/01/0008 ... $5.00.
Figure 1: Chip micrograph
This paper explores the effects of elevated temperature on the functionality, speed and energy efficiency of circuits fabricated in near-zero threshold CMOS. It demonstrates that even at increased ambient temperature, back gate bias can be used to optimize the ratio of leakage and switching power and therefore achieve minimum energy per operation.
CIRCUITS AND TECHNOLOGY
A 32 x 32-bit signed integer multiplier [lo] designed using dynamic, low-swing differential circuit techniques has been used as a test vehicle to measure the effects of elevated ambient temperature on functionality, speed and energy efficiency of circuits fabricated in near-zero threshold CMOS technology. The multiplier uses 4-bit Booth encoding and tree reduction by a 4-2 adder. It was fabricated in a 0.35pm near-zero threshold process. It occupies 3.1 x 3.1 mm area and contains about 80,000 transistors (Figure 1 ). The multiplier was tested at ambient temperatures of 28&2"C and 100f2"C measured while chip was powered-off. When the chip was operating, we observed up to an additional 6°C in- It demonstrates a wide range of threshold tunability which was used to balance static and dynamic power over frequency, circuit activity, effective logic depth and teniperature. Degradation of drive current and subthreshold slope is observed at elevated temperature.
RESULTS
Figure 3 shows a Shmoo plot of the multiplier at, 28°C. At 28°C the zone of correct operation includes both filled and unfilled circles. For supply voltages of 0.8 V and above, the multiplier is fully functional regardless of the values of the back-bias voltages. At lower supply voltages, a balance between the two biases is required to maintain minimum values 120 to ensure proper circuit operation [5] . In addition, since the NMOS transistors have lower built-in (Vpluell=O V) threshold voltage than the PMOS transistors, the zone of valid operation at low supply voltages is offset with respect to the diagonal. T h e multiplier was found to be functional CMOS multiplier at 100°C. At 100°C the zone of correct operation is indicated by filled circles only. In general, this behavior is similar t o the behavior shown at 28°C. As the supply voltage is decreased, a balance between two biases has to be maintained to obtain correct operation. However, while at Vdd=O.2 V and 28OC we found the zone of valid operation to span about 0.8 V in both ( V d d -v L w e l l ) and Vpwellr at 100°C we found only a single valid combination of ( V d d -V n w e l l ) and Vpwell which produced correct operation. Since increased temperature lowers the threshold voltage and therefore increases I , f f , only at that particular loca- space we are able to achieve the minimum required values Figure 4 shows multiplier performance vs. supply voltage for both temperatures. The range of points at each value of Vdd corresponds to various combinations of (Vdd-Vnwell) and Vpuuell ranging from (0,O) to (-4,-4) . At 28°C the multiplier runs at 188 MHz at Vdd=2.0 V, 136 MHz at Vdd=l.2 V and at 40 MHz at Vdd=O.4 V. At 100°C the multiplier runs at 162 MHz at Vdd=2.0 V, 115 MHz at Vdd=1.2 V and at 37 MHz at Vdd=O.4 V. The performance degradation is about 15 percent at the higher supply voltages due to decreased carrier mobility. At lower voltages the performance degradation is less than 10 percent. The smaller speed penalty at lower voltages can be attributed to improvements in performance resulting from temperature induced threshold lowering.
In low threshold CMOS, the key to achieve minimum power at the required performance is to choose the optimum ratio of leakage power to total power. For example, for the multiplier to run at 40 MHz at 28"C, one can choose a supply voltage ranging from 0.37 to 0.6 V. Although a lower supply voltage may seem advantageous, it requires very low thresholds which in turn makes leakage power too large. Figure 5 shows the supply and well voltages which result in minimum total power at the given frequency for both ambient temperatures. We observe that to run at any given frequency at 100°C we need to apply higher supply and well voltages than to run at 28°C at that same frequency. The optimum supply voltage is on average 23 percent higher at 100°C than at 28°C. The majority of that increase is due to the performance degradation discussed earlier. The optimum I(Vdd-Vn?,,ell)l voltage at 100°C is on average 1.10 V higher than at 28"C, while the optimum lVpvlellI voltage is 1.25 V hinher at 100°C than at 28°C.
of r o n N M O S / I o f f P M O S and I o o n P M O S / I o f f N M O S .
decreases in the absolute values of the well voltages, indicating a need for lower thresholds. At the highest frequencies it would be more efficient to run at higher supply and threshold voltages. However, since we limited our supply voltage to 2 V, the only way to achieve the highest frequencies is to run at maximum supply voltage and lowest thresholds. Figure 6 gives the optimum ratio of leakage power to total power vs. frequency for both temperatures. Both curves show very similar tendencies, with the optimum ratio decreasing with frequency. This variation is caused by different rates of change of leakage and active power as a function of the available ranges of supply and threshold voltage needed to achieve a desired frequency. While operating at 100°C, the chip tolerates a leakage ratio about 3 to 5 percent higher than at 28°C. Although one could apply more back bias and decrease the leakage power, the penalty in increased switching power would be larger. Abrupt increases in the optimum ratio at the highest frequencies for both temperatures can be attributed to the limited supply voltage range as explained in the previous paragraph. Figure 7 shows minimum achievable energy per operation vs. frequency for both temperatures (left y-axis). It also gives the ratio of these two energies vs. frequency (right yaxis). Over a wide frequency range, the same performance at 100°C requires about 1.5 times the power at 28°C.
(Energy x Time) is often considered as metric of choice for low-power applications [Ill. In Figure 8 , it is plotted vs. supply voltage for the two temperatures. The most optimum (Energy x Time) point for the multiplier operating at 28°C occurs at Vdd=0.36zt0.01 V, Vdd-Vnwell=-0.8zt0.2 V and Vp/p,,~~=-2zt0.2 V, and is 1.6 times smaller than the lowest (Energy x Time) value attainable at 100°C at Vdd=0.50+0.01 v, Vdd-Vnwell=-2zko.2 V and Vp,,~~=-3k0.2 V.
As frequency increases over about 100 MHz, we observe a gradual increase in absolute values of the optimum well voltages, indicating a need for higher thresholds. This increase compensates for decreased thresholds resulting from drain-induced barrier lowering, which is more pronounced at higher supply voltages. At the highest achievable frequencies for each ambient temperature we observe sharp
4* CONCLUSIONS
We have demonstrated that the tuning techniques we previously applied to near-zero threshold CMOS at room temperature also work well at 100°C. We can still minimize energy at a target operating frequency by adjusting back bias. To achieve the same performance across a wide fre-quency range, the back bias needs to increase about 1.2 V, and the supply voltage needs to increase about 2,O percent, increasing the power dissipation by about 1.5 times going from 28°C to 100°C. 
10.''

.
