# A Four-Transistor Level Converter for Dual-Voltage Low-Power Design

Karthik Naishathrala Jayaraman\* and Vishwani Agrawal

Department of Electrical and Computer Engineering, Auburn University, 200 Broun Hall, Auburn, Alabama 36830, USA

(Received: 1 July 2014; Accepted: 8 October 2014)

Power dissipation in digital circuits has become a primary concern in electronic design. With increasing usage of portable devices, there are severe restrictions being placed on the size, weight and power of batteries. In this work, we propose a design of a dual  $V_{\rm th}$  feedback type four-transistor level converter (DVF4) with reduced delay and power overheads. The use of DVF4 enhances the effectiveness of a dual-voltage low-power design. The level converter can be used in a circuit with multi supply voltage system where low supply gates may feed into high supply gates resulting in lower power and higher speed than with previously published level converters. The proposed level converter is based on a feedback circuit and employs multi-V<sub>th</sub> technique. To portray the advantages, we compare the proposed level converter with a previously published level converter for various supply voltages and observe 17.44% to 53% power savings and around 50% delay reduction over the best 32 nm CMOS design available in the literature. The impact of process variations is also examined. When used with dual VDD designs, the new level converter renders up to 61% more energy savings for benchmark circuits in comparison when level converters are not allowed. Furthermore, a level converter flip-flop combination performs better than an existing level converting flip-flop. A single-threshold alternative of the new level converter still remains effective, though over a reduced voltage range.

Keywords: Dual-VDD Circuit, DVF4, Level Converter, Level Converting Flip-Flop.

### 1. INTRODUCTION

Technology scaling makes Gordon Moore's prediction<sup>1,2</sup> possible by integrating more transistors on a chip. This increased number of transistors reduces the cost per transistor and improves the performance of devices. With high system clock frequency and increased number of transistors, the increased power consumption of devices becomes an important design constraint, primarily for portable devices. Circuits consuming more power require batteries to be charged more frequently. It has therefore become important not only to optimize circuits for delay and area, but also for power. Greater emphasis on finding new and more effective power reduction techniques can be expected in the future. Power reduction techniques at various levels of abstraction have been used in modern digital world. Popular techniques include multiple supply voltages, multiple threshold voltages, clock gating, and power saving architectural features.

Voltage scaling reduces power consumption quadratically, hence it is superior to reducing the frequency of

operation or scaling the threshold voltage. Because lowering of supply voltage (VDD) reduces the speed of the circuit, a dual or multi-VDD design would not lower the VDD on critical paths.<sup>3,4</sup> Thus, there are two or more voltage signal levels in the circuit, which might give rise to undesirable power consumption and delays when low voltage gates feed into high voltage gates. An under-driven signal will cause high rise and fall transition time, and this will increase the static DC current and reduce noise margins. A slow transition time means the signal spends a longer time close to the threshold voltage  $(V_{th})$ , causing the short circuit current to be high. This situation can be corrected by level conversion at the output of low voltage gates driving high voltage gates. Also, there are high voltage gates driving low voltage gates. However, in this case level conversion is not required because the higher input signal voltage will allow the gate with low supply voltage to function properly.

The supply voltage assignment of gates is extremely important because the value of lower VDD and the number of gates supplied by that low VDD determine the reduction in power consumption. Two algorithms have been

<sup>\*</sup>Author to whom correspondence should be addressed. Email: kjn0005@tigermail.auburn.edu

proposed for the assignment of VDD to gates, (1) Clustered Voltage Scaling (CVS)<sup>5–7</sup> and (2) Extended Clustered Voltage Scaling (ECVS).<sup>8</sup> In CVS, the cells driven by each power supply are grouped together such that low voltage gates do not feed into high voltage gates. In ECVS, this restriction is removed but a level converter (LC) must be inserted whenever a low voltage gate feeds into a high voltage gate. In general, ECVS can get more power saving than CVS and that benefit enhances as the power and delay overheads of LC are decreased by improving its design.

Multiple thresholds and transistor sizing can be combined with voltage scaling to get more power savings. When using multiple supply voltages in a circuit we might need to convert the voltage level from one value to another, by using level converters. There have been many level converters described in the literature. Pale A level converter either has high power consumption or high delay. The motive is to design a level converter that could save on power consumption without incurring more delay. For a given higher voltage level, a good level converter should maintain this characteristic as the lower voltage level drops (preferably close to the threshold voltage), thus providing most power saving in a dual voltage design. We can use a slack based algorithm for dual voltage assignment and also allow level converter overhead and have energy savings.

#### 2. LEVEL CONVERTERS

Level converters are circuits that convert one level of voltage into another. The use of level converters comes into effect when there are two or more voltage signals used in the circuit. In case of dual supply voltages, the level converter is used to convert from a low voltage to a high voltage. Level converters are of two types:

# 2.1. Feedback Based Level Converters

Feedback based level converters, as the name suggests, depend on some form of feedback circuitry. When a lowswing signal directly drives a gate that is connected to a higher supply voltage, the pull-up network of the receiver cannot be fully turned off. The receiver therefore produces static DC current. In order to suppress this DC current, feedback-based level converters isolate the pull-up network from the low-swing input signal. The traditional level converters, however, suffer from high short-circuit power and high propagation delay due to the typical slow response of the feedback circuitry. Furthermore, the pull-down network in these circuits is driven by low voltage swing signals while the pull-up network is driven by full-swing signals. At very low input voltages, the widths of the transistors that are directly driven by the low-swing signals need to be significantly increased in order to balance the strength of the pull-up and the pull-down networks. This causes further degradation in the speed and the power efficiency of conventional level converters.

#### 2.2. Multi-Threshold Level Converters

Unlike the level converters that depend on feedback circuits, multi-threshold level converters rely on multi- $V_{\rm th}$  technique in order to eliminate static DC current. This enables the level converter to save on power and delay.

# 2.3. Standard Level Converter

A frequently used design, we will call *standard level converter*, is shown in Figure 1.<sup>21,22</sup> The circuit operates as follows. When the input IN is at VDD1 (low signal level), transistor TN1 is turned ON and hence node OUT1 would settle to a low value. This turns on transistor TP2. The low output from the inverter turns off transistor TN2 and hence node OUT2 gets a high value of VDD2. This would turn off transistor TP1. Thus, a logic 1 input at a lower voltage level VDD1 is converted into a high output of VDD2. An input of logic 0 (0 V) makes the output of the inverter VDD1, which is fed to the gate input of transistor TN2. This pulls the node OUT2 to GND, turning TP1 on and setting OUT1 to high (VDD2). This high will turn OFF TP2 making node OUT2 to remain at 0 V.<sup>23</sup>

Although the use of this level converter is frequently cited, its performance deteriorates as the low signal level becomes closer to the threshold voltage and the inverter slows down, basically increasing the delay of the level converter. Besides, power consumption increases. Consider a rising transition at IN. TN1 and TP2 turn ON quickly while TN2 and TP1 are slow to turn OFF, causing increased short circuit power.

# 2.4. Multi- $V_{th}$ Level Converter

A multi- $V_{\rm th}$  level converter<sup>24</sup> is shown in Figure 2. The static DC current is eliminated by employing a multi- $V_{\rm th}$  technique. In general, delay is based on the threshold voltage. Higher the threshold, higher will be the delay. This level converter has fewer transistors when compared with



Fig. 1. A standard VDD1 (low) to VDD2 (high) level converter. 21,22



**Fig. 2.** Multi- $V_{th}$  level converter, best available in the literature. <sup>24</sup> Bold line denotes high threshold voltage.

the standard level converter. Its power consumption can be lower than that of the standard level converter. The working of the level converter is as follows. Transistor M1 is always ON as its gate is fixed at VDDL. When there is a logic 0 or 0 volt at IN, transistor M1 acts as a pass-transistor sending a perfect '0' to OUT. The '0' at the input sets the output of the inverter high (VDDL) making sure that the transistor M2 is OFF. When there is a logic 1 or VDDL at IN, the output of the inverter is '0', which turns M2 ON pulling OUT to VDD. The reverse current from VDD to IN is reduced by the high threshold voltage channel of M2. At lower voltages the size of the transistor is not increased, but the threshold voltage of M2 is increased to get proper VDD and '0' levels at OUT.

For power this level converter is only marginally better than the standard level converter. However, its delay for low signal levels is smaller, primarily because an input to output path is provided by the pass transistor. However, the slowing down of its inverter can cause higher conduction from VDD to IN.

# 2.5. DVF4: A Dual $V_{\rm th}$ Feedback Type 4-Transistor Level Converter

In this section we will describe the newly proposed DVF4, a dual  $V_{\rm th}$  feedback type 4-transistor level converter. It utilizes feedback circuitry and employs multi- $V_{\rm th}$  technique to suppress the static DC current and to enable the circuit to be faster than the previous level converters. The DVF4 circuit is shown in the Figure 3.

# 2.5.1. Design of Level Converter

The DVF4 level converter has a feedback from the output to the gate input of M4 transistor. M4 transistor is needed to restore the logic to '0' if the previous input state was VDDL. M3 transistor has high threshold voltage to reduce the static DC current. M2 is an always ON pass transistor with its gate tied to VDDL. It isolates the two supply signals. The gate voltage for M2 is kept below VDDL +  $V_{\text{th}-M1}$  to make sure that the two supply signals



Fig. 3. DVF4: Dual  $V_{\rm th}$  feedback type 4-Transistor level converter proposed in this paper. Bold line indicates high threshold voltage.

are isolated. As in the multi- $V_{th}$  design, the pass transistor makes the falling signal transition faster at the output.

### 2.5.2. Working of Level Converter

When there is a VDDL at IN, M1 is turned ON and the node x is discharged to '0'. The '0' on x turns M3 (PMOS) transistor ON, pulling OUT to proper logic level VDD. The gate of M4 is now at VDD turning the transistor M4 OFF and holding the gate of M3 to stable logic '0'. For a rising transition from VDDL logic to VDD logic DVF4 requires only 3 transistors (M1, M2, M3). When there is a logic '0' at IN, it is transmitted to output by the pass transistor M2. We use an NMOS pass transistor because it transmits a perfect '0'. M2 is always ON as its gate is fixed at VDDL. Since M1 is OFF, the voltage on node x can be unknown. There can be a random voltage value at node x if the previous logic state of IN was VDDL. Due to this random voltage M3 may not be fully OFF. To make sure that M3 is OFF, we have a keeper transistor M4 that is supplied by VDD and the gate is connected to OUT. So, when there is a '0' at the output, M4 is turned on, and the value at node x is VDD. This VDD at node x keeps M3 in OFF state, thereby providing a proper logic '0' at OUT. For any particular transition, either falling or rising, only 3 transistors are needed for the proper functioning of the circuit. For 1-to-0' transition, M2, M3, M4 transistors are needed, and for 0-to-1 transition M1, M3, M4 transistors are needed. The effect of process variations on threshold voltage of M3 transistor is discussed in Section 3.1.1.

# 2.5.3. A Single V<sub>th</sub> Version

Having dual  $V_{\rm th}$  adds to design and process complexity and practical implementations typically restrict the number of threshold voltages to two. We also redesigned DVF4 to function with single threshold voltage. For the single

 $V_{\rm th}$  design the width of transistor M3 (shown with high threshold in Figure 3) is reduced to decrease the leakage. This reduction causes delay to increase, so, the sizes of M1 and M2 are increased to compensate for the delay increase. We use an optimizer program to achieve minimum power consumption and minimum delay. The design is identical to DVF4 of Figure 3 except that dual thresholds are changed to single threshold voltage with larger NMOS transistors.

# 3. POWER CONSUMPTION AND DELAY CHARACTERISTICS

We compare the speed and the power consumption of DVF4 level converter as shown in Figure 3 with the previous best multi- $V_{\rm th}$  level converter (Fig. 2) and the standard level converter (Fig. 1). Two overheads of level converters, propagation delay and power consumption, are important factors in determining the optimum supply voltage in multi-VDD designs. The range in which the voltage levels are selected depend on the technology used and in this paper voltage levels from 0.4 V to 1.0 V are considered. The simulations are carried out for voltage levels in this range. The inputs are 100 random vectors applied with a period of critical delay to determine the circuit's average activity.

The standard level converter<sup>21,22</sup> and the multi- $V_{th}$  level converter<sup>24</sup> are redesigned for simulation in 32 nm CMOS technology and are compared with the proposed DVF4 level converter. A simulation setup for the evaluation of level converters is shown in the Figure 4. The sizes of driver A and load B are four times the size of a standard inverter. The propagation delay, average of rise and fall delays, is measured from input of A to output of B. The average power consumption and delay are measured through Hspice<sup>26</sup> simulation for the whole circuit including the driver and load inverters. Initially, the base setup power and delay are measured with driver A feeding directly into load B. Then a level converter is inserted and the measurements repeated. By subtracting the measured values for the base setup, we get the power and delay for the level converter.

Transistors in the level converter are optimized for both power consumption and propagation delay using an optimizer program written in Perl.<sup>27</sup> Results of Hspice<sup>26</sup> simulation using the 32 nm predictive technology model (PTM)<sup>28</sup> are tabulated in Table I for various VDDL and



Fig. 4. Simulation setup for evaluating level converters.

 $V_H=1.0$  V. We also simulated various other level converters described in the literature using the same simulation setup. The multi- $V_{\rm th}$  level converter is found to be the best among the previously available designs and hence it is used for comparison with DVF4. For clarity, only standard level converter and multi- $V_{\rm th}$  level converter are shown in Table I.

# 3.1. Dual $V_{\rm th}$ Simulation

Table I shows that when the input voltage (VDDL) is lowered in the range {1.0–0.4 V}, the average power of DVF4 decreases proportionately and it also performs better in terms of delay against the multi- $V_{th}$  level converter<sup>24</sup> (best available level converter in the literature). We optimize the level converter for threshold voltage of M3 and widths of the M3 and M4 transistors using an optimizer program written using Perl. The threshold voltage and the width is increased gradually for the said voltage range and for each loop the simulation is run using Hspice invoked by the Perl program, the average power and delay values are recorded. By using a simple sorting algorithm for minimum power-delay product (PDP), the corresponding widths and threshold voltage are found for each input voltage. Through this optimization of DVF4, we get a threshold voltage range from -0.58 V to -0.72 V for M3 PMOS transistor, the high voltage (VDD) being 1.0 V for 32 nm CMOS technology. The optimized width of M3 is determined as 0.120  $\mu$  for most VDDL and 0.106  $\mu$  for VDDL = 0.7 V.

The short-circuit current or the reverse current from VDD to input through M2 is reduced by the combination of the high threshold voltage and the reduced width of M3. As a result, we have better power savings as compared to the alternative designs. Before starting the design we set the power consumption of a standard inverter as a target. For the 32 nm CMOS technology we used, the widths of n and p devices were 0.08  $\mu$  and 0.144  $\mu$ , respectively. The delay and power consumption of the simulation setup for the level converters described above is compared to one standard inverter are shown in Figures 5 and 6. One standard inverter is simulated for 100 random vectors, the power and delay values are recorded for the input voltage range as stated before. From Figure 5, we observe that the power consumption of DVF4-setup is significantly better than the existing multi- $V_{th}$  level converter and nearly approaches the power consumption of one standard inverter. From Figure 6, the delay of the DVF4 setup is better than multi- $V_{\rm th}$  level converter, until the input voltage is 0.4 when there is no delay saving. A possible reason is that as the threshold voltage is increased to suppress the reverse current from high voltage supply to low input signal the delay increases. Both level converters do not reach the delay target of one standard inverter, but the delay of DVF4 is closer to target than that of the multi- $V_{\rm th}$ design. Further reduction of delay could be a problem for the future research.

0.5

0.4

|                          |                                       |          |                                               |          | DVF4 (this work) (Fig. 3) |          |                                  |           |  |
|--------------------------|---------------------------------------|----------|-----------------------------------------------|----------|---------------------------|----------|----------------------------------|-----------|--|
| Input voltage            | Standard LC <sup>21,22</sup> (Fig. 1) |          | Multi- $V_{\rm th}$ LC <sup>24</sup> (Fig. 2) |          |                           |          | Savings over multi- $V_{\rm th}$ |           |  |
| $V_L = \text{VDDL volt}$ | Power μW                              | Delay ps | Power µW                                      | Delay ps | Power $\mu W$             | Delay ps | Power (%)                        | Delay (%) |  |
| 1.0                      | 0.457                                 | 17.46    | 0.225                                         | 6.62     | 0.170                     | 3.30     | 24.3                             | 50.1      |  |
| 0.9                      | 0.430                                 | 19.15    | 0.162                                         | 6.73     | 0.133                     | 5.225    | 17.44                            | 22.4      |  |
| 0.8                      | 0.409                                 | 24.2     | 0.171                                         | 10.51    | 0.135                     | 6.655    | 20.94                            | 36.72     |  |
| 0.7                      | 0.452                                 | 34.6     | 0.332                                         | 13.91    | 0.187                     | 10.75    | 43.52                            | 22.77     |  |
| 0.6                      | 0.508                                 | 55.35    | 0.403                                         | 16.39    | 0.229                     | 13.11    | 43.08                            | 20.03     |  |

17.95

22.85

0.78

2.31

Table I. Average power and delay of DVF4 versus  $multi-V_{th}$  level converter and standard level converter (LC),  $VDD = V_H = 1.0 \text{ V}$ .

1.89

4.73

### 3.1.1. Effect of Process Variation in Threshold Voltage

126

695.4

2.247

4.95

The impact of process variation on level converters is a critical issue. A level converter may fail to convert from one voltage level to the other due to the process variation in threshold voltage of the device. Besides, the effect of threshold voltage variation on short circuit current is important as it might affect the delay and the power consumption. There have been predictions made on how process variation effect on delay and performance may be the single biggest impediment to device scaling.<sup>29</sup> A recent work<sup>30</sup> examines threshold variation with respect to the technology scaling. It is shown that the threshold voltage variation can be modeled as a normal distribution.<sup>31</sup> From available data,<sup>30</sup> we can infer that the threshold voltage variation for 32 nm technology can be around 15%.



Fig. 5. Average power  $(\mu W)$  as a function of VDDL. VDD = 1 V.

Results of our study are given in Table II. The threshold voltage of M3 transistor is varied  $\pm 10\%,\,\pm 20\%$  and  $\pm 30\%$  from the nominal value for several input voltages (VDDL) in the range 0.4 V to 1.0 V, keeping the higher voltage level at 1.0 V. The circuit was first simulated under nominal threshold voltage ( $V_{\rm NT}$ ) conditions for each input voltage and the corresponding power delay product (PDP) was determined. Then, the threshold voltage of M3 was varied as stated and corresponding PDP values were obtained. Percentage changes in PDP are listed in Table II.

16.33

22.85

53.4

51.3

9.03

0

From Table II, the impact of threshold voltage on PDP of the proposed level converter (DVF4) varies from as low as 2.1% of the normal PDP to a maximum of 68.9%, with one exception. That exception occurs when the threshold variation is +30% and the input voltage is 0.4 V. The PDP variation is 161.26% or 1.6 times the nominal PDP. If we assume the maximum threshold voltage variation to be around 15%, 30 then it would be reasonable to conclude that the proposed level converter will retain its usefulness. Besides, the effect of process variation on power and delay



Fig. 6. Delay (ps) as a function of VDDL. VDD = 1 V.

| Input voltage $V_L = \text{VDDL volt}$ | Nominal threshold           | Nominal PDP           | Percent change from nominal PDP for scaled threshold voltage |                  |                 |                  |                 |                 |  |  |
|----------------------------------------|-----------------------------|-----------------------|--------------------------------------------------------------|------------------|-----------------|------------------|-----------------|-----------------|--|--|
|                                        | voltage $(V_{\rm NT})$ volt | $\times 10^{-18}$ W-s | $1.1~V_{ m NT}$                                              | $1.2~V_{\rm NT}$ | $1.3~V_{ m NT}$ | $0.9~V_{\rm NT}$ | $0.8~V_{ m NT}$ | $0.7~V_{ m NT}$ |  |  |
| 1.0                                    | 0.715                       | 0.561                 | 6.54                                                         | 14.2             | 31.25           | 3.12             | 2.9             | 5.8             |  |  |
| 0.9                                    | 0.710                       | 0.694                 | 9.8                                                          | 25.67            | 54.12           | 9.6              | 11.4            | 11.68           |  |  |
| 0.8                                    | 0.710                       | 0.898                 | 14.35                                                        | 35.6             | 66.89           | 6.5              | 15.8            | 17.1            |  |  |
| 0.7                                    | 0.60                        | 2.010                 | 11.32                                                        | 29.86            | 68.9            | 4.23             | 7.2             | 9.1             |  |  |
| 0.6                                    | 0.62                        | 3.0021                | 7.5                                                          | 28.68            | 53.24           | 7.45             | 5.1             | 6.4             |  |  |
| 0.5                                    | 0.60                        | 12.737                | 5.7                                                          | 31.25            | 45.26           | 12.3             | 2.1             | 8.6             |  |  |
| 0.4                                    | 0.58                        | 52.78                 | 6.05                                                         | 51.1             | 161.26          | 24.3             | 32.8            | 31.26           |  |  |

Table II. Changes in power-delay product (PDP) of DVF4 (Fig. 3) due to process variation in threshold voltage of M3. VDD = 1.0 V.

can be easily estimated as shown here and accounted for in the design of dual-voltage circuits.

# 3.2. Single- $V_{th}$ Version of DVF4

We redesigned the level converter for proper functionality using single threshold voltage. Dual  $V_{\rm th}$  is a technique of reducing the leakage of a circuit. However, leakage can also be reduced by transistor sizing. When the size of a PMOS transistor is reduced the resistance of the device increases thereby reducing the leakage current, but the delay of the device also increases. We employ this concept in designing the single- $V_{\rm th}$  level converter. The simulation results are shown in the Table III.

Transistors of DVF4 were re-sized. Transistor sizing affects the threshold voltage, but we resize the transistors to keep the variation at a minimum. This way, we can keep the number of threshold voltages to a single value for NMOS and PMOS transistors. The threshold voltages of NMOS and PMOS devices are kept at their nominal values for 32 nm technology, 0.508 V and -0.455 V, respectively. The rise and fall delay times are made almost equal, thereby reducing the short-circuit current. Figure 7 shows the power delay product (PDP) as a function of transistor width as VDDL is varied from 1.0 V down to 0.4 V, keeping the higher voltage as 1.0 V. In this single threshold voltage alternative, the threshold voltage range is {-0.418 V, -0.450 V} for PMOS and {0.430 V, 0.490 V} for NMOS in the VDDL range {1.0 V, 0.4 V}. As shown in Table III the single- $V_{th}$  alternative of DVF4 still performs better than the multi- $V_{th}$  level converter.<sup>24</sup> But, as

**Table III.** Power and delay of single- $V_{\rm th}$  alternative of DVF4 and comparison with multi- $V_{\rm th}$  level converter,  $^{24}$  VDD =  $V_H=1.0$  V.

| Input                           | Single-V <sub>th</sub> DVF4 (Sec |            | Reduction over multi- $V_{\rm th}$ LC <sup>24</sup> (Table I, columns 4 and 5) |           |  |  |
|---------------------------------|----------------------------------|------------|--------------------------------------------------------------------------------|-----------|--|--|
| voltage $V_L = \text{VDDL (V)}$ | Power (µW)                       | Delay (ps) | Power (%)                                                                      | Delay (%) |  |  |
| 1.0                             | 0.191                            | 4.51       | 15.11                                                                          | 33.17     |  |  |
| 0.9                             | 0.149                            | 5.51       | 8                                                                              | 17.7      |  |  |
| 0.8                             | 0.140                            | 9.2        | 18.2                                                                           | 12.7      |  |  |
| 0.7                             | 0.235                            | 11.95      | 29                                                                             | 14.2      |  |  |
| 0.6                             | 0.279                            | 15.2       | 35.2                                                                           | 7.6       |  |  |
| 0.5                             | 0.815                            | 17.59      | 56.1                                                                           | 2         |  |  |
| 0.4                             | 3.73                             | 22.85      | 21                                                                             | 0         |  |  |
|                                 |                                  |            |                                                                                |           |  |  |

the supply voltage is scaled down to 0.4 V the delay of the level converter becomes higher because the size of PMOS transistor M3 is scaled down for reduced leakage causing the delay to proportionately increase. To compensate for the increased delay, we increase the size of M1, allowing more current to pass through, so the fall delay is reduced. We also increase the size of M2; the increased current helps improve the performance.

When the input voltage (VDDL) is in the subthreshold region ( $\sim$ 0.4 V) for 32 nm technology,<sup>28</sup> the leakage is very high. To reduce the leakage, we decrease the size of M3 which induces high delay penalty. The single threshold alternative performs effectively in a slightly reduced voltage range {1.0, 0.5 V}.

# 4. SLACK-BASED DUAL-VOLTAGE DESIGN USING LEVEL CONVERTERS

We examine the energy-saving potential of various level converters when used in dual-voltage circuits. We will use  $32 \text{ nm CMOS PTM}^{28}$  for experiments. In each evaluation an original circuit has a supply voltage VDD = 1 V. Its critical path delay and clock period are determined. Then we design a low-power version by determining a lower voltage VDDL and assigning a subset of gates to this voltage such that the critical path delay remains unchanged and power is minimized.

The assignment of gates to VDDL is done by a method called extended clustered voltage scaling (ECVS).<sup>8</sup> ECVS does not impose the condition that only high voltage gates may feed other high voltage gates and allows any low voltage gate to feed high voltage gates through an asynchronous level converter (ALC) to shift the logic level from low to high. As a result more gates are assigned to lower voltage and hence higher energy saving can be expected. However, the level converters inserted at interfaces between outputs of low-voltage gates and inputs of high-voltage gates contribute to delay and energy consumption that needs to be taken into account while calculating the final energy saving.

We use slack based algorithms<sup>32–36</sup> for determining VDDL and its assignment to gates. The *slack* of a gate is determined by subtracting the delay of the longest path through that gate from the critical path delay of the circuit.



Fig. 7. Change in power-delay product (PDP) of single- $V_{th}$  alternative of DVF4 as a function of width of M3. Graphs correspond to VDDL = 1.0 V, 0.9 V, 0.8 V, 0.7 V, 0.6 V, 0.5 V and 0.4 V.

| Circuit |                                                |                       |                      |                           |                                               | Dual-VDD designs                     |                                      |                     |  |
|---------|------------------------------------------------|-----------------------|----------------------|---------------------------|-----------------------------------------------|--------------------------------------|--------------------------------------|---------------------|--|
|         |                                                |                       | Algorith             | rithm 4 <sup>34</sup>     | Single-VDD design<br>VDD = 1.0 V<br>Energy fJ | Energy for<br>design with<br>DVF4 fJ | Energy saving over single-VDD design |                     |  |
|         | Algorithm $2^{34}$<br>$V_L = \text{VDDL volt}$ | Total number of gates | Gates in low voltage | Number of<br>DVF4 LC used |                                               |                                      | Design with<br>DVF4 %                | Design without LC % |  |
| C432    | 0.80                                           | 154                   | 73                   | 44                        | 161.3                                         | 145.4                                | 9.85                                 | 3.66                |  |
| C499    | 0.91                                           | 493                   | 247                  | 101                       | 463                                           | 396.9                                | 20.1                                 | 7.8                 |  |
| C880    | 0.58                                           | 360                   | 203                  | 78                        | 277.6                                         | 106.1                                | 61.77                                | 58.29               |  |
| C1355   | 0.92                                           | 469                   | 101                  | 119                       | 455.2                                         | 421.2                                | 7.4                                  | 4.86                |  |
| C1908   | 0.77                                           | 584                   | 380                  | 138                       | 496.5                                         | 352.1                                | 29.08                                | 23.81               |  |
| C3540   | 0.61                                           | 1270                  | 881                  | 232                       | 1843                                          | 1437                                 | 22.02                                | 12.23               |  |
| C6288   | 0.73                                           | 2407                  | 1183                 | 98                        | 1932                                          | 1855                                 | 3.98                                 | 3.26                |  |

**Table IV.** Optimal lower supply voltage ( $V_L = \text{VDDL}$ ) and energy saving for ISCAS'85 benchmark circuits through dual-voltage design by Algorithms 2 and  $4^{34}$  using DVF4.  $V_H = \text{VDD} = 1.0 \text{ V}$ .

Lowering the supply voltage of a gate increases its delay and hence reduces its slack. This also reduces the slack of other gates whose longest paths include this low-voltage gate. Recent work<sup>34</sup> gives four slack based algorithms to facilitate dual-voltage design of a given circuit with specified netlist, nominal supply voltage and technology data:

- Algorithm 1: Determines the critical path delay and slacks for all gates. This algorithm is used within other three algorithms.
- Algorithm 2: Finds an optimal lower supply voltage for dual voltage low power design.
- Algorithm 3: Assigns two given voltages to gates using a topological constraint that a low voltage gate must not feed a high voltage gates leading to a dual voltage low power design without level converters.
- Algorithm 4: For given power and delay characteristics of a level converter, assigns two given voltages to gates leading to a dual voltage low power design with level converters.

The present motivation is to show energy savings when level converters are used. This is achieved by using the proposed DVF4 level converter overhead at the junction of low voltage and high voltage gates. Assignment of dual voltages to gates of ISCAS'85 benchmark circuits is done by using Algorithm 4. These circuits are synthesized using a small set of 90 nm predictive technology<sup>28</sup> standard cells consisting of inverter, INV, two-input NAND gate, NAND2, three-input NAND gate, NAND3, and two-input

NOR gate NOR2. Each circuit is simulated using a logic simulator with 100 randomly generated input vectors to determine the circuit's signal activity. Gate capacitances, obtained from the circuit database, allow estimation of energy consumption.

Table IV shows the data on energy saving for various benchmark circuits when the DVF4 level converters are used and when no level converter is used in dual-VDD designs. From Table IV, we find that by using DVF4 level converter the energy savings are comparatively greater than those obtained from dual- $V_{\rm DD}$  designs without level converter.

### 4.1. Inverter Tree Experiment

To measure the delay effectiveness of DVF4 in saving power power in comparison to the best available level converter, i.e., multi- $V_{\rm th}$  level converter, <sup>24</sup> we use an experimental structure shown in the Figure 8. The circuit has four chains of inverters, supplied by the nominal supply voltage, VDD. A long chain of 20 inverters determines the critical path delay. Four other shorter chains have 10 inverters each. All chains feed into a NAND gate whose output signal level level is required to be VDD. The input vectors are 100 random vectors with a period of critical delay. Again, we use 32 nm CMOS PTM<sup>28</sup> for the simulations.

In dual voltage design, the shorter chain is supplied by both VDD and optimum VDDL as determined using Algorithm 2<sup>34,36</sup> which takes in account of the



Fig. 8. An experimental circuit to measure average power for comparison of level converters



Fig. 9. An experimental circuit to measure average power when using DVF4.



Fig. 10. An experimental circuit to measure average power when using a multi- $V_{th}$  level converter.

Table V. Inverter chain example comparison of level converters (LC): multi-V<sub>th</sub> previous best and DVF4 proposed in this work, VDD = 1.0 V.

|                                      |             |                 | Dual-voltage, VDD = 1.0 V |             |                   |                                                      |                                             |             |                   |  |
|--------------------------------------|-------------|-----------------|---------------------------|-------------|-------------------|------------------------------------------------------|---------------------------------------------|-------------|-------------------|--|
| Single voltage $VDD = 1.0 \text{ V}$ |             |                 | With DVF4 (proposed LC)   |             |                   |                                                      | With multi- $V_{\rm th}$ (previous best LC) |             |                   |  |
| Power                                | Delay<br>ps | $V$ DDL $V_{L}$ | Power                     | Delay<br>ps | % Power reduction | $\begin{matrix} \overline{VDDL} \\ V_L \end{matrix}$ | Power                                       | Delay<br>ps | % Power reduction |  |
| 92.64                                | 132.1       | 0.7             | 48.62                     | 132.1       | 47.5              | 0.8                                                  | 71.9                                        | 132.1       | 22.38             |  |

power consumption and the delay of the levelconverter in determining the optimum input voltage. Algorithm 4<sup>34,36</sup> assigns the low and high voltages to gates inserting level converters at the output of the low voltage gates feeding into the NAND gate. Thus, all inverters on the long chain are assigned to VDD. Each short chain has one level converter just before the NAND and several VDDL inverters

such that the total delay just equals that of the long chain. Dual voltage designs with DVF4 and multi- $V_{\rm th}$  LC<sup>24</sup> are shown in Figures 9 and 10, respectively. Note that when several cascaded inverters have the same voltage only the voltage of the first inverter is shown in the figure.

The average power and the delay are recorded and compared with those of the single VDD circuit in Table V. The



Fig. 11. Level converting flip-flop (LCFF)<sup>38</sup> used in comparative study.



**Fig. 12.** Latch<sup>22</sup> used in conjunction with DVF4 for replacing the LCFF of Figure 11.



Fig. 13. Experimental circuit for comparing level converting flip-flop with the combination, VDD = 1.0 V.



Fig. 14. Experimental circuit for evaluating level converting latch (LCFF) using dual VDD, VDD = 1.0 V.

optimum VDDL for DVF4 design is 0.7 V and we have 47.5% power savings against single VDD design with the critical delay being the constraint (obtained from the long chain). Similarly, VDDL for multi- $V_{\rm th}$  design is 0.8 V for the same delay. This increase in VDDL was responsible for the lower power savings of 22.38%.

# 4.2. DVF4 and Flip-Flop Combination Experiment

Level converters are sometimes combined with flip-flops, which then becomes a key element at the voltage boundary. Several level converting flop-flop (LCFF) structures have been investigated in the literature.<sup>37–39</sup> We perform an experiment to see how DVF4 in combination with a flip-flop performs against the existing level converting flipflops. We simulated several LCFF's<sup>37,38</sup> described in the literature with 100 random vectors and selected the best LCFF based on the minimum power-delay product (PDP). This LCFF is shown in Figure 11. We selected a latch described for Intel Itanium 2 processor<sup>22</sup> for combining with the DVF4 level converter to make an LCFF replacement. This latch is shown in Figure 12. We use a latch instead of FFs because the level converting flip-flop design which we compare to is a pulsed latch level converter.<sup>38</sup> The setup structure for this experiment is as shown in the

Figure 13. It has one long chain and several short chains, each feeding a separate output latch. Suppose we have a circuit with 10 latches, 5 primary inputs (PIs) and 5 primary outouts (POs). Each PI feeds into a separate latch and each PO is driven by a separate latch. There are five inverter chains between inputs and outputs. One chain has 20 inverters and others have 10 each. This reference circuit is simulated with single VDD, the critical delay and the average power are measured for use as reference for comparison of latches. The input vectors are 100 random vectors with a period of critical delay.

All latches have the same clock period, determined by the longest chain in the entire circuit. The optimum input voltages for the two cases are determined by Algorithm 2<sup>34,36</sup> as described in the previous section. The simulation is carried out in 32 nm CMOS PTM<sup>28</sup> using dual supplies, VDD and VDDL. The simulation structure is shown in Figure 14. In each short chain, the numbers of gates assigned to VDD and VDDL are 4 and 6, respectively, in the short chains. The number of VDDL and VDD gates are determined by Algorithm 4<sup>34,36</sup> and the critical delay was obtained from the reference circuit (when single VDD was used).



Fig. 15. Experimental circuit for evaluating DVF4-latch combination using dual VDD, VDD = 1.0 V.

Table VI. Inverter chain example comparison of level converting flip-flops: previous best LCFF<sup>38</sup> and DVF4 combined with a latch,<sup>22</sup> VDD = 1.0 V.

| Single voltage<br>VDD = 1.0 V |          | Dual-voltage, VDDH = 1.0 V  |               |          |                   |                |               |          |                   |  |
|-------------------------------|----------|-----------------------------|---------------|----------|-------------------|----------------|---------------|----------|-------------------|--|
|                               |          | With DVF4-latch combination |               |          |                   | With best LCFF |               |          |                   |  |
| Power µW                      | Delay ps | VDDL $V_L$                  | Power $\mu W$ | Delay ps | % Power reduction | VDDL $V_L$     | Power $\mu W$ | Delay ps | % Power reduction |  |
| 212.4                         | 146.6    | 0.72                        | 101.4         | 146.6    | 52.2              | 0.8            | 158.3         | 146.6    | 25.47             |  |

The simulation is repeated for DVF4-latch combination as shown in the Figure 15. The average power and the delay are tabulated in Table VI, which provides comparison against the single VDD reference and the level converting flip-flop setups. The optimum low voltage (VDDL) is determined by Algorithm 2<sup>34,36</sup> as discussed before. In each short chain, the number of VDDL gates is 8 and that of VDD gates is 2, as determined by Algorithm 4.<sup>34,36</sup> From Table VI, the power savings for DVF4-latch combination is 52.2% better than the single VDD reference circuit and for the best existing LCFF it is 25.47% better than the same reference circuit.

# 5. CONCLUSION

In this paper, DVF4, a new level converter based on dual- $V_{\rm th}$  and feedback technique is proposed and compared to the best available level converter. The level converter is optimized for minimum power consumption and delay in 32 nm CMOS technology, the proposed level converter (DVF4) offers power savings up to 53% and delay savings up to 50%. DVF4 offers significant savings of 61% over benchmark circuits. A single- $V_{\rm th}$  alternative of the design is also effective and offers power savings of 56.1% and delay savings of 33.17% for a reduced voltage range. The advantage of DVF4 in dual voltage low power design is demonstrated using various test structures and circuit examples.

# References

 G. E. Moore, Lithography and the future of Moore's Law. Proc. SPIE 2437, 2 (1995). Reprinted in IEEE SSCS Newsletter, September (2006), pp. 37–42.

- A. A. Chien and V. Karamcheti, Moore's Law: The first ending and a new beginning. Computer 46, 48 (2013).
- M. Keating, D. Flynn, R. Aitken, A. Gibbons, and K. Shi, Low Power Methodology Manual: For System-on-Chip Design, Springer, New York, USA (2007).
- V. Kursun and E. G. Friedman, Multi-Voltage CMOS Circuit Design, Wiley. England. UK (2006).
- K. Usami and M. Horowitz, Clustered voltage scaling technique for low-power design, Proc. International Symposium on Low Power Electronics and Design, New York, NY, USA (1995), pp. 3–8.
- M. Donno, L. Macchiarulo, A. Macii, E. Macii, and M. Poncino, Enhanced clustered voltage scaling for low power, *Proc. 12th ACM Great Lakes Symposium on VLSI*, New York, NY, USA (2002), pp. 18–23.
- S. Kulkarni, A. Srivastava, D. Sylvester, and D. Blaauw, Power Optimization Using Multiple Supply Voltages, Springer, New York, USA (2007).
- K. Usami, M. Igarashi, F. Minami, T. Ishikawa, M. Kanazawa, M. Ichida, and K. Nogami, Automated low-power technique exploiting multiple supply voltages applied to a media processor. *IEEE Journal of Solis-State Circuits* 33, 463 (1998).
- S. H. Kulkarni, A. N. Srivastava, and D. Sylvester, A new algorithm for improved VDD assignment in low power dual VDD systems, *Proc. International Symp. Low Power Design*, Newport Beach, CA, USA (2004), pp. 200–205.
- S. H. Kulkarni and D. Sylvester, Fast and energy efficient asynchronous level converters for multi-VDD design [CMOS ICs], Proc. IEEE International System on Chip Conf. (2003), pp. 169–172.
- 11. K. H. Koo, J. H. Seo, M. L. Ko, and J. W. Kim, A new level up shifter for high speed and wide range interface in ultra deep sub micron, *Proc. IEEE International Symp. Circuits and Systems*, Japan (2005), Vol. 2, pp. 1063–1065.
- S. Wooters, B. Calhoun, and T. Blalock, An energy efficient subthreshold level converter in 130 nm CMOS. *IEEE Trans. Circuits* and Systems II: Express Briefs 57, 290 (2010).
- 13. C. Tran, H. Kawaguchi, and T. Sakurai, Low power high speed level shifter design for block level dynamic voltage scaling environment, Proc. International Conf. Integrated Circuit Design and Technology, Austin, Texas, USA (2005), pp. 229–232.

- M. Kumar, S. K. Arya, and S. Pandey, Level shifter design for low power applications. *International Journal of Computer Science and Information Technology* 229 (2010).
- 15. S. K. Han, K. Park, B. S. Kong, and Y. H. Jun, High speed low power bootstrapped level converter for dual supply systems, *Proc. IEEE Asia Pacific Conference on Circuits and Systems*, Kuala Lumpur, Malaysia (2010), pp. 871–874.
- B. Sathiyabama and S. Malarkkan, Low power adders for MAC unit using dual supply voltage in DSP processor, *Proc. International Conf. Solid-State and Integrated Circuit*, Singapore (2012).
- S. K. Manohar, V. K. Somasundar, R. Venkatasubramanian, and P. T. Balsara, Bidirectional single-supply level shifter with wide voltage range for efficient power management, *Proc. 25th International Con*ference on VLSI Design, Hyderabad, India (2012), pp. 125–130.
- S. Ali, S. Manohar, and S. Balasubramanian, Method and apparatus of a level shifter circuit with duty-cycle correction. US Patent 7,352,228 B2 (2008).
- S. Ali, S. Balasubramanian, and S. Manobar, Method and apparatus of a level shifter circuit having a structure to reduce fall and rise path delay. US Patent 7,511,552 (2009).
- S. Ali and S. Manohar, Single supply level shifter circuit for multivoltage designs, capable of up/down shifting. US Patent 7,750,717 B2 (2010).
- J. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital Integrated Circuit—A Design Perspective, Pearson Education, New Jersey, USA (2003).
- N. H. E. Weste and D. Harris, CMOS VLSI Design, A Circuits and Systems Perspective Addison-Wesley, Boston, Massachusetts, USA (2004).
- S. Masaki, A. Yamamoto, F. Seki, F. Asami, K. Ohno, M. Imai, and S. Udo, Level Converter for CMOS 3 V to from 5 V. United States Patent 5,680,064, issued Oct. 21, (1997).
- S. A. Tawfik and V. Kursun, Multi-V<sub>th</sub> level conversion circuits for multi-VDD systems, Proc. IEEE International Symposium on Circuits and Systems, New Orleans, LA (2007), pp. 1397–1400.
- K. N. Jayaraman, DVF4: A Dual V<sub>th</sub> Feedback Based 4-Transistor Level Converter. Master's thesis, Auburn University, Auburn, Alabama (2013).
- HSPICE Reference Manual: Commands and Control Options, Version D-2010.03-SP1, 2010. http://www.synopsys.com/Tools/

- Verification/AMSVerification/CircuitSimulation/HSPICE/Pages/default .aspx, accessed on October 14, 2011.
- T. Christiansen, B. D. Foy, L. Wall, and J. Orwant, Programming Perl, 4th ed., O'Rielly Media, Inc., Sebastopol, CA (2012).
- 28. Predictive Technology Model, Arizona State University.
- S. Borkar, Design perspectives on 22 nm CMOS and beyond, *Proc.* 46th ACM/IEEE Design Automation Conference, San Francisco, CA (2009), pp. 93–94.
- S. Sindia, F. F. Dai, V. D. Agrawal, and V. Singh, Impact of process variations on computers used for image processing, *Proc. International Symp. Circuits and Systems*, Seoul, South Korea (2012), pp. 1444–1447.
- 31. X. Yuan, T. Shimizu, U. Mahalingam, J. S. Brown, K. Z. Habib, D. G. Tekleab, T. C. Su, S. Satadru, C. M. Olsen, H. W. Lee, L. H. Pan, T. B. Hook, J. P. Han, J. E. Park, M. H. Na, and K. Rim, Transistor mismatch properties in deep-submicrometer CMOS technologies. *IEEE Transactions on Electron Devices* 58, 335 (2011).
- K. Kim, Ultra Low Power CMOS Design. Ph.D. thesis, Auburn University, Auburn, Alabama (2011).
- K. Kim and V. D. Agrawal, Dual voltage design for minimum energy using gate slack, *IEEE Conference on Industrial Technology (ICIT)*, Auburn, Alabama, USA (2011), pp. 419–424.
- M. Allani, Polynomial-Time Algorithms for Designing Dual-Voltage Energy Efficient Circuits. Master's thesis, Auburn University, Auburn, Alabama (2011).
- M. Allani and V. D. Agrawal, An efficient algorithm for dualvoltage design without need for level conversion, 44th Southeastern Symposium on System Theory (SSST), Jacksonville, FL (2012), pp. 51–56.
- M. Allani and V. D. Agrawal, Energy-efficient dual-voltage design using topological constraints. J. Low Power Electronics 9, 275 (2013).
- R. Bai and D. Sylvester, Analysis and design of level-converting flipflops for dual-V<sub>dd</sub>/V<sub>th</sub> integrated circuits, Proc. International Symposium on System-on-Chip (2003), pp. 151–154.
- F. Ishihara, F. Sheikh, and B. Nikolic, Level conversion for dualsupply systems. *IEEE Transactions on Very Large Scale Integration* Systems 12, 185 (2004).
- J. Rabaey, Low Power Design Essentials, Springer, New York, USA (2009).

#### Karthik N. Jayaraman

Karthik N. Jayaraman received his B.E. degree in electrical engineering from Anna University, Chennai, India, in 2006; M.S. degree in electrical engineering from Auburn University, Auburn, Alabama, in December 2013. In 2010–2011 he was with Cognizant Technology Solutions as a Programmer Analyst. His current research interests include circuit design and approaches for low-power VLSI design.

# Vishwani D. Agrawal

Vishwani D. Agrawal is James J. Danaher Professor of Electrical and Computer Engineering at Auburn University, Alabama, USA. He has over forty years of industry and university experience, working at Bell Labs, Murray Hill, NJ, USA; Rutgers University, New Brunswick, NJ, USA; TRW, Redondo Beach, CA, USA; IIT, Delhi, India; EG&G, Albuquerque, NM, USA; and ATI, Champaign, IL, USA. His areas of expertise include VLSI testing, low-power design, and microwave antennas. He obtained his B.E. degree from the Indian Institute of Technology Roorkee, Roorkee, India, in 1964; M.E. degree from the Indian Institute of Science, Bangalore, India, in 1966; and Ph.D. degree in electrical engineering from the University of Illinois at Urbana-Champaign, in 1971. He has published over 350 papers, has coauthored five books and holds thirteen United States patents. He is the Editor-in-Chief of the Journal of Electronic Testing: Theory and Applications, and a past Editor-in-Chief (1985-87) of the IEEE Design and Test of Computers magazine. He was the Keynote Speaker at the 25th International Conference on VLSI Design, Hyderabad, India, 2012, Keynote Speaker at the Ninth Asian Test Symposium, Taipei, Taiwan, 2000, and an invited Plenary Speaker at the 1998 International Test Conference, Washington, D.C., USA.He served on the Board of Governors (1989-90) of the IEEE Computer Society, and in 1994 chaired the Fellow Selection Committee of that Society. He has received eight Best Paper Awards and several other awards including two Lifetime Achievement Awards, and the 2014 James Monzel Award from the IEEE North Atlantic Test Workshop. Agrawal is a Fellow of the ACM, IEEE and IETE-India. He has served on the Advisory Boards of the ECE Departments of the University of Illinois at Urbana-Champaign, New Jersey Institute of Technology, and the City College of the City University of New York. See his website: http://www.eng.auburn.edu/~vagrawal.