In this paper, a scheme for power reduction based on Cluster Voltage Scaling (CVS) for gate-level design of the VLSI circuits is presented. To increase the power reduction efficiency of the previous CVS techniques, a new low power level-shifter is utilized in the circuit. In addition, the concept of transistor ordering has been used to further reduce the power consumption. This technique shows an average improvement of 7% compared to the previous CVS circuits. The impact of CVS and its modified version on the reduction of short-circuit and leakage power are also discussed.
INTRODUCTION
With the importance of battery-life and reliability of portable products, the low power design of CMOS VLSI circuits has attracted much attention in recent years and numerous research efforts to address various techniques of power reduction [1] . Reducing capacitance, the switching activity, the frequency, the supply voltage of the circuit, are the bases of these techniques.
Reducing the supply voltage, also called Voltage Scaling (VS), has been deemed as the most potential approach for the power reduction [2] . Since lowering the voltage leads to increasing the delay of the circuit, some techniques have been proposed to deal with performance degradation resulting from the voltage reduction. Parallel and pipeline are two well-known architectures to overcome this problem at the Register-transfer Level (RTL) [1] . In the gate-level designs, however, these techniques can not be applied and, hence, the voltage scaling is performed only on the gates off the critical path of the circuit [3] . Using two different supply voltages for the gates leads to large DC leakage currents which occurs when a low-voltage gate is directly drives a high-voltage one. To avoid this problem, a Level Shifter (LS) is used at the interface of a low-voltage and highvoltage gates. Since the LS circuit consumes power and has a considerable delay, minimizing the number of level-shifters is important in the voltage scaling technique. Considering this fact, a few techniques have been proposed to deal with voltage scaling at the gate level [3] [4] [5] [6] . The most popular of them is Cluster Voltage Scaling (CVS) [5] , in which the level shifters are used just in the front of the primary outputs. Fig.1 shows the idea behind this technique. Although CVS uses the least possible number of levelshifters for voltage scaling, the power and delay overhead of the level shifters delimit the efficiency of this technique. Additionally, in most of reported research efforts on CVS, only the impact on the reduction of the dynamic power has been studied and few researches have been deducted to survey the efficiency of this technique for short-circuit and leakage power reduction.
In this paper, a low-power low-delay level-shifter has been incorporated in the CVS technique to reduce the power consumption. In addition, the transistor ordering concept has been utilized in the design. A study of the impact of CVS on the reduction of short-circuit and leakage power components are also presented. The paper is organized as follows. In section 2, we introduce the gate-level model of the power used in this work. Section 3 describes our modifications to CVS, including the low-power level shifter and transistor ordering, while section 4 presents the simulation results and discussion.
II. GATE-LEVEL MODELING OF POWER DISSIPATION
For a comprehensive study on the efficiency of CVS for the power reduction, we need to use some models for describing the power dissipation of the gates .The power dissipation of a CMOS gate consists of three major components: dynamic power, short-circuit power and leakage power. So, the power consumption of a CMOS gate can be expressed by where P Leak , P SC and P Dyn are the leakage, short-circuit and dynamic power, respectively and P is the total power dissipation of the gate.
A. Dynamic Power Dissipation
Charge and discharge of load and parasitic capacitances of a gate creates the dynamic power dissipation. With a good accuracy, this component of power can be expressed by [ 
where α is the switching activity, f is the clock frequency, C L is the fanout capacitance, C int is the parasitic capacitances of gate output node and V DD is the supply voltage.
B. Short Circuit Power Dissipation
When the transition time of the input signal of a gate is not short enough, short-circuit power dissipation has a considerable effect on total power dissipation and can not be neglected [8] . Many good approaches have been proposed to address this component of power [see, e.g., 8]. However, the complexity of these models make them not very suitable for a gate-level process. A simple approach for evaluating shortcircuit power dissipation which has been proposed in [9] is used in this work.
The short-circuit power dissipation of a CMOS inverter can be expressed as
, where W n and W p are the widths of NMOS and PMOS transistors, Tr is the 10-90% input transition time and C L is the load capacitance. In [9] it was shown that with an acceptable accuracy, the short circuit power dissipation of an inverter can be described by
where P SC,f is the short-circuit power dissipation when the output of the inverter changes from V DD to 0 and P SC,r is the short circuit power dissipation when its output goes from 0 to V DD . Furthermore, K r , K f , α 1r , α 1f , α 2r , α 2f , α 3r , α 3f , α 4r , α 4f are some parameters depending to the technology and the library. To obtain the short-circuit power dissipation of complex gates (such as NAND and NOR), these gates can be converted to equivalent inverters [9] .
C. Leakage power dissipation
In the current technologies, the leakage power consumption is small, but in future technologies this component may have a great impact on the total power dissipation and can not be neglected. In order to survey the impact of any power optimization technique on the state-ofthe art VLSI circuits, one needs to consider the impact of this component in the power consumption.
It should be noticed that the leakage power of a CMOS gate not only depends on the gate size, but also on the values of input signals. For example, in a 3-input NAND gate, when all input signals are 0, the leakage power, at least, is one order of magnitude less than other cases [11] . To express the dependency on the input signal pattern, the leakage power of a complex gate when the logical values of its n inputs are b 1 , b 2 , …, b n respectively, can be denoted by [11] . The average leakage power of this gate can be modeled by [11] ( )
is the probability that the logical values of input signals are b 1 , b 2 , …, b n , respectively. With an assumption of temporal independence for the input signals, one can write
III. MODIFICATION TO CVS

A. Level Shifter
As mentioned in the introduction, in the CVS technique, the level shifter must be inserted at the primary outputs to prevent the static current in the circuit. Fig. 2(a) shows the traditional level shifter, called Dual Cascode Voltage Switch (DCVS), which has been used in [2, 3, 5, 6] . Simulation results show that the power consumption of this level shifter is about four to five times that of an inverter. Additionally, the delay of this level shifter is about four times that of an inverter. Thus, it is obvious that using a low power level shifter which has less delay, can improve the efficiency of CVS technique. In [12] , a new level shifter has been proposed which has less power and delay at the cost of adding two transistors to the previous structure. This structure, called Symmetrical Dual Cascode Voltage Switch (SDCVS), is shown in Fig. 2(b) .
For comparison between this level-shifter and the traditional one, we have simulated both of them by SPICE. The widths of the transistors and inverters with the same name in two structures have been chosen to be equal. Figs. 3  and 4 compare the power and delay of the two structures, respectively. As can be seen, the delay and power of the SDCVS structure is particularly less than DCVS. So, it is expected that the use of this level-shifter can improve the efficiency of CVS. 
B. Transistor Ordering
At the physical level, it is known that lay-outing the critical-path transistors closer to the output of the gate can result in an increase in the speed of the gate [7] (see Fig. 5 ). Our experiments showed that with this technique, called Transistor Ordering, the critical time of the circuit could be reduced up to 15%. Using this technique, one can reduce the delay of the circuit in CVS. 
IV. SIMULATION RESULTS AND DISCUSSION
For our voltage scaling, we have chosen 3.3V and 2.0V supply voltages. Additionally, for modeling the short-circuit power, we have used HSPICE simulations to obtain the short-circuit power dissipation of an inverter with both highand low-voltage supply for a variety of parameters (W n , W p , C L , Tr). Then, using Simulated Annealing [10] technique for curve fitting, we have obtained the parameters of (3) and (4) for both V DDH and V DDL . For all cases, the least square error of the fitting was less than 10%. The short-circuit power dissipation of the complex gates in the library was obtained by converting the gates into their equivalent inverters. The leakage power component for each input signal was estimated by simulating the gates of the library for all combinations of the input signal.
We have implemented the CVS technique in C, on the top of SIS [4] environment. In addition, Transistor Ordering feature has been added to the implementation of the CVS. In these experiments, SDCVS has been used at the level shifter. Twenty MCNC benchmark circuits are used as the test bed. Each of them was optimized by "script.rugged" provided in the SIS package and then mapped to a minimum delay circuit of the technology library. A 0.35µm CMOS technology library, enriched by adding low-voltage gates, has been used for our simulations. Table 1 shows a comparison between traditional CVS technique, with DCVS level-shifter and without transistor ordering, and the Modified CVS (MCVS), with SDCVS level-shifter and with transistor ordering. As this table shows the average improvement of MCVS relative to CVS is more than 7%. Table 2 gives more details of our simulations. In this table, we have demonstrated the contribution of shortcircuit (SC) and dynamic (Dyn.) power consumption on the power dissipation of each circuit. It is seen that the CVS and its modified version reduce both the dynamic and shortcircuit power components. Table 3 shows the number of total gates of original circuit, the number of level-shifters used in CVS and MCVS, and the ratio of V DDL gates in each technique.
The impact of CVS on the leakage power has tabulated in Table 4 . As can be seen, in many cases the use of CVS leads to increasing the leakage power. This originates from the fact that the leakage power of a level-shifter (whether DCVS or SDCVS) is particularly more than the leakage power of other gates in the library. So, in future technologies, where the leakage power has a great impact on the total power dissipation, CVS must be incorporated in the circuit with care. 
