AN APPROACH FOR LOW LEAKAGE POWER BY POWER GATING STACK TECHNIQUE by Srikanth, Jarpula
Jarpula Srikanth* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4579-4582.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4579 
An Approach For Low Leakage Power By 
Power Gating Stack Technique 
JARPULA SRIKANTH 
Assistant Professor 
Mallareddy Institute Of Technology 
Hyderabad 
Abstract— Clock gating (CG) and power gating (PG) the two most widely used techniques to reduce 
dynamic power and leakage power respectively, are expected to be integrated together effectively. 
Normally, the implementation of CG leads to some redundant operations, which provides the opportunity 
to apply PG. In this brief, we have proposed an activity-driven fine-grained CG and PG integration. For 
the implementation of XOR-based CG we have intro-duce an optimized bus-specific-clock-gating (OBSC) 
scheme to improve traditional gating.It chooses only a subset of flip-flops (FFs) to be gated selectively, 
and the problem of gated FF selection is reduced from exponential complexity into linear. Then those 
combinational logics, which completely depend on the outputs of gated FFs, are performing redundant 
operations. They can be power gated, and the clock enable signal generated by OBSC is used as the sleep 
signal. A minimum average idle time concept is proposed to determine whether the insertion of PG will 
lead to energy reduction.The simulation results show that 25.07% dynamic power can be reduced by 
OBSC, and 50.19% active leakage power can be saved by PG. 
Key words— Clock Gating; Low Power; Power Gating; 
I. INTRODUCTION 
Low power has emerged as a principal theme in 
today’s electronics industry. The need for low power 
has caused a major paradigm shift where power 
dissipation has become as important a consideration 
as performance and area. Two components 
determine the power consumption in a CMOS 
circuit Static power Includes sub-threshold leakage, 
drain junction leakage and gate leakage due to 
tunneling. Among these, sub threshold leakage is the 
most prominent one. Dynamic power Includes 
charging and discharging power and short circuit 
power. When technology feature size scales down, 
supply voltage and threshold voltage also scale 
down. Sub-threshold leakage power increases 
exponentially as threshold voltage decreases. 
Furthermore, the structure of the short channel 
device lowers the threshold voltage even lower. 
Power consumption consists of dynamic power and 
leakage power, and leakage power can be divided 
into standby leakage and active leakage. Clock 
gating (CG)is the most common and widely used 
technique to reduce dynamic power, and power 
gating (PG) is the dominant technique to reduce 
standby leakage power. As active leakage power 
becomes more and more important, it also requires 
care.The PG to minimize active leakage power in 
the operation mode is referred to as run time power 
gating (RTPG) in this brief. CG is a technique used 
to gate the unnecessary clock toggles of a register. 
During the clock gated period, there are some 
components that are performing redundant 
operations, and RTPG will put these components 
into sleep. There are several researchers focusing on 
the integration of CG and RTPG. All of their 
designs are based on clock gated designs generated 
after synthesis, and they evaluate the feasibility of 
RTPG according to the signal activity of the design. 
However, it is possible that a design cannot be clock 
gated during synthesis. 
In this brief, we have proposed an activity-driven 
fine-grained CG and RTPG integration, which can 
reduce dynamic power and active leakage power 
simultaneously. An activity-driven optimized bus 
specific CG (OBSC afterward) is used to maximize 
dynamic power reduction at RT level before 
synthesis. It chooses only a subset of flip-flops (FF) 
to be gated selectively, and the problem of gated FF 
selection is reduced from exponential complexity 
into linear. After the OBSC is applied to the design, 
the components performing redundant operations 
during the clock gated period are determined by 
forward traversing the circuit from the gated FF 
outputs. These components will be power gated 
using the clock enable signal generated by OBSC 
only if the implementation of RTPG can reduce 
active leakage power. The feasibility analysis of 
RTPG is based on our proposed minimum average 
idle time concept. 
The rest of this brief is organized as follows. Section 
II gives an introduction to CG and PG basics. The 
proposed activity-driven OBSC is presented in 
Section III. Section IV explains the details on how 
to implement PG after OBSC. Experimental results 
are given in Section V, and this brief is concluded in 
Section VI. 
  
Jarpula Srikanth* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4579-4582.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4580 
II. BACKGROUND 
A. CG Basics 
As the operating speed increases of a chip then the 
dynamic power consumption increases 
dramatically.CG is a technique used to gate the 
unnecessary clock toggles of a registers. Clock 
gating is a technique that is used to control the 
power dissipated by a clock network and it reduces 
the dynamic power dissipation. In a synchronous 
circuits clock network is responsible for a power 
dissipation up to 40%..Clock gating reduces the 
unwanted switching on the parts of a clock network 
by disabling the clock signal. Clock gating saves the 
power by adding a more logic to a clock network. 
When the clock is not switched the switching 
(dynamic) power consumption goes to zero and 
there is only a leakage current is occurred. Clock 
gating shuts off the clock when the system is in 
current state so that the dynamic power consumption 
is reduced. 
Fig. 1(a) is a typical non-CG circuit and Fig. 1(b) is 
its traditional XOR-based CG circuitry [we call it 
bus-specific-clock-gating (BSC) afterwards]. BSC 
circuit compares the inputs and outputs, and gates 
the clock when they are equal. BSC can be used as a 
final CG option to reduce dynamic power when no 
CG can be applied during synthesis. However, BSC 
is far from optimal in terms of dynamic power 
minimization, and the partial BSC (PBSC afterward) 
circuit may have much less power. More details are 
given in Section III. 
B. PG Basics 
In this brief, we use the most basic PG structure (a 
single footer) to reduce the leakage power, as shown 
in Fig. 2. The sleep signal that controls the footer in 
traditional PG is provided by an independent power 
management block. In this brief, we do not consider 
this type of sleep signal. Instead, the sleep signal of 
RTPG we focus on is generated by CG in operation 
mode. It is used to turn off the components that are 
executing redundant operations in operation mode. 
III. ACTIVITY-DRIVEN OPTIMIZED BUS-
SPECIFIC CG 
A. Preliminary 
1) Combinational Logic Power Model:  If the 
logic is part of a synchronous digital system 
controlled by a global clock, the average dynamic 
power dissipated by the gate Pavg
comb
 can be 
expressed as 
 
where Vdd is the supply voltage, Tcyc is the global 
clock period, T R is toggle rate of the gate output, 
and C is the gate output capacitance. Among these 
four parameters, only Vdd and Tcyc can be determined 
in advance from the technology and design 
information, and they can be treated as constants in 
the estimation process. T R depends on both the 
logic function being performed and the statistical 
properties of the primary inputs. When the output of 
a combinational logic toggles every clock cycle, its 
T R is 1, and the power dissipated by this 
combinational logic is defined as unit power 
Punit
comb
. As a result 
Pavg
comb
 = Punit
comb
 · T R 
where Punit
comb
 is a function of C, and it can be 
determined once we have the circuit structure. 
 
Fig. 1. (a) Non-CG circuit. (b) BSC circuit. (c) 
PBSC circuit. 
 
Fig. 2. PG scheme. 
2)Sequential Logic Power Model: The sequential 
logic is normally a latch or a D FF. The power 
estimation of the Sequential logic cannot be 
evaluated by the above technique. We propose a 
new way to measure the power of a sequential logic 
in this brief. For a D FF/latch, its operation per each 
clock cycle can be classified into four categories. 
1) OP_I both clock and input data toggle; 
2) OP_II only clock toggles; 
3) OP_III only input data toggles; 
4) OP_IV neither clock nor input data toggles.  
IV. INTEGRATION OF BSCG AND PG 
A footer power switch is inserted either in between 
actual ground and virtual ground of the power gated 
cells or a header switch is inserted in between power 
supply and the virtual power supply of power gated 
cells are shown in Fig. 4. The enable signal 
generated from BSCG is used as sleep signal for PG 
cells. PG cells are totally dependent on gated FF 
Jarpula Srikanth* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4579-4582.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4581 
outputs. Holders are placed in between the power 
gated cells and the non-power gated cells so that 
non-power gated cells can function properly. 
Integration of BSCG and PG can be explained in 
detail by considering an example of synchronous 
circuit. It consists of four out of five FFs are clock 
gated. For it first we had applied BSCG technique 
then four FFs are clock gated. The dashed lines are 
completely dependent on stable gated FFs outputs, 
so they are inactive and can be power gated into 
sleep. However, one input of the xor gate H is the 
output of un-gated FF1, since it may not be stable 
(active) during clock gated period. In order to avoid 
floating signal, holder logic is placed at the output of 
power gated cell if that output connects to non-
power gated cells or primary outputs. 
 
Fig. 4. Forward traversing example. 
V. RESULTS 
Waveforms of integration of cg with RTPG: 
 
Figure 5.wave forms of Integration of CG with 
RTPG 
By the simulation results can be observed that as the 
clock given to the Flip Flop the input data 
transmitted to the circuit and the output compared 
with XOR gate. As the clock given then only the 
data has been transmitted.  
CG with stacking technique: 
 
Figure 6.wave forms of CG with Stacking 
technique  
By observing the above simulation in Tanner tool it 
shows that the clock applied to the required Flip 
Flops and the data compared with XOR gates and 
the output stated above. 
VI. CONCLUSION 
In this brief, we have proposed a fine-grained CG 
and RTPG integration based on signal activities. We 
started with an activity driven fine-grained OBSC 
technique that selects only a subset of FFs to gate, 
clock enable signal generated in the OBSC circuit 
can be used as the sleep signal in RTPG. The power 
gated cells can be determined by forward traversing 
from the gated FF outputs. Hence by the technique 
we reduced the active leakage by 56.19%. And now 
by using power gating Stacking Technique reduced 
more leakage power by 76.17%. 
VII. REFERENCES 
[1]  H. Sutter. (2005). The Free Lunch is Over 
[Online].  
Available: http://www.gotw.ca/publications/ 
concurrency-ddj.htm. 
[2]     S. Jairam, M. Rao, J. Srinivas, P. 
Vishwanath, H. Udayakumar, and J. Rao, 
“Clock gating for power optimization in 
ASIC design cycle theory & practice,” in 
Proc. Int. Symp. Low Power Electron. 
Design, Aug. 2008, pp. 307–308. 
[3]    P. Babighian, L. Benini, and E. Macii, “A 
scalable algorithm for RTL insertion of gated 
clocks based on ODCs computation,” IEEE 
Trans. Comput.-Aided Design Integr. 
Circuits Syst., vol. 24, no. 1, pp. 29–42, Jan. 
2005. 
[4]   R. Fraer, G. Kamhi, and M. K. Mhameed, “A 
new paradigm for synthesi and propagation 
of clock gating conditions,” in Proc. Des. 
Autom. Conf., Jun. 2008, pp. 658–663. 
[5]   K. Roy, S. Mukhopadkyay, and H. 
Mahmoodi-meimand, “Leakage current 
mechanisms and leakage reduction 
techniques in deepsubmicrometer CMOS 
circuits,” Proc. IEEE, vol. 91, no. 2, pp. 305–
327, Feb. 2003. 
[6]    K. Usami and H. Yoshioka, “A scheme to 
reduce active leakage power by detecting 
state transitions,” in Proc. Int. Midwest 
Symp. Circuits Syst., Jul. 2004, pp. 493–496. 
[7]    K. Usami and N. Ohkubo, “A design 
approach for fine-grained run-time power 
gating using locally extracted sleep signals,” 
in Proc. Int. Conf. Comput. Design, 2006, pp. 
151–161. 
[8]    E. Macii, L. Bolzani, A. Calimera, A. Macii, 
and M. Poncino, “Integrating clock gating 
Jarpula Srikanth* et al. 
  (IJITR) INTERNATIONAL JOURNAL OF INNOVATIVE TECHNOLOGY AND RESEARCH 
  Volume No.4, Issue No.6, October – November 2016, 4579-4582.  
2320 –5547 @ 2013-2016 http://www.ijitr.com All rights Reserved.  Page | 4582 
and power gating for combined dynamic and 
leakage power optimization in digital CMOS 
design,” in Proc. EUROMICRO Conf. 
Digital Syst. Design, 2008, pp. 298–303. 
[9]    L. Bolzani, A. Calimera, A. Macii, E. Macii, 
and M. Poncino, “Enabling concurrent clock 
and power gating in an industrial design 
flow,” in Proc. Des. Autom. Test Eur. Conf., 
2009, pp. 334–339. 
[10]   A. Ghosh, S. Devadas, K. Keutzer, and J. 
White, “Estimation of average switching 
activity in combinational and sequential 
circuits,” in Proc. Des. Autom. Conf., Jun. 
1992, pp. 253–259. 
[11]   Sequence Design Inc., PowerTheater User 
Guide, 2007. 
AUTHOR’s PROFILE 
JARPULA SRIKANTH Hailed From Warangal 
(Dist.) born on 25th June 1990. He 
received B. Tech in Electronics 
and Communication Engineering 
from BITS-Warangal(JNTU-
HYD)  He received M.Tech in 
EMBEDDED SYSTEMS & VLSI 
DESIGN from Mallareddy Institute Of Technology 
And Sciences-Hyderabad,Telangana, India. His 
research interests include Physical Design (RTL to 
GDSII),Digital VLSI Design and Low Power 
Memory Design.Attended 01 National Conference. 
Also He has worked on Physical Design (RTL to 
GDSII) on tanner EDA tools. Presently he is 
working as Asst.Prof in Mallareddy Institute Of 
Technology-Hyderabad. He is having 1
1/2
 years 
experience in teaching field on VLSI related areas. 
 
 
