A Modified Signal Feed-Through Pulsed Flip-Flop for Low Power Applications by Hassanzadeh, Alireza & Panahifar, Ehsan
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2017, VOL. 63,  NO. 3, PP. 241-246 
Manuscript received December 31, 2016; revised June, 2017. DOI: 10.1515/eletel-2017-0032 
 
 
Abstract—In this paper a modified signal feed-through pulsed 
flip-flop has been presented for low power applications. Signal 
feed-through flip-flop uses a pass transistor to feed input data 
directly to the output. Feed through transistor and feedback 
signals have been modified for delay, static and dynamic power 
reduction. HSPICE simulation shows 22% reduction in leakage 
power and 8% of dynamic power. Delay has been reduced by 
14% using TSMC 90nm technology parameters. The proposed 
pulsed flip-flop has the lowest PDP (Power Delay Product) among 
other pulsed flip-flops discussed. 
 




LIP-FLOPS are basic building blocks of memories and 
shift registers. Pipelining technique is widely used in 
recent digital system design that uses flip-flops. It is estimated 
that more than 50% of the total circuit power is consumed in 
clock and storage elements of an integrated circuit [1]. 
Therefore, power dissipation of flip-flop is very important. 
Conventional master-slave flip flops do not satisfy timing 
requirements of high performance systems. Many dynamic 
schemes such as hybrid latch flip-flop (HLFF) [2], semi-
dynamic flip-flop (SDFF) [3] and sense amplifier-based flip-
flop (SAFF) [4] [5] have been reported to improve the speed 
and delay. The major disadvantage of these flip-flops is power 
consumption, which has been increased due to their dynamic 
operation and are not useful in low power applications. 
 Pulsed flip-flops are good replacement for traditional flip-
flops in high speed circuits [2] [6] [7] [8]. Besides to the speed 
advantage, their circuit simplicity is also beneficial for 
lowering the power consumption of the clock tree system. The 
circuit of a P-FF is simplified since only one latch, as opposed 
to two latches, are used in conventional master–slave 
configuration. Since pulsed flip-flops are using only one latch, 
power dissipation in clock circuit will be reduced too. Pulsed 
Flip-Flops (P-FF) include a pulse generator and a latch to store 
data. If the pulse width is too small, latch acts as an edge 
triggered flip-flop. P-FFs provide time borrowing across clock 
cycle boundaries and feature a zero or even negative setup 
time. The pulse generating circuit should be insensitive to 
process variations [7]. P-FFs are divided into two categories of 
implicit and explicit, based on pulse generation method [9].  In 
implicit P-FF, pulse generating circuit is inherent to the circuit 
and no explicit pulse is generated. Pulse generation and latch 
circuits are separate in explicit P-FFs. Implicit P-FFs consume 
 
 
Authors are with ECE Dept., Shahid Beheshti University, Tehran, Iran (e-mail: 
ehsan.panahy@gmail.com, a_hassanzadeh@sbu.ac.ir). 
 
less power than explicit P-FFs, but suffer from slow discharge 
path. The situation becomes worse when low-power 
techniques such as conditional capture [10], conditional 
precharge [11], conditional discharge [12] or conditional data 
mapping [13] are used. Pulse generation logic transistors are 
often large to ensure that the generated pulses are sufficiently 
wide to trigger data latch. Explicit type P-FF designs face a 
similar pulse width issue. The problem is further complicated 
in the presence of a large capacitive loads when one pulse 
generator is shared among several latches [14]. Some implicit 
P-FFs such as Implicit Data-Close to Output (ip-DCO) [9], 
Single-ended Conditional Capturing Energy Recovery 
(SCCER) [6] and Conditional Pulse-Enhancement Scheme 
[14] have been reported in recent years. 
Explicit P-FFs are faster than implicit ones and consume more 
power, because of separate pulse and latch circuits. If the pulse 
generating circuit is shared between many P-FFs, power 
dissipation and circuit complexity is effectively reduced [9] 
[10] [15]. In this paper a modified P-FF based on the signal 
feed-through structure [15] has been presented that reduces 
delay and power dissipation.   
This paper has been arranged as follows. Section II reviews 
the conventional explicit P-FF advantages and disadvantages, 
then the modified P-FF based on Signal feed-through scheme 
is introduced. Section III shows simulation results and 
compares performance of Pulse-triggered flip-flops. 
Concluding remarks are at the end. 
II.  MODIFIED P-FF BASED ON DIRECT SIGNAL FEED-
THROUGH  
In this section traditional explicit P-FFs and signal feed-
through P-FFs are reviewed. The principle of operation of each 
flip-flop is explained as follows. 
 
A- Pulse Generation 
For better understanding of the pulse generation circuits, 
some pulse generators for P-FFs are discussed in this section. 
Figure 1-(a) shows a simple pulse generator called clock 
chopper [16]. This structure consists of a series of inverters 
that produce an inverted delayed version of pulse clock (CLK 
signal). The original signal and the delayed signal are passed to 
a NAND Gate. The width of the pulse is adjustable by setting 
the delay of the inverters and transistor sizes. The Naffziger’s 
pulsed latch that was used on Itanium 2 processor, is shown in 
figure 1-(b) [8]. It uses a weak inverter to produce a narrow 
pulse with a nominal width of about one-sixth of the clock 
cycle (125 ps for 1.2 GHz clock). Figure 1-(c) shows another 
pulse generator used on an NEC RISC processor [17]. The 
generator has a built-in dynamic transmission gate latch to 
prevent the enable signal glitch during the pulse period [18]. 
 
A Modified Signal Feed-Through Pulsed Flip-
Flop for Low Power Applications 
Ehsan Panahifar and Alireza Hassanzadeh 
F 

























Fig.1. Pulse generator circuits   (a) clock chopper [16]  (b) Naffziger [8]
 (c) NEC RISC processor [17] 
 
Chopper pulse generator has been used for all P-FF 
simulations in this paper because of its simple structure. Static 









































































































































Fig.2. Conventional P-FF designs:   (a) ep-DCO [9]    (b) CDFF [12]  (c) 
Static CDFF [19] (d) MHLFF [20] 
A MODIFIED SIGNAL FEED-THROUGH PULSED FLIP-FLOP FOR LOW POWER APPLICATIONS 243 
 
B- Traditional Explicit P-FF 
Figure 2-(a) shows an explicit P-FF named Explicit Data-
Close to Output (ep-DCO) [9] that is used as a reference for 
later comparison. The P-FF uses NAND gates and some 
inverters to generate a pulse and uses a latch with True-Single-
Phase-Clock (TSPC). I3 and I4 inverters are used for data latch 
and I1 and I2 are used to maintain middle node voltage levels. 
The clock pulse is generated using three inverters. Point X is 
discharged in every rising edge of clock even with “1” output 
state. This produces large switching power dissipation and is a 
disadvantage of this method. To eliminate this problem many 
methods have been used such as conditional discharge or 
CDFF [10]. Figure 2-b shows a CDFF circuit [12]. An NMOS 
transistor MN3 that is controlled by Q_fdbk signal guarantees 
that no discharge will happen when output is in “1” state. Also 
charge keeper for node X is a pull up PMOS transistor and a 
simplified inverter. Figure 2-c is a Static Conditional 
Discharge Flip-Flop (SCDFF) that uses conditional discharge 
[19]. This FF uses a static latch structure compared to previous 
one. This FF has more delay of D-to-Q compared to CDFF. 
Both methods suffer from long discharge path including MN1-
MN3. For delay reduction a strong pull down circuit is 
required. A combined latch and FF circuit, Modified Hybrid 
Latch Flip-Flop or MHLFF, in figure 2-d uses a static latch 
[20]. In [20] charge keeper at node X has been omitted. A 
weak pull-up transistor MP1 that is controlled by voltage at Q, 
keeps X node level when Q is “0”. This circuit has two 
problems, first since node X is not pre-discharged, long delay 
is expected for “0” to “1” transition. This delay is increased 
because of smaller clock level (reduced by Vth drop). Second, 






























Fig. 3. Pulse-triggered flip-flop design based on a signal feed-through 
scheme 
C- P-FF based on input signal feed-through 
Figure 3 shows a P-FF with signal feed-through structure 
[15]. This circuit uses a static latch and conditional discharge 
method. MP1, a weak pseudo-PMOS, is used to keep voltage 
level at node X. This circuit uses a pass transistor MNx that is 
controlled by clock pulse and drive output node directly using 
input data named as signal feed-through. Besides to MNx, 
MP2 provides a second path from input to output Q. The pull 
down network of the latch has been omitted and MNx provides 
the discharge path. Therefore, MNx transistor plays the main 
role for both “0” to “1” and “1” to “0” transitions. Comparing 
to ep-DCO, CDFF and SCDFF, this P-FF has better and 
balanced delay. However, this circuit has higher leakage 
power. 
The principle of operation of signal feed-through circuit is 
as follows. If input pulse is applied when there is no input 
change, Q remains the same. Input data and Q_fdbk are 
complements. Therefore, pull-down path from X will be off. 
When there is input change from “0” to “1”, node X is 
discharged and MP2 is turned ON and charges Q to Vdd 
supply. This is the worst time for the FF that the discharge path 
is connected in such a short pulse time. In this case signal feed-
through, connects input to node Q using MNx pass transistor 
and delay is reduced. When a “1” to “0” input transition 
happens, MNx will be turned ON by clock pulse and Q is 
discharged through MNx and input stage.  
 
D- The proposed signal feed-through P-FF 
To improve performance of a signal feed-through P-FF, its 
operation has been characterized. Figure 3 shows percentage of 
leakage power dissipation of each transistor in the signal feed-
through  P-FF circuit.   
 














































244 E. PANAHIFAR,  A. HASSANZADEH 
 
As it can be seen from figure 4, PMOS transistors have 
higher leakage power share. Therefore, by reducing leakage in 
MP1, circuit leakage power can be reduced. MP1 is always 
ON, therefore consumes power in idle state.  
MP1 transistor gate is connected to ground. Therefore, it 
serves as charging and keeping charge at node X. Since MP1 
transistor is always ON, it consumes static power. MP1 is a 
weak pull-up and causes long delay in the circuit too. 
To improve performance of the circuit, MP1 has been 
divided into two transistors that each transistor does the two 
functions separately. 
Transistor MP1-a gate is connected to Q_fdbk and is used for 
charging node X when Q is “1”. When output is “0”, Q_fdbk is 
“1” and MP1-a will be OFF, this causes less delay because 
shorter path is created between output and node X. 
Another transistor MP1-b with gate connected to Data signal  
has been added to keep charge at node “X”. When Data is 
“0” node X will be pulled-up by this transistor. Therefore, the 
always ON PMOS transistor has been omitted and so its static 
power dissipation.  
Instead of two inverters at the output node to keep output 
charge, MPk, PMOS transistor and an inverter has been used 
to reduce output delay.  
III. SIMULATION RESULTS       
For performance evaluation of the proposed circuit HSPICE 
and TSMC CMOS 90nm technology parameters have been 
used. For fair comparison all circuits shown in figures 2 and 3 
have been simulated. The pulse generation circuit transistors 
have been sized to produce 120ps pulse. A 20fF load capacitor 
for the P-FF and a 3fF capacitor at the clock buffer output have 
been used [14]. Input data and clock pulse are buffered too. 
Clock frequency is 500MHz and 1V power supply has been 
used for simulation. Power has been analyzed for six different 
data switching activity. 
A. Power Consumption Analysis 
Table I summarizes some of the specifications of the P-FFs. 
As it has been shown in Table I, the proposed circuit has 
lowest dynamic power consumption comparing to other 
circuits. Table II shows leakage power dissipation of all 
circuits. The proposed circuit has superior leakage power 
dissipation compared to other circuit. Lower leakage power 
consumption is due to the fact that MP1-a and MP1-b are not 





Table I  






MHLFF SCDFF CDFF ep-DCO FF Design 
24 24 19 31 30 28 Number of transistors 
-83 -82 1.5 -112 -75 -73 Setup time (ps) 
92 107 116.26 138.02 121 118 Minimum D-to_Q (ps) 
24.38 26.09 26.21 35.16 29.82 30.47 Average Power(100% activity) µW 
20.02 21.78 23.61 26.33 23.87 24.49 Average Power50% activity) µW 
15.3 16.56 18.16 18.4 17.93 20.79 Average Power(25% activity) µW 
13.31 14.43 15.94 15.28 15.36 19.99 Average Power(12.5% activity) µW 
11.16 12.19 13.67 12.36 12.37 20.28 Average Power(0% all-1) µW 
12.03 11.16 12.52 11.46 11.97 12.073 Average Power(0% all-0) µW 
1.41 1.67 2.11 2.45 1.97 2.18 Optimal PDP(25% activity) fJ 
 
 
Table II  





SCDFF CDFF ep-DCO FF Design 
47.07 53.09 58.97 53.53 51.48 (CLK, Data) = (0, 0) 
48.30 62.60 52.02 51.51 57.94 (CLK, Data) = (0, 1) 
42.80 54.32 65.31 59.56 59.87 (CLK, Data) = (1, 0) 
44.03 63.83 74.66 67.96 66.34 (CLK, Data) = (1, 1) 












Fig. 6. PDP performances:  (a) Different data switching activity, (b) Different processor corners at 50% data switching activity. 
 
IV. SIMULATION RESULTS       
For performance evaluation of the proposed circuit HSPICE 
and TSMC CMOS 90nm technology parameters have been 
used. For fair comparison all circuits shown in figures 2 and 3 
have been simulated. The pulse generation circuit transistors 
have been sized to produce 120ps pulse. A 20fF load capacitor 
for the P-FF and a 3fF capacitor at the clock buffer output have 
been used [14]. Input data and clock pulse are buffered too. 
Clock frequency is 500MHz and 1V power supply has been 
used for simulation. Power has been analyzed for six different 
data switching activity. 
A. Power Consumption Analysis 
Table I summarizes some of the specifications of the P-FFs. 
As it has been shown in Table I, the proposed circuit has 
lowest dynamic power consumption comparing to other 
circuits. Table II shows leakage power dissipation of all 
circuits. The proposed circuit has superior leakage power 
dissipation compared to other circuit. Lower leakage power 
consumption is due to the fact that MP1-a and MP1-b are not 
always ON. The leakage current of these transistors have been 
reduced.  
B. Timing Analysis 
As it has been shown in Table I, the minimum D-to-Q delay of 
the proposed circuit is the lowest among other designs. The 
power delay product PDP has been improved too and is the 
lowest for P-FFs. Figure 6-a shows the PDP for different data 
activities. To consider process variation of PDP for the 
proposed circuit, simulation has been performed for corner 
cases of (SS = 0.8 V/125ºC, TT = 1V/25ºC, FF=1.2V/-40ºC, 
SF=1V/25ºC, FS=1V/25ºC) and the results are shown in figure 
6-b. The setup time has been adjusted for best PDP results. The 
proposed P-FF, when NMOS transistors are faster than PMOS 
transistors, has the lowest delay and when PMOS transistors 
are faster, has the highest delay among other methods. The 
reason is that charging of node X is dependent on “0” to “1” 
transition of Q_fdbk node. This transition is also dependent on 
discharge time of node X through NMOS transistors. As a 
result, when NMOS transistors are faster than PMOS 
transistors the cycle is executed faster.  The proposed circuit 
has superior performance on other designs except for SF 
corner case. For TT or typical case, that PMOS transistors are 

































246 E. PANAHIFAR,  A. HASSANZADEH 
 
V. CONCLUSION 
In this paper a modified signal feed-through P-FF has been 
proposed. The important modification is to remove always on 
PMOS transistor and replace with two transistors to separate 
charging and charge keeping functions. Because of this 
modification, the static and dynamic power of the P-FF has 
been reduced by 8 and 22 percent respectively. PDP of the 
proposed FF is the lowest among other methods. Since large 
numbers of flip-flops are used in recent digital ICs, the 
proposed flip-flop modification can improve power dissipation 
of ICs significantly. 
REFERENCES 
[1] H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop (RCSFF) 
for 63% power reduction," IEEE J. Solid-State Circuits, vol. 33, pp. 807–
811, May 1998. 
[2] R. Burd, U. Salim, F.Weber, L. DiGregorio, and D. Draper H. Partovi, 
"Flow-through latch and edge-triggered flip-flop hybrid elements," in 
IEEE Tech. Dig. ISSCC, pp. 138–139, 1996. 
[3] F Klass, "Semi-dynamic and dynamic flip-flops with embedded logic," in 
Symp. on VLSI Circuits, Dig. of Tech. Papers, pp. 108–109, June 1998. 
[4] B. Nikolic et, "Sense amplifier-based flip-flop," Int. Solid-State Circuits 
Conf., Dig. of Tech, pp. 282–283, Feb. 1999. 
[5] M. Matsui et al., "A 200 MHz 13mm 2-D DCT macrocell using sense 
amplifying pipeline flip-flop scheme," IEEE J. Solid-State Circuits, vol. 
29, pp. 1482–1490, 1994. 
[6] D. De Caro, E. Napoli, and N. Petra A. G. M. Strollo, "A novel high 
speed sense-amplifier-based flip-flop," IEEE Trans. Very Large Scale 
Integr. (VLSI) Syst., vol. 13, pp. pp. 1266–1274, Nov. 2005. 
[7] C. Amir, A. Das, K. Aingaran, C. Truong, R.Wang, A. Mehta, R. Heald, 
and G.Yee F. Klass, "A new family of semi-dynamic and dynamic flip 
flops with embedded logic for high-performance processors," IEEE J. 
Solid-State Circuits, vol. 34, pp. 712-716, May 1999. 
[8] G. Colon-Bonet, T. Fischer, R. Riedlinger, T. J. Sullivan, and T. 
Grutkowski S. D. Naffziger, "The implementation of the Itanium 2 
microprocessor," IEEE J. Solid-State Circuits, vol. 37, pp. 1448–1460, 
Nov. 2002. 
 
[9] S. Narendra, Z. Chen, S. Borkar, M. Sachdev, and V. De J. Tschanz, 
"Comparative delay and energy of single edge-triggered and dual edge 
triggered pulsed flip-flops for high-performance microprocessors," in 
Proc. ISPLED, pp. 207–212, 2001. 
[10] S. Kim, and Y. Jun B. Kong, "Reduction, Conditional-capture flip-flop 
for statistical power," IEEE J. Solid-State Circuits, vol. 36, pp. 1263–
1271, Aug. 2001. 
[11] M. Aleksic, and V. G. Oklobdzija N. Nedovic, "Conditional precharge 
techniques for power-efficient dual-edge clocking," n Proc Int. 
Symp.Low-Power Electron. Design, Monterey, pp. 56–59, 2002. 
[12] T. Darwish, and M. Bayoumi P. Zhao, "High-performance and low 
power conditional discharge flip-flop," IEEE Trans. Very Large Scale 
Integr. (VLSI) Systems, vol. 12, pp. 477–484, May 2004. 
[13] M. Hamada, T. Fujita, H. Hara, N. Ikumi, and Y. Oowaki C. K. Teh, 
"Conditional data mapping flip-flops for low-power and high-
performance systems," IEEE Trans. Very Large Scale Integr. (VLSI) 
Systems, vol. 14, pp. 1379–1383, Dec. 2006. 
[14] J.-F. Lin, and M.-H. Sheu Y.-T. Hwang, "Low power pulse triggered 
flip-flop design with conditional pulse enhancement scheme," IEEE 
Trans. Very Large Scale Integr. (VLSI) Syst, vol. 20, pp. 361–366, Feb. 
2012. 
[15] Jin-Fa Lin, "Low-Power Pulse-Triggered Flip-Flop Design Based on a 
Signal Feed-Through," IEEE Transactions on Very Large Scale 
Integration (VLSI) Systems , vol. 22, pp. 181 - 185, 2014. 
[16] D. Harris, Skew-Tolerant Circuit Design. San Francisco: CA: Morgan 
Kaufmann, 2001. 
[17] S. Kozu et al., "A 100 MHz 0.4W RISC processor with 200 MHz 
multiply-adder, using pulseregister technique," Proc. IEEE Intl. Solid-
State Circuits Conf, pp. 140–141, 1996. 
[18] David Money Harris Neil H. E. Weste, CMOS VLSI Design:A Circuits 
and Systems Perspective, 4th ed.: Pearson Education , 2011. 
[19] W.-L. Goh, and K.-S. Yeo M.-W. Phyu, "A low-power static dual edge 
triggered flip-flop using an output-controlled discharge configuration," in 
Proc. IEEE Int. Symp. Circuits Syst, pp. 2429–2432, May 2005. 
[20] A. Khademzadeh, A. Afzali-Kusha, and M. Nourani S. H. Rasouli, "Low 
power single- and double-edge-triggered flip-flops for high speed 
applications," IEE Proc. Circuits Devices Syst., vol. 152, pp. 118–122, 
Apr. 2005. 
 
 
