#### **Chapman University**

### **Chapman University Digital Commons**

**Engineering Faculty Articles and Research** 

Fowler School of Engineering

3-9-2023

# Low-Power Redundant-Transition-Free TSPC Dual-Edge-Triggering Flip-Flop Using Single-Transistor-Clocked Buffer

Zisong Wang

Peiyi Zhao

Tom Springer

Congyi Zhu

Jaccob Mau

See next page for additional authors

Follow this and additional works at: https://digitalcommons.chapman.edu/engineering\_articles

Part of the Other Computer Engineering Commons, and the Other Electrical and Computer Engineering Commons

# Low-Power Redundant-Transition-Free TSPC Dual-Edge-Triggering Flip-Flop Using Single-Transistor-Clocked Buffer

#### Comments

This is a pre-copy-editing, author-produced PDF of an article accepted for publication in *IEEE Transactions* on *Very Large Scale Integration (VLSI) Systems*, volume 31, issue 5, in 2023. https://doi.org/10.1109/TVLSI.2023.3251286

#### Copyright

© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

#### **Authors**

Zisong Wang, Peiyi Zhao, Tom Springer, Congyi Zhu, Jaccob Mau, Andrew Wells, Yinshui Xia, and Lingli Wang

# Low Power Redundant-Transition-Free TSPC Dual-Edge-Triggering Flip-Flop using Single-Transistor-Clocked Buffer

Zisong Wang, *Graduate Student Member, IEEE*, Peiyi Zhao, Tom Springer, Congyi Zhu, Jaccob Mau, Andrew Wells, Yinshui Xia, *Member, IEEE*, and Lingli Wang, *Member, IEEE* 

Abstract—In the modern GPU/AI era, flip-flop (FF) has become one of the most power-hungry blocks in processors. To address this issue, a novel single-phase-clock dual-edge-triggering (DET) FF using a single-transistor-clocked (STC) buffer is proposed. The STC buffer uses a single-clocked transistor in the data sampling path, which completely removes clock redundant transitions and internal redundant transitions that exist in other DET designs. Verified by post-layout simulations in 22nm FD-SOI CMOS, when operating at 10% switching activity, the proposed STC-DET outperforms prior state-of-the-art low power DET in power consumption by 14% and 9.5%, at 0.4 V and 0.8 V respectively. It also achieves the lowest power-delay-product(PDP) among the DETs.

Index Terms—dynamic power, dual edge triggering, flip-flop

#### I. Introduction

OWER consumption has become one of the top concerns for CMOS digital designers, especially when facing the drive from modern GPU/AI processors. The computing power used in AI training has doubled every 3.4 months [1].

Inside a modern processor, the clocking system can consume more than 50% of the total power [2]. Hence, its power optimization has been viewed as one of the main keys for addressing the power dissipation issue mentioned above.

Flip-flops (FFs), along with clock distribution networks, are the two main building blocks in a processor's clocking system. Conventional single-phase-clock FFs only use one clock edge in a time period to process input data, resulting in an unnecessary power overhead as the other clock edge stays undeveloped for data processing. Dual-edge-triggering (DET) FFs take advantage of both clock edges to process data, thus can lower clock frequency to half for reducing power consumption while still maintaining the same throughput.

In this brief, a new DET FF topology using a true singlephase clock (TSPC) is proposed to further lower its power consumption. The paper's main contributions are as follows:

Manuscript received XXX. This work was supported by grants to Chapman University from Dale E. and Sarah Ann Fowler, Microchip/Microsemi Inc, and Broadcom Inc.

- Z. Wang is with University of California, Irvine, CA 92697, USA.
- P. Zhao, T. Springer, J. Mau, and A. Wells are with the Fowler School of Engineering, Chapman University, Orange, CA 92866, USA (Email:zhao@chapman.edu).
  - C. Zhu is with Nanjing University, Nanjing, China.
- Y. Xia is with School of Information Science and Engineering, Ningbo University, Ningbo, China
  - L. Wang is with Fudan University, Shanghai, China.



Fig. 1. Floating node C element DET, FN C-DET [3]

- The proposed DET FF is the first of its kind that completely eliminates both the clock and the internal redundant switching.
- Single-transistor-clocked (STC) buffers with zero redundant transition are proposed for our FF design.

The rest of the paper is organized as follows: Section II briefly reviews the-state-of-art DET FFs. Section III proposes our low-power TSPC DET using a single-transistor-clocked buffer. Section IV presents the simulation verification, and Section V concludes the paper.

#### II. SURVEY OF STATE-OF-THE-ART DET FFS

Various different low-power latches, flip flops, and methodologies have been proposed [4]–[11].

In DET FFs, one of the power consumption issues is that the clocked transistors often cause unnecessary redundant power consumption overhead which occurs when input data remains unchanged but some transistors in the circuit still switch actively due to the circuit topology.

#### A. Redundant-Transition in Two-Phase-Clock DET

Conventional DET FFs, for example, static Floating Node C element FN\_C DET [3] Fig.1, Fully static FS DET [12] Fig.2, TSPC DET [13], Sense Amplifier DET [14], S-DET [15], CBS DEFF [16], DE epDSFF [4], DEr FF [17], NCDDR [18], use two-phase-clocking where two cascaded inverters are used to generate the two clock phases (CKB, CKI in left side of Fig.3). However, when input data does not change, the cascaded inverters remain switching constantly, resulting in clock redundant transition power [13]. As an example,



Fig. 2. Fully-static TSPC DET, FS-TSPC DET [12]



Fig. 3. Summary of redundant transitions in DET.

this redundant-transition behavior existing in FN\_C-DET is illustrated in the bottom right side of Fig.1. Furthermore, there is a contention, which is a type of short circuit, between outer-C-elements (N1, N4, P1, P4) and inner-C-elements (N2, N3, N4, P2, P3, P4) at node B during transition time in FN\_C-DET. A similar contention can also be observed at node A.

#### B. Redundant-Transition in Single-Phase-Clock DET

To reduce the aforementioned redundant clock transition power in dual-phase-clock DET, a few single-phase-clock DET FFs have been proposed to avoid the cascaded clocked inverters.

One of the approaches is the Fully Static TSPC DET (FS TSPC) [12], Fig.2, which is an elegant design. There is no *explicit* clocked inverter in FS TSPC, as only one clock phase is used. However, when input does not change, there is an *implicit* internal redundant transition. Shown in Fig.2, if D stays at 1, DP will stay at 0, then the NOR structure in the middle of figure becomes an inverter that has one clocked PMOS and one clocked NMOS. The implicit-redundant-transition has thus occurred because of the continual switching of the two clocked transistors. Similarly, if DN stays at 1, the NAND structure will become an inverter which will also have the constant switching problem, causing the implicit-redundant-transition. Another static True-single-phase-clock DET, TSPC DET [13], also suffers from the implicit-redundant-transition in a similar mechanism.

#### C. Summary of redundant transitions in DET

Fig.3 categorizes redundant transitions (RTs) in DET FF into two types: the explicit clock-RT in two-phase-clock in FN\_C DET, (shown in the left of Fig.3), and the implicit internal-RT in single-phase-clock in FS TSPC and TSPC DET (shown in the right of Fig.3).



Fig. 4. Proposed TSPC single transistor clocked DET, STC-DET



Fig. 5. Operation of the proposed STC-DET: (a) Top FF, and (b) Bottom FF using equivalent simplified logic circuit diagram

As the data switching activity rate is close to 10% in CMOS static logic circuit [6], the redundant transition rate would be 90%. Therefore, these internal redundant transitions would consume considerable power for contemporary data processing, which draws our attention leading to the following design.

## III. PROPOSED LOW-POWER TSPC DET FF USING SINGLE-TRANSISTOR-CLOCKED BUFFER

To eliminate the aforementioned redundant-transitions between two clocked transistors (one PMOS, one NMOS) in dual-edge flip-flops, a low-power TSPC DET FF is proposed with innovative redundant-free single-transistor-clocked (STC) buffer topology.

Shown in Fig.4, STC-DET adopts the master-slave-latch topology. It has two flip flops, one on the top and the other at the bottom. The top FF samples data at the clock positive edge, and the bottom FF samples input at the clock negative edge.

THE THE STATE OF THE PROPERTY OF THE PROPERTY

The detailed operations of STC-DET flip-flop are as follows:

#### A. Operation of the top FF in STC-DET

When CLK = 0, in the data sampling path of the top FF, the clocked PMOS P3 in the top master latch turns on, node X becomes D'' (see the top left of Fig.4); since D'' is essentially D, then transistors (P2, N2) will become equivalent to a virtual inverter (A simplified logic diagram is illustrated in Fig.5(a)). Then the input passes to MID in the top master latch (see arrow in the top left of the above figure); On the other hand, in the top slave latch in Fig.4, clocked NMOS N4 is off since CLK = 0, so node Y will not be 0, hence PMOS P8 is off. QT is not connecting with VDD or GND, meaning that QT in the top FF is floating (see the top left of Fig.5 (a)<sup>1</sup>).

In Fig.4, transistors (N1, N2, P1, P2, P3) build a negative triggered single-transistor-clocked buffer (STCB), where only one clocked transistor P3 is used in signal sampling path. The redundant transition which occurs between one clocked PMOS and one NMOS in FN\_C DET and FS-TSPC, Fig.3, does not exist in STC-DET. Neither is there any contention. Besides the clocked transistor P3, there is one more clocked NMOS transistor N3, in the top master latch (see the top left of Fig.4), but it is used in keeper, rather than the data sampling path. All the four clocked transistors that are on the data sampling path have been marked out with an arrow in Fig.4 (P3, N4, N5, and P6).

Transistors (N4, N7, N8, P7, P8) build another *positive* triggered STCB in the top FF.

When CLK = 1, in the top master latch in Fig.4, clocked PMOS P3 is off since CLK = 1, so the paths associated with P1 and subsequently N2, are off. As a result, the logic state of MID is kept by keeper (N3, N15, P14, P15). When X's logic state is 0, it will be kept by pull down keeper (N14, N3). On the other hand, in the top slave latch, the clocked NMOS, N4, turns on, so Y is MID'' which is essentially MID. Therefore, transistors (N8, P8) act as a virtual inverter, the signal of MID which is right before clock rising edge passes to QT (see the arrow in the top right of the Fig.5(a)<sup>1</sup>). Hence the top FF is activated at the clock positive edge.

#### B. Operation of the bottom FF in STC-DET

When CLK = 0, in the bottom FF (see bottom left of Fig.4), the clocked NMOS, N5, in the bottom master latch turns off. Consequently, the paths associated with N9 and P10 are off, and the logic state of  $MID_n$  is kept by keeper (N16, N17, P5, P17); if logic state of  $X_n$  is 1, that state will be kept by pull up keeper (P16, P5). On the other hand, in the bottom slave latch (bottom right of Fig.4), clocked PMOS P6 in top of the figure turns on when CLK = 0,  $Y_n$  becomes  $MID_n''$  which is essentially  $MID_n$ , thus P12 and N12 act as a virtual inverter, and the signal of  $MID_n$  which is right before clock falling edge passes to QT (see arrow in the left half of Fig.5(b)<sup>1</sup>). As a result, the bottom FF is activated on

<sup>1</sup>Note that Fig.5 does not show any keeper. The left half of Fig.5 (a) only shows N4, N8, P8 to highlight when the top slave latch is opaque, and the right half of Fig.5(a) only shows the last signal stage (N2, P2, P3) to highlight that the top master latch is in opaque state; And the same applies to Fig.5(b).



Fig. 6. Layout of the proposed STC-DET

clock negative edge. There would be no redundant-transition if D keeps the same value since it causes no switching.

Transistors (N5, N9, N10, P9, P10) and (P6, N11, N12, P11, P12) build other two STCBs in the bottom FF.

When CLK = 1, the clocked NMOS N5 in the bottom master latch turns on (see the bottom left of Fig.4),  $X_n$  will be D'' which is essentially D, thus P10 and N10 act as a virtual inverter, and input D passes to  $MID_n$  in the bottom master latch (see arrow in right half of Fig.5(b)); On the other hand, the clocked PMOS P6 in the bottom slave latch turns off, so the paths associated with P11 and subsequently N12 will also be off. And it's worth mentioning that as QT node has another connection in the top FF which is active when CLK = 1 as discussed before, QT is a non-floating node. There is still no redundant-transition if D keeps the same value since it does not affect QT.

Since the top and bottom slave latches are activated by positive clock edge and negative clock edge, respectively, STC-DET can sample input at both edges of clock. Moreover, because the two slave latches are activated by different clock edges, so there is always one latch that is transparent and the other one is opaque for all periods, thus the slave latches' outputs can be connected together at QT without contention. Furthermore, this transparent connection to supply or ground for QT makes it an all-time non-floating node. And if necessary, by adding an enable signal and scan input to the master latches on the left side, one can easily modify STC-DET to be scannable for supporting the Design for Test (DFT). Also, by adding keepers at the end of the top or bottom FF, it can be modified into a single-edge flip-flop design, respectively.

#### IV. SIMULATION VERIFICATION AND COMPARISON

In this section, the performance of the proposed STC-DET is compared with that of the FN\_C-DET, FS TSPC, S-DET, TSPC DET, TGFF (widely-used single-edge FF). Fig. 6 shows the layout of STC-DET in Cadence Virtuoso<sup>®</sup>. In this design, we adopt the standard transistor width as 600 nm for PMOS and 300 nm for NMOS in a 22 nm CMOS FD-SOI technology.

The post-layout simulation setup is as follows: All the DETs have been designed with identical PMOS and NMOS setup as aforementioned. The clock frequency is running at 1 GHz for all the FFs<sup>2</sup>. In order to capture more realistic results, the FF inputs (clock, data) are driven by the input buffers, and the output is loaded with 4 standard sized inverters.

Fig.7 shows the power consumption<sup>3</sup> at supply voltages varying from 0.4 V to 0.8 V with 10% switching activity. As

<sup>&</sup>lt;sup>2</sup>A 2 GHz exception is made for TGFF, as it is a single-edge FF.

<sup>&</sup>lt;sup>3</sup>The total power consumption includes the power consumption of flip-flops as well as the drivers for both clock and data.

|   |               | Transistor/ |          |        | Setup | Hold  |       | Powe  | er (W) @        | 0.8V  |       | Power (W) | PDP_cq     |
|---|---------------|-------------|----------|--------|-------|-------|-------|-------|-----------------|-------|-------|-----------|------------|
|   |               | Clock count | Area(m2) | CQ(ps) | 1     | (ps)  |       | 1000  | <i>A</i> (11) C | 0.01  |       | @ 0.4V    | @0.8V (fJ) |
|   |               | Clock Couli |          |        | (ps)  | (ps)  | 5%    | 10%   | 15%             | 20%   | 50%   | 10%       | 10%        |
| ĺ | TGFF [17]     | 24/14       | 7.4      | 46     | 39.9  | 23    | 10.81 | 11.1  | 11.38           | 11.65 | 13.55 | 2.751     | 5.11       |
|   | TSPC_DET [13] | 38/14       | 11.3     | 34.49  | 29.3  | 23.5  | 7.08  | 8.122 | 7.439           | 7.564 | 9.044 | 1.784     | 2.80       |
|   | S_DET [15]    | 26/16       | 7.9      | 44.04  | 57.1  | 17.9  | 6.987 | 7.206 | 7.525           | 7.775 | 9.764 | 1.73      | 3.17       |
|   | FS_TSPC [12]  | 36/10       | 10.8     | 29.96  | 21.3  | 25.05 | 7.08  | 6.88  | 7.439           | 7.564 | 9.044 | 1.57      | 2.06       |
| ı | FN C DET [3]  | 34/8        | 9.8      | 74.6   | 27.4  | 31    | 4.998 | 5.39  | 5.956           | 6.34  | 9.516 | 1.29      | 4.02       |

4.88

5.612

6.368

4.028

TABLE I
COMPARISON WITH STATE-OF-THE-ART FF DESIGNS



42/8

Proposed STC

Fig. 7. Power consumption comparison for different voltages at 1 GHz with 10% switching activity.



Fig. 8. Simulated timing diagram of proposed STC-DET.

can be seen, the STC-DET has the lowest power dissipation, leading to a power reduction by 14% and 9.5% at 0.4v, 0.8V, respectively when comparing to FN\_C-DET, which is the 2nd lowest power consumption DET in this paper. This is due to the three advantageous factors of STC-DET: 1) the complete removal of redundant switching activity, 2) the exclusion of contention, and 3) that STC-DET has only 8 clocked transistors in its topology, which is the least number comparing to state-of-the-art. The FN\_C DET suffers from clock redundant-transition and contention [12]. The FS TSPC has internal redundant-transition and 2 more clocked transistors than STC-DET. The rest of the DET designs show inferior power dissipation in the comparison due to their clock inverter's redundant-transition (TGFF, S\_DET) or large number of clocked transistors (TSPC\_DET, 14 clocked transistors).

Fig.8 shows the simulated timing diagram of the STC-DET when D  $0\rightarrow 1$  is captured by the clock positive edge and D  $1\rightarrow 0$  is captured by the clock negative edge.

Table I summarizes the comparison of the state-of-the-art DETs. The table shows the power consumption at different



11.66

Fig. 9. Delay at different supply voltages at 1 GHz.



Fig. 10. Power consumption at different process corners (T=27 °C, Supply=0.8 V for TT, SF, and FS; for FF, 0.88 V at T=-40 °C; and for SS, 0.72 V at 120 °C).

switching activities from 5% to 50% at 0.8 V as well as 10% at 0.4V. In modern processor, the average switching activity is in the range of  $5\% \sim 15\%$  [6]. STC has the lowest power among all the DETs in the above range. FN\_C-DET catches up as the activity rate is above 20%. This means that the lower the switching activity is, the more the power dissipation will be saved by adopting the proposed STC-DET topology. This phenomenon is due to the fact that with lower switching activity, there are more percentage of cycles that the data does not change which increases redundant-transition power in all other DETs. The proposed design shows the best powerdelay-product (PDP) performance thanks to its low power and contention-free nature. Note that the area of the STC is large because of the added transistors. But they are used to remove the redundant transitions to achieve significant power saving. For example, it consumes 56% less power than the widely used TGFF. Setup and hold time of FF are important for the design to meet system timing constraints and performance metrics. In this design, the setup time is determined by the propagation delay from D to  $MID\_n$ , whereas the hold time is by the speed of QT when settling to its final value after the clock transition. The worst-case hold time scenario occurs when D falls too close to the clock's rising edge, which results in a fall of MID'; then Y in the top slave latch will be pulled up to turn P8 off before QT is fully charged to Vdd.

Fig.9 shows delay at different switching activities. The proposed design shows moderate delay among the DETs above 0.6 V. While its delay at 0.4 V along with some other delays of the DETs increase sharply due to that circuits work at the near-threshold voltage at 0.4 V. At IoT applications, speed is not the critical concern, instead power consumption is the most important metric for device works at near threshold voltage. Table I shows that the proposed design exhibits the lowest power consumption at 0.4 V among all the DETs.

Fig.10 shows the simulation results through the following different process corners at 10% switching activity: FF, SF, FS, and SS. The proposed STC-DET consumes less power than all other DETs in all the corners.

To verify the robustness of the proposed design at low voltage, 1000-point Monte-Carlo simulations have been performed at 0.4V under a 10% data switching activity with a 1 GHz driving clock. The results are summarized in Table II. It can be seen that the power of the proposed FF is much lower compared with other designs with an acceptable standard deviation.

In summary, the proposed STC-DET shows the lowest power dissipation in the average switching activity range and hence serves as the best candidate for low power DET FF in the modern processor's context.

#### V. CONCLUSION

To completely eliminate redundant transition in dual-edgetriggered flip-flops, a novel low-power redundant-transitionfree dual-edge-triggered flip-flop is proposed, namely STC-DET, as it uses single-transistor-clocked (STC) buffers. Of the two STC buffers inside the topology (the positive-triggered one and the negative-triggered one), each has only *one* clocked transistor in the data sampling path, leading to a complete removal of clock redundant-transitions and internal redundanttransitions that existed between two clocked transistors in other DET designs. Furthermore, there is no contention in the proposed STC-DET. In view of power consumption, STC-DET dissipates less power than the prior state-of-the-art FN C-DET by 14% and 9.5% at switching activity of 10% in 0.4 V, 0.8 V, respectively. Also, STC-DET consumes the least amount of power in all process corners, different voltages (0.4 V to 0.8 V) for switching activities below 20% among all DET designs. Regarding PDP (CQ), the proposed design outperforms FN\_C-DET by 53.4%, 51.0% at 10% switching activity with 0.4 V, 0.8 V, respectively. In summary, the proposed STC-DET achieves the lowest power consumption and PDP in the average switching activity range among all the DET FFs of the state-of-the-art.

TABLE II
COMPARISON OF MONTE-CARLO SIMULATION RESULTS FOR FF DESIGNS

| Delay(ps) /<br>Power(µW) | TGFF   | TSPC<br>DET | S_DET | FS<br>TSPC | FN_C   | Proposed<br>STC |
|--------------------------|--------|-------------|-------|------------|--------|-----------------|
| $\mu_{delay}$            | 1500   | 805         | 505   | 340        | 1120   | 915.0           |
| $\sigma_{delay}$         | 493    | 142         | 69.2  | 33.9       | 209    | 95.4            |
| $\mu_{power}$            | 2.62   | 1.81        | 1.76  | 1.59       | 1.34   | 1.12            |
| $\sigma_{power}$         | 0.0058 | 0.0024      | 0.008 | 0.026      | 0.0072 | 0.0084          |

#### VI. ACKNOWLEDGMENT

Dr. Zhao is grateful for all the insightful discussions with Prof. Yoonmyung Lee and Mr. You Heng, respectively. And the authors would like to thank Dr. Michael Fahy of Chapman University for the support with CAD tools, and also express their gratitude to CPSC465 Integrated Circuit Design class in Chapman University for helpful discussions.

#### REFERENCES

- [1] "AI and compute," 2018. [Online]. Available: https://openai.com/blog/ai-and-compute
- [2] T. Singh *et al.*, "Zen: An energy-efficient high-performance × 86 core," *IEEE J. Solid-State Circuits*, vol. 53, no. 1, pp. 102–114, 2018.
- [3] S. Lapshev and S. M. R. Hasan, "New low glitch and low power DET flip-flops using multiple C-elements," *IEEE Trans. Circuits Syst. I*, vol. 63, no. 10, pp. 1673–1681, 2016.
- [4] J. Tschanz et al., "Comparative delay and energy of single edge-triggered and dual edge-triggered pulsed flip-flops for high-performance microprocessors," in Proc. ISLPED, 2001, pp. 147–152.
- [5] P. Zhao, T. Darwish, and M. Bayoumi, "High-performance and low-power conditional discharge flip-flop," *IEEE Trans. VLSI Syst.*, vol. 12, no. 5, pp. 477–484, 2004.
- [6] N. Kawai et al., "A fully static topologically-compressed 21-transistor flip-flop with 75% power saving," *IEEE J. Solid-State Circuits*, vol. 49, no. 11, pp. 2526–2533, 2014.
- [7] Y. Cai et al., "Ultra-low power 18-transistor fully static contention-free single-phase clocked flip-flop in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 54, no. 2, pp. 550–559, 2019.
- [8] M. Saint-Laurent, B. Mohammad, and P. Bassett, "A 65-nm pulsed latch with a single clocked transistor," in *Proc. ISLPED*, 2007, pp. 347–350.
- [9] A. Karimi, A. Rezai, and M. M. Hajhashemkhani, "A novel design for ultra-low power pulse-triggered D-flip-flop with optimized leakage power," *Integration*, vol. 60, pp. 160–166, 2018.
- [10] —, "Ultra-low power pulse-triggered CNTFET-based flip-flop," *IEEE Trans. Nanotechnol.*, vol. 18, pp. 756–761, 2019.
- [11] P. Zhao et al., "Low-power clocked-pseudo-NMOS flip-flop for level conversion in dual supply systems," *IEEE Trans. VLSI Syst.*, vol. 17, no. 9, pp. 1196–1202, 2009.
- [12] Y. Lee, G. Shin, and Y. Lee, "A fully static true-single-phase-clocked dual-edge-triggered flip-flop for near-threshold voltage operation in IoT applications," *IEEE Access*, vol. 8, pp. 40232–40245, 2020.
- [13] A. Bonetti, A. Teman, and A. Burg, "An overlap-contention free truesingle-phase clock dual-edge-triggered flip-flop," in *Proc. IEEE Int.* Symp. Circuits Syst., 2015, pp. 1850–1853.
- [14] M. W. Phyu et al., "Power-efficient explicit-pulsed dual-edge triggered sense-amplifier flip-flops," *IEEE Trans. VLSI Syst.*, vol. 19, no. 1, pp. 1–9, 2011.
- [15] R. Hossain, L. Wronski, and A. Albicki, "Low power design using double edge triggered flip-flops," *IEEE Trans. VLSI Syst.*, vol. 2, no. 2, pp. 261–265, 1994.
- [16] P. Zhao et al., "Low-power clock branch sharing double-edge triggered flip-flop," *IEEE Trans. VLSI Syst.*, vol. 15, no. 3, pp. 338–345, 2007.
- [17] A. Gago, R. Escano, and J. Hidalgo, "Reduced implementation of D-type DET flip-flops," *IEEE J. Solid-State Circuits*, vol. 28, no. 3, pp. 400–402, 1993.
- [18] S. V. Devarapalli, P. Zarkesh-Ha, and S. C. Suddarth, "A robust and low power dual data rate (DDR) flip-flop using C-elements," in *Proc. Int'l* Symp. Qual. Electron. Design (ISQED), 2010, pp. 147–150.