Multi-threshold voltage CMOS (MTCMOS) is an effective technique for suppressing the leakage currents in idle circuits. When the conventional MTCMOS technique is directly applied to a sequential circuit however the stored data is lost during the low-leakage sleep mode. Significant energy and timing penalties are suffered to restore the pre-sleep system state at the end of the sleep mode with the conventional MTCMOS circuits. Two new master-slave MTCMOS memory flip-flops are presented in this paper for providing a low-complexity and low-leakage data retention sleep mode. A small size high threshold voltage static memory cell is integrated into an MTCMOS flip-flop to preserve the stored data while drastically reducing the leakage power consumption of idle sequential circuits. The already existing sleep signal of the MTCMOS circuitry is also used for controlling the data retention and restoration operations, thereby eliminating the need for any extra control signals. The memory flip-flops provide a significantly simplified sleep control/data transfer mechanism and reduce the circuit area by up to 37.21% as compared to the previously published MTCMOS flip-flops. Furthermore, the leakage power consumption with the presented techniques is reduced by up to 97.71% as compared to the previously published techniques in a UMC 80 nm CMOS technology.
INTRODUCTION
Supply and threshold voltages are reduced with the scaling of CMOS technology. The lowering of threshold voltage exponentially increases the subthreshold leakage current produced by a transistor. More than 40% of total active mode energy dissipation can be due to the leakage currents produced by idle transistors in modern high performance systems-on-chip. 1 2 Leakage currents are expected to dominate the total energy consumption as increasing numbers of transistors are crammed onto integrated circuits in each new technology generation. Furthermore, leakage is the only source of energy consumption in an idle circuit. The battery-dependent portable devices such as smart phones and laptop computers tend to have long standby modes. Reducing the leakage energy consumption during these long idle periods is crucial for a longer battery lifetime in portable applications. * Author to whom correspondence should be addressed. Email: jiaohl@ust.hk Multi-threshold voltage CMOS (MTCMOS) is a commonly used low leakage circuit technique. 1-10 13-19 22 The MTCMOS technique suppresses the leakage currents by disconnecting the idle low threshold voltage (low-V th logic gates from the power supply and/or the ground line via cut-off high threshold voltage (high-V th sleep transistors. The MTCMOS technique is, therefore, also known as "power gating." The power gating technique can be applied as either gated-ground or gated-V DD , as shown in Figure 1 . The leakage current produced by an MTCMOS circuit is significantly reduced by switching off the high-V th sleep transistors in the standby mode as illustrated in Figure 1 . [1] [2] [3] MTCMOS circuits are effective for reducing the leakage power consumption in the sleep mode. If the MTC-MOS technique is directly applied to a sequential circuit (a flip-flop or a latch), however, the state of the circuit is lost during the sleep mode. The retrieval of the previously stored data for post-sleep system state restoration costs significant energy and timing overheads when the sequential MTCMOS circuits are reactivated. A low leakage sleep mode with data retention capability is, therefore, critical for achieving truly energy efficient sequential MTCMOS circuits.
The previously published MTCMOS flip-flop (FF) techniques with data retention capabilities can be divided into two groups depending on the implementation of the sleep transistors. The first group utilizes a localized sleep switch circuit structure with high-V th NMOS and PMOS sleep transistors inserted into both the master and the slave latches. [3] [4] Several high-V th devices serving as sleep switches are locally distributed into each individual flipflop, thereby causing a large circuit area and a high activeto-sleep mode transition energy overhead. The second group of MTCMOS flip-flops utilizes different forms of high-V th data retention circuitry for preserving the data. 5 6 Centralized sleep switches are employed, thereby reducing the overall area overhead as compared to the first group of flip-flops. This second group of flip-flops published in Refs. [5] and [6] , however, requires excessively complex control signals for storing and retrieving the circuit states when entering and leaving the sleep mode. New sequential MTCMOS circuits with smaller area, lower energy overhead, and simpler control circuitry are therefore highly desirable.
In this paper, two new MTCMOS memory flip-flops are presented for providing a low leakage data retention sleep mode with smaller area and significantly simplified control circuitry as compared to the previously published techniques. A small size high-V th static memory cell is combined with the slave latch of the memory flip-flops. The already existing sleep signal of the MTCMOS circuit technique is also used for controlling the data retention and recovery operations. No extra control signals are required for implementing a data preserving sleep mode with the memory flip-flops. The MTCMOS memory flipflops reduce the sleep mode leakage power consumption by up to 99.05% as compared to a standard single low-V th clock-gated flip-flop in a UMC 80 nm CMOS technology. Furthermore, the leakage power consumption of the memory flip-flops is reduced by up to 97.71% as compared to the previously published MTCMOS flip-flops.
The paper is organized as follows. Different sequential MTCMOS circuit techniques are described in Section 2. Post-layout simulation results are presented in Section 3 to characterize and compare the sequential MTCMOS circuit techniques. Finally, conclusions are offered in Section 4.
SEQUENTIAL MTCMOS CIRCUITS
MTCMOS circuit techniques aimed at lowering the leakage power consumption of sequential circuits are presented in this section. Previously published sequential MTCMOS circuits are discussed in Section 2.1. The MTCMOS memory flip-flops with simpler control circuitry, smaller area, enhanced data stability, and more energy efficient mode transition capability are described in Section 2.2.
Previously Published Sequential MTCMOS Circuits
The node voltages are lost when a standard MTCMOS circuit (as shown in Fig. 1 ) enters the sleep mode. Although the loss of circuit state can be acceptable (and inevitable) in some applications, preserving the active mode data is highly desirable in flip-flops and latches to be able to restore a system to a pre-sleep state at the end of an idle period. A system with state retention registers can quickly resume the operations after a low leakage sleep mode. This would permit more frequent and opportunistic transitions between the sleep and active modes of operation, thereby providing more effective (finer-time-grain) leakage reduction.
A conventional MTCMOS flip-flop (Mutoh-FF) that is capable of preserving data is shown in Figure 2 . 3 4 All of the devices along the critical path of the Mutoh-FF have low-V th for maintaining similar Clock-to-Q speed as compared to a standard single low-V th FF. Several sleep transistors are distributed within the Mutoh-FF. 3 Both NMOS and PMOS high-V th devices are employed in the master and slave latches in order to eliminate the sneak leakage current paths in the sleep mode. 4 The Mutoh-FF therefore has a high circuit area overhead as compared to a An alternative MTCMOS flip-flop (Balloon-FF) for providing a high speed and low leakage data preserving sleep mode is presented in Refs. [5] and [6] . A high-V th data retention cell (Balloon) is attached to the slave latch of the Balloon-FF, as shown in Figure 3 . All of the devices on 6 The complex operations required by the Balloon-FF for data transfer in and out of the data retention balloon during the sleep/active mode transitions are illustrated in Figure 4 . The Balloon-FF has a high energy overhead due to the complex data storage and recovery operations required for mode transitions.
Data Retention MTCMOS Memory Flip-Flops
Two high speed MTCMOS flip-flops 15 providing a low leakage data retention sleep mode are presented in this section. The first flip-flop is composed of gated-ground MTCMOS master and slave stages, with a low leakage high-V th data retention memory cell attached to the slave stage as shown in Figure 5 . The technique (MEMORY-FF) utilizes two high-V th pass transistors (N 1 and N 2 in Fig. 5 ) for accessing the data retention cell (DRC). The DRC is very similar to the standard six-transistor SRAM cell used in memory caches. One centralized NMOS sleep switch is shared by the low-V th gates in the master and slave stages of the MEMORY-FF. The circuit area overhead is thereby significantly reduced as compared to the Mutoh-FF. All of the devices along the critical path of the MEMORY-FF have low-V th . The Clock-to-Q speed of the MEMORY-FF is therefore similar to a standard single low-V th FF. The already existing sleep signal employed for ground gating is also used for controlling the data retention and restoration operations with the MEMORY-FF. No extra control signals are required for the operation of the MEMORY-FF, thereby significantly reducing the control complexity of implementing a low leakage data retention sleep mode as compared to the Balloon-FF.
The signal waveforms representing the operation of the MEMORY-FF are shown in Figure 5 . In the active mode, the sleep signal is maintained high. The sleep transistors (N 1 , N 2 , and N 3 are turned on. The inverter (I fb and the transmission gate (TG cell inside the DRC form the active mode feedback path of the static slave latch. The circuit operates similar to a standard positive edge triggered master-slave FF composed of static latches. The DRC maintains the states of Node 3 and Q whenever the clock is low (the slave stage is opaque) in the active mode. Whenever the clock transitions high, the feedback path within the DRC is cut-off. The most recent data sampled by the master stage is thereby transferred to the slave stage and the DRC through the slave transmission gate (TG s with the positive edges of the clock.
When the circuit is idle, the clock and the sleep signal transition low. The low-V th gates in the master and slave stages are disconnected from the real ground distribution network by cutting off N 3 . The access transistors N 1 and N 2 are also cut-off, thereby disconnecting the DRC from the FF during the sleep mode. TG cell within the DRC is turned on since the clock is gated low. The most recent data that was sampled by the DRC is thereby maintained throughout the sleep mode. Note that the high-V th crosscoupled inverters within the DRC are always active by uninterrupted connections to the power supply and ground distribution networks. These inverters are sized small and are composed of high-V th transistors since these devices are not on the critical delay path of the FF. The sleep mode power consumption of the MEMORY-FF is thereby significantly reduced while maintaining the pre-sleep circuit state.
At the end of the sleep mode, the sleep signal transitions high before the clock is enabled. Memory-Node 1 and Memory-Node 2 are connected to Node 3 and Q through N 1 and N 2 , respectively. Depending on the pre-sleep data stored in the DRC, either Node 3 (Memory-Node 1 = "0") or Q (Memory-Node 2 = "0") is discharged. After the data recovery is complete, the clock is enabled. The entire FF is thereby reactivated after the successful restoration of the pre-sleep state to the slave latch.
Provided that Memory-Node 1 stores a "1" during the sleep mode, Node 3 voltage does not rise all the way up to V DD due to the V th drop across N 1 during the data recovery. I s is, therefore, weakened during the first clock cycle of the active mode following a wake-up event. The V th drop at Node 3 , however, does not impose a malfunction risk since the parallel feed-forward inverter (I fr within the DRC also drives the output load, thereby supporting the state of the FF. The temporary weakening of I s due to V th drop can be eliminated by replacing N 1 with a transmission gate (TG pass as shown in Figure 6 . This alternative MTCMOS memory flip-flop (MEMORY-TG-FF) is also characterized in the following sections. The operation of the MEMORY-TG-FF is similar to the MEMORY-FF. An additional control signal SLEEP is however required for the operation of the MEMORY-TG-FF as shown in Figure 6 .
SIMULATION RESULTS
The UMC 80 nm multi-threshold voltage CMOS technology 11 (High-V th0_NMOS = 320 mV, low-V th0_NMOS = 72 mV, high-V th0_PMOS = −273 mV, low-V th0_PMOS = −56 mV, and V DD = 1 V) is used in this paper for the characterization of leakage power consumption, active power consumption, clock power, data stability, and area overheads with the different sequential MTC-MOS techniques. Flip-flops and 32-bit shift registers are designed based on the following techniques: the standard single low-V th FF, the conventional Mutoh-FF (Fig. 2) , the Balloon-FF (Fig. 3) , and the two data retention memory-cell FFs explored in this paper (MEMORY-FF: Fig. 5 and MEMORY-TG-FF: Fig. 6 ). All the data presented in this section are produced by post-layout simulation.
The design criterion used in this paper for the sizing of transistors is to achieve similar propagation delays (within 5%) with each flip-flop and shift register. Furthermore, the low-V th circuitry of each FF needs to be sized carefully to achieve similar data output rise and fall times as well as similar high-to-low and low-to-high propagation delays. Delay overheads of different MTCMOS techniques when the sleep transistors are bypassed (with the virtual power and ground lines connected directly to the real power and ground lines, respectively) are listed in Table I . The delay (sum of setup time and Clock-to-Q propagation delay) overheads that are below an acceptably low level (considered to be +5% as compared to the standard single low-V th FF) cannot be achieved by only increasing the sizes of the sleep transistors in MTCMOS circuits as listed in Table I . The low-V th circuitry of each MTCMOS FF needs to be sized larger (in addition to appropriate sleep transistor sizing) as compared to the standard single low-V th FF to meet the timing requirement. The sizes of different MTCMOS FFs to satisfy the timing criteria are shown in Figures 2, 3 , 5, and 6. The sizes of sleep transistors used with different MTCMOS FFs and MTC-MOS shift registers are listed in Table II . The mutually exclusive switching patterns 9 (the data in the adjacent flipflops of the shift registers never switch in the same direction) are exploited to further reduce the sizes of the sleep transistors in the Balloon, MEMORY, and MEMORY-TG shift registers. Alternatively, sleep transistors of different FFs in the Mutoh shift register cannot be shared. Localized and distributed sleep transistors are required in order to eliminate the sneak leakage current paths in Mutoh-FF. 4 Different tapered buffer chains are employed to provide similar signal rise and fall times to the sleep transistors and the clock distribution network with each technique. Section 3 is organized as follows. The successful data storage and recovery operations with the sequential MTC-MOS memory circuits are verified in Section 3.1. The total active power and clock power consumed by the FFs are presented in Section 3.2. The leakage power consumed by the FFs is compared in Section 3.3. The data stabilities of the MTCMOS FFs in the sleep mode are evaluated in Section 3.4. The areas of the FFs are compared in Section 3.5. A comprehensive design metric is proposed to compare the overall electrical quality of MTCMOS FFs in Section 3.6. The leakage power consumption, data stability, and propagation delays of different FFs under supply voltage and process parameter variations are characterized in Section 3.7. The leakage power consumption, total active power consumption, clock power consumption, and area of 32-bit shift registers designed with different MTCMOS flip-flops are discussed in Section 3.8.
Data Storage and Recovery with the MTCMOS Memory Flip-Flops
The data storage and recovery operations with the MTC-MOS memory flip-flops are verified in this section. The waveforms representing the operations of storing a "0" and a "1" with the MEMORY-FF circuit in the active mode are shown in Figure 7 . Note that there is no V th drop observed at the internal nodes of the DRC since the full voltage swing is provided by the cross-coupled inverter pair I fr and I fb . With the MEMORY-FF, the new data is not only transferred to Q but also stored in the DRC with each positive clock edge. Unlike the Balloon-FF, no additional data transfer operations are required for storing the data into the DRC before entering the sleep mode. When the MEMORY-FF is idle, the data that was last sampled by the DRC is maintained throughout the sleep mode. At the end of the sleep mode, the sleep signal transitions high before the clock is enabled. Memory-Node 1 and Memory-Node 2 are connected to Node 3 and Q, respectively, through the pass transistors. TG cell is active. The process of recovering the data from the DRC to the slave stage of the MEMORY-FF is similar to a read operation from an SRAM cell to the bitlines in a memory array. In an SRAM cell, the stored data is disturbed due to the voltage division between the cross-coupled inverters and the access transistors during a read operation. The pull-down devices are therefore sized wider than the access transistors in order to avoid the disturbance of data during a read access. 12 27-32 Similar to a conventional SRAM cell, the pull-down NMOS transistors of the cross-coupled inverters in the DRC are also sized wider than the pass transistors for robust data recovery at the end of the sleep mode.
The waveforms representing the recovery of a "0" and a "1" at the end of the sleep mode with the MEMORY-FF are shown in Figures 8 and 9 , respectively. Provided that Memory-Node 1 and Memory-Node 2 store "1" and "0", respectively, Node 3 cannot be fully restored to V DD during the sleep-to-active mode transition due to the V th drop across the access transistor N 1 , as shown in Figure 8 . The V th drop at Node 3 of the MEMORY-FF, however, does not produce a significant issue since the parallel feedforward inverter I fr within the DRC also supports the state of Q and drives the output load. The MEMORY-TG-FF Table III . The MEMORY-TG-FF increases the active power by up to 43.49%, 13.58%, 6.17%, and 1.67% as compared to the standard single low-V th FF, Balloon-FF, Mutoh-FF, and MEMORY-FF, respectively. The Mutoh-FF has an additional forward path (Inv 1 in Fig. 2 ) in the master latch, thereby increasing the parasitic capacitances at Node 1 and Node 2 in Figure 2 . Furthermore, the sizes of the transistors on the critical path of the Mutoh-FF are larger as compared with the Balloon-FF. The Mutoh-FF therefore consumes higher active power as compared to the Balloon-FF.
The MEMORY-TG-FF consumes the lowest clock power among the MTCMOS FFs evaluated in this paper. As listed in Table III, up 45.63%, and 40.36% as compared with the MEMORY-TG-FF, MEMORY-FF, and Balloon-FF, respectively.
Leakage Power Consumption of Different Flip-Flops
The leakage power consumption of different flip-flops is compared in this section. The majority of MTCMOS circuits evaluated in this paper utilize the gated-ground technique. The virtual ground and the internal nodes of a gated-ground MTCMOS circuit have high steady-state voltages (∼V DD in the low leakage data retention sleep mode. A high data input (D = V DD is therefore assumed for the leakage power measurements. The sleep mode leakage power consumptions of individual MTCMOS flipflops are measured by post-layout simulation for two different scenarios as listed in Table IV . A "1" and a "0" are assumed to be retained by the flip-flops with the first (pre-sleep-Q: 1) and the second (pre-sleep-Q: 0) scenarios, respectively. The percent leakage power reduction provided by different MTCMOS circuit techniques in the sleep mode as compared to the Mutoh-FF is shown in Figure 12 .
As listed in Table IV , the MEMORY-FF consumes the lowest leakage power primarily by employing the smallest Notes: Post-layout simulation. centralized sleep transistor among the FFs evaluated in this paper. The leakage power consumption is reduced by up to 99.05% as compared to the standard single low-V th FF. Furthermore, the MEMORY-FF reduces the leakage power consumption by up to 97.71% and 52.22% as compared with the Mutoh-FF and Balloon-FF, respectively. Alternatively, the Mutoh-FF consumes the highest leakage power among the MTCMOS FFs evaluated in this paper due to the distributed bulky sleep transistors and a sneak leakage current path as illustrated in Figure 13 . As listed in Table IV , the Mutoh-FF increases the leakage power consumption by up to 43.58×, 28.62×, and 27.55× as compared to the MEMORY-FF, Balloon-FF, and MEMORY-TG-FF, respectively. However, the Mutoh-FF still manages to reduce the leakage power consumption by 43.59% to 95.96% and 42.32% to 96.98% at 25 C and 110 C, respectively, as compared to the standard single low-V th FF depending on the data retained.
When retaining a "0" in the sleep mode, the leakage power consumption of Mutoh-FF significantly increases by 10.25× and 13.85× as compared to the condition in which a "1" is stored in the sleep mode at 25 C and 110 C, respectively. This significant increase in the leakage power consumption with the variation of the stored data is mainly due to a sneak leakage current path as illustrated in Figure 13 . The clock is gated high in the sleep mode with the Mutoh-FF. The low-V th transmission gate TG m is cut-off while the high-V th transmission gate TG m-fb is turned on for data retention during the sleep mode. Significant amount of leakage current flows from the high data input terminal, through the cut-off low-V th TG m , the active TG m-fb , and the active NMOS transistor N fb to the ground, as shown in Figure 13 . In order to eliminate this sneak leakage current path, an extra inverter with high-V th NMOS and PMOS sleep transistors can be inserted before the data input, as shown with solution 1 in Figure 13 . 4 This extra inverter added to the Mutoh-FF significantly degrades the Clock-to-Q speed and further increases the circuit area and the mode transition energy overhead. An alternative solution is to employ a high-V th transmission gate to replace the low-V th TG m , as illustrated with solution 2 in Figure 13 . 6 However, the Clockto-Q speed is significantly degraded with this solution too since a high-V th device is placed along the critical path.
Hold Static Noise Margin of MTCMOS Flip-Flops
The hold static noise margin (SNM) is the metric used to characterize the data stability of flip-flops in the lowleakage data retention sleep mode. 20 The hold static noise margins of different MTCMOS FFs measured by postlayout simulation are listed in Table V 22 The MEMORY-FF and MEMORY-TG-FF thereby slightly enhance the hold SNM (by up to 4.21%) as compared with the Mutoh-FF and Balloon-FF.
Area Comparison of Flip-Flops
The Figure 16 . The MEMORY-FF has the lowest area overhead due to the smallest centralized sleep transistor and the simplest control circuitry among the MTCMOS flip-flops. The MEMORY-FF reduces the area by 37.21%, 19.09%, and 7.44% as compared to the Mutoh-FF, Balloon-FF, and MEMORY-TG-FF, respectively. Alternatively, the Mutoh-FF has the highest area overhead due to the distributed bulky sleep transistors. The Mutoh-FF increases the area by 144.64%, 59.26%, 47.40%, and 28.85% as compared to the standard single low-V th FF, MEMORY-FF, MEMORY-TG-FF, and Balloon-FF, respectively.
Jiao and Kursun

Low-Leakage and Compact Registers with Easy-Sleep Mode
Quality Metric
As discussed in the previous sections, different MTC-MOS flip-flops rank differently for various design metrics. A comprehensive design metric is used in this section to evaluate the overall electrical quality of different MTC-MOS flip-flops. The total active power consumption, clock power consumption, leakage power consumption, hold static noise margin, and area of different MTCMOS flipflops are assumed to have equal importance in the evaluation of overall electrical quality. The Quality Metric is Quality Metric = Hold_Static_Noise_Margin Leakage_Power ×Total_Active_Power ×Clock_Power ×Area
The Mutoh-FF displays the lowest overall electrical quality primarily due to the significantly higher leakage power 
Influence of Power Supply and Process Variations
The fluctuations of supply voltage and process parameters alter the electrical characteristics of CMOS circuits. The leakage power consumption, data stability, and propagation delays of sequential circuits due to supply voltage fluctuations and process variations in gate length (L gate , gate oxide thickness (t ox , and threshold voltage (V th0 are evaluated in this section. The variation of supply voltage is assumed to be ±10% of the nominal value. 33 34 L gate , t ox , and V th0 are assumed to have Gaussian statistical distributions. L gate and t ox are assumed to have three sigma (3 variations of 12% and 4%, respectively. [23] [24] [25] [26] The 3 variation of V th0 is assumed to be 3% of the supply voltage. 23 1000 Monte Carlo simulations are run to evaluate the leakage power, hold SNMs, and propagation delay distributions (with respect to L gate , t ox , and V th0 of different FFs. The ranges of the hold SNM variations of the Mutoh-FF, Balloon-FF, MEMORY-FF, and MEMORY-TG-FF are −7.31% to 6.55%, −7.27% to 6.51%, −7.59% to 6.79%, and −7.59% to 6.79%, respectively. Furthermore, the ranges of the propagation delay variations of the standard single low-V th FF, Mutoh-FF, Balloon-FF, MEMORY-FF, and MEMORY-TG-FF are −8.17% to 11.42%, −8.98% to 13.21%, −8.82% to 12.50%, −8.09% to 12.12%, and −7.99% to 11.96%, respectively.
The statistical data for the leakage power consumption of flip-flops are listed in Table IX and illustrated in Figures 17 and 18 . The MEMORY-FF achieves the lowest average leakage power consumption among the different FFs evaluated in this paper. The MEMORY-FF and MEMORY-TG-FF reduce the average leakage power consumption by up to 99.03% and 98.51%, respectively, as compared with the standard single low-V th FF. Alternatively, the Mutoh-FF has the highest average leakage power consumption among the MTCMOS flip-flops as listed in Table IX . The average leakage power consumed by the Mutoh-FF is up to 42.91×, 28.21×, and 27.17× higher as compared to the MEMORY-FF, Balloon-FF, and MEMORY-TG-FF, respectively. When retaining a "0" in the sleep mode, the average leakage power of the Mutoh-FF significantly increases by 13.66× as compared to the condition in which a "1" is stored in the sleep mode. This significant increase in the leakage power consumption with the variation of the stored data is primarily due to the sneak leakage current path inside the Mutoh-FF as described in Section 3.3.
The statistical data for the hold SNMs of different flipflops are listed in Table X and illustrated in Figure 19 . Due to the increased V th of the data storage elements (I fr and I fb in Figs. 5 and 6), the average hold SNMs of the MEMORY-FF and MEMORY-TG-FF are slightly enhanced as compared to the Mutoh-FF and Balloon-FF as listed in Table X. The statistical data for the propagation delays of different flip-flops are listed in Table XI and illustrated in Figure 20 . The average propagation delays with different MTCMOS flip-flops are similar to each other as listed in Table XI due to the careful sizing of the sleep transistors and the low-V th segments of the flip-flops. The Balloon-FF experiences the lowest standard deviation in propagation delay among the MTCMOS flip-flops evaluated in this paper. The standard deviation of propagation delay of Balloon-FF is 9.52%, 7.52%, and 4.13% smaller as compared to the MEMORY-TG-FF, MEMORY-FF, and Mutoh-FF, respectively.
Sleep Transistor Sharing in Circuits with
Mutually Exclusive Output Switching Patterns: Shift Register Case Study
The opportunities for area and power reduction in MTC-MOS circuits by sharing the sleep transistors among different logic blocks are explored in this section. In a sequential circuit, the sleep transistors can be shared among multiple flip-flops provided that the outputs of individual flip-flops do not simultaneously switch in the same direction. For example, the data in the adjacent flip-flops of a shift register never switch in the same direction. The mutually exclusive switching patterns in a shift register can be exploited Table XII while the total active power consumption, clock power consumption, and area of the 32-bit shift registers are listed in Table XIII . The active power and area of different shift registers are compared in Figure 21 . The MEMORY shift register achieves the lowest leakage power consumption. The MEMORY shift register reduces the leakage power consumption by up to 97.38%, 67.72%, 28.57%, and 24.69% as compared to the standard single low-V th , Mutoh, Balloon, and MEMORY-TG shift registers, respectively. Alternatively, the Mutoh shift register consumes the highest leakage power among the MTCMOS shift registers primarily due to the distributed bulky sleep transistors and the huge buffer chains driving the sleep transistors. The Mutoh shift register increases the leakage power by up to 3.10×, 2.59×, and 2.38× as compared to the MEMORY, MEMORY-TG, and Balloon shift registers, respectively. All of the MTCMOS shift registers evaluated in this paper increase the active power consumption due to the larger transistors as compared to the standard single low-V th shift register. The Mutoh shift register consumes the highest active power among the shift registers evaluated in this paper primarily due to the significant power consumed by the clock buffer chain. The Mutoh shift register increases the active power consumption by up to 85.45%, 20.95%, 13.75%, and 10.47% as compared with the standard single low-V th , Balloon, MEMORY, and MEMORY-TG shift registers, respectively.
Similarly, all the MTCMOS shift registers evaluated in this paper increase the clock power due to the larger transistors as compared to the standard single low-V th shift register. The MEMORY shift register consumes the lowest clock power among the MTCMOS shift registers evaluated in this paper. The MEMORY shift register reduces the clock power by 24.53%, 4%, and 0.83% as compared to the Mutoh, MEMORY-TG, and Balloon shift registers, respectively. Since the clock distribution network of the Mutoh shift register is the largest, the clock buffer chain of the Mutoh shift register has to be upsized to achieve similar clock rise and fall times as compared to the other shift registers evaluated in this paper, thereby increasing the clock power consumption significantly. The Mutoh shift register increases the clock power by 32.50%, 31.41%, and 27.20% as compared to the MEMORY, Balloon, and MEMORY-TG shift registers, respectively, thereby consuming the highest clock power among the MTCMOS shift registers. All of the MTCMOS shift registers evaluated in this paper increase the area due to the enlarged low-V th circuit block (to achieve similar propagation speed) and the additional sleep transistors as compared with the standard single low-V th shift register. The Mutoh shift register has the highest area overhead among the MTCMOS shift registers evaluated in this paper due to the distributed bulky header and footer sleep transistors. The Mutoh shift register increases the area by 3.42×, 1.87×, 1.68×, and 1.48× as compared to the standard single low-V th , MEMORY, MEMORY-TG, and Balloon shift registers, respectively. Alternatively, the MEMORY shift register achieves the lowest area overhead among the different MTCMOS shift registers. The MEMORY shift register reduces the area overhead by 46.43%, 20.97%, and 9.84% as compared to the Mutoh, Balloon, and MEMORY-TG shift registers, respectively. 
CONCLUSIONS
New MTCMOS memory flip-flops are presented in this paper for a low leakage data retention sleep mode with significantly simplified data transfer capability, smaller circuit area, enhanced data stability, lower mode transition energy, and no additional control complexity as compared to the previously published MTCMOS flip-flips. A small sized high-V th memory element is combined with the slave latch of the memory flip-flops. One centralized NMOS sleep transistor is employed to disconnect the low-V th gates in the master and slave stages from the real ground distribution network in the sleep mode. The already existing sleep signal of the MTCMOS circuitry is also used for controlling the data transfer operations, thereby eliminating the need for any extra control signals with the memory flip-flops. Sleep and wake-up mode transitions are facilitated by simplified data storage and with the memory flip-flops. The design and operation complexity are significantly reduced as compared to the previously published sequential MTCMOS techniques. The superiority of MEMORY-FF is quantitatively verified based on a comprehensive electrical Quality Metric that considers various equally important design tradeoffs. A 32-bit shift register designed with the MTCMOS memory flip-flop reduces the clock power consumption, leakage power consumption, and area by up to 24.53%, 67.72%, and 46.43%, respectively, as compared to the previously published MTCMOS registers in a UMC 80 nm CMOS technology. The significant leakage savings and robust operation of the MTCMOS memory flip-flops are also verified under supply voltage and process parameter variations.
