In this paper a study of power efficient pulse triggered flip flops was conducted by adopting a pulse control scheme (PCS), named conditional pulse enhancement. The conditional pulse enhancement scheme consists of a simple pass transistor 'AND' gate design and a pull up 'pMOS'. This set up reduces circuit complexity and removes the pulse generation control logic from the critical path, which facilitate a faster discharge operation as well as improvise the discharge speed conditionally. In this project, the effect of conditional pulse enhancement scheme on the power as well as performance of conventional flip flop such as ep-DCO, ep-CDFF, ip-DCO, are analyzed. The performance analysis was carried out by adopting 180nm CMOS technology. The simulation results reveal that implicit flip flops with conditional pulse enhancement scheme outperforms the conventional flip flops in terms of power and timing characteristics.
INTRODUCTION
Modern day digital design contains many FF rich modules such as register files, shift registers etc. One of the most power consuming parts of the very large scale integration systems is the clock and its interconnection network. It is estimated that a clock system and a logic part itself consume almost same power in various chips, and the clock system consumes 20-45% of the total chip power. Due to the large transition probability, the clock system consumes large power. While the transition probability of the ordinary logic is only one-third of the average power [1] . As a result, reducing the power consumption of FF will have a huge impact on the total power consumption of a system. From a performance perspective, the delay and latency of the FF consumes a large portion of the cycle time, especially in high operating frequencies.
The FF choice and design has a profound effect on both reducing the power dissipation and improvises the overall performance of a system. For high speed operations, instead of conventional master-slave FFs, pulsed latches are more suitable [2] . A pulsed latch can be built from a conventional latch driven by a clock pulse. It consists of a pulse generator, for generating pulses and a latch structure, to store the data. Besides the speed advantage, the circuit complexity and size are reduced because of the single latch structure. These features tends to lower the power consumption. It features "zero" or "negative" set up time and they allow time borrowing across cycle boundaries. Besides these advantages, the limitation is that the pulses should have a minimum size inorder to capture the correct data. The pulse generation scheme requires pulse width controlling strategies to withstand the process variations. Patrovi's [3] and Naffziger's [4] pulsed latches were first of its kind, which were used in earlier microprocessors. In Klass et al [5] proposed a semi-dynamic FF which was a hybrid structure of pulsed latch and flip flop.
Depending on the pulse generation method, Pulsed-FFs (P-FF) can be classified as implicit and explicit [6] . In an implicit type P-FF, the pulse generator is built into the latch circuit design and there is no need of explicit pulse generator here. In explicitpulse generator, the design of the latch and pulse generator are separate. Since no separate pulse generation techniques are used, implicit P-FFs are most power-economical. But they have inferior timing characteristics because of the longer discharging paths in circuit. The power consumption and complexity of the explicit PFFs can be reduced by sharing the pulse generator between groups of FFs.
This paper is divided into 5 sections. Section 2 discusses two of the conventional pulsed flip flop designs. Section 3 deals with conditional pulse enhancement scheme and how it affects the working of conventional flip flops. Pre-layout simulation results are shown and analyzed in section 4. Fig.1 . Explicit-DCO [7] The Fig.1 shows explicit data close to output (ep-DCO) P-FF, a classic explicit P-FF design [7] . It has a semidynamic TrueSingle-Phase-Clock (TSPC) structured latch. This FF can be called semidynamic because it combines dynamic input stage with static operation. The function of pulse generator used in this P-FF is based upon 'NAND' logic. Initially, when the pulse is 'LOW' the transistor "MP1" is turned on and the node "X" is charged. When the pulse becomes 'HIGH', according to the data input the charge on the node "X" remains 'HIGH' or discharges. When data is '1' and pulse is 'HIGH' the node "X" gets discharged. The 'HIGH' pulse also turns on "MN3", which makes the invertor "MP2" and "MN4" effective. Now the logic '0' in node "X" is the input to the inverter which gives the data value '1' at the output node "Q". Similarly when data is '0' and the pulse is 'HIGH' the node "X" remains the same and "Q" remains '0'. The inverters "I1" and "I2", "I3" and "I4" are used to hold the internal node and latch the data. The pulse width can be determined by the delay of the three inverters. One of the limitations in this circuit is the large switching power. If data remains 'HIGH', node "X" will gets discharged for each clock pulse.
CONVENTIONAL PULSED FLIP FLOPS

EXPLICIT DATA CLOSE TO OUTPUT (EP-DCO) FLIP FLOP
IMPLICIT DATA CLOSE TO OUTPUT (IP-DCO) FLIP FLOP
Fig.2. Implicit-DCO [7]
The Fig.2 shows the circuit diagram of ip-DCO. The working of the ip-DCO is same as that of ex-DCO. The only difference is that pulse generator is incorporated to the latch. Even though the power consumption is low than ep-DCO. The timing characteristics are degraded as the node X has to discharge through three transistors. Fig.3 . ep-CDFF [8] In conditional discharge scheme [8] 'nMOS' transistor controlled by "Qb". During the input transition 'LOW' to 'HIGH', the output "Q" changes to 'HIGH' and "Qb" to 'LOW'. As a result, transition at the output switch 'OFF' the discharge path for first stage to prevent it from discharge. As long as the input is holds the value 'HIGH', further redundant evaluations in succeeding cycles are prevented. This method is preferred when there is high switching activity probability. The Fig.3 shows the schematic of the ecplicit conditional dischage FF.
CONDITIONAL DISCHARGE FLIP-FLOP (CDFF)
CONDITIONAL PULSE ENHANCEMENT SCHEME
The pulse generation scheme plays an important role in performance as well the power consumption of the flip-flops. In this project a conditional pulse enhancement scheme using pass transistor based 'AND' logic is used in conventional FFs and the performance were analysed. This work has been started by adopting pulse enhancement scheme in both explicit and implicit flip flops. At First, two conventional explicit flip flops, ep-DCO and ep-CDFF were selected and simulated. Then the pulse generation schemes are replaced with conditional pulse enhancement scheme [9] . The modified flip flops are simulated to identify the number of transistors, "D" to "Q" delay and total power dissipation of these modified flip flops and compared with the conventional explicit FFs. Similar procedure is repeated for implicit ip-DCO.
The explicit FFs can be made power efficient only by sharing a single pulse generator with multiple FFs. The main limitation of the pass transistor logic is that it cannot pass a proper '1'. Therefore a pull up 'pMOS' transistor has to be introduced in all the flip flops. But sharing of a single pulse generator among many flip flops is practically difficult and also the numbers of transistors in the discharging path of the conventional as well as modified FFs are same. Even though the power is reduced, performance and delay remain the same. So the PCS scheme is more suitable for implicit FFs rather than explicit FFs and the work is directed more towards implicit FFs.
The proposed conditionally pulse enhanced ip-DCO, as shown in Fig.4 contains two measures to reduce the limitations of existing P-FF designs. It reduces the number of 'nMOS' transistors in the discharging path and it uses a technique to enhance the pull down strength when input data is '1'. The upper part of latch is similar to the one which employed in ip-DCO design [7] . In Fig.2 , transistor "N2" is included in the stack of transistors in the discharging path. Here "N2" is eliminated from the discharging path. A two-input pass transistor logic based 'AND' gate is formed by transistor "N6" along with an additional transistor "N7" which controls the discharge of transistor N1 [10] . The output node "Z" is at logic 'LOW' most of the time because the inputs to the 'ÁND' logic are clock and it's complimentary. The 'AND' logic gives a "HIGH" only during the transition edge of the clock. At the rising edges of the clock, when both the transistors "N6" and "N7" are turned 'ON' a weak logic 'HIGH' is passed to the node "Z". It turns 'ON' the transistor "N3". This happens only for a small time determined by the inverter "I1". The reduced voltage swing results in reduced switching power at node "Z". Also the discharging path contains reduced number of stacked transistors. As a result of this reduced number of transistors, the time to discharge the node "X" is less i.e. the delay gets reduced.
In this circuit when the input data is '1', node "X" has to discharge through two transistors. This is the longest discharging path in this design. Discharging under this condition is enhanced by the addition of transistor "P3". Transistor "P3" is normally turned 'OFF'. Only when node "X" is discharged to |VTP|, the transistor "P3" gets turned 'ON'. This transistor pulls up the node "Z" to a strong logic 'HIGH' (from VDD-VTH to VDD). This enhances the pull-down strength of transistor "N3".
The width and height of the generated discharging pulse is enhanced by the transistor "P3". A pulse with sufficient width for correct data capturing is thus generated. The bulky delay inverter is no longer necessary. The smaller transistors in the discharge path also reduce the leakage power. 
EXPERIMENTAL RESULTS
The performance of the pulse enhanced P-FF design is evaluated against existing designs through pre-layout simulations using eldo simulator. The compared designs include four explicit type P-FF designs ep-DCO and CDFF. The implicit type P-FF designs used is ip-DCO.
Fig.5. Simulation setup model
The target technology is the TSMC 180-nm CMOS process. The transistors of the pulse generator logic are sized for a design spec of 120 ps in pulse width. The operating condition used in simulations is 500 MHz/1.8V. In order to analyse the power consumption data pattern with 100% transition probability is given at a temperature of 27 o C. The simulation set up model is shown in Fig.5 .
The Table. 1 gives the readings of conventional explicit FFs and explicit FFs with PCS. From the table it is clear that the power delay product of conditionally pulse enhanced explicit FFs are not much reduced compared to the conventional explicit FFs. There is not much improvement in D to Q delay either. This is because the number of transistors in discharging path is not getting reduced in case of these FFs. Also the sharing of pass transistor pulse generator among many flip flops is practically difficult in case of explicit pulsed FFs. Therefore, in the case of explicit flip flop this conditional pulse enhancement scheme is not suitable. The Fig.6 shows the simulation waveforms of the ip-DCO and proposed ip-DCO design. The pulses in node "Z" are generated on every rising edge of the clock i.e. only when both inputs to the 'AND' logic are 'HIGH'. To capture input data "1", the pulse is pulled up to a strong 'HIGH' by the additional transistor "P3". Compared with the pulses generated for capturing data "0", these pulses are enhanced in their heights and widths.
From Table. 2 and Fig.6 , it is clear that adopting conditional pulse enhancement scheme in ip-DCO FF can effectively reduce the power delay product. Due to shorter discharging path and the employment of conditional pulse enhancement scheme, the power consumption of the proposed designs are the lowest. The shorter discharging path also reduces the delay in the circuit. Therefore the conditional pulse enhancement scheme can be applied to other conventional implicit FFs in order to get a better performance in terms of power and speed. 
CONCLUSION
Reducing the power of clocking elements can reduce the total power consumption of the system. The choice of the flip flop design has a huge impact on the power consumption as well as performance of the system. For high speed operations, instead of conventional master-slave flip flops, pulsed latches are more suitable. Here, performance analysis of power efficient pulsed flip flops was carried out by adopting a pulse enhancement scheme. The conditional pulse enhancement scheme consists of a simple two transistor pass transistor 'AND' gate design. The conditional pulse enhancement scheme decreases discharging path length by reducing the number of transistors and it supports conditional enhancement of the discharging pulse. The pre layout simulation results show that pulse control scheme is very suitable in implicit flip flops and it indicates the proposed design excels other design in performance indices such as power, "D" to "Q" delay, and Power Delay Product (PDP). This conditional pulse enhancement scheme can be adopted in other types of implicit pulsed FFs also.
