I. INTRODUCTION
H IGH-SPEED frequency dividers are critical in a variety of applications from frequency syntheses in wireless communications to broadband optical fiber communication systems. These applications require high speed, low power, high sensitivity, and monolithic integration.
To date, the highest operating frequencies have been achieved with bipolar and III-V technologies [1] , [2] , though their power consumptions are high. Compared to the bipolar and III-V dividers, complementary metal-oxide semiconductor (CMOS) dividers usually operate at lower frequencies. To increase the operating frequency at a given power consumption, several techniques are used, such as injection-locking [3] , dynamic circuit [4] , and improved Miller dividers [5] . Compared with them, a static divider has a much wider operating range and moderate operating frequency and power consumption. CMOS static frequency dividers operating around 20 GHz have recently been reported [6] - [9] , but the power consumption is too high (usually larger than 25 mW for 25-GHz operation). In this letter, by optimizing the transistors size, a power efficient 32:1 CMOS static frequency divider is presented. The power consumption of the first 2:1 stage is less than 15% of other bulk CMOS static frequency dividers at the same frequency. The tradeoff between the speed and power consumption is discussed in detail. II. CIRCUIT DESIGN Fig. 1(a) shows the block diagram of 2:1 current mode logic (CML) (also known as source-coupled logic) static frequency divider [11] . The divider is based on the classical master-slave D-type flip-flop in which the inverted slave outputs are connected to the master inputs. The differential nature reduces the switching noise and provides a sufficient noise margin. A separate buffer is usually used to drive 50-loads. The divider inputs (CK and CKB) are also terminated with 50-resistors to control the amplitude of input signals. As shown in Fig. 1(b) , each master-slave flip-flop is implemented using CML. The master or slave consists of an evaluate stage (M1,3,4) and a latch stage (M2,5,6). The current sources in conventional CML latches are omitted [6] for low-voltage operation. This causes the total current flowing through the evaluate and latch stages to fluctuate in time, which may potentially generate larger switching noise. However, at high frequencies, there is a big overlap when both evaluate and latch stages are turned on, which makes the supply current relatively constant. Therefore, the switching noise is limited, which is also verified by the simulation.
When CK and CKB are equal to the common-mode value and there is no input clock signal, both the master and slave latches are semitransparent, allowing signals to propagate through both latches. This makes the circuit work as a ring oscillator. If the delay from the gate to drain of M3 is , then the oscillation period is equal to 4 . Thus, the circuit oscillates at 1/4 and the signal at the drain of M3 lags the signal at the gate of M3 by 90 . In the small signal model, the propagation delay, , is proportional to the constant at the output node. However, the voltage swing in this circuit can be large and the oscillator becomes nonlinear. This makes only an approximate estimate and large signal characteristics also need to be carefully studied to estimate the real oscillation frequency Usually, the higher self-oscillation frequency leads to higher operating frequency of the divider. Meanwhile, the oscillation frequency strongly depends on the transistors size. Fig. 2(a) shows the simulated oscillation frequency as a function of the width of latch transistors (M5,6) for varying widths of positive channel metal-oxide semiconductor (PMOS) loads (M7,8). In the simulation, the widths of M3,4 are fixed at 5 m and M7,8 are fixed at 8 m. As can be seen, smaller load transistors lead to lower oscillation frequency, because the increases with smaller loads. Though the capacitance also decreases a little, it decreases slower than the increase of . Furthermore, with given load transistors, wider latch transistors lead to lower frequency. From the simulation, the output voltage swing increases as the latch transistor size increases, because of the larger negative resistance from the cross-coupled transistors. Meanwhile the maximum charge/discharge current is also limited by M1. This leads to the slower increase of current compared to the increase of voltage swing, which in turn results in larger and smaller oscillation frequency. Additionally, when the widths of PMOS loads are less than 1.8 m and latch transistors are less than 1 m, the circuit stops oscillating because the PMOS transistors are too small to pull-up sufficiently fast.
Report Documentation Page
To lower power consumption, the PMOS loads and latch transistors should be small, while avoiding the region where the circuit fails to oscillate. Sufficient voltage swing is also required to drive the following stage. In the final design, the widths of the drive transistors (M3,4), PMOS loads (M7, 8) , and latch transistors (M5,6) are chosen as 5 m, 2.6 m, and 1.6 m, respectively. There is greater flexibility for sizing input transistors (M1,2). It should be sufficiently large enough so that the voltage drop across the transistors is not too high and the gate capacitance is sufficiently low enough so that the power consumption for driving their gates is not high. The widths of M1,2 are chosen to be 8 m. For the following four stages, the frequency is lower, thus, smaller transistors are used and the power consumption is much lower than that of the first stage.
Further, extracting from the layout of the divider, the interconnect capacitance doesn't change much with different sizes of transistors. Therefore, as the sizes of all the transistors are scaled up, the impact of interconnect parasitic capacitances becomes less important and the self-oscillation frequency is increased. This, however, also increases the power consumption. Fig. 2(b) shows the power consumption, and maximum and minimum operating frequencies as a function of the drive transistor width. In this simulation, for both the master and slave stages, the widths of M1,2, M5,6, and M7,8 are approximately 1.6 times, one-third (1.6/5), and one-half (2.6/5) of the width of M3,4, respectively. As expected, the power consumption increases almost linearly with the transistor sizes, however, the operating frequency levels are off when the drive transistor is larger than 5 m. This shows that the choice of 5 m for M3,4 is almost optimal.
III. EXPERIMENTAL RESULTS
To make the measurements easier and more realistic, a 32:1 circuit consisting of five stages of 2:1 divider is implemented. The circuit is fabricated in a 130-nm CMOS logic process with eight-layer copper metallization. The die micrograph of the circuit is shown in Fig. 3 . The chip size is 0.38 mm 0.53 mm, which is mainly determined by the pad frame, while the active area is only about 20 m 80 m.
The divider starts to work at supply voltage of 0.53 V with 4.2 GHz maximum operating frequency and only 56 W power consumption of the first 2:1 stage. This is only 12 W higher than the divider architecture specially designed for low voltage and power operation [12] . Fig. 4 shows the input sensitivity measured at three different supply voltages of 0.7, 1.2, and 1.5 V. The maximum operating frequencies are 10, 22.5, and 26 GHz, respectively and the power consumption of the first 2:1 stage is 228 W, 1.86 and 3.88 mW, respectively. The power consumption of the whole 32:1 circuit including buffers (with highimpedance output load) is 551 W, 4.68 mW, and 8.97 mW, respectively. With 50-output load, the power consumption is about one-third higher due to larger current in the buffers. As can be seen, the first stage consumes about 45% of the total power. The output waveform is measured with an Agilent Infiniium 86 100B oscilloscope. Fig. 5 shows the output waveform with a 26-GHz input signal. Since the buffers work at low frequency, the output is close to square. Table I summarizes the power consumption and the maximum operating frequency for several previously reported 2:1 CMOS static frequency dividers around 20 GHz. The 3.88-mW power consumption at 26 GHz is much less than those of all the bulk CMOS dividers [7] - [9] and is close to that of the silicon-on-insulator (SOI) CMOS frequency divider [10] . 
IV. CONCLUSION
By optimizing the transistors sizes in D-flip-flops, a power efficient and high-sensitivity 32:1 static frequency divider in a 130-nm CMOS process is demonstrated. The first 2:1 stage can work up to 26 GHz with only 3.88 mW power consumption at a 1.5-V supply. This is the most power efficient bulk CMOS static frequency divider operating above 20 GHz.
