Index Terms-Controllable delay element (CDE), frequency divider, process compensation.
I. INTRODUCTION

F
REQUENCY dividers are critical functional blocks in very large-scale integration (VLSI) circuits designed for various wireless communication applications. They are employed extensively in phase-locked loop (PLL)-based frequency synthesizers which account for a significant portion of the system power budget. Given mobile devices' inherent need for battery-powered operation; the design of high-speed, low power frequency dividers are essential to reduce the overall power dissipation and maximize the battery life.
There have been many CMOS-based divider architectures in the reported literature. LC resonator-based injection-locked dividers (LC-ILFD) [1] utilize an oscillator whose center frequency is locked to a harmonic of the incoming signal frequency. While being extremely power efficient, these dividers require on-chip inductors that occupy large chip area. Furthermore, the locking range of LC dividers are typically narrow when compared with other frequency dividers. Other ILFDs [2] achieve low power operation, but are relatively slow and can be sensitive to process and temperature variations.
Flip-flop-based frequency dividers consist of two latches in a master slave configuration. Current mode logic (CML) flip- flop-based frequency dividers described in [3] and [4] generally achieve a high speed of operation. However, CML latches require constant biasing and therefore consume power even when there are no clock inputs. A quasi-latch-based dynamic divider proposed in [5] achieves good performance, although it is impractical as proper operation can not be guaranteed across all process and temperature operating corners. In this brief, we propose a dynamic frequency divider that is suitable for use in the 5-GHz UNII band while realizing a small area and low power consumption. In addition, we illustrate how a novel compensation circuitry can be utilized to ensure proper divide-by-two operation with a wide variation in transistor performance. Fig. 1(b) shows the circuit topology of the proposed divider. The divider core inside the dashed box consists of three controllable delay elements (CDEs) and a CMOS transmission gate (TG). When compared with a conventional static divider, shown in Fig. 1(a) , the proposed divider topology has one CMOS TG removed from the signal path. This topology leads to higher speed through a reduction in the critical-path delay. Furthermore, the reduction in clock load results in faster clock transition times and further decrease the delay through the TG [9] .
II. FREQUENCY DIVIDER DESIGN
1549-7747/$25.00 © 2007 IEEE Despite providing speed advantages, the proposed divider operates like a ring oscillator with operating frequencies that are extremely sensitive to variations in the speed of the transistors. Simulations with BSIM3v3 models indicate that up to 80% variation in the maximum input frequency can be expected across all process and temperature corners, causing improper operation. In order to improve the divider's tolerance to uncertainties in the manufacturing process and operating environment, CDEs have been incorporated into the divider design and absorbed into the inverter chain. These delay elements replace the conventional CMOS inverter stages utilized in [5] and allow for controllable tuning of the divider's operating frequency.
A. Divider Core Operation
The proposed frequency divider core in Fig. 1(b) operates in a dynamic manner where the output frequency is modulated by the CLK which controls the ON-OFF state of the TG. On the rising edge of the CLK, the signal at node "a" passes through TG1 and reaches node "b" after a delay of . The signal at node "b" then propagates through three CDEs and returns to node "a" in an inverted form after a delay of , where is the delay through a single CDE stage. This inverted signal will then propagate to node "b" at the beginning of the next clock cycle and the inverting action is repeated. Effectively, the proposed circuit oscillates at half the frequency of the input clock, and thereby achieves divide-by-two operation.
The above discussion yields two conditions for proper operation. Firstly, the path delay through the signal chain must be greater than half the clock period to ensure no race conditions can occur. This yields the maximum allowable input clock period for the given TG and CDE delays (1) Further, the path delay of the divider limits the minimum input clock period such that self-oscillation in the ring oscillator can be maintained. (2) The delays through the circuit can be determined by the equivalent circuit shown in Fig. 2 . This circuit is similar to the one in Fig. 1(b) , but has the feedback loop broken at node "a" to simulate the total propagation delay around the loop. The dummy load is a TG in the off state, and it has the same dimensions as TG1 to resemble the same capacitive loading seen by in Fig. 1(b) . Finally, using (1) and (2), and noting that , the tuning range of the frequency divider can be obtained
B. CDE
There are various popular techniques for designing CDEs such as shunt capacitor techniques and current starved technique [6] . As shown in Fig. 2 , each of the variable delay stages is implemented as a current starved CMOS inverter since no additional capacitances are created along the signal path. As can be seen this figure, the charging and discharging currents of the inverter's output capacitance depends on the gate voltage of M3, M6, and M9. This allows the use of a continuous voltage to control the delay. Without loss of generality, the ensuing discussion will concentrate on the behavior of the first delay stage, and the same design methodologies can be applied to other delay elements.
Sizing of the inverter and control transistors involve tradeoffs between power consumption, speed, and delay controllability. When the ratio of the control transistor M3 is smaller than that of M2, M3 operate mainly in the saturation region and the propagation delay of the current-starved delay stage has been expressed in [6] as (4) where From (4), it can be seen that a small ratio results in the inverter's charging and discharging currents mainly determined by , increasing the controllable range of the delay. On the downside, the propagation delay is higher and the speed of the overall divider is decreased. If a higher frequency of operation is required while maintaining good controllability, larger ratios must be applied to the inverter as well as the control transistor. This leads to additional capacitive loading in the signal path and increases the power consumption.
In contrast, a larger ratio results in the current mainly controlled by M1 and M2. This results in a lower degree of controllability while achieving a higher speed. To illustrate this speed-controllability tradeoff, a three-stage delay element is simulated with with varying , for respectively. As can be seen in Fig. 3 , the delay for is higher, but exhibits wider tunability with respect to . In the proposed divider, power and speed performances have been given higher priority over delay controllability. Therefore, large ratios have been adopted for the control transistors to minimize the intrinsic delay. Furthermore, the width of the inverter transistors must be minimized to reduce power dissipation. Based on these requirements, the following steps can be considered as general guidelines for sizing of the delay elements in the proposed frequency divider.
1) Start with minimum dimensioned transistors and find the worst case delay from node "a" to node "a"( ). 2) Increase the width of the control transistors until the improvement in is negligible. 3) Increase the inverter widths such that (1) is satisfied. 4) Find the best case . If satisfies (2), the design is complete. Otherwise reduce the width of the control transistors and go back to step 3. Following these steps, the transistor sizes for the CDEs have been chosen as shown in Fig. 2 . The width of the CDEs' pMOS transisotrs M1, M4, and M7 are made smaller than their nMOS counterparts M2, M5, M8 to reduce the capacitive load of the gate. While this disrupts the symmetrical switching behavior of the inverting CDE, the overall propagation delay and power consumption are improved.
III. PROCESS-COMPENSATION ARCHITECTURE
The process compensation architecture sets the control voltage of the CDEs, and thereby adjust the delays through the signal path to counteract the effect of process variations. The topology of the proposed compensation circuit working with a PLL is shown in Fig. 4(a) . When the system first powers up, the calibration block compares the stable reference frequency with the proposed divider output . Depending on the result of the comparison, the control voltage is increased, decreased or kept constant to adjust the speed of the proposed divider. This calibration process repeats on the next clock cycle and continues until reaches a stable value. The corresponding digital control register value is then stored in memory to set during normal operation mode. At this point, the compensation process is complete, allowing the reference divider and the calibration block to be powered down. This process compensation submodule is only operational during one-time factory calibration and has no influence on the power consumption of the proposed divider during normal operation. It should be noted that the reference divider chain should employ robust, high-speed static dividers such as the CML-based dividers in [3] . These dividers occupy minimal area when compared with LC-based dividers, and their high power usage can be tolerated since they are not powered during normal operation.
The calibration subcircuit has been implemented using digital logic and a digital-to-analog converter (DAC) as shown in Fig. 4(b) . During system initialization, the register contains an initial digital code which is translated to . The calibration process starts with the phase frequency detector (PFD) and the "Sub" unit which adjusts the digital code depending on the relative phase of and . The digital code is then checked for overflow and updated using the "Add/Sub" module and stored back into the register.
is then updated after a finite delay of the DAC. When the compensation process is complete, the reference divider is shut down while only the register and the DAC need to be powered to maintain the calibrated control voltage.
The complexity of the calibration subcircuit is determined, in large, by the speed and resolution requirement of the DAC. The minimum speed requirement of the DAC is limited by the frequency of the comparison process . Therefore, a 20-MHz 4-bit DAC implementation was sufficient for the proposed compensation architecture. An alternative to the DAC implementation is to replace the CDEs shown in Fig. 2 with the digitally controlled delay elements proposed in [6] . Digital outputs from the register can then directly control the operation of the frequency divider without employing a DAC.
IV. SIMULATION RESULTS
A. Core Divider Performance
The frequency divider was designed using Peregrine's 0.25-m UTSi SOS-CMOS technology. The divider occupies an area of 35 m 25 m and post layout simulation of the fully-extracted divider was carried out using Cadence SpectreRF to verify the operation of the proposed divider. Furthermore, a conventional divider shown in Fig. 1(a) and a CML-based divider have also been designed for comparison purposes. Core divider performance results in this section are obtained under 15 C in the typical process corner using a sinusoidal input clock with an amplitude set to the supply voltage. In addition, the control voltage of the CDEs are tied to the supply voltage to simulate the fastest speed achievable.
The divider's maximum and minimum operating frequencies are plotted against the supply voltage in Fig. 5 . The maximum operating frequency of 10.1 GHz is obtained at a supply voltage of 2 V. With the same supply voltage, the tuning range is 5.2 GHz and is in good agreement with the values predicted by (3) . When the supply voltage decreases to 1.2 V, the divider realizes a lower power consumption while maintaining a moderate operating frequency in the 3.5-to 7-GHz range. Fig. 6 shows the comparison between the proposed divider's figure of merit (FoM), which is defined as the maximum operating frequency per unit of power consumption, with a conventional divider, a CML divider, and other reported CMOS dividers [1] , [2] , [5] , [7] , [8] , [10] . From the graph, it is clear that the FoM of this work shows a inverse relationship with the supply voltage. With a supply voltage of 0.6 V, the divider operates at 4.3 GHz while consuming 33 W, yielding a power efficiency FoM of 143. The FoM gradually decreases to 71 with a supply voltage of 0.8 V, and reaches a minimum of 9 when the maximum supply voltage of 2 V is used.
Over the same range of operating voltage, the proposed divider is 1.4 to 2 times faster than the conventional latch divider while dissipating similar amount of power. In the mean time, the proposed divider can operate at frequencies comparable to that of CML-based dividers at a fraction of the power consumption. Therefore, the proposed divider has a higher level of power efficiency than both topologies designed under the same technology. When operating at 6.8 GHz, the proposed divider exhibits a power saving of greater than 2 mW over the CML divider. This translates to a 5% power reduction in a complete low-power receiver front-end designed previously. With the exception of [8] which is designed in an advanced 0.12-m SOI-CMOS process, the proposed divider shows a higher FoM when compared with other reported CMOS dividers. The enhancement in power efficiency is particularly significant in the medium frequency range from 4 to 7 GHz. Thus, this divider is an attractive candidate for usage as a power efficient first stage of a PLL's divider network working in the 5-GHz UNII band.
B. Compensated Divider Performance
In order to verify the robustness of the proposed divider design over temperature and process corners, we designed and simulated a 5-GHz frequency synthesizer and its accompanying compensation circuitry as shown in Fig. 4(a) . The proposed frequency divider was utilized as the first stage of the PLL's divider chain, while a CML divider was used as the first stage of the reference divider chain. Lower frequency dividers contained in both divider chains employ the conventional static divider shown in Fig. 1(a) . The base frequency of the VCO was designed to be 4.6 GHz with a tuning range of 600 MHz, while the loop filter bandwidth was set to approximately 1 MHz.
The calibration process of the divider's control voltage is shown in Fig. 8 for each of the process corners. As expected, the control voltage reaches the highest value in the SS process, and decreases the transistor's speed increases. Using the final calibrated control voltages, the proposed divider's maximum and minimum input frequencies are simulated and shown in Fig. 9 . For correct operation, the functional range of the divider operating in all corners must cover the complete frequency spectrum of the PLL. As shown in this plot, the worst case scenario occurs at 75 C where the maximum frequency in the slow process corner and the minimum frequency in the fast process corner are 5.3 and 4.5 GHz, respectively. Since the tuning range of the PLL is from 4.6 to 5.2 GHz, the minimum operating margin is 100 MHz. This shows that the compensated divider can indeed operate under all operating conditions. Finally, the tuning transients of the VCO in different process corners are shown in Fig. 7(a) . The tuning voltage approaches a stable level of approximately 900 mV in all process corners. This coincides with the divider chain's output locking with the reference crystal oscillator signal, and producing the desired 5-GHz VCO output as shown in Fig. 7(b) and (c).
V. CONCLUSION
In this brief, we propose a novel frequency divider that is robust and highly power efficient. The proposed architecture utilizes a configurable CMOS ring oscillator for frequency division in the 2-to 10.1-GHz range. With a supply voltage of 0.6 V, the divider operates at 4.3 GHz while consuming 33 W of power. When combined with a simple and effective compensation submodule, the proposed divider can attain process and temperature-insensitive operation in a 5-GHz UNII band frequency synthesizer.
