The main features of the two circuits are summarised in Table 1 . The offset error is divided by 70, the linear gain error by 20 and the total harmonic distortion is reduced by 13dB.
High-speed divide-by-4/5 counter for a dual-modulus prescaler
Ching-Yuan Yang, Guang-Kaai Dehng and Shen-Iuan Liu
Indexing terms: Dividing circuits, CMOS integrated circuits
A new high-speed divide-by-4/5 counter is developed. Based on this divide-by-4/5 counter, a 3V 2M -1.1 GHz dual-modulus divide-by-128/129 prescaler fabricated with 0 . 6~ CMOS technology is presented. Its maximum operating frequency of 1.1 1 GHz with power consumption of 19.2mW has been measured at a 3V supply voltage. In addition, for a power supply of ISV, the circuit consumed 2.67mW at a maximum input frequency of 520MHz.
Introduction: In the world of modern communication, a frequency synthesiser with a high frequency prescaler is an important building block. New techniques offering hgher integration density, lower power consumption, and high-speed capability are developed to achieve a high-speed CMOS prescaler. Some circuits, using advanced processes and/or special circuit techniques, are proposed to realise the high-speed dual-modulus prescalers. [ 11 Among them, the true single phase circuit technique [2] has resulted in many complex CMOS circuits operating at clock frequencies of several hundred MHz [3] , and some CMOS circuits operating at > 1 GHz [4, 51 in the last few years. In this Letter, a new architecture of a dual-modulus prescaler is presented and fabricated with 0 . 6~ CMOS technology. Experimental results are given to demonstrate its performance.
Circuit description: Most divide-by-1281129 dual-modulus prescalers [4, 51 consist of a synchronous divide-by-415 counter as the first (high-frequency) stage, followed by a chain of toggle flip-flops (TFFs), which forms an asynchronous divide-by-32 counter as the second (low-frequency) stage. The operating speed of prescalers is mainly limited by that of the divide-by-4/5 counter. Unlike the conventional architecture of the divide-by-415 counter, our approach is to preprocess the clock signal and to cascade the divide-by-two stages as shown in Fig. 1 . There is a clock preprocessor (CP) and also two TFFs in the proposed circuit. The clock preprocessor consists of a 'half transparent' (HT) register [6] in the front, and a domino CMOS logic [7] in the rear. The HT register in its register mode (with a '0' input) is extremely fast; nearly one inverter delay is required. In its transparent mode (with a '1' input), the inverse data directly returns to the input of the precharged stage (becoming '0') so that the output signal is allowed to delay a period of the input signal. If MC is set to 'O', then MCx is always 'l', and this domino gate is used as the buffer stage of the two-stage inverter and directly transports the signal to the next stage (TFF). The state in the HT register is not effected since its input CKx is the inverse of clock signal 'id. In this situation, the output frequency equals finl4, where fin is the frequency of the input signal 'in'. If MC is set to 'l', the NAND gate forces MCx to be '1' or '0, depending on the nodes O U P 2 and OUT; then, divide-by-5 operation can be obtained. When the control signal MCx is 'l', the CP acts just as a buffer and the output frequency equals one-fourth of the input frequency. However, when MCx is ' 0 (i.e. the outputs of these two TFFs are 'l'), the N-logic block in the domino gate is forced to turn off. This causes the domino gate to hold the precharge state (i.e. C f i is 'l'), although the signal 'in' is changed to '1'. Observe that node ' A is changed to '0' through the N-CZMOS stage of the HT register. In the next half period of 'in' (which becomes '0), node 'B' in the P-C*MOS stage of the HT register is precharged to 1. At this time (node 'B' becoming 'l'), it will not discharge the In the divide-by-4i5 counter, the frequency-limiting in the architecture is carried out by the CP stage and the first TFF; they must operate at full speed. To date, the fastest standard CMOS DFFS are the dynamic circuits in [6 - have a signifcant effect on VLSI circuit speed. This is shown by comparing circuit simulations with measurements of the critical path delay of a self-resetting SRAM. It is shown that including the measured high frequency noise in the circuit simulation leads to very accurate prediction of circuit speed.
The effect of power supply voltage on the speed of digital CMOS circuits is well understood. Simulations of circuits carried outfor the purpose of predicting speed must, of course, take the power supply values into account. It is also understood that the switching activity of the circuit results in power supply noise on-chip; that is, the levels are different from the external values, due to resistive and inductive losses in the package and on-chip wiring. While it is possible, in principle, to model the effect of switching noise, it is a very intractable problem. Instead, significant effort is spent in designing packaging to reduce the noise. Although package design cannot eliminate noise, the aim is to reduce it to such a level that the circuit designer can assume an average value of the power supply with a negligible amount of noise.
With good device models, and careful package design to minimise switching noise, it can be expected that simulations of circuit performance should accurately predict the circuit speed. However, it is shown in this Letter that short term power supply noise can cause signtficant departures from predictions which assume a constant, steady-state, value of the power supply. If the on-chip voltages vary on a time scale comparable to the circuit cycle time, the switching speed will be affected. Although circuit designers have long been aware of the importance of power supply changes, there have been no previous reports of measurements of power supply and device performance on the same short time scale. Because of the difficulty of simulating power supplies, measurement after circuit fabrication can greatly assist the designers in assessing the accuracy of their device models, and explaining performance discrepancies.
The effect of rapid power supply variations is illustrated here for the case of a 500MHz self-resetting SRAM [l], for which the switching noise is potentially large, due to the large simultaneous switching activity in the circuit. The SRAM was built in a CMOS technology which was very well characterised [2] , and for which highly accurate transistor models were developed. Furthermore, distinct parameters of the model were extracted for each chip of the wafers processed. By using the same die for measurement and simulation, an accurate simulation of the SRAM speed was expected.
Measurements of the SRAM speed were made by measuring waveforms of the internal nodes with a high bandwidth electronbeam prober [3] . To minimise power supply noise, the duty cycle of the chip was very low during the measurements. Examples of some of the measured waveforms are shown in Fig. 1 , in which a series of signals along the critical path of a 'read' of the SRAM,
