Abstract-A mm-wave digital transmitter based on a 60 GHz all-digital phase-locked loop (ADPLL) with wideband frequency modulation (FM) for FMCW radar applications is proposed. The fractional-N ADPLL employs a high-resolution 60 GHz digitallycontrolled oscillator (DCO) and is capable of multi-rate two-point FM. It achieves a measured rms jitter of 590.2 fs, while the loop settles within 3 µs. The measured reference spur is only -74 dBc, the fractional spurs are below -62 dBc, with no other significant spurs. A closed-loop DCO gain linearization scheme realizes a GHz-level triangular chirp across multiple DCO tuning banks with a measured frequency error (i.e., nonlinearity) in the FMCW ramp of only 117 kHz rms for a 62 GHz carrier with 1.22 GHz bandwidth. The synthesizer is transformer-coupled to a 3-stage neutralized power amplifier (PA) that delivers +5 dBm to a 50 Ω load. Implemented in 65 nm CMOS, the transmitter prototype (including PA) consumes 89 mW from a 1.2 V supply.
I. INTRODUCTION

M
ILLIMETER-WAVE, frequency-modulated continuous wave (FMCW) radars are utilized in automotive, security and presence detection applications when high range resolution is required [1] , [2] . Recent research efforts are focused on realizing a CMOS radar IC for various low-cost, high-volume applications based on triangular modulation [3] - [6] . In such an FMCW radar system, the achievable range and velocity resolutions depend on the transmitter bandwidth (BW) and the period ( ) of the linear frequency sweep, referred to as the linear chirp (see Fig. 1 ). For short-range detection, a modulation BW up to several gigahertz is required to obtain range resolution better than 10 cm ( , where is speed of light) [2] , [7] . A fast chirp is desired to keep the received baseband beat frequency outside the flicker noise region of active devices [2] . By contrast, a slow chirp (e.g., of 10 ms) is required for high-resolution velocity detection, , in a long-range scenario such as 77/79 GHz automotive radar ( , where is the center operating frequency) [2] . Any nonlinearity in the frequency ramp results in an error when measuring the range, as the transmit signal is also used to detect the signal received from the target. Therefore, the frequency synthesizer for an FMCW radar must provide a carrier with high purity, and moreover, should generate wideband, ultra-linear modulation of the output frequency with a programmable modulation BW and period.
A charge-pump based phase-locked loop (PLL) is typically employed to linearize the voltage-controlled oscillator (VCO) tuning curve using negative feedback. It has proven to be more effective in controlling the chirp linearity compared to openloop techniques [3] - [5] . Two methods of applying frequency modulation (FM ) are shown in Fig. 2 . FM is applied in Fig. 2 (a) via modulation of a multi-modulus divider of a fractional-PLL [5] , while Fig. 2(b) employs an integer-PLL with a direct digital frequency synthesizer (DDFS) as a reference to control the modulation ramp [4] . Both examples ( [4] and [5] ) achieve only 500 MHz modulation range for a short modulation period ( 0.5 ms) on a 77 GHz carrier. Longer modulation period is limited by device flicker noise and leakage current via capacitors used in the analog blocks. The root-mean-square (rms) frequency error of the FMCW ramp (i.e., measure of sweep linearity) is on the order of 500 kHz, while consuming more than 70 mW. In addition, the charge pump PLL cannot be fully integrated on silicon at low cost due to the bulky analog loop filter, and it is not scaled easily to future CMOS technology nodes.
Alternatively, the mixed-mode FMCW synthesizer of Fig. 2 (c) [6] incorporates digital circuits in the majority of the 0018-9200 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. PLL blocks to provide modulation reconfigurability (modulation slope, period, and range). However, the digital-to-analog converter (DAC), integrator, and the VCO still operate in the analog domain due to unavailability (in the past) of a high-performance digitally-controlled oscillator (DCO) at mm-wave frequencies. Moreover, building an analog integrator, which tracks the ever-changing rate of the zero-order-hold (ZOH) circuit at its input with high precision, is difficult. The fastest chirp achievable via the indirect VCO modulation (as in Fig. 2 (a)-(c)) is limited by the PLL's closed-loop bandwidth. For triangular modulation with a 100-s period, the PLL bandwidth should be larger than 1 MHz in order to pass the first 100 harmonics through the loop filter that are necessary to achieve 0.01% sweep linearity [8] . Thus, the loop bandwidth needed to accommodate the modulation could be much higher than that desired for an optimum phase noise of the PLL synthesizer (e.g., 200 kHz).
To overcome these challenges, we propose a digitally-intensive transmit modulator architecture based on a multi-rate all-digital PLL (ADPLL) operating at mm-wave frequencies. This architecture features extensive reconfigurability and wideband modulation capability. Our 60 GHz ADPLL-based FMCW transmitter prototype is implemented in high-volume 65 nm bulk CMOS [9] , [10] . As the first ever reported mm-wave fractional-ADPLL, it employs a fine-resolution 60 GHz DCO using distributed switched-metal capacitors as tuning elements. The 60 GHz ADPLL achieves in-band phase noise of 75 dBc/Hz while consuming 48 mW from a 1.2 V supply. The measured reference spur level ( 70 dBc) is much lower than in the existing analog PLLs. A novel digital mapping algorithm further improves tuning step mismatch in the fine-tuning bank for modulation without utilizing additional dummy cells, which add unwanted parasitic capacitance and ultimately limit the synthesizer's maximum output frequency. The ADPLL performs autonomous gain calibration for the time-to-digital converter (TDC) and closed-loop DCO gain linearization in order to output a GHz-range triangular chirp with high sweep linearity. The measured FMCW slope ( ) is programmable between 300 and 5,000 GHz/s. The measured frequency rms error is only 117 kHz for a 1.22 GHz range chirp on a 62 GHz carrier, corresponding to a range resolution of 10 cm in a 60 GHz short-range radar.
This paper is organized as follows. Section II describes the multi-rate ADPLL-based frequency modulator architecture. Section III elaborates the design of the high-performance 60 GHz ADPLL, including key building blocks. The multi-bank DCO gain linearization is discussed in Section IV. Section V focuses on the multi-rate ADPLL operation. The 60 GHz prototype and its experimental results are discussed in Section VI, followed by conclusions in Section VII. Fig. 2(d) shows a simplified block diagram of the 60 GHz ADPLL-based FMCW transmitter. Frequency modulation capability is incorporated directly into the ADPLL without the need for an up-conversion mixer. The ADPLL has a natural wideband FM capability [11] , which can be realized as a twopoint modulation scheme that has been demonstrated in numerous prototypes at low-gigahertz frequencies [12] - [15] . One data path directly modulates a DCO, while the other path compensates the frequency reference and prevents the modulating data from affecting the phase error. The former path has highpass characteristic, while the latter low-pass filters the signal. When both paths are combined (without any delay difference), an all-pass transfer function is realized. The maximum triangular modulation frequency is not limited by the PLL closedloop bandwidth, and therefore an ultra-fast linear ramp can be synthesized. The slowly varying wander in the DCO frequency is corrected by negative feedback around the DCO. The all-digital loop consists of a TDC to estimate the variable phase, a frequency command word (FCW) accumulator to calculate the reference phase, an arithmetic subtractor to calculate the phase error, and a digital loop filter to control the ADPLL bandwidth and feedback transfer function characteristics. For the two-point FM, the DCO gain should be linear enough to meet the required FMCW rms frequency error, since the system does not rely on the feedback loop to linearize the frequency ramp. This is ensured by the DCO gain calibration and linearization algorithms described in Section IV.
II. MULTI-RATE ADPLL-BASED FREQUENCY MODULATOR ARCHITECTURE
For the two-point modulation scheme to work properly, the modulation data must be normalized accurately to the DCO gain ( ) in the direct modulation path (i.e, in Fig. 2(d) , where is the reference clock frequency). If the normalization is exact, the modulating transfer function is flat from dc to in the -domain, and has only a sinc-type response in the -domain caused by the zero-order hold at the DCO interface. Otherwise, the modulating transfer function is either somewhat low-pass or high-pass. Moreover, there are potential delay differences between the two paths due to layout routing in the IC, which can be observed from the post-layout simulation results and compensated for during the design stage.
The DCO is a highly-linear replacement of the VCO in Fig. 2(a) -(c). Fine frequency resolution is achieved by -dithering of unit-weighted variable capacitors using the high-speed down-divided DCO clock, as shown in Fig. 3 . The 60 GHz ADPLL presented in this paper enables this all-digital synthesis for mm-wave FMCW radar applications and harnesses the power of digital signal processing to improve chirp linearity. In order to synthesize a linear chirp of several gigahertz in range, multiple DCO tuning banks of various tuning step sizes (i.e., different ) are employed. A closed-loop DCO gain linearization algorithm (proposed in Section IV) compensates for the process, voltage and temperature (PVT) variations of , and the calibration data are stored in an SRAM look-up table. Upon modulation, a predistorted signal is applied in the data path of the DCO to obtain higher sweep linearity across a gigahertz modulation range.
The modulation data samples in this ADPLL operate at a high rate (CKM clock in Fig. 3 ) obtained by a low integer division of the variable DCO clock (CKV), which is independent of the phase detection rate at the reference frequency ( ). Being digital, the modulating paths have clock-cycle precision in time. Thus, choosing a high rate for CKM reduces the instantaneous frequency error with respect to an ideal ramp in the synthesized FMCW signal, thereby improving the chirp linearity. The details of multi-rate operation are discussed in Section V.
III. 60 GHZ ADPLL DESIGNS Fig. 3 shows the block diagram of the 60 GHz FMCW transmitter implemented in this work, emphasizing the ADPLL [9] . The DCO, being at the heart of the ADPLL, oscillates directly in the 60 GHz band. It contains three tuning banks to provide 7 GHz tuning range and 1 MHz raw frequency resolution. An extra dithering bank operating at 1 GHz improves the frequency resolution to 400 Hz. The divide-by-32 prescaler output CKV/32 at 2 GHz oversamples the external frequency reference (FREF_in, ranging from 10 MHz to 100 MHz) to generate CKR as a synchronous system clock. CKR is used to accumulate the reference phase , as well as to sample the variable DCO phase to obtain the phase error , where is the discrete-time index of the FREF/CKR clock. In order to avoid metastability in the FREF retiming, FREF_in is oversampled by both the rising and falling edges of CKV/32 simultaneously, and an edge-selection signal derived from the TDC delay chain chooses the path farthest away from the metastable region. A simplified glitch removal circuit compares the absolute value of the jump with a half-integer threshold to correct potential misalignment between and the fractional phase error coming from the TDC. The phase error is fed to a reconfigurable, type-II 4th-order IIR loop filter (LF). The LF employs a gearshift technique to minimize the settling time by switching dynamically during frequency acquisition, while keeping the phase noise (PN) as low as possible in the steady state. A 3-bit slope control in the FREF slicer reduces the reference spur amplitude at the cost of a slight increase in the in-band PN. The built-in DCO gain ( ) and TDC gain ( ) calibrations are performed automatically to ensure a wideband FM response. Six, 8-kbit SRAMs and other digital arithmetic blocks are also integrated on-chip to enable system debugging, and these memories are used to supply look-up tables when applying wideband FM.
A. Wideband High-Resolution 60 GHz DCO
The DCO is an essential building block of the 60 GHz ADPLL. To achieve a 10% tuning range and raw resolution of 1 MHz at 60 GHz, switched-metal capacitors distributed across a transformer-coupled resonator are employed in order to avoid using MOS varactors, which suffer from poor quality factor at 60 GHz [16] . Shorting metal shield strip pairs beneath a transmission line (TL) connected to the transformer primary winding increases its capacitance per unit length, thereby decreasing the oscillation frequency. The schematic of the 60 GHz DCO with its load is illustrated in Fig. 4(a) . NMOS cross-coupled pair ( ) sustains the oscillation. DCO tuning is segmented into coarse-(CB), fine-(FB), and mid-coarse tuning banks (MB), each with a linear characteristic. Tuning bank MB bridges the gap in step size between the CB and FB banks. The CB and MB are integrated with the TL as configurable floating metal shields to form a compact, digitally-controlled frequency tuning scheme. A smaller tuning step is attained by placing metal strips on (lower) metal M6 compared to the coarse-tuning strips on metal M7. The fine-tuning bank (
) is placed at the transformer secondary winding with a weak mutual coupling factor of 0.28 to attenuate its frequency tuning sensitivity by a factor greater than 10. The measured DCO oscillation range is from 56.4 to 63.4 GHz with coarse-tuning of 367 MHz/bit, mid-coarse-tuning of 35 MHz/bit, and fine-tuning of 1.64 MHz/bit.
To optimize the ADPLL operation in both continuous-wave (CW) and FM modes, the FB is split into two parts as shown in Fig. 4(b) .
at the center of the TL is dedicated for FM, and , located above and below , is used to correct DCO frequency wander in the loop at low rates. In this way, only of needs to be calibrated accurately (e.g., 5% mismatch) and applied to the multiplier in the direct modulation path of Fig. 3 . The can tolerate more tuning-step mismatch (e.g., 15%) and only a rough approximation of the DCO transfer function (e.g., 20%) is required to establish an acceptable range for the closed-loop ADPLL bandwidth. The loop bandwidth affects mainly the settling time and noise rejection of the PLL, so a 5-25% variation would have minimal effect on the system performance [11]. Therefore, a of 1 MHz/bit is used in the closed-loop multiplier and the DCO gain ratios between different banks are scaled by (i.e., and ) to simplify multiplication to a right-bit-shift. Consequently, the system complexity and ADPLL loop delay are reduced, which improves the phase margin in wide bandwidth operation.
Device matching in the is critical for distortion-free modulation. To improve device matching, dummy cells are normally added to the capacitor tuning bank in DCOs reported in the literature [17] . However, the need to minimize all parasitics does not permit the addition of dummy cells in a 60 GHz design. Moreover, the metal shield strips beneath the TL must be controlled monotonically to ensure an unambiguous for each bit. The unavoidable magnetic coupling between adjacent strips makes dependent on nearby states, which increases the tuning nonlinearity in the FB. To overcome these problems, a decoding scheme for the FB is proposed in Fig. 5 . By default, half of the switches in each part ( and ) are turned ON via re-centering of the fine-tuning bank after locking. The switches in the lower half-part of are ON ('1') and in the upper half-part of are OFF ('0'), both acting as dummies for in the center (state0). When a small frequency drift upwards appears in the loop, the fine-tuning bank changes to state ' ' (see Fig. 5 ) in response to the positive phase error. To track even more frequency drift upwards, it then changes to state ' '. The switches are turned ON in the sequence shown in Fig. 5 , and sufficient "virtual" dummy switches are attained for unless the frequency drift is so large that all of the switches in are ON, which should never happen in a normal operation. A similar scenario for frequency drift downward is shown as state ' ' and state ' '. Consequently, the of achieves less than 5% nonlinearity and less than 0.1% variation in is measured relative to the expected non-linearity with respect to the DCO center frequency, even without extra dummy cells.
To further improve the DCO's frequency resolution, a (programmable) 1st/2nd-order dithering at CKV/64 ( 1 GHz) is synthesized on chip using a digital standard cell library to reduce the DCO quantization noise to below 140 dBc/Hz at 1 MHz offset. This is significantly lower than the intrinsic DCO phase noise of 92 dBc/Hz at 1 MHz offset. For 1st-order dithering, the total synthesizer PN after dithering flattens off at 145 dBc/Hz from a frequency offset of 20 MHz. Employing the 2nd-order dithering shapes the noise to a higher frequency offset (i.e., 200 MHz), which is far away from the frequency band of interest. Increasing the dithering rate can further reduce the out-of-band noise floor introduced by dithering (e.g., 160 dBc/Hz for 4 GHz dithering rate), but the modulator should be custom designed to operate at such a high rate.
The 60 GHz DCO drives the divider chain directly and is transformer-coupled via the FB structure to a 3-stage neutralized PA. Thus, capacitive loading by the PA on the 60 GHz DCO core is attenuated by a factor of 10 via the weakly-coupled transformer. Extensive EM simulations of the complete DCO tank together with interconnections to the divider and PA were performed with the EMX simulator [18] in order to verify the correct DCO tuning characteristic.
B. Divider Chain Design
The block diagram of the divide-by-32 chain is shown in Fig. 6(a) , which consists of a 60 GHz LC-based injection-locking frequency divider (ILFD), a high-speed current-mode-logic (CML) divider with a maximum toggle frequency of 34 GHz, a divide-by-4 CML stage, and a rail-to-rail CMOS divider. Detailed schematics for the first two divider stages are plotted in Fig. 6(b) and (c). These stages consume the most power in the entire chain due to the operation in the 60-and 30 GHz bands, respectively. In the ILFD, a single-ended 60 GHz signal is injected directly into an LC resonator via ( 12 m 60 nm). A dummy cell is added to balance the capacitive loading of the ILFD on the differential outputs of DCO. A simulated locking range of 9 GHz is realized by employing a 350 pH tank inductor and minimizing the total parasitic capacitance in the tank [19] . The ILFD consumes 4.5 mA from a 1.2 V supply and delivers 500 mV single-ended peak voltage swing to the following CML divider. Switched-capacitor tuning of the ILFD free-running frequency (2 bits) extends the locking range (51-69 GHz), which is sufficient to cover the PVT variations. Each sub-band has a 9 GHz locking range, which is 30% larger than the DCO tuning range. The 2-bit band control is used to adjust the self-resonant frequency of the ILFD in case there is undesired frequency skew between the operating range of the ILFD and the DCO due to process variations. Once the correct sub-band is identified in the open loop test (e.g., a factory calibration), the ILFD is programmed to that band and no other calibrations are needed for the ADPLL in normal operation.
The high-speed CML latch employed in the 30 GHz prescaler is shown in Fig. 6(c) . The CML topology is employed since it can operate with a small input voltage swing (i.e., singleended, peak-to-peak voltage approximately equal to the NMOS threshold voltage, 0.5 V) [20] . Thus, the ILFD output is large enough to drive the CML latch directly. Unlike the latch used in the following divide-by-4 stage, the tail current source is removed to increase the dc voltage drop on the resistive load ( , provided by a PMOS operating in triode region) and the overdrive voltage of cross-coupled transistors . The tracking (
) and the latch stages ( ) are optimized separately for correct operation at-speed. Cross-coupled pair (6 m wide FETs) provides the minimum required gain to hold the state (i.e., ), while contributing only a small parasitic capacitance at nodes and . The simulated maximum divider operating frequency in the worst case (i.e., slowslow process corner) is 34 GHz while consuming 8 mA from a 1.2 V supply. A PMOS transistor is used instead of an inductor load (as in [21] , [22] ) in order to place the latch close to the ILFD to form a compact layout. The last divide-by-2 stage in the prescaler (operating at 4 GHz) employs a CMOS divider with square-wave internal signals to achieve much lower noise floor than the CML divider (e.g., 165 dBc/Hz). The entire divider chain, including interstage buffers, consumes 23 mA in total and operates properly together with the DCO in all the process corners.
C. Phase Detection and Glitch Removal
The 60 GHz ADPLL operates in a digitally-synchronous fixed-point phase domain. The reference phase is obtained by accumulating the FCW at each rising edge of the retimed frequency reference. For the variable phase signal (see Figs. 3  and 7) , its integer part is determined by counting the number of rising clock transitions of the variable clock (CKV/32), while the fractional part is obtained by a TDC based on a pseudo-differential delay chain [23] . Samples of the variable phase signal are subtracted from the reference phase in a synchronous arithmetic phase detector to determine the phase error. As described in [24] , the sampling moments of the accumulator value might not be the same as those of the TDC. The FREF clock provides triggering moments which sample both the counter and TDC outputs. These different sampling places could have a timing misalignment , indicated in Fig. 7(a) , and thus cause glitches in the phase error when the counter and TDC outputs are combined (see Fig. 7(b) ). Instead of correcting these glitches [24] , a simple glitch removal method via digital processing is proposed, as shown in Fig. 7(c) . The digital logic first detects a glitch by comparing the current phase error to that in the previous clock cycle. If the difference is larger than a threshold (e.g., 0.5), the input is assumed to contain a glitch. The phase error is then frozen for this clock cycle by simply disregarding the current phase error to obtain a glitch-free output (see Fig. 7(b) ). In addition, the same logic can be re-used as a lock indicator, or to generate a clock quality monitoring signal by setting a different comparator threshold for debugging or monitoring purposes.
D. Frequency Reference Slicer With Output Slope Control
The reference slicer generates an on-chip frequency reference (FREF) clock for the PLL from an off-chip crystal oscillator. The thermal and flicker noise of the slicer contributes to the in-band PN of the PLL, and thus must be kept well below the contribution due to the TDC quantization noise in order not to be the dominant noise factor. The schematic of the slicer is shown in Fig. 8 . The external 200 mV-pk reference signal is AC-coupled and amplified to a rail-to-rail swing by the inverter amplifier consisting of and . The transistor channel length is sized at 1 m to suppress flicker noise for both and . A small-sized inverter replica ( and ) is configured as a unity-gain buffer to provide gate bias voltage for the main amplifier. The output of the main amplifier drives four independent parallel paths, each consisting of a digital inverter cascaded with a transmission gate. The overall driving capability at the slicer output is binary controlled (3 bits) by transmission gates in each path, and achieves control on the rising/falling slope of FREF. By slowing down the FREF transition time from 100 ps to 800 ps, higher-order harmonics generated during the transitions are attenuated and the reference spur could be attenuated by at least 2 dB, as verified via PLL output spectrum measurements. At the same time, no degradation of the PLL in-band phase noise was detected, since it is still dominated by the TDC quantization noise. To achieve a low phase noise of 135 dBc/Hz at 10 kHz offset, any on-chip interference at the slicer input is minimized by placing the slicer close to its bondpad. The rail-to-rail swing of the slicer output is then fed to the TDC via an on-chip microstrip line of 100 m length, which further slows down the rising edge of FREF as it is distributed on-chip.
E. Output Power Amplifier
The synthesizer output is transformer-coupled to the 3-stage power amplifier (PA) of Fig. 9 . The cascode first stage minimizes loading on the DCO tank with good reverse isolation. The 2nd and the 3rd common-source stages use neutralization [25] to achieve higher gain and greater stability. An on-chip output balun provides a single-ended output to drive a 50 load (i.e., antenna or test instrument input for characterization). The first stage of the PA is co-designed with the DCO in order to account for the load it places on the transformer for the fine-tuning bank. The self-resonant frequency of the FB including the PA load is set above 100 GHz to ensure that linearity of the frequency tuning step in remains better than 5%.
IV. DCO GAIN CALIBRATION AND LINEARIZATION
To achieve a wide-tuning range and good phase noise performance, switched-capacitor tuning of the DCO using multiple tuning banks is employed. Traversing multiple banks, each with distinct tuning characteristics, is unavoidable for the multi-GHz modulation required by the FMCW radar application. Wideband two-point modulation relies on accurate DCO gain ( ). Therefore, the DCO gain calibration and linearization techniques presented in this section are essential for generation of a linear frequency ramp.
A. Frequency Tuning Nonlinearity in a Multi-Bank DCO
As discussed in Section III, the 60 GHz DCO has a multi-GHz tuning range and 1 MHz raw resolution or step size. Ideally, a single tuning bank with constant across the modulation range would be desired. However, the DCO tuning must be segmented into CB, MB, and FB banks of progressively finer resolution (i.e., each with different ) to realize fine resolution and a wide tuning range simultaneously. Consequently, the wideband triangular modulation traverses through all three tuning banks, as shown in Fig. 10 . Integer bits in are matched to within 5% by employing the "virtual" dummy bank configuration earlier illustrated in Fig. 5 . Tuning step mismatches of 15% in MB and CB are much larger than in FB because no dummy cells are employed for coarse-tuning. Moreover, the in MB and FB varies with the CB tuning word due to the ultra-wide 7 GHz coarse-tuning range. When the capacitance increases by , the oscillation frequency ( ) will decrease by , or approximately (i.e., ). Therefore, will vary with even for the same , which is the case for a modulation frequency range up to a few gigahertz.
In addition, the mismatches between the fractional bits (for dithering) and the integer bits in the fine-tuning bank are large because the three fractional bits in FB are located at the edge of the tuning bank (see Fig. 4(b) ). These mismatches result in a frequency error in the ramp generation and can degrade the ramp linearity. Therefore, nonlinearities in the DCO tuning curve must be calibrated and compensated for in real-time in order to implement the wideband triangular modulation shown in Fig. 10 .
B. Multi-Bank DCO Gain Linearization for Wideband FM
Published DCO gain calibration and linearization techniques for low-GHz ADPLLs include: 1) digital normalization based on a background algorithm that measures the phase error present in the loop [17] , 2) adaptive gain compensation by a sign-LMS loop [14] , [26] , 3) DCO FB mismatch characterization via an open-loop method [27] , and 4) signal predistortion in the direct modulation path by a polynomial fitting to correct for the nonlinearity in FB [13] . The first two methods provide background estimation of the DCO gain or the gain ratio between various banks, but are difficult to apply here since each bit in CB and MB has a different , and the in FB also varies with CB settings as explained above. The mm-wave wideband DCO requires over 16 values to be calibrated in the background, which calls for an adaptation algorithm that is complicated to implement and may not converge to a stable solution. Correcting Consequently, a new closed-loop DCO gain calibration and linearization technique is proposed in this paper, with details shown in Fig. 11 . For a triangular modulation of slope ( is the modulation range and is the period of the triangular modulation), the output frequency change within each modulation clock (CKM) is , where is the modulation sampling rate. In the proposed calibration scheme, accurate DCO tuning words (OTW) are determined only in the vicinity of the bank-switchover points (see Fig. 11 ), instead of finding and storing accurate OTWs for each frequency along the chirp trajectory. This reduces the size of the look-up table by three orders of magnitude. To ensure monotonic tuning against PVT, the total tuning range is set to 1.75 times the frequency step size in MB. Thus, bank switchover may be performed at any frequency located in the overlap region shown in Fig. 11 . The mid-point of the overlap region is obviously chosen for a robust switchover. When the upper and lower boundaries of the FB tuning word switchovers are determined for and (i.e., and ), the average tuning step at each CKM cycle, , can be calculated as . The frequency span for and is , and is the total number of CKM cycles required to traverse this span, as indicated in Fig. 11 . The of is linear enough ( 5% mismatch, see measured in Fig. 18 ) so that the average modulation step can be employed. Thus, only three variables: , , and , need to be saved in SRAM for each index . Note that two sets of the DCO tuning words are saved for each switchover point to implement hitless modulation, e.g., and . The entire closed-loop DCO gain calibration takes 4 seconds, which is only performed at power-up and repeated whenever necessary.
The above calibration data are very stable over temperature ( ) and supply voltage ( ) variations. The switchover points only depend on the ratio of the two tuning banks, which are determined by the switched-capacitor ratio ( ) and are insensitive to both and variations. The linearity of is independent of and . The average of varies by 0.4% when changes from 0 to 80 , and 0.2% when the DCO supply voltage changes from 1.1 V to 1.3 V. Thus, the FMCW chirp linearity degradation with and variations (for a specific set of calibration data) is much less than the frequency error due to the calibration accuracy itself (e.g., of has a 5% nonlinearity). During FM, the feedback loop must maintain the DCO in-lock with the desired channel frequency without changing MB and CB. The DCO oscillation frequency varies by 84 MHz when changes from 0 to 80 and by 32 MHz when changes from 1.1 V to 1.3 V. Thus, the entire tuning range of the FB (130 MHz) is sufficient to cover the above and variations. After splitting the FB into two parts, only will be used in the loop to track the DCO frequency drift. Thus, once the and induced DCO frequency change exceeds the tuning range of (70 MHz in this testchip and can be enlarged), a new calibration is needed.
In addition to the above multi-bank linearization algorithm, the DCO gain calibration based on the digital normalization process described in [17] is also implemented on-chip to obtain the of only, which is used when a frequencyshift-keying (FSK) test signal with a maximum frequency deviation within is applied to the ADPLL for debugging purposes.
C. Mismatch Calibration of Fine-Tuning Bank's Fractional and Integer Bits
The mismatch of the fractional tuning bit in (1st-order ) with respect to the average of the integer bits is characterized in an open-loop manner by a forced ON/OFF toggling of the fractional bit, as suggested in [27] . Since small capacitance fluctuations in the DCO tank result in proportional frequency fluctuations, changes in capacitance resulting from on/off switching of the dithering bit are evaluated by subtracting frequency measurements performed at each of the two states. This open-loop configuration is used since each toggling procedure addresses a specific fine-tuning bit (i.e., tuning capacitor), which could not be easily controlled through the normal modulation capability of the ADPLL. The frequency measurements are based on a counter within the ADPLL, and multiple readings of the counter are averaged to reduce the quantization error in a single measurement of frequency deviation, especially in the presence of the DCO phase noise. The same method is used to toggle each thermometer bit in CB, MB, and FB open-loop, in order to characterize the DCO tuning curve and evaluate the mismatch in each tuning bank. Fig. 12 elaborates on the multi-rate two-point FM in the 60 GHz FMCW transmitter. The direct modulation path operates at a high clock rate ( ), which is a down-divided DCO clock to obtain high sweep linearity. The modulated DCO output frequency step in each CKM is . Thus, the CKM is configurable from CKV/128 ( 450 MHz) to CKV/1024 ( 56 MHz) to minimize power consumption according to the required modulation ramp slope ( ). The compensation path is applied to the frequency reference and operates at the retimed reference clock (CKR) rate, . During FM, CKV varies linearly with time and so does CKM. The two functional parts of the ADPLL-based frequency modulator, which are the phase error calculator and the data modulator, have their own separate clock domains: FREF and CKV, respectively. Since their frequency relationship is a time-varying fractional number, their interfaces normally require sampling rate converters. However, this is not necessary in this architecture because system clock CKR is always synchronized to the modulator clock CKM via re-sampling of FREF by CKV/128. The fine-tuning bank used for data modulation ( ) is physically separated from the , which is used for phase error corrections. The coarse-tuning banks CB and MB, on the other hand, are controlled by the direct data path during modulation by a multiplexer (see Fig. 12 ). Therefore, no sampling rate conversion is required for the DCO tuning word.
V. MULTI-RATE ADPLL FOR FMCW RADAR
During modulation, a state-machine controls the access to SRAMs and reads out the proper data before the bank switchover. Operation at high speed is simplified to just an accumulation of the and a comparator to generate the bank switchover events. Meanwhile, a frequency step equal to 32 (32 is the division ratio in the feedback loop) is added as compensation to the frequency reference at every CKR to obtain the wideband FM output. As explained earlier, the mismatch of the dithering bit in FB is obtained via an open-loop calibration, and compensated for in the direct path using the logic highlighted by the dotted line in Fig. 12 , as suggested in [28] . The compensation mechanism is based on a digital gain correction factor that is applied to the 10-bit fractional tuning word before it is fed to the fractional tuning unit, where it is converted into the appropriate dithering signal. The correction factor can be represented as a sum of one and the normalized mismatch error ( ). Accordingly, it is implemented using a reduced-size multiplier followed by an adder. The magnitude of the error has been limited to 8 bits, allowing for a dynamic range of mismatch errors of up to 25% and a theoretical resolution of 0.1%, which is more than sufficient. In addition, the fractional unit in (see Fig. 4(b) ) is sized in this design so that the compensating factor is always a fraction, thereby avoiding a potential overflow.
VI. EXPERIMENTAL RESULTS
The 60 GHz ADPLL-based transmitter is fabricated in TSMC 65 nm LP CMOS with 1 poly and 7 metal layers. The die micro- graph is shown in Fig. 13 . The 2.2 mm total die area includes bondpads, PA, SRAMs and other digital circuitry for debugging. The ADPLL core occupies 0.5 mm and consumes 40 mA: 11 mA for DCO, 23 mA in the frequency prescaler, and 6 mA for TDC and the digital part, while the PA dissipates 34 mA, all from a single 1.2 V supply. The ADPLL chip is wire-bonded to a printed circuit board (PCB) providing dc, digital control and low RF connectivity, while the single-ended 60 GHz PA output is measured via on-die probing, as shown in Fig. 14. After de-embedding cable and probe losses, 5 dBm into 50 is observed at the PA output with a gain flatness of 1 dB across the measured locking range of 7 GHz. A 100 MHz off-chip crystal oscillator provides the frequency reference to the PLL, and is also used to synchronize the spectrum analyzer in order to obtain accurate measurements of the output spectrum (e.g., spur levels). An external harmonic mixer is used at the front-end of the spectrum analyzer to measure the 60 GHz ADPLL output. The noise produced by this mixer dominates the noise floor of the test setup. In addition, a 2 GHz test output (after divide-by-32, CKV/32) is also accessed via the PCB, providing a convenient way to characterize the 60 GHz ADPLL without on-die probing. Most of the measurement results presented in this section are obtained from the divider output and are related to the 60 GHz synthesizer output by the division ratio (i.e., 32, or 30.1 dB).
A. ADPLL in Continuous-Wave (CW) Mode Operation
The measured spectrum at the mm-wave TX output and the close-in of the carrier are plotted in Fig. 15(a) and (b) , respectively, when the ADPLL is locked at 60.099878 GHz. A very low reference spur level of 74 dBc is observed, with no other significant spurs detectable at the ADPLL RF output for a frequency span of up to 2 GHz. When the phase error glitch removal logic is disabled, the skirt level of the PLL output spectrum is increased by 2 dB. The measured worst case reference spur is 72.4 dBc across the 7 GHz locking range from 56.4 to 63.4 GHz. Any out-of-band fractional spurs would be filtered out heavily by the type-II, 4th-order IIR loop filter. The in-band fractional spurs are not detected since they are buried within the phase noise spectrum (see Fig. 15(b) ) and masked by the noise of the external harmonic mixer. However, they can be observed at the divide-by-32 test output and are below 62 dBc, which is a very low level [29] . The in-band fractional spurs, despite being insignificant, matter only for near integer-channels since the PLL nominal loop bandwidth is 300 kHz. They do not pose any problem for the FMCW synthesizer since the RF output is always modulated, and therefore stays at a particular frequency only for a short time (e.g., several CKR cycles).
The close-in spectrum of the 60 GHz carrier (see Fig. 15(b) ) indicates that the PN at 1 MHz frequency offset is 88.5 dBc Hz ( 41.58 10 [50 kHz]). To simplify the measurements, the PN of the ADPLL is measured at the divide-by-32 (CKV/32) test output (see Fig. 16 for the PN measured at various loop bandwidths), and thus 30.1 dB (i.e., 20 [32] ) should be added to refer the PN to the mm-wave (i.e., 60 GHz) output. For a nominal loop bandwidth of 300 kHz, the measured PN at the divide-by-32 output is 118 dBc Hz at 1-MHz offset, which agrees well with the PN obtained at the PA output shown in Fig. 15(b) . Compared to the free-running DCO PN of 92 dBc Hz at 1-MHz offset, the synthesizer PN degrades by 3.5 dB under this (nominal) type-II loop filter configuration. The integrated PN (10 kHz to 10 MHz) at the divide-by-32 output is 45.9 dBc (see Fig. 16 ) for the loop bandwidth of 300 kHz. This corresponds to an integrated PN of 15.8 dBc ( ) at 61.87 GHz output and the rms jitter of 590.2 fs. The measured integrated PN at the PLL output varies by 1 dB across the 7 GHz locking range for the same loop bandwidth.
The PN of the FREF clock (after slicing) measured via a dedicated test output is 124 and 134 dBc Hz at 1 kHz and 10 kHz offsets, respectively. The measured slicer PN degrades only 1 dB at the slowest rise-time setting (800 ps). The TDC is self-calibrating during normal operation of the ADPLL for PVT inverter delay variations. The average time resolution of the TDC ( ) is 12.2 ps at a 1.2 V supply, which corresponds to a theoretical in-band PN of 78 dBc Hz for a 60 GHz carrier, calculated as As seen from Fig. 16 , the measured in-band PN is 77 dBc Hz at 60 GHz band for a loop bandwidth of 1.5 MHz, and thus is dominated by the TDC quantization noise. The measured PN at the nominal loop bandwidth (see Fig. 16 ) is sufficient for the targeted short-range FMCW radar applications [4] - [6] . To further reduce the integrated PN in order to meet more stringent PN requirements for other 60 GHz applications (e.g., IEEE802.11ad with 16QAM modulation), a TDC with finer (e.g., 4 ps) can be used to lower the in-band PN. Consequently, the in-band PN will be reduced by 9.7 dB and the loop bandwidth can also be widened to further suppress the PN of the DCO and reduce the integrated PN at synthesizer output. The of 4 ps can be easily obtained in 65 nm CMOS by employing well-known high-resolution TDC techniques, e.g., the vernier delay line [30] , [31] , [13] , two-step TDC combing coarse and fine [32] , [33] , [15] , a gated ring oscillator [34] , or an interpolation-based TDC [35] . To achieve fast locking and excellent PN after the settling, simultaneously, the loop bandwidth is dynamically controlled via a gearshift technique [17] . During frequency acquisition, the loop operates in type-I, with a wide bandwidth of 1.5 MHz. It is then switched hitlessly to type-II, 4th-order IIR filter with the 300 kHz bandwidth only when it enters tracking mode. The measured lock-in time is less than 3 s for a frequency step of 512 MHz, as demonstrated in Fig. 17 .
The digital-to-frequency conversion linearity of the is measured by toggling each thermometer bit ON/OFF to obtain the individual tuning step as described in Section IV-C. The measured of each bit in has less than 5% mismatch (see Fig. 18 for four different IC samples), which is linear enough for the FMCW generation.
As the first ever reported digital synthesizer at 60 GHz, the performance of this ADPLL is compared in Table I to leading 60 GHz analog PLLs [36] - [38] , as well as an ADPLL operating at 40 GHz [39] . It is the only 60 GHz CMOS PLL capable of the fractional-synthesis. It exhibits excellent in-band and out-ofband PN performance, fast locking, and an ultra-low reference spur when compared to the prior art. Moreover, it is also capable of a wideband FM.
It should be noted that the analog PLL in [38] designed for high-data-rate communication systems achieved lower in-band PN. As discussed earlier, the TDC resolution can be easily reduced to 4 ps to improve the integrated PN if desired. Moreover, a higher reference frequency (e.g., 135 MHz in [38] ) can also be used to further suppress the TDC and reference noise. Compared to the feedback prescaler in [38] , the divide-by-32 chain in this implementation consumes much more power to provide the 9 GHz continuous (without band switching) operating range, which is not required in [38] since only four channel frequencies were generated (integer-PLL). The divider chain is overdesigned for robustness and can be optimized for lower power.
B. ADPLL in Two-Point Frequency Modulation
Two-point modulation at the FREF rate employing only the DCO fine-tuning bank is demonstrated by FSK modulating the 60 GHz carrier at a rate of 50 kHz with a maximum frequency deviation of 40 MHz. The DCO gain for and the TDC gain are calibrated automatically via digital averaging techniques [12] , [23] . The calibrated is then applied to the gain normalization multiplier in the direct modulation path (see Fig. 2(d) ). The demodulated signal is measured using a Rohde&Schwarz FSUP signal source analyzer with FM demodulation firmware, and the waveform measured at the CKV/32 output is shown in Fig. 19(a) . The sharp transition edges in the step-response (which require many harmonics) confirm the wideband FM capability, and demonstrate the effectiveness of the built-in calibration. The multiplier is then perturbed intentionally from the self-calibrated optimum to demonstrate the effect of an incorrect estimation. It can be seen from Fig. 19(b) that a larger multiplier factor introduces strong overshoot in the demodulated signal rising/falling edges, while an overdamped waveform is observed for a smaller multiplier in Fig. 19(c) .
To generate an FMCW chirp with high sweep linearity, a multi-rate two-point modulation is adopted in which the modulation paths operate at a higher rate than FREF (i.e., programmable from CKV/128 to CKV/1024). All three tuning banks (CB, MB and FB) are traversed when the modulation range is wider than 300 MHz. The closed-loop, multi-bank DCO gain linearization technique presented in Section IV is performed automatically prior to modulation, and the resulting look-up table is stored in 24 kbit SRAMs, all completed within 4 seconds. Both slow and fast modulation slopes are used to characterize the chirp linearity by programming various values for the modulation range (BW) and period ( ). Fig. 20(a) shows the FMCW chirp spectrum measured at the PA output for a 1.22 GHz modulation range centered at 62.1 GHz. The is 8.2 ms, forming a slow, triangular chirp. CKM is configured at CKV/1024 to reduce the power consumption of the digital part by 20% compared to operating it at CKV/128. The instantaneous output frequency of the FMCW synthesizer and the frequency error compared to an ideal chirp are also plotted in Fig. 20(b) . The rms frequency error is only 117 kHz. Fig. 21(a) shows the result when the modulation slope is 4 times faster than the case in Fig. 20(b) . The rms error in the frequency chirp is 148 kHz (not including the turn-around points). For an ultra-fast chirp (1 GHz change in 210 s, plotted in Fig. 21(b) ) the frequency error degrades to 384 kHz .
The performance of the 60 GHz all-digital FMCW synthesizer is summarized in Table II . Compared to state-of-the-art FMCW generators, the all-digital architecture reported in this paper achieves wider modulation range for varying modulation slopes, and better phase noise with lower power consumption.
VII. CONCLUSION
A 60 GHz fractional-all-digital PLL (ADPLL)-based transmitter capable of multi-rate, two-point frequency modulation (FM ) is implemented in 65 nm bulk CMOS. Compared to analog-intensive approaches, the all-digital architecture features extensive reconfigurability and allows auto-calibration. The mm-wave DCO exploits distributed switched-metal capacitors for frequency tuning and achieves a wide tuning range (measured at 10%) with fine frequency resolution ( 1 MHz). To execute ultra-linear FM, the ADPLL can calibrate and linearize the multi-bank DCO tuning curve to less than 10 kHz of frequency granularity within 4 seconds. The high-speed modulation clock is programmable to optimize power consumption for various chirp slopes. The measured rms frequency error (ramp nonlinearity) of the FMCW signal is only 117 kHz for a 62 GHz carrier with 1.22 GHz bandwidth. The ADPLL prototype driving a 3-stage transformer-coupled, neutralized power amplifier delivers 5 dBm into a 50 load, while consuming 89 mW from a single 1.2 V supply. As the first 60 GHz ADPLL ever reported, it achieves an rms jitter of 590.2 fs, ultra fast settling (3 s), very low reference spur levels ( 74 dBc) with no other significant spurs observed, and demonstrates ultra-linear FMCW chirp generation, which makes it an excellent choice for FMCW synthesis, and also an attractive architecture for mm-wave frequency synthesis in general.
