Abstract-This paper presents a millimeter-wave (mm-wave) phase-locked loop (PLL), with an output frequency centered at 54.65 GHz. It demonstrates a mode-switching architecture that considerably improves the lock time, by seamlessly switching between a low-noise mode and a fast-locking mode that is only used during settling. The improvement is used to counteract the increased lock-time caused by cycle-slips that results from using a high reference frequency of 2280 MHz, which is several hundred times the loop bandwidth. Such a reference frequency alleviates the noise requirements on the PLL and is readily available in 5G systems, from the radio frequency PLL. The mmwave PLL is implemented in a low-power 28-nm fully depleted silicon-on-insulator CMOS process, and its active area is just 0.19 mm 2 . The PLL also features a novel double injection-locked divide-by-3 circuit and a charge-pump mismatch compensation scheme, resulting in state-of-the-art power consumption, and jitter performance in the low-noise mode. In this mode, the inband phase noise is between −93 and −96 dBc/Hz across the tuning range, and the integrated jitter is between 176 and 212 fs. The total power consumption of the mm-wave PLL is only 10.1 mW, resulting in a best-case PLL figure-of-merit (FOM) of −245 dB. The lock time in low-noise mode is up to 12 µs, which is improved to 3 µs by switching to the fast-locking mode, at the temporary expense of a power consumption increase to 15.1 mW, an integrated jitter increase to between 245 and 433 fs, and an FOM increase to between −235 and −240 dB. Index Terms-Charge pump (CP), CMOS, divide-by-3, fast lock time, 5G, frequency synthesizer, ILFD, injection-locked divider, local oscillator (LO), low phase noise, low power, millimeter-wave (mm-wave), phase-locked loop (PLL), 60 GHz.
PLLs for millimeter-wave (mm-wave) communication standards is challenging in many aspects. Their phase noise limits the highest achievable modulation order [1] , and they also need to have fast settling and be able to operate at low power in battery operated devices [2] [3] [4] . In particular, the fast settling time is required for communication standards that support very high data rates, to avoid losing large amounts of data during frequency locking. In frequency-modulated continuous-wave radar applications, a fast settling PLL is required to be able to receive baseband signals at higher frequencies and, hence, reduce the impact of the high flicker-noise corner in shortchannel technologies [3] , [4] .
A frequency synthesizer operating at about 60 GHz can be used for direct conversion transceivers in the unlicensed 60-GHz band, where, for instance, the WiGig/IEEE 802.11ad standard [2] , future 5G standards [5] , and high-precision radars [3] will reside. Such a frequency synthesizer, followed by a divide-by-2 circuit, can also be used to generate quadrature signals for 28-GHz front ends intended for emerging 5G applications. It can also be used as the local oscillator (LO) in dual-frequency conversion transceivers, by using the 60-GHz PLL output signal together with the divided 30-or 20-GHz signals present in the PLL, depending on whether the first division step after the voltage controlled oscillator (VCO) divides by 2 or 3. Higher frequencies in the E-band can then be targeted. Clearly, high-performance and low-cost 60-GHz PLLs have many potential applications.
The anticipated 5G wireless communication systems will support mm-wave links together with lower frequency cellular ones. This means that an mm-wave PLL operating with a high reference frequency ( f REF ) may have another PLL as its input, instead of a crystal oscillator. This has been the case in recent works and prior art, such as [6] [7] [8] . Similarly, a radio frequency (RF) f REF can be provided by direct digital synthesis, which has been demonstrated in several state-of-theart submillimeter frequency imaging radar systems [9] [10] [11] . Increasing f REF is beneficial for the PLL noise performance, because when the f REF noise is added to the PLL in-band noise, it is first multiplied by the PLL division ratio squared. If the loop bandwidth is kept low, a PLL with high f REF may also decrease the current consumption in the charge pump (CP). However, if not addressed properly, this approach will inevitably lead to problems with prolonged settling time. An additional advantage of using the PLL for cellular bands as an input to the mm-wave PLL in 5G systems is that the RF Fig. 1 . Example of an mm-wave frequency generation architecture for 5G applications that utilize the presence of an RF PLL.
PLL can provide the fine-grain resolution required for channel selection and/or modulation and, thus, simplify the design of the mm-wave PLL. The architecture of such a system is shown in Fig. 1 .
In this paper, a new PLL architecture is presented. Tuned to 55 GHz, the PLL is aimed at 5G applications, but it also has the more general goal of demonstrating an architecture that balances low noise, low power consumption, and fast settling time, when a restriction is that f REF is several hundred times the PLL bandwidth. The demonstrated PLL also features a novel double injection-locked divide-by-3 circuit that achieves a wide lock range at low current consumption. Furthermore, the CP includes a novel current mismatch mitigation technique based on negative feedback. To support rail-to-rail output signals and, hence, wider VCO frequency range, the CP also features an improved operational amplifier that allows operation over a large common-mode range. The presented PLL shows competitive performance at a power consumption of just 10.1 mW, a value which to our knowledge is the lowest presented at such high output frequency. The PLL architecture is first introduced in Section II. A detailed description of the circuit design is then presented in Section III. The measurement results together with a comparison to the state-of-theart are then presented in Section IV. Finally, this paper is concluded in Section V.
II. PLL ARCHITECTURE
A conventional type-II PLL is shown in Fig. 2(a) . Increasing f REF is an effective technique to achieve lower phase noise [6] [7] [8] . Traditionally, using a higher f REF means that the loop bandwidth can be higher, which, in turn, leads to an improved lock time. However, when the reference frequency becomes very high, the loop bandwidth is no longer limited by the reference frequency. Instead, if minimum jitter is targeted, the loop bandwidth should be chosen as the frequency where the VCO phase noise and extrapolated in-band phase noise at the PLL output intersect. Generally, for state-of-theart mm-wave PLLs, a typical bandwidth for optimum noise performance is in the order of a few megahertz. If f REF is then chosen to be about 2 GHz, the ratio of f REF to loop bandwidth becomes very high. To still get the desired bandwidth when using such a high f REF , either the loop filter (LF) capacitances must be increased, or the CP current (I CP ) must be reduced. Increasing the size of the LF leads to a large area for the capacitors, while reducing I CP is an attractive way of reducing the overall power consumption. However, this will also limit the available output current from the CP that charges the filter capacitances, and the PLL lock time will be severely degraded due to so-called cycle slips, originating from nonlinear effects in the transient when the PLL is out of lock-in range [12] . An example of this is shown in Fig. 3 , where the simulated VCO control voltage settling behavior for two PLLs with the same bandwidth is presented. One uses f REF that is eight times higher, and to keep the loop bandwidth unchanged, I CP is then reduced eight times. In the case of a high f REF -to-bandwidth ratio, it can be seen that the cycle-slips prevent the PLL from approaching the correct frequency exponentially and that the settling time is significantly prolonged.
It is, thus, clear that aggressively increasing f REF to obtain better noise performance affects the settling time and that low phase noise and fast settling time are contradictory in this case. To solve this conflict of requirements, we propose a PLL architecture that can achieve fast settling by disregarding the noise performance during PLL settling, and then seamlessly shift to a low-noise mode at steady state, thereby achieving both fast settling and low noise. The proposed PLL architecture, which features two such optimized modes of operation, is shown in Fig. 2 (N 1 N 2 ) ). At the same time, the amplitude of the current pulses fed to the LF from the CP become N 2 I CP . This mode has an increased maximum current that can charge the capacitors in the LF, which yields a faster settling. The low-noise mode is enabled when setting SW fast to logic 0. Both multiplexers then forward f REF and f VCO /N 1 without the extra division. The amplitude of the CP current pulses is at the same time reduced to I CP . The reduced CP current increases the settling time, but the total power consumption is reduced and the decrease in total division ratio improves the in-band phase noise of the PLL. In this paper, f REF /N 2 is chosen to be 8, large enough to demonstrate the efficiency of the mode-switching architecture.
It is important to note that for either mode setting, the small signal PLL characteristics remain the same. One CP is switched on or off, but a seamless transition between the modes is possible because the steady-state value of V ctrl is the same in both modes and no LF reconfiguration is needed. The signal SW fast can be easily generated when the current operating frequency is to be changed. As will be demonstrated by measurements in Section IV, the mode transition will indeed be seamless and not generate any sudden transients. Therefore, the mode change back to the low-noise mode can be performed simply after a predefined delay, without any need for calibration. As the measurements in Section IV suggest, a safe choice for the delay is around 2.5 μs.
An alternative to the fast-settling PLL mode would be to use a fast digital oscillator calibration, in both cases followed by linear settling. In [13] , such a technique is used, with a successive approximation register. The frequency divider of the PLL is then set to the target value, and the divided frequency with different digital oscillator control words is measured using a counter. The time of the counting is proportional to the required precision. For instance, using 7-bit precision to provide the margin against cycle slips in the lownoise mode requires counting about 2 7 = 128 cycles. Since the clocks are not synchronized, an error of up to one cycle can occur [13] . This means that the measurement interval must be doubled. With a 2-GHz clock, each measurement will then require 128 ns. With a 7-bit control word, seven such measurements are needed, i.e., one per bit. The total calibration time will then be about 0.9 μs. As can be seen in [13] , the linear settling following the calibration will then have a significant transient and require additional time, compared to the proposed technique that is close to transient-free. In general, it seems the two techniques can provide similar settling times. However, implementing the digital frequency tuning in the oscillator to support the calibration would result in increased complexity and parasitics in the sensitive mmwave oscillator. The overall circuit simplicity and robustness of operation is a key advantage of the proposed technique.
III. PLL IMPLEMENTATION AND CIRCUIT DETAILS

A. VCO and Divider Chain
The PLL was implemented in a low-power, 28-nm CMOS silicon-on-insulator (SOI) process. The VCO and divider chain is shown in Fig. 4 . The phase-noise performance of the cross-coupled pair VCO is improved by using the tail current filtering technique [14] , which is effective at 60 GHz as well [15] . The VCO output is fed differentially to an injection-locked frequency divide-by-3 circuit (ILFD), as well as to two single-ended buffers, for measurement purposes. Simulations show that the full differential VCO voltage swing of 2.7-V peak to peak can be preserved, even if the VCO output signal is also required to drive additional, mainly capacitive loads, such as divide-by-two ILFD circuits or a mixer. The differential output of the ILFD is fed directly to a differential latch-based divide-by-2 circuit, followed by truesingle-phase-clock (TSPC) dividers and a multiplexer [16] . The first stage of the TSPC divider divides the frequency of the input by 4 or 6, depending on if the Sel control signal is set to logic 0 or 1, respectively. This signal is used to perform step response measurements. Based on whether the PLL is to operate in low-noise mode or fast-settling mode, the control signal SW fast is set to either logic 1 or 0. The sinusoidal reference signal is buffered and converted to a square wave on chip, and fed to the PFD though a multiplexer in a similar configuration as in the divider chain path.
For mm-wave PLLs, it is preferable to use a division ratio higher than 2 in the first feedback divider stage after the VCO, if it can be achieved without increasing the power budget. However, at such frequencies, as explored in [17] and [18] , special architectures suited for mm-wave operation have to be employed. One such architecture that has lately gained increased attention due to its attractive properties, such as low power consumption and small area, is the mm-wave dynamic current-mode logic divider. The operation at mmwave frequencies is attributed to that the memory elements in the latches are the parasitic capacitances of the active devices. Unfortunately, this kind of divider is sensitive to process, voltage, and temperature (PVT) variations. Injection locking is a more attractive technique in terms of robustness, since LC tanks are then used to tune the divided frequency output, and ILFDs with higher division ratios such as three have been demonstrated [19] [20] [21] . However, current consumption and locking range remain major concerns when increasing the division ratio, due to less effective current injection. The ILFD presented in this paper addresses these issues.
Differential ILFD divide-by-two operation is typically based on injecting a current signal at a frequency close to the second harmonic of the ILFD LC resonance ( f 0 ), at the source of the current commutation MOS pair, as shown in Fig. 5(a) [22] . The current signal is then injected by the tail transistor M 1 that acts as a transconductance. Note that M 2 and M 3 act as a single balanced mixer with a maximum conversion gain of G = 2/π, downconverting the injected signal to a frequency close to f 0 at the LC tank. A more general observation is that injection locking division is achieved by injecting a harmonic signal to the circuit that by some mechanism results in a current signal at the drains of M 2 and M 3 , with a frequency close to f 0 and a magnitude sufficiently large to pull the oscillator to that frequency. An upper bound of the frequency locking range for the ILFD, as shown in Fig. 5(a) , is expressed by [22] , [23] 
where Q is the quality factor of the divider's LC tank, I inj is the magnitude of the injected current, and I osc is the magnitude of the free running oscillator current at f 0 . In (1), G = 2/π is used assuming M 2 and M 3 switch on/off abruptly [23] . Injecting a voltage V inj at the gate of M 1 makes the frequency locking range ω range become
It is important to note that ω range is referred to the output frequency of the divider. Clearly, a way to improve the lock range without compromising the performance, such as reducing Q, is by maximizing I inj (i.e., increasing the injection efficiency).
To make the ILFD divide by 3, the current can be injected directly to the output using differential back-to-back MOS devices, as shown in Fig. 5(b) [21] , [24] . In this paper, a novel double injection-locked divide-by-3 circuit is proposed [see Fig. 5(c) ]. The proposed divider uses double injection to achieve increased injection efficiency, enabling a wide tuning range at a reduced current consumption. The triode-multiplier constituted by devices M 4 and M 5 multiplies the injected voltage (V inj ) at frequency f ≈ 3 f 0 with that of the divider output, tuned to f 0 . It was shown in [25] that the output voltage of the triode-multiplier is comprised of even order harmonics. In this case, the multiplier is excited with rail-to-rail signals from the VCO, and therefore, it acts as an efficient voltagemode single-balanced mixer with a differential input and a single-ended output. The output of the multiplier is simply the even-order intermodulation terms, which, in the locked state, are at frequencies close to 2 f o , 4 f o , and higher even-order harmonics. These tones are, in turn, fed to the tail current device, which acts as a transconductance that injects current mainly at ≈ 2 f o to the source of the active pair (M 2 and M 3 ). This results in an additional injection mechanism similar to that of the divide-by-2 ILFD shown in Fig. 5(a) . Higher order harmonics are not significant, as they are suppressed by the circuit. The triode-multiplier also injects a current signal at the first harmonic, I inj,direct directly to the output. Hence, the lock range of the divider is proportional to the sum of lock ranges of the two dividers shown in Fig. 5(a) and (b) . Including the transfer of the passive mixer V inj = V out 2/π in (2) and adding the direct injection gives
Equation (3) presents an upper bound to the lock range of the double injection-locked divider, indicating that the lock range can be increased considerably compared to the directinjection-only circuit, without any current penalty. The current budget can also be reduced significantly since with the second injection path established, the first injection devices [M 4 and M 5 in Fig. 5(c) ] can be small, increasing the LC resonator impedance. Further efficiency enhancement is also achieved due to the push-push regime that the tail current source M 1 is operated in. The designed divider covers the frequency range of the VCO with a margin, to account for PVT variations. Simulations of the divider sensitivity with single injection and with the proposed double injection are shown in Fig. 6 . At the same power consumption, with an input power of 5 dBm and at a fixed varactor voltage of V DD/2, the double injection increases the locking range from 2.5% to 8.5%, i.e., by a factor of 3.4. All possible spurs created by the ILFD are harmonics of the divided signal and, since the output is differential, the odd-order harmonics dominate. Directly after the ILFD are the inverting buffers that aim to make the signal even more square-wave shaped and, thus, increase the odd harmonics even further. The spurs of the ILFD are thus of little concern in this case. The simulated phase noise of the ILFD at 10-MHz offset from the carrier is below −141 dBc/Hz, with a noise of −146 dBc/Hz close to its self-resonance frequency.
The ILFD is followed by a divide-by-2 circuit as shown in Fig. 5(a) . The latch proposed in [26] , shown in Fig. 7 , can be used directly after the ILFD, thanks to the high-speed 28-nm technology, while the last divider stages are implemented in TSPC logic.
B. PFD-CP
The three-state PFD used in this paper is TSPC-based and produces UP/DOWN pulses, which are converted to differential signals by inverters. Transmission gates are used to match the inverter delay, thereby mitigating imbalance between the differential signals.
The CP schematic is shown in Fig. 8 , using a differential architecture (M 1 − M 6 ) for high speed and reduced charge sharing. The UP/DOWN currents are matched to a first order through a 1:1 current mirror (M 1 , M 6 , M 11 − M 13 ). However, due to the reduced feature size, the channel length modulation causes a mismatch between the UP and DOWN currents at high and low values of the output voltage V ctrl . This results in different UP and DOWN pulse widths at steady state, where the pulse with less current must become wider to compensate for the current difference and produce zero net charge to the LF. This issue results in increased spur levels and noise contribution from the CP, as described in the following.
In [28] and [29] , it was shown that high-frequency noise folding due to CP gain mismatch can result in large PLL inband phase noise increase, especially in fractional-N PLLs with strong -modulation noise. This effect is even more pronounced by charge injection, further increasing gain mismatch in the crossover region, resulting in more noise folding. To reduce the charge injection, an operational amplifier OP1 is, therefore, used to make the voltage of the dummy node in the differential CP track V ctrl [30] . Furthermore, with increased f REF , the contribution of the CP to the PLL phase noise becomes more significant. This in-band contribution is [31] , [32] 
where
Boltzmann's constant, T is the absolute temperature, γ is the MOS gamma factor, g m is the transconductance of the devices in the 1:1 current mirror, n T is the number of MOS devices used to copy current to the CP, f c is the 1/ f noise corner of the MOS devices, and T p is the current pulsewidth in steady state (in this paper T p ≈ 20 ps).
This relation indicates that the noise contribution of the CP remains constant if I CP is scaled down by the same factor as the PLL division ratio N. However, T p is dependent on the speed of the latches in the PFD and, hence, do not scale with f REF .
As T REF is reduced, the CP phase noise contribution will, therefore, increase, both thermal noise and even more so 1/ f noise. A mismatch in UP/DOWN currents will result in wider pulses, as the minimum pulsewidth is set by the PFD reset delay. The pulses will then contain more charge and contribute more phase noise. There are, thus, two mechanisms that cause increased phase noise due to CP mismatch, highfrequency noise folding, and LF noise injection, motivating a technique to counteract the mismatch. To reduce the mismatch between UP and DOWN currents, compensation of the channel length modulation effect is required [33] . The schematic of the CP is shown in Fig. 8 , where an additional dc current branch (M 7 − M 10 ) has been introduced, with 1:1 replicas of (M 1 , M 3 , M 5 , M 6 ). A negative feedback loop, using amplifier OP2 with a push-pull output M 15 ) , controls the UP current so that the dc current branch outputs V ctrl . Since the nMOS and pMOS currents are equal in the dc branch, and it has the same output voltage as the CP and replicated devices, the UP and DOWN currents must also be equal.
The operational amplifiers used in the CP need to have high gain as well as capability to handle rail-to-rail commonmode signal levels. This is crucial as the CP output voltage range limits the PLL frequency range. The schematic of a conventional amplifier proposed in [27] is shown in Fig. 9(a) . All the devices are biased in weak inversion for low current consumption, keeping in mind that only low-frequency signals are processed. The input differential pair limits the minimum common-mode input voltage, and an improved version is shown in Fig. 9(b) , where the input stage is a differential pair in parallel with a differential cross-coupled pMOS source follower. As the common-mode input voltage drops below a threshold voltage, the common-source gain is drastically reduced. However, inserting the pMOS source follower in parallel, which turns on under these conditions, preserves some of the gain. Since the differential pair common-source amplifiers are loaded with diode-connected devices, the voltage gain from the gate to drain is not high, making the gain of the common source and the source follower more similar. This technique has also been used in [25] to improve the linearity of operational transconductance amplifiers. Simulation of the voltage gain with and without the proposed technique is shown in Fig. 10 . As can be seen in the figure, using a pMOS source follower helps to provide gain even at 0-V commonmode input. Traditionally, the amplifier shown in Fig. 9(a) has been used without phase compensation; however, in this case, the high gain of the dc loop required compensation, realized by the resistor-capacitor link at the output. The 20-pF capacitor creates a dominant pole, and the resistor creates a high-frequency zero, advancing the phase.
The CP was simulated when driven by two in-phase signals with a frequency of 2 GHz, to resemble the PLL locked state with DI V and R E F matched in phase and frequency. The CP output was forced to a fixed voltage and the output current was observed. The net output current then represents the mismatch between UP and DOWN currents. The simulated mismatch versus CP output voltage with/without the proposed CP mismatch correction loop is shown in Fig. 11 . As can be seen, the current mismatch is reduced to less than 0.1% in the range from 0.2 to 0.8 V. The excessive mismatch near supply (1 V) and ground voltage proximity is due to the V dsat drop required over the tail current devices M 1 and M 6 shown in Fig. 8 .
IV. MEASUREMENT RESULTS
Before designing the complete PLL, a stand-alone ILFD was designed and fabricated. For comparison, it shares a chip with an identical ILFD, but without the double injection path. Both the ILFDs and the subsequent PLL were fabricated in the STMicroelectronics 28-nm ultrathin body and buried oxide fully depleted SOI CMOS process with ten metal layers of interconnect, and with the metal-insulator-metal capacitance option. The active area of each divider is 0.05 mm 2 and the total area of the PLL chip is 0.9 × 0.9 mm 2 , of which the active area is just 0.17 mm 2 . The ILFD die microphotograph is shown in Fig. 12 , and that of the PLL chip overlaid with a layout image is shown in Fig. 13 . Both chips were mounted on FR-4 printed circuit boards, to which all needed supply, bias, and signal pads were wire-bonded. Only the mm-wave signals were probed, using Infinity microprobes from Cascade Microtech. The PLL measurement setup is depicted in Fig. 14 . The input reference frequency was generated by an Agilent E4438C signal generator with low-noise option UNJ. For the phase noise measurements, FSWP phase noise analyzer from Rohde & Schwarz with harmonic mixers for the 50-75-GHz band was used. The FSWP also provided the setup with lownoise supply voltages. An FSU50 spectrum analyzer with a harmonic mixer, also from Rohde & Schwarz, was used for the output spectrum measurements. The loop control voltage measurements used a 4-GHz 20-GS/s Rohde & Schwarz RTO 1044 digital oscilloscope. Pulses for the mode switch control were generated by a WW2572A 250-MS/s waveform generator from Tabor Electronics. To avoid introducing noise on the sensitive loop control voltage node, the oscilloscope was disconnected during phase noise measurements. Measurement results from the ILFD are shown in Fig. 15 , and the two versions are compared in Table I . When utilizing the full range of the varactor, the measured locking range is increased from 13% for the single injection circuit, to 17.5% for the double injection circuit. Even if these ILFD measurements use less than 0-dBm input signal due to measurement setup limitations, which is less than the expected VCO output voltage in the PLL, both ILFD versions demonstrate a wide locking range, covering the VCO frequencies. However, using the single injection ILFD in the PLL would allow for almost no drift in tuning due to PVT variations between the VCO and the ILFD. Since they will be tuned together by the same control voltage, the key concern is how sensitive the mm-wave parts of the PLL are to unforeseen tuning mismatch, and the robustness is dependent on the locking range of the ILFD for a fixed varactor control voltage. At a fixed varactor voltage of V DD/2, the measurements show that the double injection technique increases the lock range by more than a factor of 2, from 2.4% to 5.2%.
When measuring the PLL, deviations from the simulated performance were found. One such difference was that the outof-band noise was worse than anticipated from measurements of earlier stand-alone VCOs in the same CMOS process [15] , possibly due to the decision to use a single-ended output buffer for the VCO signal, or due to noise coupling to the loop control voltage node. This was addressed by increasing the PLL bandwidth. The in-band noise was also higher than expected from simulations, which was alleviated by the decision to increase the CP current and to complement the internal LF with additional capacitance, connected externally. The choice of external components was a capacitor twice the size of the internal one, in series to the ground with a resistor that is half the size of the internal one. If implemented on chip, the extra LF would add approximately 0.02 mm 2 of the active area, making the new total active area 0.19 mm 2 .
All PLL performance measurements were taken using the same settings applied to a single chip in room temperature. Subsets of the full measurement set were also taken on three additional chips, to investigate variations between the samples. The supply voltages for the VCO, ILFD, 20-GHz latch, and the output buffer were set to 0.8 V, and the supply voltages for the low-frequency dividers, PFD, CP, and CP amplifiers were set to 1.06 V. The regular and large CP currents were set to 300 μA and 7 × 300 = 2400 μA, respectively. The total power consumption of the PLL was then measured to 15.2 mW in fast-locking mode, and 10.1 mW in low-noise mode. The measured power consumption of each contributor is visualized in Fig. 16 . It is apparent that the CP current results in a significant power consumption reduction when the PLL operates in low-noise mode. However, the full effect cannot be clearly seen in the figure since the CP shares supply with the PFD, which in low-noise mode will operate at a frequency eight times higher, hence consuming more power. Since the PLL will be in low-noise mode nearly all the time, the power in low-noise mode will be presented as the PLL power consumption.
The PLL frequency range, i.e., the range where the PLL is able to acquire lock, was measured to 52.87-56.81 GHz. However, at the edges of this range, the reduced CP and VCO gain reduce the PLL bandwidth, which along with increased CP current mismatch result in degraded phase noise performance. Hence, the useful frequency range of the PLL is therefore where the phase noise performance is relatively uniform. This was measured to be between 53.2 and 56.1 GHz, a 5.3% range, in which the jitter deviated by less than 1.6 dB from the minimum value of 176 fs (see Fig. 17 ). Although the measured tuning range, which is 1.45 GHz when divided by two, is wide enough to cover the upcoming 5G band in the USA, i.e., 0.85 GHz, located between 27.5 and 28.35 GHz [5] , it still requires some slight tuning. To ensure that the intended bands are covered even in the presence of PVT variations, the PLL frequency range could be increased. An increased PLL output frequency range will also have the added benefit of making the PLL useful in more applications. The output frequency range of the presented PLL is mainly defined by VCO tuning range. The most straightforward way to increase this is to increase the VCO varactor size or to include switched capacitor arrays in the VCO tank, resulting in a tradeoff between phase-noise and tuning range.
The output spectrum at a PLL output frequency of 54 GHz was measured (see Fig. 18 ). The PLL was in low-noise mode and the total division ratio of the PLL was set to 24. To access the differential VCO signal outside the chip, it was fed to two single-ended on-chip buffers, one of which was accessible by on-chip probing. The buffers were sized down to produce a signal of lower power, to reduce coupling to sensitive parts of the circuit, as well as supply ripple due to high-frequency current through bond wires. Because of this, the measured offchip output signal power was about −27 dBm. Note that this is just for measurement purposes and that the PLL is intended to deliver its output signal to a transceiver on the same chip. Since the measurement setup lacked external amplifiers and the used external mixer has high signal loss, the measured spectrum has a high noise floor. As can be seen in the PLL output spectrum in Fig. 18 , the external harmonic mixer also makes the noise floor itself undulate, and creates spurs. The reference spurs, at a distance of 2.25 GHz from the carrier, are measured to be below −61 dBc.
The PLL phase noise measurements shown in Fig. 17 show the phase noise and rms jitter across the PLL frequency range, Fig. 18 . Measured PLL output spectrum in low-noise mode, with a division ratio of 24. The mixing products and undulating noise floor are due to the harmonic mixer of the spectrum analyzer.
in low-noise and fast-locking modes. The total division ratio was set to 24, by setting the Sel signal shown in Fig. 4 to a logic 0. The rms jitter was calculated from the measurements by integrating phase noise from 1 kHz to 30-MHz offset frequency. The best measured phase-noise performance was at an output frequency of 54 GHz, where the rms jitter was 176 fs and the phase noise at 1-and 10-MHz offset were −95.7 and −103.5 dBc, respectively. The measured in-band phase noise at 1-MHz offset stayed below −93 dBc in all lownoise mode measurements. Since the PLL will be in low-noise mode when used as a clock source, the measured phase noise of this mode will be presented as the overall PLL performance.
The measured noise includes a negligible noise contribution from the frequency reference signal (see Fig. 19 ). If a CMOS state-of-the-art, low-jitter 2.2-GHz PLL, such as in [40] and [41] , is used as the input reference frequency generator, the integrated jitter is estimated to rise from 176 fs to between approximately 200 and 230 fs.
Separate phase noise measurements were conducted with the current source matching in the CP disabled. They indicate that the technique does help reduce the phase noise, especially at lower offset frequencies, as expected. A measurement showing the impact on the phase noise in low-noise mode is shown in Fig. 20 .
To estimate the settling time, and to verify the concept of the mode-switching PLL, the loop control voltage was measured during settling with a digital oscilloscope. To introduce a step in the PLL, a pulsed input signal was applied to the Sel signal that controls the division ratio of the divide-by-6-or-4 divider in the feedback path (see Fig. 4 ), while the reference frequency was kept constant. This means that the PLL goes from an unlocked state when the division ratio times the reference frequency does not fall inside the VCO tuning range, to a locked state. When Sel = 0, the total division ratio is equal to 24, and if the loop is then in an unlocked state, the targeted frequency is below the VCO tuning range and the control voltage is at its minimum. When Sel = 1, on the other hand, the total division ratio is equal to 36, and if then in the unlocked state the targeted frequency is above the VCO tuning range, and the control voltage is instead at its maximum. Another pulsed signal was used to control the PLL mode switch. In Fig. 21 , the settling behavior with the mode switch activated at different time delays is shown. When both the pulsed signals switch at the same time, the settling behavior will be that of the low-noise mode. When the mode switch signal is delayed, the settling instead starts in fastlocking mode, followed by a switch to low-noise mode. The estimated settling time in low-noise mode is about three times longer than the estimated settling time using fast-locking mode during the first part of the settling. The estimated maximum settling time using fast-locking of 3 μs was determined as the PLL settling time. The figure also shows that the switching between modes has minimal impact on the output frequency and that the best time to switch from fast-locking to lownoise mode is when the frequency is close to stable. However, the measurements also show that since the switch can safely be made at any time, and that any time operating in fast-locking mode improves the settling time significantly, there is no need for complicated algorithms and feedback to control the mode switch mechanism.
The performance of the PLL is summarized in Table II . The PLL figure-of-merit (FOM PLL ) is commonly used for wireless communication PLLs, and it is based on the theory in [39] , where the phase noise figure-of-merit for the VCOs is extended to an entire PLL. Compared to state-of-the-art published mm-wave PLLs, the mode-switching PLL reported in this paper achieves comparable settling time, area, and phasenoise performance, but at much lower power consumption, which results in a state-of-the-art FOM PLL of −245 dB for this frequency range. Even if the output frequency range is enough to cover the main intended use of the PLL and also enough to demonstrate the architecture, an increased range, attainable with small changes in the VCO, would make it useful for a wider range of applications.
V. CONCLUSION
A novel PLL for mm-wave frequency wireless transceivers is presented, that mitigates the problem of cycle slips during settling by switching between two modes of operation with the same small-signal, but different large-signal properties. Two key building blocks of the PLL include novel circuit techniques. The first is a double injection divide-by-3 circuit that increases the frequency lock range, allowing the power consumption of the mm-wave divider to be robustly scaled down to less than 0.8 mW. The second is the CP, which has a replica-based feedback loop to diminish the current mismatch due to channel length modulation, therefore reducing lowfrequency PLL phase noise. Measurements show a PLL lock time of about 3 μs using the fast-settling mode during the first part of settling, while then operating in low-noise mode achieves a record low power consumption of 10 mW and a state-of-the-art FOM PLL of−245 dB for PLLs in the 60-GHz range.
