clock period. With each succeeding generation of microprocessors, the clock frequency increases, decreasing the clock period and thus the skew requirement in absolute time. This is in contrast to chip area and total delay through the clock distribution network, which are both increasing [1] . Hence, techniques are required to equalize the increasingly large delays of each distributed clock signal to even greater accuracies or lower clock skews. To address this need, advanced interconnect systems capable of distributing high-frequency signals with short propagation delays while dissipating minimal power must be investigated and developed.
The improved radio-frequency (RF) capability and projected increase in die size for CMOS circuits have led to the concept of wireless interconnect systems [3] - [6] . The wireless interconnect system consists of integrated receivers and transmitters with on-chip antennas which communicate across a single chip or between multiple chips at the speed of light via electromagnetic waves. Wireless interconnects can be used for both data and clock signals. However, for wireless data, a modulation scheme is required, while for a wireless clock, only a single tone is required. Therefore, wireless clock distribution is a natural first step for evaluating the potential of wireless interconnects in general as well as for developing the key components of a wireless interconnect system.
A conceptual illustration of an intra-chip wireless interconnect system for clock distribution is shown in Fig. 1(a) . A signal is generated on-chip at 8 times the local clock frequency and applied to an integrated transmitting antenna which is located at one part of the integrated circuit (IC). Clock receivers distributed throughout the IC detect the transmitted signal using integrated antennas, and then amplify and synchronously divide it down to the local clock frequency. These local clock signals are then buffered and distributed to adjacent circuitry. Fig. 1(b) shows an illustration of an inter-chip wireless clock distribution system. Here, the transmitter is located off-chip, utilizing an external antenna, potentially with a reflector. Integrated circuits located on either a board or a multichip module each have integrated receivers which detect the transmitted global clock signal and generate synchronized local clock signals. Note that the inter-chip system with an off-chip transmitter results in equalized phases and amplitudes of the received global clock signals, greatly reducing clock skew caused by phase or amplitude mismatch.
Wireless interconnects can provide multiple benefits. As with optical interconnects, signals propagate at the speed of light in wireless interconnects. However, optical components are not needed and the system should therefore be easier to integrate into CMOS ICs. Wireless interconnect should also provide an additional means for global communications, freeing up conventional wires for other uses. Also, using wireless interconnects in a clock distribution system should reduce the latency in the clock tree, which should help reduce clock skew and should eliminate the frequency dispersion problem that may ultimately limit the maximum clock frequency [7] . This paper demonstrates 15-GHz on-chip wireless interconnects consisting of a clock transmitter and clock receivers, integrated with antennas [6] . The circuits are implemented in a 0.18-m CMOS technology with six layers of copper interconnects and a substrate resistivity of 15-25 cm. The paper is organized as follows. Sections II and III present the circuitry used in the clock transmitter and receiver, respectively. Section IV presents the wireless interconnect results, including on-chip antennas, a clock transmitter with integrated antenna, and a clock receiver with integrated antenna. Section V presents conclusions, assessing the potential of intra-chip wireless interconnects and the potential of 0.18-m CMOS for RF applications above 10 GHz.
II. TRANSMITTER CIRCUITRY
A simplified block diagram for the clock transmitter is shown in Fig. 2(a) . The 15-GHz signal is generated using a differential voltage-controlled oscillator (VCO). The VCO is required to have low phase noise to decrease the clock jitter. The signal from the VCO is then amplified by a two-stage output amplifier and delivered to the transmitting antenna. In the final clock distribution system implementation, the VCO will be phase locked to an external reference. However, to ease the implementation requirements for this chip, the transmitter was operated openloop, where the frequency of the VCO was controlled directly with its dc input.
A. Voltage-Controlled Oscillator
A schematic of the VCO and output amplifier is shown in Fig. 3 . Cross-coupled transistors and form a positive feedback loop, providing negative resistances to the LC tanks. PMOS transistors are exclusively used in this VCO design, due to their reduced noise and lower hot-carrier noise, resulting in improved phase noise [8] - [10] . The inductors for the LC tank are implemented in metal layers 5 and 6, above a polysilicon patterned ground shield. The measured is 10 at 15 GHz. The varactor is implemented using an accumulation-mode MOS capacitor, whose measured is approximately 47 at 15 GHz. A test structure consisting of the VCO core plus a single-stage 50-output buffer was connected to a spectrum analyzer, yielding the output single-sideband phase-noise plot and spectrum shown in Fig. 4 . With V, the measured center frequency is 14.92 GHz, with an output power level of 21.5 dBm. The VCO core consumes 7.2 mW from a 1.5-V supply. The phase noises at 1-MHz and 3-MHz offsets are 105 and 113 dBc/Hz, respectively. To understand the competitiveness of this VCO result with previously published results, the phase noise can be scaled to a 5-GHz regime, using Leeson's formula [11] . Thus, for a given inductor , the 15-GHz VCO would correspond to a 5-GHz VCO achieving a phase noise of 114.5 dBc/Hz at a 1-MHz offset. This result is 2.5 dB worse than that achieved in [9] 
B. Output Amplifier
The output amplifier consists of two stages of inductively loaded common-source amplifiers. Since the transistor noise in the amplifier does not significantly degrade the VCO phase-noise performance, nMOS transistors are used. The first class-A stage serves as a preamplifier for the final output amplifier stage. The second stage acts as a pseudo class-E amplifier without a bandpass filter (traditionally used to select the fundamental), and is tuned together with the antenna impedance. As the amplifier switches, the fundamental is transmitted. Higher order harmonics are greatly attenuated due to the impedance mismatch between the antenna and the amplifier at these frequencies. The transistor widths of stage 1 are only 2/3 of that of in the VCO core, which avoids significantly loading the VCO output. At the tuned frequency, the single-sided output has an amplitude approximately equal to the supply voltage, with an offset voltage of ; hence, the output power can be controlled by varying . The simulated power efficiency is 60 . At 1.3 V, measurements show that the amplifier delivers 13.2 dBm to the antenna at 15 GHz, while the peak gain occurs at 18 GHz. 
III. RECEIVER CIRCUITRY
A block diagram for the clock receiver is shown in Fig. 2(b) . The received signal is amplified using a low-noise amplifier (LNA) and divided down to the local clock frequency, and then the signal is buffered to provide the local clock signal. The amplifier is tuned to the clock transmission frequency to reduce interference and noise. Since the microprocessor is extremely noisy at the local clock frequency and its harmonics, transmitting the global clock at a frequency higher than the local clock frequency provides an increased noise immunity for the system [12] . Also, operating at a higher frequency decreases the required antenna size. The receiver is implemented in a fully differential architecture, which rejects common-mode noise (such as substrate noise), obviates the need for a balanced-to-unbalanced conversion at the input, and provides dual-phase signals to the frequency divider.
A. Low-Noise Amplifier
A schematic of a fully differential LNA [13] , [14] with source follower buffers [5] is shown in Fig. 5 . The LNA is input matched to the projected antenna impedance of 125-j55 at 15 GHz. Due to size constraints, the antenna is not designed to be resonant, hence its reactance is nonzero, and additional series inductance in the LNA is used to complete the match. A single-stage cascode amplifier topology with gate, source, and drain inductors is used for each half circuit. The use of inductive degeneration allows both the input power and noise matching conditions to nearly coincide. Following the LNA are source-follower buffers, used to provide a dc level shift and to isolate the LNA from the divider. When driving a capacitive load, a source follower has tunable negative input conductance. By adjusting the current through the buffer (i.e., ), the total conductance at the output of the LNA can be decreased, increasing the overall circuit gain. However, the total conductance must remain positive, else oscillations can occur.
The measured (on-wafer) , , and noise figure for the differential LNA and buffers V are 21 dB, 8 dB, and 8 dB, respectively, at the resonant frequency of 14.4 GHz, as shown in Fig. 6 . The circuit consumes 28 mW from a 1.5-V supply. The LNA inductor 's are 15 at 15 GHz. By increasing to 0.9 V, the gain can be increased to 25 dB, however, becomes positive and the circuit is unstable, demonstrating the effect of negative conductance on gain and stability. Finally, the and for an LNA test circuit output matched differentially to 100 using a capacitive transformer are 8 and 15 dB, respectively, also shown in Fig. 6 .
B. Injection-Locked Frequency Divider (ILFD)
An 8:1 (divide-by-eight) injection-locked frequency divider is implemented by cascading three 2:1 dividers [5] , [9] , [15] , as shown in Fig. 7(a) . A schematic of the 2:1 divider is shown in Fig. 7(b) , consisting of two source-coupled-logic (SCL) D-latches in a master-slave configuration. Fig. 8(a) shows the maximum input frequency and power consumption (for the 8:1 divider core) versus supply voltage for a 64:1 divider, consisting of an 8:1 SCL divider and an 8:1 true-single-phase clocked divider [16] . For a 1.5-V , the maximum input frequency is 15.8 GHz, and the power consumption is 4.5 mW, while for a 2.1-V , the maximum input frequency is 20.4 GHz and the power consumption is 12.2 mW.
Injection locking is the process of synchronizing a free-running oscillator to an input signal [17] - [19] . Before injection locking can be understood, the divider's free-running oscillation must first be described. The 2:1 SCL divider topology will self-oscillate when and are close to their common-mode values. This turns on both D-latches simultaneously, and the circuit becomes a two-stage ring oscillator (equivalent to a regenerative oscillator). The period of self-oscillation for the outputs of the 2:1 divider is equal to , where is the propagation delay for a single D-latch. During design, can be controlled by adjusting the transistor dimensions, as well as by adjusting the common-mode voltage on the lines. During testing, can be controlled by adjusting the divider's . Due to the differential structure of the latch, the source nodes of and oscillate at twice the natural frequency, or at a period of . This is the input-referred self-oscillation frequency (herein termed ). The gates of the clock transistors ( and ) are therefore an ideal place to inject signals close to , and lock this "double-frequency" oscillator (drain of ). Accordingly, the outputs of the 2:1 ILFD are injection locked at half the input frequency.
A benefit of injection locking is the high input sensitivity of the circuit. In other words, an ILFD exhibits conversion gain. This property is ideally suited for wireless interconnects, since the receiver's minimum detectable signal (MDS) can be improved. Fig. 8(a) shows versus , while Fig. 8(b) shows the measured divider input sensitivity. The input sensitivity dips at the self-resonance frequency, meaning that the divider provides conversion gain and selectivity. To achieve the maximum operating frequency, a very large input signal swing is required, while smaller input swings can be used to injection lock the divider close to its . For the wireless interconnect application, the input power level to the divider is large enough such that the divider can be locked over approximately 2 GHz. Thus, the system does not rely on the peak conversion gain of the divider for system operation.
IV. WIRELESS INTERCONNECTS
The layout of the 0.18-m test chip [6] is shown in Fig. 9 . The chip area is 7 6 mm . Multiple antenna test structures, LNAs, frequency dividers, clock receivers, and clock transmitters have been included. The locations of relevant zigzag antennas ( ) have been noted. Clock transmitters (Tx1, Tx2) and receivers (Rx1-Rx4) have been labeled as well. Located in between the antennas are test structures which should interfere with the clock transmission and reception. These test structures contain multiple metal interconnects, vias, substrate connections, passivation openings, and metal-fill patterns (not shown) for metal layers 1-6. Therefore, the density of structures between the antennas is high.
A. On-Chip Antennas
For on-chip antennas, the antenna size is limited by the size of the chip. Therefore, to maximize the antenna's radiation while limiting the physical size of the antenna requires operating at higher frequencies (e.g., 15 GHz), corresponding to smaller . The dipole antenna length has been limited to 2 mm, corresponding to and at 15 GHz, in silicon dioxide and silicon substrate, respectively. As mentioned earlier, since the antenna is not operating at resonance (i.e., the length is not ), the antenna impedance has a reactive component. Therefore, this reactance has to be conjugately matched with the impedances of the transmitter and receiver.
The implemented antennas are 2-mm-long zigzag dipole antennas, labeled in Fig. 9 . The zigzag antennas, illustrated in Fig. 10 , have a 10-m trace width, an 80-m arm element length, and a 30 bend angle. These values were based on the best results currently available from antenna design experiments [20] , [21] . Antennas to are implemented in metal 6 at various locations throughout the chip, with spacings of mm, for antenna pairs , respectively. The separation from metal 6 to the 15-25 cm substrate is 7.2 m. Although the lateral field components partially cancel each other in a zigzag structure, the longitudinal components reinforce each other. Indeed, compared to a linear dipole antenna having the same axial length, a zigzag dipole antenna will have higher gain [20] .
The gain between a transmitting/receiving antenna pair can be determined using Friis transmission equation [22] , which describes the power received to the power transmitted between two antennas as follows: (1) where is an efficiency representing loss in the conductors and dielectrics, is the reflection coefficient at the antenna terminals, is the directivity, and and are the wavelength and separation distance, respectively. When characterizing an antenna pair using S-parameters, this Friis transmission equation is equal to [20] , which is equal to the transducer gain , when the antennas are measured with 50-transmission lines and termination impedances [23] . To characterize the antenna performance under matched conditions, a transmission gain is defined as (2) Referring to (1), is equal to the quantity , and is equal to the power available at the output divided by the power delivered to the input, where both antennas are conjugately matched. Note that in actual system implementations, the antennas would only be matched for a limited frequency range; thus, would only be equal to the actual gain over those frequencies.
A test setup for antenna characterization utilizing a network analyzer has been developed [20] . This setup converts the unbalanced signals from the network analyzer to balanced signals used to excite the antennas. Semi-rigid cables are used to increase measurement reliability. The dies are mounted on a glass slide, which is then placed on a 1-cm-thick insulator with of 2. Fig. 11(a) shows the measured zigzag-zigzag transmission gain versus frequency for 6.7-and 3.2-mm separations, corresponding to antenna pairs and -, respectively. The gain increases with frequency and decreasing separation. At 15 GHz, is 53 and 45 dB for 6.7-and 3.2-mm separations, respectively. Also shown is the transducer gain between two sets of pads separated by 5.6 mm, which is about 20 dB below the transmission gain at 15 GHz. For this measurement situation, the pad-to-pad gain is close to the instrument noise floor. The phase delay between the voltage at the receiving and transmitting antennas for 6.7-mm separation is shown in Fig. 11(b) . The phase delay decreases linearly with frequency , indicating wave propagation rather than some type of lumped element RC coupling [20] . This fact, along with the low pad-to-pad gain, shows conclusively that the signal is propagating from one antenna to the other, and that these waves are launched much more efficiently from the antennas than from the pads alone. Fig. 12(a) shows versus frequency for three pairs of zigzag antennas. Each pair is separated by 6.7 mm, corresponding to antennas . These results demonstrate the effect of interference structures on antenna gain. As can be seen from Fig. 9 , each pair has different types and densities of metal and active structures located in between. The transmission gain changes by 5-10 dB, depending on the antenna and frequency. Also, the antennas have slightly different impedances (not shown), since the reflection coefficient is a function of the metal structures reflecting the transmitted signal back into the antenna. This clearly shows that the structures located around the antenna affect the antenna performance [24] . Fig. 12(b) shows the measured antenna impedance. The zigzag impedance is 100 with a capacitive reactance. The mismatch loss between the receiving zigzag antenna and the LNA input (which was matched to 100 ) is approximately 0.3 dB at 15 GHz. Therefore, virtually all of the power from the receiving antenna is transferred to the LNA.
B. Clock Transmitter With Integrated Antenna
A die photograph of the clock transmitter with a zigzag antenna is shown in Fig. 13 [6] . The size of the transmitter, including the "unused" portions on either side of the circuit, is 0.64 2 mm , while the active areas (excluding pads) is 0.4 0.29 mm . The layout is symmetric left-to-right. Multiple substrate connections are included throughout the circuit.
Referring back to Fig. 2 , the clock transmitter consists of a VCO connected to an amplifier, which then drives the transmitting antenna. Operation of the clock transmitter is demonstrated by on-chip generation of a global clock signal and reception of this signal using receiving antennas. Fig. 14(a) shows a block diagram of the measurement setup used to test the clock transmitter. The dc control and supply voltages were supplied to a transmitter driving a zigzag dipole antenna. The output spectrum was then obtained by probing receiving antennas located at 3-and 5.6-mm separations from the transmitting antenna. Fig. 14(b) and (c) show the resultant output spectra measured at 3-and 5.6-mm distances, respectively, having peak output power levels of 60 and 69 dBm at 15 GHz. The transmitter consumes 48 mW of power. This is the first demonstration of an on-chip clock transmitter with an integrated antenna.
C. Clock Receiver With Integrated Antenna
A die photograph of the 0.18-m clock receiver with an integrated zigzag dipole antenna is shown in Fig. 15 [5] , [6] . The size of the receiver, including the antenna, is 0.66 2 mm , while the active area is 0.37 0.58 mm . The receiver consists of a zigzag dipole antenna, a differential LNA with source-follower buffers, an 8:1 frequency divider, and output buffers. To decrease the minimum detectable signal of the receiver, the peak LNA/buffer gain should coincide with of the frequency divider. To achieve this, the supply voltage of the frequency divider was increased from 1.5 to 2.1 V, increasing from 9.2 to 14.2 GHz. Operation of the clock receiver is demonstrated by transmission of a global clock signal across the chip, detection of this signal, and generation of a local clock signal by a single clock receiver.
The measurement setup shown in Fig. 16(a) was used to test the clock receiver. The input signal is generated off-chip and externally amplified (note that the input signal plotted in Fig. 16(b) is taken before the external amplifier). The input signal is then transformed into a balanced signal and injected into the on-chip transmitting antenna, with an available power level of 20.3 dBm. The output of the receiver located across the chip is then probed and examined using an oscilloscope. Due to driving the 50-load of the oscilloscope, the output swing is reduced.
Two clock receivers (Rx1 and Rx3 from Fig. 9 ) were tested, having antenna separations of 3.2 and 5.6 mm, respectively. Fig. 16(b) shows plots of the input voltage to the transmitting antenna and the output voltage of the wireless clock receiver, for a 5.6-mm transmission distance (Rx3). This demonstrates operation of the clock receiver with integrated antenna. The input global clock frequency is 15 GHz, and the output local clock frequency is 1.875 GHz (15 GHz divided by 8). Sensitivity measurements show that the MDS of the receiver is 40 dBm. The total power consumption for the receiver is 40 mW. Inferring from the measured zigzag-zigzag antenna pair transmission gain 53 dB and accounting for mismatch loss in the system, the power delivered to the input of the receiver is 34.3 dBm. As expected, for the 3.2-mm separation, the input power level to the transmitting antenna can be 10 dB lower, since the antenna transmission gain is 10 dB higher.
Previously, using a 0.25-m CMOS technology, a wireless receiver interconnect with 2-mm linear dipole antennas was demonstrated across 3.3 mm at 7.4 GHz [5] . The 0.25-m receiver had an MDS of 43 dBm, while the available power at the transmitting antenna was 21 dBm. Thus, in advancing the technology one generation, the interconnection distance and operating frequency have both been approximately doubled. Using a more advanced CMOS technology allowed the operating frequency to increase, leading to improved antenna performance, which in turn leads to increased interconnection distance.
D. Simultaneous Transmitter and Receiver Operation
By comparing the results from both the transmitter and receiver circuits with integrated antennas, it can be seen that simultaneous operation of these two circuits (i.e., integrated transmitter communicating across the chip to integrated receiver) is currently not possible. The MDS of the receiver is 40 dBm, while the power received from the transmitter is 60 and 69 dBm for half-chip and full-chip transmission, respectively. Thus, there is a 20-and 29-dB deficit in "system gain" for achieving a fully integrated half-and full-chip wireless clock distribution system, respectively. The primary reason for this deficit is a reduction in the thicknesses of metals 5 and 6. The thicknesses of these metal layers were half that expected, reducing the inductor 's by approximately half. According to simulations, this 50% degradation reduces the system gain by approximately 20 dB, described as follows. The voltage gain of an inductively loaded amplifier is proportional to the inductor . Hence, a 50% reduction will decrease the of the LNA by 6 dB. Likewise, the gain of the class-A preamplifier in the output amplifier is also decreased by 6 dB. Simulations show that the gain of the pseudo class-E stage in the output amplifier is reduced by 1 dB, since the inductive load is shunted by the antenna impedance. Finally, the VCO bias current is adjusted to maximize the voltage swing across the tank; thus, its and, hence, power consumption are increased as decreases. To increase the VCO power consumption, was decreased, which also decreased the dc bias point of the output amplifier. Simulations show that this decreased the gain by another 7.5 dB. All of these yield a 20.5-dB degradation in system gain. Therefore, with the planned metal 5 and 6 thicknesses, a fully integrated half-chip wireless interconnection should have been possible. Despite the reduced metal thickness, reasonable circuit performance is obtained, demonstrating the benefits of copper metallization over aluminum.
V. CONCLUSION
In this paper, both 15-GHz RF CMOS circuits and wireless interconnects have been presented. Individual RF circuit blocks operating at 15 GHz have been demonstrated, including a 14.4-GHz LNA with 21 dB of gain, a frequency divider with maximum input frequencies of 15.8 GHz at 1.5-V and 20.4 GHz at 2.1-V , a 15-GHz VCO with 113 dBc/Hz phase noise at a 3-MHz offset, and an output amplifier capable of delivering 13 dBm of power at 15 GHz. These circuits are commonly found in conventional transceiver architectures; thus, these results show that 0.18-m CMOS technology can potentially be used for RF applications operating above 10 GHz.
In addition to the 15-GHz RF circuit results, a 15-GHz wireless interconnect system has been demonstrated. Measured antenna characteristics for both chips show that antenna gain increases with frequency where the antenna becomes electrically longer. Also, the presence of interference structures between the antennas can alter the gain by 5-10 dB. Additionally, the phase versus frequency response for the antennas is linear, indicating wave propagation, while the gain between two sets of pads is at least 20 dB less than that for the antennas. These two facts together show that the signal coupling is due to wave propagation, and that these waves are launched much more efficiently from the antennas than from just the pads. A wireless transmitter with an integrated antenna generated and transmitted a 15-GHz global clock signal across a 5.6-mm test chip, and this signal was detected using receiving antennas. A wireless clock receiver with an integrated antenna detected a 15-GHz global clock signal supplied externally to an on-chip transmitting antenna located 5.6 mm away from the receiver, and generated a 1.875-GHz local clock signal. These results demonstrate wireless interconnection at 15 GHz for a 5.6-mm distance. This is the first known demonstration of an on-chip clock transmitter with an antenna and the second demonstration of a clock receiver. Finally, this result has basically doubled the distance and frequency of wireless interconnection presented in [5] , by advancing the technology one generation. In summary, this paper has shown that it is possible to use integrated antennas and CMOS circuits to send signals from one side of a chip to another. Such a wireless system can enable transmission of high frequency signals with little to no dispersion over large distances at the speed of light using conventional CMOS technology
