Abstract-A pulsed ultrawideband (UWB) transceiver with four-element receiver beamforming is described. Multiband signaling with frequency hopping is employed to efficiently utilize spectrum from 6 to 8.5 GHz to achieve a high data rate of 1 Gbps. A pulse-based signaling approach with low implementation complexity is utilized. Frequency-hopping reduces the overhead for multiband implementation by sharing multiple circuit blocks in the transmitter and receiver paths. The sinusoids for pulse-shaping and single-sideband mixing are synthesized using multiphase LOs to avoid high-speed DACs. The receiver channelizes the spectrum with cascaded mixers for which the LO signals are derived from a single PLL. A single-input multi-output (SIMO) beamforming architecture is employed to increase the sensitivity of the receiver. Beamforming is performed at baseband to avoid the requirement for a wideband delay circuit. The prototype is implemented in 65 nm CMOS and meets the ECC mask. The measured power dissipation of the transmitter is 221 mW, while that of the receiver is 211 mW. A receiver sensitivity of −69 dBm is achieved and Gbps communication over 2 m is demonstrated.
I. INTRODUCTION
T HE DEMAND for high data rate communication in wireless personal area networks (WPAN) is ever increasing, driven mostly by data transfer and display data streaming. Two major wireless technologies that have been proposed to address this requirement include IEEE 802.11ac [1] and standards at 60 GHz [2] - [6] . These technologies pursue high channel capacity using high-order modulation, such as QAM256, or very wide bandwidth, e.g., the 7 GHz bandwidth available around 60 GHz, as dictated by the Shannon-Hartley theorem.
A large bandwidth of several GHz has also been available at a much lower frequency band, compared to millimeterwave bands, as part of the ultrawideband (UWB) system [7] . Depending on the regulations of various regions as shown in Table I , the system allows for the use of at least 2.5 GHz, as long as the transmitted signal has a spectral density under −41.3 dBm/MHz [8] , [9] . For a short range, UWB can offer a good tradeoff between power consumption and data rate, especially for data/video transfer applications from a mobile device to a display that usually require a range of a few meters. Compared to millimeter-wave band operation, communication at lower frequency also relaxes practical challenges arising from testing and packaging, and sensitivity to device parasitics. Relative to systems that exploit very high-order modulation to enhance data rate, the transmitter and PA complexity are significantly relaxed.
A large number of UWB implementations reported over the recent years have employed impulse radio (IR) or OFDM-based approaches [10] . IR-UWB architectures have been demonstrated with transmission pulses employing PPM [11] , OOK [12] , BPSK [13] , PSM [14] , a combination of these schemes [15] . A key challenge in IR-UWB systems that use broadband pulses arises from the difficulty in shaping the pulse spectrum to utilize the regulation mask efficiently. OFDM architectures utilize the given spectrum well, but require greater digital complexity.
In this paper, a multiband IR for Gbps communication [16] over a range of 2 m is described. The radio system is designed to address the following aspects. 1) Flexibility: The transmitter should be capable of adjusting the power spectral density (PSD) to satisfy various regulations. 2) Low implementation complexity: A low-complexity implementation is considered to ensure power efficiency. Low power consumption at the transmitter is especially critical for applications relating to media transfer or display-streaming from portable devices. 3) Spectrum utilization: The allowed spectrum should be maximally utilized, while providing robust high data rate communication. A system that takes the above aspects into consideration is described below. Carrier-based signaling is preferred over carrier-less signaling since it is easier to control PSD [17] , [18] . The ECC band spanning 6-8.5 GHz is chosen as the target frequency range, because it is the strictest regulation in terms of the available bandwidth. Expanding the system over a wider bandwidth can thus provide a higher data rate. In order to enhance data rate in a given spectrum, a beamforming approach is employed in the receiver. A prototype to verify the system is implemented in a 65 nm CMOS process and demonstrates 1 Gbps data rate over 2 m using QPSK modulation. The transmitter includes all the required RF and baseband circuits. The receiver includes the frontends for each of the signal paths and eight internal ADCs to digitize the incoming data streams. The baseband data are demodulated in an FPGA board.
Section II describes the proposed multiband IR-UWB architecture. Sections III and IV describe the circuit implementation of the transmitter and the receiver, respectively. Experimental results are presented in Section V. Fig. 1 shows the proposed multiband IR-signaling scheme and its simulated PSD, along with the ECC mask. The target frequency band from 6 to 8.5 GHz is divided into eight subbands that are separated by 250 MHz. In the frequency domain, a sub-band contains the spectrum of a carrier-based pulse train that has a pulse width of 8 ns. A wideband pulse is formed by combining four pulses that have different center frequencies. QPSK modulation delivers 8 bit data (2 bits per sub-band for four simultaneously transmitted sub-bands) at 125 MHz symbol rate to achieve 1 Gbps data rate.
II. MULTIBAND IR ARCHITECTURE

A. Multiband Signaling
This scheme satisfies the aforementioned considerations of flexibility, simplicity, spectral efficiency, and easy implementation of beamforming in the RX. The approach is flexible since different regulation masks can be accommodated by adjusting the LO frequencies and the pulse width. Pulse-based signaling leads to a relatively simple implementation, 2 especially in the digital domain. The baseband in fact consists of a limited 2 While OOK is even simpler to implement, it is not adopted due to inherent 3 dB SNR penalty. number of gates for tasks such as synchronization and is significantly simpler than that used in schemes such as OFDM, which requires an FFT/IFFT engine. Further, as each sub-band bandwidth is only 250 MHz wide, pulse-shaping and control of the output PSD is easier compared to achieving a rectangular PSD with a single-frequency wideband pulse. Finally, RX beamforming can be implemented at baseband after downconversion, which avoids a potentially power-hungry delay-and-sum circuit [19] . This is the case because the phase rotation within a sub-band can be assumed to be a constant, since the bandwidthto-carrier frequency ratio is small, with a maximum value of 1/26.
B. Frequency Hopping
Frequency-hopping is employed since it reduces the number of signal paths, increases the channel separation, and also avoids multipath components temporally. The eight sub-bands are divided into two sets with respective center frequencies of [6.5, 7, 7.5, 8] GHz and [6.75, 7.25, 7.75, 8.25 ] GHz. The number of concurrent pulses is halved to four by interleaving two sets of pulses for every symbol period of 8 ns, which results in fewer signal paths in the TX and RX. Using this approach, the duty cycle of the signal in a sub-band becomes 50%. In order to maintain the same output power, the instantaneous power can be doubled, similar to the case of WiMedia [10] or IR-UWB as employed in [20] . The interleaving scheme also relaxes the receiver baseband channelization, since the adjacent sub-band signals lie at 500 MHz offset, instead of 250 MHz. Furthermore, multipath effects can be mitigated by placing time intervals between symbols at the same frequency [17] .
III. TRANSMITTER IMPLEMENTATION
The transmitter consists of four identical modules. Each module consists of an independent PLL to provide a carrier frequency for one sub-band. Single-sideband mixing is employed in each module to generate a second sub-band that is separated by 250 MHz relative to the PLL output frequency. A total of eight carrier frequencies are thus generated. Both the sub-bands generated in a module employ a merged signal path and share the LO generation circuit, a phase modulation block, and an output stage as shown in Fig. 2 . In each module, phase modulation is performed by choosing one phase out of four quadrature LO phases in response to the input data. Pulse-shaping is applied within the output stage. The operation of each of these blocks and the signal flow are described below.
A. LO Phase Generation
The four phases required for QPSK modulation are implemented using a quadrature VCO (Fig. 3) that is locked within a PLL. The quadrature VCO is implemented using two LC oscillators that are coupled through parallel NMOS transistors whose sizes are determined based on a tradeoff between the phase noise and the phase accuracy [21] . A 5-bit digital control word is used for frequency tuning by selectively turning on the capacitor bank in the oscillator tank. The capacitor bank consists of metal-oxide-metal capacitors (MOMCAP) in series with NMOS switches. The source and drain nodes of the switching NMOS devices are biased to reduce the loss in the tank and increase the tuning range in such a way that the resistance of NMOS for the "on" state and the parasitic junction capacitance for the "off" state is minimized via weak inverters [22] .
A given module generates LOs at f LO and f LO + 250M to alternately transmit pulses in two sub-bands. Single-sidebandbased LO synthesis is employed to avoid the requirement for two PLLs in each module which would result in a large area, since there are four such modules. Using a single PLL and changing its division ratio for generating the two LOs is not an option, due to the requirement for settling time. The use of wideband IF similar to [23] offers agile band-switching at the expense of area and power. Here, single-sideband mixing with 250 MHz sinusoidal LO is used and fast band-switching is possible simply by multiplexing between PLL outputs and SSB mixer outputs. Multiplexing the input of the SSB mixer [24] is not employed, because the 250 MHz input node is designed to have a limited bandwidth to minimize harmonics as explained in Section III-B.
The MUX in Fig. 4 selects the outputs of PLL and SSB mixer while providing gain. Spurious coupling between the MUX inputs is prevented by completely separating the two paths. When "SEL" is in a "HIGH" state, the path from disabled by pulling the gate voltage of the second differential pair to GND and vice versa. Also, the smaller input capacitance compared to the selector in [24] is helpful for maximizing the bandwidth along the LO path. The same MUXs are used to select the LO phase to implement QPSK modulation. In order to not affect the output PSD, the MUX inputs for phase selection and band selection are changed between pulses when the output amplitude is near zero.
It is critical to minimize the harmonics of the sinusoidal waveform that is applied to the SSB mixer, since the unwanted outputs at [f LO + n × 250 MHz] appear in the adjacent channels and can violate the regulation mask if they are not adequately suppressed. A tuned L-C load at the output, e.g., [24] , is not used to save area and the overhead of the tuning circuit. Instead, harmonics of 250 MHz are suppressed at the input port, as explained in the next section. The harmonics of f LO are always greater than 13 GHz and, hence, are not critical as they fall outside of the signal bandwidth.
B. Sinusoidal Synthesis
A current DAC can be used to generate the I/Q sinusoids, e.g., [25] . In our application, this would imply a high clock rate of 4 Gsps to avoid aliasing, which is power hungry. Furthermore, since eight mixers require identical in-phase and quadrature 250 MHz LOs, distributing the sinusoids with adequate linearity poses a design challenge.
It has been shown previously that a properly weighted sum of ideal square waves with a phase difference of 45
• can completely cancel the third and fifth harmonics of the fundamental frequency in a harmonic rejection mixer [26] . Similarly, the weighted sum of waveforms with a phase difference of 22.5
• suppresses the seventh and ninth harmonics. higher harmonics beyond the 11th, which are equal to or greater than 2.75 GHz, are attenuated by the gm-C low-pass filter at the input port of the SSB mixer.
The required multiphase clocks are derived from 8 GHz PLL by digital dividers and logic gates, and are shared among the four modules. Quadrature phase of the 250 MHz LO is also easily generated with same set of multiphase clocks used for in-phase signals.
C. Output Stage
The pulse shape in the time domain determines the PSD of each sub-band and, hence, the PSD of the entire output spectrum. A rectangular pulse shape leads to large sidelobes that are undesirable. Better side-lobe performance is achieved through the use of triangular pulse-shaping, which is relatively easy to generate using a current source and a capacitor [18] . However, this approach requires a linear transconductor to convert a triangular waveform into a current and, hence, can pose a design challenge. A sinusoidal pulse shaper is implemented in this work. It operates directly in the current domain and provides better side-lobe suppression than either rectangular or triangular pulse-envelopes.
The sinusoidal pulse-template is generated using the same approach employed for sinusoidal synthesis (Section III-B), except that the fundamental frequency is 125 MHz which corresponds to a pulse duration of 8 ns. The output stage including the upconversion mixer and the output network is shown in Fig. 5(c) . The right branch removes the pulse-template frequency term of ω p in the differential current outputs, prior to an external balun. 3 This helps to suppress the harmonics of ω p that can arise from circuit and balun imperfections. The current through one of the differential outputs is given by
The output current of four elements is summed after cascode devices that employ thick-gate MOSFETs. This protects the transistors in the output stage by limiting the voltage below the nominal voltage of 1.2 V and also provides isolation between the modules.
IV. RECEIVER IMPLEMENTATION
Since the transmitted signal has a low power level to conform to the mask, and the path loss at the frequency of interest is large, the receiver must exhibit high sensitivity to achieve a sufficiently long communication range. The relationship between the transmitted power and the available power at the receiver is given by Friis' transmission equation
where P r and P t are the received and transmitted powers, respectively, G t and G r are the antenna gains, λ is the wavelength, and R is the distance between the transmitter and the receiver. Assuming 0 dB antenna gains, the free-space path loss is approximately 57 dB at 2 m for a signal of 8.25 GHz center frequency. Even if the transmitter could occupy the entire spectrum from 6 to 8.5 GHz with the maximum PSD of −41.3 dBm/MHz as specified in the ECC regulation, the available power at the receiver is about −64 dBm at a distance of 2 m.
The receiver sensitivity is the lowest power level for which the SNR at the receiver output is sufficient to ensure the acceptable minimum bit-error rate performance, and is given by Sensitivity = −174 dBm + 10 log B + NF + SNR (3) where −174 dBm represents the thermal noise at the receiver antenna at 290 K, B is the signal bandwidth, and NF is the noise figure of the receiver. The required SNR for QPSK with a BER of 10 −3 is nearly 10 dB. Assuming a receiver noise figure of 7 dB, the receiver sensitivity is calculated to be −64 dBm for the signal bandwidth of 2 GHz. Assuming an omnidirectional antenna, the received power at a distance of 2 m in a 2 GHz bandwidth will be slightly in excess of this level. This is the case since the path loss is not a constant at 57 dB, as assumed above, and in fact decreases as the carrier frequencies decrease below 8.25 GHz. However, taking into consideration antenna and board losses at the transmit and receiver, which are of the order of 5 dB, it can be observed that the power level at the input of the receiver is lower than the desired sensitivity. It is possible to enhance the receiver noise figure, to lower sensitivity. However, this translates into greater power dissipation, and ultimately the approach is limited in its effectiveness, since in practice lowering the receiver NF significantly below this level is impracticable.
In order to enhance the effective receiver sensitivity, we utilize a beamforming approach. Employing multiple antennas at a receiver configures the communication channel as a singleinput multi-output (SIMO) system and provides several benefits at the expense of power and complexity. Multi-input multioutput (MIMO) or multi-input single-output (MISO) implementations are not considered in this research as the benefits from these configurations are not as much as in other wireless standards due to the transmit-power regulations. The following sections describe the SIMO receiver architecture and its circuit implementation.
A. Channelization
The SIMO receiver (Fig. 6 ) has four identical paths operating in parallel, each connected to its own antenna. In each path, the received signal band spanning 6-8.5 GHz is downconverted to baseband into sub-bands that are spaced apart by 500 MHz. The receiver path uses an architecture similar to [27] , wherein all LOs are derived from dividing the main LO, to avoid the generation of all the required center frequencies. This simplifies the receiver architecture, and hence layout, especially when multiple paths are needed for beamforming. By downconverting all sub-bands to baseband prior to digitization, the ADC requirements are also relaxed.
As mentioned earlier, the receiver consists of four identical paths for beamforming. All the baseband outputs of each path are aligned at the output and summed in the current domain. This signal is applied to a variable-gain amplifier (VGA) that uses input resistors to convert the current signal to voltage. The signals in the sub-bands are downconverted through one or more stages of mixers depending on their center frequency. The first mixer employs an LO of 8 or 8.25 GHz to downconvert the incident signal. A PLL generates 8 GHz LO, and the subsequent single-sideband mixing provides 8.25 GHz as in the transmitter. The selection of 8 or 8.25 GHz is decided in the synchronization block in the FPGA.
After the VGAs, integrate-and-dump filters (IDFs) convert the VGA outputs to dc levels that are captured by ADCs at the end of each symbol period. An IDF functions as a low-pass filter with a sinc transfer function that has nulls at the center frequencies of the adjacent channels. Eight 6 bit ADCs are employed for digitizing the two (I/Q) baseband signals that are obtained from each of the four sub-bands. The ADC outputs are transmitted to the FPGA by six LVDS interfaces running at 1 Gbps per pair, for a total of 6 Gbps.
B. Low-Noise Amplifier
The low-noise amplifier (LNA) [ Fig. 7(a) ] employs a common-gate common-source (CGCS) configuration that offers the advantages of wideband input matching and single-ended-to-differential conversion. Also, it decouples the matching requirement from the noise figure performance.
An off-chip inductor is used to provide a dc path for the common-gate (CG) branch. A common-source (CS) branch provides an inverted output and additional gain. The use of an inductive connection to ground also provides ESD protection to the input node once the IC is mounted on the board. 4 Assuming a matched condition of 1/g m1 = R S and ignoring the small-signal output resistance of the transistors, the parasitic capacitance, and the biasing components, the gain from V RFin to V RFout can be shown to be
where g m2 is the transconductance of M 2 and Z L is the load impedance. Then, the noise factor F is given by
where γ is the excess noise factor in the drain thermal noise. When the transconductance of CG branch is equal to that of the CS branch, CG-induced noise is canceled [28] . However, the overall noise figure becomes better with more gain in the CS branch by reducing the other noise terms. The gain ratio is set to be 2 as a tradeoff between the noise figure and the power consumption. Instead of a resistive load, on-chip inductors are employed to enhance the output bandwidth and reduce load noise. As the signal of interest is broadband, area-efficient, low-Q inductors are sufficient for this purpose. Fig. 7(b) shows the modified active mixer that is employed in the frontend. The mixer uses two resistors between the source nodes of the switching differential pairs and the supply voltage. Adding the dc current paths helps to relax headroom and allows for a larger bias current [29] at the expense of noise from the added components. The mixers are buffered using source followers to drive the subsequent passive mixers and the amplifiers.
C. Downconversion Mixers
The second-stage mixer is a single-sideband passive mixer with 25% duty cycle LO of 1 GHz. The LOs for the mixer are generated by dividing the 8 GHz LO followed by "AND" operations. This mixer exhibits good linearity while providing isolation between the I-and Q-paths [30] . After the second stage, the signal from the sub-band of 6.5/6.75 GHz is transferred to −0.5 GHz, while the signal from 7.5/7.75 GHz is transferred to +0.5 GHz. The third stage of the mixers frequency-translates these signals to dc using a 25% duty cycle active mixer structure. Two identical mixers with different connections to implement +0.5 and −0.5 GHz mixing are used, and one of the designs is shown in Fig. 8 . An active mixer is utilized in this stage to provide isolation between the +0.5 and −0.5 GHz paths. The required quadrature phases are also generated in the same manner as in the 1 GHz mixers.
D. Vector Modulator
Phase-rotation of the I and Q baseband signals for optimal combination of the SIMO signals can be performed in several ways, examples of which include Cartesian interpolation [31] or a switched-capacitor circuit to implement a sinusoidal approximation with a rational equation [32] . Considering that the baseband bandwidth is only 125 MHz wide and there are 16 vector modulators, a simple source-degenerated amplifierbased approach is adopted as shown in Fig. 9 . The weighted sum of (V I+ − V I− ) and (V Q+ − V Q− ), where the summation is performed in the current domain, flows at the output. Using small-signal analysis, the differential current output I O is given by
where
1+gmRQ/2 . A four-bit digital control word θ CTW changes the connection of 11 parallel resistors to adjust the phase θ with a step of 6
• . The values of resistors are fine-tuned in simulation to match their setting to the corresponding value of sin θ. It is noted that the parallel resistors can be shared by separating the angle range into four regions to provide the values of R I and R Q , because these resistors represent cos θ and sin θ and thus cannot have same value, except for 45
• . The relationship sin θ = cos(90 • − θ) is used to simplify the implementation of decoding logic from θ CTW . The value of θ should cover the entire 360
• range, and switching the input polarity supports all combinations of (cos θ, sin θ), (− cos θ, sin θ), (− cos θ, − sin θ), and (cos θ, − sin θ) to span the four quadrants. Consisting of four sub-bands and four modules, there are 16 vector modulators that are individually adjusted after monitoring each baseband signal as explained in the following section. 
E. VGAs and ADCs
The baseband VGAs each consist of four stages of amplifiers with a total gain range from 5 to 42 dB and a gain step-size of 3 dB. Each VGA has a dc correction circuit to remove low-frequency signals from the previous stages, since the transmitted signal is dc-balanced. The VGA output is integrated by a transconductor and a capacitor in an integrate-and-dump block. The output of this block is digitized by a 6-bit flash ADC at 125 MHz. Eight such ADCs are integrated onto the transceiver IC.
F. Synchronization and Beamforming
Since the center frequencies of the received sub-bands alternate every 8 ns, the integration window and the frequencyhopping need to be aligned to the timing of frequency change of the incoming signal. Fast synchronization is achieved by reconfiguring the connections to VGAs and varying the integration window timing individually for each baseband block. Fig. 10 shows the timing diagram of the integrate-and-reset circuit during a synchronization period when the LO input of the first mixers is fixed to 8 GHz. The diagram ignores the 250 MHz baseband signal that is downconverted from 8.25 GHz subband for simplicity. Initially, CK S{0,1,2,3} (Fig. 11) are spaced by one-fourth of the symbol period, i.e., 2 ns, which provides four output amplitudes A {0,1,2,3},0 , where A m,k represents the amplitude of the signal in the kth integration interval on the mth antenna. 5 After selecting the maximum of these levels, the receiver compares the levels of samples at adjacent times to decide the direction in which the integration window needs to be moved. For example, if A 0,0 is picked initially, A 3,1 and A 1,2 around A 0,2 are compared and the integration window for each element CK S{0,1,2,3} is moved toward the timing of the lesser of the two by steps of 125 ps, while maintaining a relative timing of 2 ns, until CK S0 is at the optimal timing where the signal levels A 3,k−1 and A 1,k are observed to be equal. After the synchronization is completed, all CK S{0,1,2,3} are aligned in time to the optimal timing determined previously. For this condition, all integrate-and-dump outputs will nominally have the same output level. The receiver can then continuously track the symbol position by observing the rotation of the baseband I/Q signals. The output clocks of the timing generator are derived from the main LO of 8 GHz using dividers and MUXs. The actual integrator block is implemented as two integrators, so that one is in integration period while the other is in reset.
The outputs of the four receiver paths need to be optimally combined to generate a beam pattern, which maximizes the SNR. In order to do so, the phases of the each baseband in the individual paths must be rotated such that the signals are aligned. The noise in the four paths is always uncorrelated. Once synchronization is achieved, the integration window timing is fixed and the beamforming angle is calculated from the phases of the four baseband signals, {Φ = tan −1 (Q/I)}, that are available concurrently, by using the same connections for synchronization. The process of adjusting beamforming coefficients is repeated until all vector modulators are properly set. The receiver is then reconfigured to the normal operating mode, where a VGA is dedicated to receive the summed input of a corresponding sub-band.
V. EXPERIMENTAL RESULTS
The prototype is implemented in a commercial 65 nm CMOS process (Fig. 12) . The transmitter and the receiver occupy 0.7 and 1 mm 2 , respectively. The area calculation excludes the FPGA interface and the pad area. The breakdown of the power consumption is shown in Fig. 13 . This calculation excludes the eight ADCs that dissipate 32 mW in total. For the transmitter, most of the power dissipation is related to LO generation and distribution. This is the case when the transmitter requires four PLLs, unlike the receiver that employs only one. We can thus expect that the architecture will benefit from more advanced CMOS technologies. Fig. 14 shows the measurements of the transmitter with a high-speed oscilloscope. The pulse shape is close to a sinusoid for which all harmonics are at least 27 dB lower than the fundamental frequency of 125 MHz. The output PSD meets the targeted ECC mask. Fig. 14(c) , in which only 8/8.25 sub-bands are turned ON, indicates that the signal at 7.75 GHz (image of 8.25 GHz) is around 25 dB smaller than the 8.25 GHz signal, and other unwanted spurs from PLL are significantly smaller than the LO strength at other sub-bands. The phase noise of the PLL is about −105 dBc at 1 MHz offset as shown in Fig. 15(a) . The QVCO output phase is measured to be accurate within 5
• , and Fig. 15(b) is the representation of QPSK modulation in accumulated waveforms. The coupling among the four QVCOs in the TX is measured to be negligible in the test setup, where each PLL is turned OFF one by one to observe the effect on the others.
A test chip, which has only the LNA, and the first mixer are used to measure the RF front-end performance separately. S11 is below −8 dB over the frequency range of 6-8.5 GHz, and the combined gain is 25.5-29.5 dB when measured at fixed IF of 10 MHz (Fig. 16) . The overall noise figure of the LNA and first mixer is 5-7.5 dB. A low-pass filter embedded in the VGA and an IDF results in at least 26 dB isolation between sub-bands in a test setup, where the RX receives only one sub-band at a time.
The measurements of the vector modulators are shown in Fig. 17 . The amplitude variation is within ±0.5 dB across the entire range, and the phase resolution is 5
• -7
• as expected. Beam patterns are also measured when the receiver is set for the incident angles of 0
• and 30
• , and Fig. 18 shows the plot for the 6.5 GHz sub-band. Commercial UWB antennas are mounted on the test board, and the spacing between them is designed to be a half-wavelength at 8 GHz. Therefore, the beamwidth of the lower sub-bands is narrower than the higher sub-bands.
The test setup to measure the receiver sensitivity includes the cable connection with discrete attenuators and power dividers. BER is measured after the demodulation in an FPGA that receives 48 bit wide digitized data at the rate of 125 MHz from eight 6 bit ADCs. Fig. 19 shows the received constellation at the highest sub-band of 8.25 GHz that includes the impact of the phase mismatch due to SSB mixing at both end of TX and RX under hopping condition. The sensitivity is about −69 dBm at a BER (uncoded) of 10 −3 . As the lower frequency sub-bands witness smaller path loss, their sensitivity numbers are better by 1-2 dB. Successful communication over 2 m range is demonstrated in the setup with commercial antennas that support a bandwidth from 3 to 10 GHz on an FR4 test board in a line-of-sight environment. A longer range will be possible with further optimization of board/antenna design and the use of low-loss PCB material. Also, narrowband interference from the UNII-band interference can be mitigated using a custom antenna that has a notch characteristic in the interfering band, or a front-end notch filter, as the target frequency spectrum does not overlap with the ISM band [33] . Comparison with other UWB implementations is shown in Table II .
VI. CONCLUSION
A multiband IR architecture is proposed for short-range, high data-rate applications as a low-complexity and low-power solution. The architecture is flexible and can be adapted to meet various regulations. The design employs on-chip sinusoid synthesis for achieving frequency-hopping and pulse-shaping. Four-element beamforming, combined with a frequencyhopping architecture, is demonstrated. The prototype demonstrates 1 Gbps data rate over 2 m. All required RF and analog physical-layer circuit functions are integrated, from LNA to the digital interface. The proposed scheme can be extended to address QAM16 modulation for even higher data rates.
