Abstract-This paper presents a four-element phased-array receiver which achieves 38% fractional bandwidth around 55 GHz. Baseband phase-shifting is employed to eliminate wideband phase-shifters, power dividers, and quadrature splitters operating at millimeter-wave frequencies. Antenna weighting and combining are accomplished using a highly digital Cartesian phase-shifter and current summation, respectively. Transformer-coupling techniques are introduced in the LNA to simultaneously achieve wide bandwidth, reduced noise figure 
proving the link budget and ensuring robustness in channels with weak or blocked line-of-sight path. SiGe [6] [7] [8] [9] [10] [11] [12] [13] and CMOS [14] [15] [16] [17] phased-array transceivers with up to 32 elements have been reported for WLAN/WPAN, automotive radar, and imaging. However, since these transceivers target specific applications, their limited bandwidth makes them unsuitable for future multiband scenarios which may require operation in widely separated bands. Current transceivers also consume large die area and power in the front-end, LO distribution and phase-shifting circuits.
This paper presents a compact, supply-voltage scalable phased-array receiver that achieves 38% fractional bandwidth around 55 GHz. A baseband phase-shifting architecture [14] is chosen to eliminate ultra-wideband phase-shifting circuits in the RF or the LO paths. The signal received at each antenna is amplified using a compact, ultra-wideband low-noise amplifier (LNA), and downconverted by "near-passive" current-commutating I/Q mixers. After complex-weighting using digitally-controlled Cartesian phase-shifters, the signals are combined using current summation. All circuits either employ a single transistor between the supply rails or folded topologies to enable operation from a scalable supply voltage. In contrast to typical mm-wave front-ends which are designed to satisfy worst-case channel conditions, the proposed receiver can not only operate from the nominal to achieve high performance in poor channels, but can also operate at lower -and therefore, reduced power consumption-with moderate performance in better channel conditions. This can lead to large energy savings in a practical phased array incorporating a large number of elements.
The LNA employs transformer coupling techniques to simultaneously achieve neutralization, ultra-wide bandwidth and gain boosting. These techniques allow a more compact design compared with a conventional topology that uses separate (uncoupled) inductors. By careful placement of the terminals of the transformer windings, parasitic inductances in the current return paths are minimized. The built-in neutralization eliminates the need for a cascode transistor in the input common-gate stage, thereby resulting in lower noise. Power-and area-efficient LO distribution to the phased-array elements is challenging due to the wide target bandwidth. In this work, I/Q LO buffers with current-splitting multiconductor transmission lines (MTL) and distributed loads are used to implement a compact, wideband LO network. This paper is organized as follows. Section II motivates the choice of the baseband-shifting architecture. Section III builds upon [18] and describes the evolution, analysis, and design of the LNA. Generalized " " models for multiwinding transformers are introduced to provide design insights. Sections IV and V describe the mixer/phase-shifter, and the LO distribution network, respectively. Section VI presents measurement results, and Section VII concludes the paper.
II. ARCHITECTURAL CONSIDERATIONS
In RF phase-shifting architectures [6] , [7] , [12] , [13] , [17] , [19] , antenna weights are inserted at the input frequency, usually after low-noise amplification. Weighting is implemented using passive phase shifters [6] , [7] , vector modulators with passive [19] or active quadrature splitters [12] , or hybrid architectures that combine active weighting with quadrature downconversion [20] . RF phase-shifting architectures suffer from several shortcomings which inhibit their use for the wide bandwidth under consideration. Passive phase shifters and combiners suffer from limited bandwidth and high insertion loss, the mitigation of which necessitates wideband amplifiers. The bandwidth limitation can be overcome using multisection networks, but with attendant penalties in die area and loss. Also, many passive phase shifters allow phase-only weight control. The use of cascaded RF amplifiers to overcome phase-shifter losses can degrade linearity, which is often identified to be a major advantage of RF phase-shifting arrays. Vector modulators that use passive quadrature splitters share these limitations and require calibration [19] and wideband post-amplification. The variable-gain amplifiers in vector modulators must be designed to cover the entire RF bandwidth, which poses additional challenges. Nevertheless, RF phase-shifting architectures enjoy the advantage of simple LO distribution.
LO phase-shifting architectures [8] , [15] , [19] , [22] implement antenna weighting by independently phase shifting the LO to each element. The phase-shifting circuits are usually implemented locally within each element [8] . The Cartesian phase shifter is a common choice since noise and linearity requirements of the weighting amplifiers are reduced [8] . However, the LO phase shifters must be designed to cover the entire bandwidth, which can be problematic for wide bandwidth. In [22] , local phase-shifts are incorporated at a sub-harmonic frequency of the LO, followed by local frequency multiplication, to avoid distribution of high frequencies. Injection-locked multipliers can avoid spurs generated by conventional frequency multipliers (e.g., [23] ), but suffer from lock-range limitations and require wideband local buffers between the multiplier and the mixer LO ports.
The baseband-shifting architecture ( Fig. 1 ) defers antenna weighting to baseband, after amplification and downconversion. Although the LNA must cover the wide target bandwidth (similar to RF and LO phase-shifting), it is sufficient to design the baseband weighting circuitry to cover the channel bandwidth, rather than the RF bandwidth. Since the former is much smaller in a communication application, this simplifies its design and facilitates a digital-friendly implementation. In the CMOS baseband-shifting phased-array in [14] , a single LO signal at the input frequency (for direct conversion) is distributed to each element, where it is locally split into quadrature/differential phases. Since the cascaded quadrature/differential power splitting networks limit bandwidth, and since they provide high inter-element isolation only over a limited bandwidth (e.g. 25 dB isolation for 23% fractional BW [6] ), the LO distribution network in the proposed phased-array receiver avoids these approaches in favor of the approach described in Section V.
III. LOW-NOISE AMPLIFIER

A. Topology
Each phased-array element employs a two-stage LNA with a common-gate (CG) first stage and a common-source (CS) second stage. A bandwidth-enhanced series-shunt network [24] is used as the load of the CG stage [ Fig. 2(a) ] in order to achieve a wideband gain characteristic. A cascode device-conventionally employed for unilateralization-is avoided to facilitate supply voltage scalability. With appropriate neutralization (described in the next subsection), reverse signal flow in the CG stage may be neglected. Neglecting resistances, the transfer function of the stage can be written as (1) where is the effective transconductance and is the transimpedance. In this design, is chosen to be much larger than , and therefore, has two complex-conjugate pole pairs at and , with . As shown by the solid black line in Fig. 3 , the frequency response has two peaks at these frequencies, while the frequency span of the peaks is determined by the finite inductor 's.
The CS second stage also avoids a cascode device to allow supply voltage scalability. The frequency response of the CS stage is designed to have a peak at the center of the desired frequency band, and thus compensates the gain droop between the peaks in the frequency response of the CG stage. This results in a relatively flat frequency response, as illustrated for the designed LNA in Fig. 3 . The overall frequency response of each receive element is further flattened by tailoring the frequency response of the mixer and the wideband LO buffer. In both stages, the elimination of the cascode device helps to improve the noise figure. However, this choice can cause large reverse signal flow and result in poor input matching and potential instability. Transformer coupling techniques are introduced to neutralize each stage, thereby mitigating reverse signal flow, as described next.
B. Transformer-Based Neutralization in the CG Stage
An un-neutralized CG stage without a cascode device suffers from increased reverse signal flow, due to: 1) the finite output resistance of the transistor; 2) drain-source wiring capacitance; and 3) parasitic capacitances and resistances through the gate and the body networks of the transistor, as shown in Fig. 4(a) . The reverse signal flow can be modeled by a parallel combination of a resistance and a capacitance which constitute the reverse admittance . Noting that the input admittance is , it is evident that a large reverse signal flow can cause the input admittance to deviate significantly from its unilateral value of . In particular, the input admittance-and hence the quality of the input match-becomes dependent on the load network comprising and . Also, since the voltage gain of the stage is , the aforementioned dependency is exacerbated when the stage is designed for a high voltage gain, as is typically the case.
To understand the proposed neutralization technique, first consider a CG stage with a narrowband load [ Fig. 4(b) ]. Here, the drain and the source inductors are combined into a transformer with a carefully chosen coupling coefficient. A convenient way to understand how this structure can achieve neutralization is to replace the transformer with a " " model (see Appendix A), as shown in Fig. 4(c) . Examination of (9) in Appendix A shows that the coupling coefficient can be chosen such that the capacitive portion of resonates with the inductance of the " " model: (2) Note that the resulting neutralization is partial since the resistive portion of remains unaffected. However, since the impedance of dominates over at millimeterwave frequencies, this technique substantially reduces reverse signal flow, and results in a well-controlled input match. Also, by thus reducing the equivalent of the parallel network between the source and the drain ( 2-3 in this design), serves to widen the bandwidth over which neutralization is achieved.
The wideband series-shunt load can be elegantly combined with the aforementioned neutralization technique using a multiwinding transformer, as shown in the simplified schematic of Fig. 2(b) . To understand the evolution of this topology, first consider introducing magnetic coupling between the inductors and only, with the polarity indicated by the dot points in Fig. 2(b) . Replacing the transformer with a ' ' model (see Appendix A) results in the small-signal equivalent circuit shown in Fig. 5(a) . Similar to the narrowband case discussed above, the coupling coefficient can be adjusted such that resonates with , thereby neutralizing it. The inductances and in the " " model are related to the self-inductances and by (9) in Appendix A; since the required value of is small, it is evident that and . Also, since for small , the inductors that appear in parallel with and may be neglected, thereby resulting in further simplification.
However, reverse signal flow can still occur through the inductor in Fig. 5 (a). To address this issue, magnetic coupling is introduced between and with the polarity indicated by the dot points in Fig. 2 
C. -Boosting in the CG Stage
A fourth winding that couples with is introduced, leading to the equivalent circuit of Fig. 5(c) . A -boosting [25] factor , is obtained. Using the expressions in Appendix A, the -boosting factor can be written and approximated as follows for weak coupling between other pairs of windings:
Equation (3) shows that the -boosting factor increases with coupling coefficient and the turns ratio between and ; note that the approximation in (3) holds true despite the fact that (3) holds strictly for low . Also, (3) holds only for frequencies . Therefore, the self-inductance -which influences the most-must not exceed a value that is dictated by the capacitance .
D. Input Impedance and Noise Figure of the CG Stage
The input impedance of the CG stage is calculated from the small-signal circuit of must be set equal to the source conductance , while must be made to vanish. The noise figure is calculated using the small-signal circuit of Fig. 6 , which employs the Pospieszalski noise model [26] and the generalized " " transformer model (5) where is the source conductance. The other constituent terms are defined as follows: (6) represents the parallel admittance of , and . In (5), the second and third terms represent contributions of the drain current noise and the gate noise, respectively. From (5), it can be seen that the second term decreases as is increased since the numerator is proportional to and the denominator is roughly proportional to . Similarly, it can be seen that the third term also decreases as is increased (since ). Thus, can be reduced by increasing . However, input matching, [i.e.
] can be preserved by changing (i.e., through coupling) despite change in . Compared with a conventional CG topology, this additional degree of freedom can be exploited to reduce the noise figure while achieving a conjugate match.
Equations (5) and (6) are compared against simulation in Fig. 7 , which shows good agreement for the first stage. Fig. 7 also plots the noise figure of the entire LNA with local RC extraction for the transistors and full EM extraction for the transformers and the interconnect. As expected, some degradation is observed in the bandwidth and the noise figure.
E. CS Stage Design
The large reverse signal flow through the gate-drain capacitance can degrade the input impedance and stability of the cascode-less CS second stage [27] . This is exacerbated by the high voltage gain of this stage, as described in Section III-B. The neutralization technique proposed in [27] , [28] is employed here to reduce reverse signal flow (Fig. 8) .
F. Transformer Design
The design of the four-winding transformer in the CG stage is simplified since only three coupling coefficients are non-negligible . Metal stack details, coupling factors, and a 3-D diagram of the transformer are shown in Fig. 9 . The transformer is designed as follows. 1) , which couples to the other inductors, is centrally placed and is drawn primarily in layer .
2)
is drawn in layer directly above to obtain high .
traces are added to in shunt with its traces. This introduces horizontal ( to ) coupling in addition to the vertical ( to ) coupling, and allows adjustment of . 3) has two turns and is off-centered with respect to to set . is made negligible by increasing the distance between and compared to the distance between and .
4)
-drawn in -is well-separated from to and from to set . Its axial distance from is adjusted to set the required value of . 5) Finally, all coils are rotated so that their terminals are in close proximity to their connection points to other circuit components. This allows a priori reduction in parasitic inductances.
IV. MIXER AND PHASE-SHIFTER
The mixer consists of a single-ended CS stage whose output is converted to differential by a balun (Fig. 10) . The balun provides a wideband frequency response and decouples the bias current of the stage from that of the LO switching devices, which in turn enables low operation, ensures better linearity and reduces noise contribution (especially ) from the switches. The balun is connected to double-balanced I/Q mixer quads comprising NMOS transistors whose is biased to be approximately equal to the threshold voltage. This choice improves the conversion gain of the mixer since it is driven by sinusoidal signals with relatively low swing ( 180 mV amplitude).
The above design choices are partially motivated by two important benefits that they confer on the design of the wideband LO distribution network. First, the leakage inductance of the balun provides inductive degeneration which causes the input impedance looking into the LO switches to have a real part. As described in Section V-B, this real part decreases the of the load of the LO buffers, and reduces the I/Q mismatch and crosstalk in the MTL used in the LO distribution network. Second, as described in Section V-D, the double-balanced mixer reduces inter-element leakage significantly and eliminates the need to employ large, lossy power dividers and quadrature splitters in the LO distribution network.
The downconverted signals are weighted using Cartesian phase shifters (Fig. 10) , each comprising four digitally-programmable cells, as shown in Fig. 11(a) . Each baseband signal is weighted by a separately programmable complex weight . Each cell [ Fig. 11(a) ] consists of 15 unit cells arranged in a 4-bit binary-weighted array. An additional bit is used to switch the sign of the programmable transconductance. A folded-cascode topology is used to enable supply-voltage scalable operation. When the unit cell is off, its output common-mode voltage is maintained constant for all settings by using auxiliary differential pairs whose inputs are connected to the input common-mode voltage, as shown in Fig. 11(b) . The "sign" and "sign_b" inputs provide signed weights.
V. LO DISTRIBUTION
A. LO Buffer Topology
The LO distribution network is based on a distributed-load LO buffer, whose simplified schematic is shown in Fig. 12(a) . The buffer consists of a transconductor driving a transmission line terminated in a parallel LC load which includes the input impedance of the mixers. Assume that and are the propagation velocity and length of the transmission line, so that the frequency at which the line length equals a quarter wavelength is given by . Also assume that the resonant frequency of the LC load is equal to . The first two resonant frequencies of the transimpedance occur at where (7) In (7), and are the characteristic impedances of the LC load and the transmission line, respectively. Equation (7) indicates that the resonance frequencies split such that is lower (higher) than the minimum (maximum) of . As shown Fig. 12(b) , the of the LC tank can be set to a sufficiently low value to obtain a flat, wideband frequency response. Note that a lower results in lower gain at the band edges, thus necessitating greater power consumption in the transconductor. Conceptually, four such single-ended buffers (or two differential buffers) can be used to distribute the four LO phases through G-S-G transmission lines with sufficient separation (and, optionally, shielding) to minimize coupling between signal traces. However, in order to reduce die area, the present design uses a compact MTL to distribute differential quadrature LO signals. In this case, additional issues arise due to I/Q mismatch and crosstalk in the MTL; analysis and design techniques to mitigate these issues are described next.
B. I/Q Mismatch and Crosstalk in an MTL-Based LO Buffer
Consider a simplified equivalent circuit of the LO distribution network, shown in Fig. 13 . In the following discussion, the MTL is of the mctline1 type described in Section V-C. The characteristic impedance matrix , propagation constant matrix and mode decomposition matrices of the MTL [29] , [30] are extracted by post-processing the output from an electromagnetic field solver [31] . While these matrices are frequency-dependent, the values at 60 GHz are shown in Table I . Using MTL theory [29], the output voltage can be expressed as a 4 1 column vector comprising the four phases (8) where is the MTL length, is the identity matrix, , and . The buffer is driven by differential quadrature excitation (i.e.,
). The matrix represents the Norton equivalent output impedance of the transconductor and represents the load matrix. The output voltages are calculated from (8) using values extracted from simulation for the constituent variables. Observations, tradeoffs, and design guidelines are summarized here.
1) The transimpedance magnitudes of the MTL are similar to Fig. 12(b) and are therefore not plotted. This suggests that the requisite bandwidth can be achieved with an identical choice of parameters as in Fig. 12(b) . However, this results in regions of large, rapidly changing I/Q phase and amplitude mismatches (called "spikes") at frequencies just higher than (and just lower than) the lower (and upper) resonant frequencies [ Fig. 14(b) ]. This is due to the combined effect of mismatches in the different propagation modes of the MTL (see Table I ) and the large changes in the phase-shifts around the resonant frequencies. The presence of such spikes within the band of interest can eat into the dynamic range of the baseband phase-shifter, and can lead to distortion similar in effect to beam-squinting [32] , if left unmitigated. Therefore, it is important to reduce the spikes and design the proposed LO network such that the spikes are driven outside the RF bandwidth; this is especially challenging considering the wide target bandwidth. 2) The aforementioned propagation mode mismatches are due to geometric (length) mismatches and asymmetries, 1 and crosstalk between the signal traces of the MTL. Geometric mismatches are difficult to eliminate completely even with careful layout, especially at bends and transitions between metal layers. Additional geometric mismatches are incurred due to random variations in trace geometry, spacing and fill patterns. On the other hand, crosstalk between signal traces occurs due to capacitive and inductive coupling between the signal traces. Capacitive coupling can be effectively reduced by using shielded (conductor-backed) transmission lines. However, inductive coupling is only partially reduced by shielding, and decreases relatively slowly with increasing separation between the traces. 2 This can severely affect the overall 60 GHZ compactness. Note that this discussion considers imperfections in I/Q amplitude and phase due to the MTL alone; in practice, additional mismatches will be incurred due to source-side mismatches in the LO driver and load-side mismatches in the LC loads and the mixer. In the remainder of discussion, the mismatches are assumed to be deterministic. 3) I/Q mismatch spikes can be reduced by reducing the quality factor of the LC termination, as illustrated in Figs. 12(b) and 14(a). In this design, this is accomplished by using a low-miniature inductor, and inductive source degeneration (from the leakage inductance of the baluns, see Fig. 10 ) in the mixer's LO switches. However, since reducing results in an increase in power consumption, it was set at an optimum level in this design. 4) In addition to decreasing , it is necessary to split the resonant frequencies further apart in order to locate the mismatch spikes outside the band of interest. In other words, the bandwidth must exceed that required for a simple LO buffer which is not constrained by I/Q matching considerations. From (7), this can be accomplished by: 1) increasing the line length to decrease ; 2) increasing ; and 3) increasing (by increasing for a given ). Note that the latter two knobs are inter-related since the capacitance in the LC load depends on the input impedance of the mixer. Fig. 14(b) illustrates the calculated I/Q phase mismatch for the mctline1 MTL geometry optimized using the above considerations. For comparison, Fig. 14(b) also illustrates the phase mismatch obtained with a mismatch-free MTL, and a mismatch and crosstalk-free MTL.
The proposed (unshielded) mctline1 is compared in Table II against different unshielded and shielded MTL configurations. Examination of the extracted inductance and capacitance parameters quantifies the previously drawn qualitative conclusions on inductive and capacitive coupling. In order to compare the isolation between various ports of each MTL, , , , and are listed in the Table II ; examination of these parameters reveals that increasing spacing is indeed the most effective way to reduce coupling, albeit at the cost of compactness. The extracted parameters for each line are then used in the model described by (8) , and the resulting I/Q mismatches are plotted in Fig. 15 .
Extension to Larger Arrays: The proposed LO distribution approach can be scaled to higher numbers of elements. For example, for 16 elements, two-level cascaded buffering can be used overcome losses, with 1:4 splitting at each level. An H-tree MTL network can be absorbed into the buffers similar to the proposed approach.
C. Design
The LO distribution subsystem [ Fig. 16(a) ], is designed using the above guidelines. It comprises a pair of transformer-coupled cascode stages ( Fig. 16(a) , inset), which are input-matched using inductive source degeneration. The neutralization capacitors are critical to ensuring the stability of the buffer. A single transistor between the supply rails, and inter-stage transformer coupling, facilitate operation from a scalable supply voltage. The differential outputs of the stages are routed to the phased-array elements through a network of coplanar four-phase MTL's. The MTL network, shown in Fig. 16(a) , comprises three parts: a 218 m long mctline1, a 124 m long 5:10 splitter sptline (here, five results from four signals plus one ground return conductor) and two a 110 m long mctline2 MTL's. The geometry of mctline1 and mctline2 are shown in Fig. 16(b) ; the geometry of sptline follows that of mctline1 before 90 bend and that of mctline2 after the bend. Each mctline2 MTL is terminated in an LC load (EM extracted values are 147 pH, 28 fF and 350 ). Each LC termination is shared between two adjacent array elements, without a second splitter [ Fig. 16(a) ]. This choice, together with the use of MTL's, allows a tight pitch and results in a compact implementation. The resulting inter-element length mismatch is only 35 m which results in an inter-element phase mismatch of less than 4 over the entire bandwidth. Fig. 17 shows the transistor-level simulated magnitude and phase mismatch responses of the LO distribution network. 
D. Inter-Element Coupling Considerations
This subsection analyzes inter-element coupling through the proposed LO distribution network and shows that sufficient isolation can be achieved with the proposed approach, despite the avoidance of dividers or splitters. Fig. 18 shows the coupling mechanisms of an input signal to one element leaking through to the output of another element. The coupling path is decomposed into three gains: which represents leakage from the mixer's RF port into the LO network, which represents the isolation between port 1 to 2 of the LO network, and which represents the conversion gain from the LO input to the IF output. Due to the double-balanced mixer, and are non-zero, but small, due to balun and device mismatches. For the current-splitting network used here, is ideally unity since no isolation is present. The overall inter-element coupling from the input of element 1 to the output of element 2 is the product of the above gains. With only balun mismatch, 31 dB and 29 dB were estimated by simulation.
1 dB ( 6 dB) between adjacent (nonadjacent) elements was estimated due to resistive losses. Monte Carlo simulations including device mismatches indicate worst case 50 dB over the bandwidth of interest.
VI. EXPERIMENTAL RESULTS
The phased-array receiver (Fig. 19) is fabricated in 45 nm SOI CMOS which features 200/200 GHz for an NMOS contacted from the topmost metal (the for an NMOS contacted from M1 is significantly higher). The active area is 0.225 mm /element excluding pads, but including the LO buffer and distribution network. The chip is bonded to an FR4 board. DC biases are provided through wirebonds and RF signals are accessed by wafer probing. Figs. 20 and 21 show the measured conversion gain and noise figure of a single element. Under 1.1 V operation, the array elements achieve a maximum gain of 26.2 dB over a 3 dB bandwidth from 45 to 66 GHz, (37.8% fractional BW). From a 0.6 V supply, a gain of 20.2 dB is achieved from 48 to 67 GHz. Each element achieves minimum 5.5 dB NF from a 1.1 V , and a modest increase in NF to 7.7 dB is measured at 0.6 V . Post-layout simulations with electromagnetic field-solver models for inductors, transformers, and major interconnect are also plotted in Fig. 20 and Fig. 21 . The measured gain exhibits a stronger low frequency peak than the high frequency peak (opposite to the simulation) which caused the minimum NF frequencies to shift towards the lower end of the band. The differences are attributed to modelling inaccuracies (e.g. automated fill for lower level metals), and process variation. The presented chip represents our first silicon in this technology, and therefore was not guided by simulation-measurement correlation of the transistors and passive devices from prior tapeouts. Also, the LO's were generated using a N5247A PNA-X network analyzer, which has poor phase-noise ( 90 dBc/Hz at 1 MHz offset); this is partially responsible for the degradation in the measured NF compared to simulation. For all measurements, 0 dBm of LO power was delivered at the pads. The array achieves excellent inter-element matching ( 1.5 dB gain difference) over the entire 20 GHz bandwidth, as shown in Fig. 22 . The inter-channel phase difference was measured to be less than 5 rms at mid-band over all digitally programmed gain and phase settings of the phase-shifters (however, this measurement was not made over the entire RF bandwidth).The receiver achieves a worst case input match of 9 dB over the band of interest (Fig. 23) . The measured 1 dB input gain compression point is shown in Fig. 24 across the band of interest.
I/Q gain and phase errors were measured by sweeping the LO across the target frequency range, while keeping the IF and the phase-shifter settings fixed. The results, plotted in Fig. 25 , demonstrate that the I and Q paths in the LO distribution network remain well matched across the desired frequency range.
The Cartesian phase shifter is characterized next by programming weight codes through a test register. The and gain components are almost linear relative to the input code [ Fig. 26(a) ]. Fig. 26(b) shows 961 possible weights in the complex plane. In order to steer nulls in the array pattern, both the amplitude and phase of the complex weights must be programmable, in contrast to phase-only control [14] , [33] which confers beam-steering capability but no control over notch locations. The amplitude and phase errors of the constellation points in Fig. 26(b) include the effects of I/Q mismatches in the LO distribution network, the mixer cores and mismatches between the cells in the phase shifter. Measured rms gain and rms phase errors over all digitally programmed gain and phase settings are plotted in Fig. 27 across the entire frequency range; the uncorrected rms gain error remains below 1.5 dB (rms), while the uncorrected phase error remains relatively flat over frequency, increasing to a worst case of about 5 (rms) at 45 GHz. Note that the residual I/Q mismatches can be compensated using the coarse-grained digital and/or the fine-grained analog control of the phase shifters. Fig. 28(a) shows the measured gain variation of the phase shifter using analog (continuous) control; this figure also shows that the receiver achieves 1.2 GHz IF bandwidth. This bandwidth is limited by the wirebond and PCB interface to the IF outputs, but is sufficient for current 60 GHz standards. The array pattern with two elements is measured using a waveguide phase shifter [ Fig. 28(b) ]. A null depth of better than 36 dB is achieved using digital-only weight programming (analog fine-control was not used).
Inter-element coupling was characterized next [ Fig. 29(a) ] and is quantified by the rejection ratio , where is the conversion gain of element 1, and was defined in Section V-D as the leakage conversion gain from the input to element 1 to the output from element 2. Note that this ratio is not affected by the LNA or baseband gain and can easily be measured at the pads. To measure , the RF signal is applied to the input element 1, and the conversion gain is measured with element 1's phase shifter turned on. Then, element 1's phase shifter is turned off, element 2's phase shifter is turned on and the conversion gain is measured in order to characterize . A worst case rejection ratio of 26 dB (45 dB) is measured between adjacent (non-adjacent) elements. This worse-than-expected rejection is due to coupling through the substrate and common bias networks. Still, this level of rejection value is superior to the 21 dB rejection reported in [12] . To characterize the efficacy of the transformer based neutralization schemes employed in the LNA, LO-to-RF isolation has been measured. As shown in Fig. 29(b) , LO-to-RF isolation better than 45 dB has been measured over the entire RF bandwidth. Table III summarizes gain, linearity and power consumption of the various subblocks. Table IV benchmarks the phased-array against recent CMOS implementations.
VII. CONCLUSION
This paper presents a compact four-element phased-array receiver which achieves 38% fractional bandwidth around 55 GHz. Baseband phase-shifting is chosen to avoid designing ultra-wideband mm-wave phase shifters and power and quadrature splitters and enables a highly digital Cartesian phase shifter with simple current combining. Transformer-coupling techniques are introduced in the LNA to simultaneously achieve wide bandwidth, reduced noise figure, gain-boosting and built-in neutralization which eliminates a cascode device, enables operation from a low, scalable and results in a compact layout. The LO network is simplified by the proposed scheme where the four-phase LO distribution lines are absorbed into the buffer circuit and terminated by distributed LC loads near each phased-array element. A 45 nm SOI CMOS prototype of the phased-array achieves 26.2 dB (20.2 dB) element gain over 21 GHz (19 GHz) 3 dB bandwidth with 5.5 dB (7.7 dB) minimum NF and consumes 30 mW (14 mW) per element from a 1.1 V (0.6 V).
APPENDIX A " " MODELS FOR MULTIWINDING TRANSFORMERS
Generalized " " models for multiwinding transformers with isolated terminals-which have not been presented previously, to the authors' best knowledge-provide useful insights into the proposed LNA. Fig. 30(a) illustrates the " " model for a two-winding transformer whose ports share a common ground. When the two windings are isolated, the simple " " model must be modified to ensure that the currents entering the input terminals are equal to the currents leaving the reference terminals at each port (1-1 , 2-2 in Fig. 30(b) ). The elements of the resulting "generalized " " model" [ Fig. 30(b) ] are (9) Notice that the generalized model reduces to the simple model when the ports share the same reference. The generalized " " model can be extended to multiwinding transformers, as shown in Fig. 31(a) . The model parameters three windings can be derived as (10a) In (10), represent mutual inductances. The expressions in (10) may be simplified for the case of weak coupling. For example, when all three coupling coefficients are low, it can be seen that (11) Two important conclusions can be drawn from the approximations in (11) .
1) Each pair-wise coupling may be considered in isolation. The inductors appearing in place of the self-inductances are approximately independent of the coupling factors. Each extra inductance depends only on the coupling factor it represents (e.g. is only function of ). Thus, the effect of the coupling factors be evaluated sequentially, which simplifies analysis and yields design insights. 2) By inductive reasoning, the above observation can be extended to an -winding transformer with coupling factors. Inductors appearing in place of the self-inductances remain unchanged under weak coupling. The additional inductors approximately equal . The generalized " " model can be simplified when two or more ports share the same reference terminals. This is shown in Fig. 31(b) and has been applied in Fig. 5 (b) in order to gain intuition into the operation of the transformer-based LNA.
