Abstract-At low-GHz frequencies, analog time-delay cells realized by LC delay lines or transmission lines are unpractical in CMOS, due to their large size. As an alternative, delays can be approximated by all-pass filters exploiting transconductors and capacitors (g m -C filters). This paper presents an easily cascadable compact g m -C all-pass filter cell for 1-2.5 GHz. Compared to previous g m -RC and g m -C filter cells, it achieves at least 5x larger frequency range for the same relative delay variation, while keeping gain variation within 1 dB. This paper derives design equations for the transfer function and several non-idealities. Circuit techniques to improve phase linearity and reduce delay variation over frequency, are also proposed. A 160 nm CMOS chip with maximum delay of 550 ps is demonstrated with monotonous delay steps of 13 ps (41 steps) and an RMS delay variation error of less than 10 ps over more than an octave in frequency (1-2.5 GHz). The delay per area is at least 50x more than for earlier chips. The all-pass cells are used to realize a four element timed-array receiver IC. Measurement results of the beam pattern demonstrate the wideband operation capability of the g m -RC time delay cell and timed-array IC-architecture.
I. INTRODUCTION
T IME DELAY circuits have broad applications in communication systems, e.g., for FIR and IIR filters [1] , equalizers [2] , and wide-band beam forming [3] , [4] . This paper deals with the latter application, where a "timed array" is targeted instead of the more commonly used phased array. In a timed array, true time delays are used instead of the narrowband phase shifter approximation. In this way beam squinting can be minimized [4] , [5] . In beam-forming receivers, the variable delay cells compensate the relative delay of signals of the antenna channels. The transfer function of an ideal delay cell is:
( Fig. 1 ). Its gain is 1 and its phase is linear versus frequency. The delay at frequency is: , ideally independent of (linear phase). Note that we consider true time delay here, not group delay, which is defined as . Achieving constant true time delay is tougher as it not only requires constant group delay independent of frequency but also a constant ratio between and independent of frequency [6] . There are different IC compatible circuits to approximate a time delay, e.g., transmission lines [7] , [8] , LC delay lines [9] , switched capacitor delay circuits [10] and -RC or -C all-pass filters [10] . However, at low-GHz frequencies, transmission lines and LC delay lines in CMOS are unpractical due to the low quality factor of coils, loss of the transmission lines and their large sizes. Switched capacitor time-delay circuits on the other hand are not fast enough for low-GHz applications. One of the few remaining options is to exploit an all-pass filter approximation of a delay, e.g., a first-order all-pass filter: (1) The transfer function of this all-pass filter is plotted in Fig. 2 . At low frequencies it approximates the ideal delay cell but at higher frequencies the delay is reduced and delay variations occur. This delay variation is quantified via the criterion [6] which is the crossing point of the frequency axis and the tangent to the phase curve at operating frequency (Fig. 2) . The delay variation in around is approximately
The first-order all-pass transfer function can be realized both with -RC filters [2] , [11] (see Fig. 3 ) and the -C filter presented in this paper and in [12] . In [13] , a benchmarking method has been proposed to compare delay cells based on . This method is re-used here to show that the -C delay cell of [12] has better performance than other published designs. Moreover, the feasibility of a compact broadband beam-forming IC with -C delay cells is demonstrated. Apart from bandwidth, other important properties of the delay cell are: 1) cascadability; 2) compactness; 3) wide delay tuning range; 4) high 0018-9200 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
delay tuning resolution and precision; 5) gain controllability; 6) noise figure and 7) linearity. In [12] it has been shown that the -RC all-pass filters of [2] and [11] (shown in Fig. 3 ) do not work adequately up to several GHz in UMC 180 nm CMOS because of their high parasitic capacitors. Besides, they need DC blocking capacitors or source-follower buffer circuits to realize cascadability, which limits the bandwidth and/or results in high current consumption. It will be proven that the -C topology of [12] has better performance: 1) low delay variation over a 5x wider frequency band compared to other reported -RC delay circuits, while maintain similar noise and nonlinearity performance; 2) compactness compared to LC or transmission lines; 3) high resolution of delay and gain tuning; 4) direct cascadability. Compared to [12] , this paper adds circuit analysis and circuit optimization techniques, e.g., for phase linearization and bandwidth extension.
The structure of this paper is as follows. In Section II, the all-pass filter as an approximation of a delay cell is reviewed and the all-pass filter of [12] is explained. Section III assesses non-idealities of the delay cell. Section IV discusses improvements of its characteristics. Section V establishes a relation between requirements of the beam forming system and the delay cell. Section VI presents the sub-circuits of the timed-array IC and Section VII presents chip implementation and measurement results, while Section VIII draws conclusions.
II. FIRST-ORDER ALL-PASS DELAY CELL
The transfer function of the first-order all-pass filter of (1) can be rewritten as a combination of a low-pass part with DC-gain of two and a unity-gain part [14] as (3) It is realizable without floating capacitors and hence with good bandwidth potential, because the low-pass part can be implemented by a capacitor to ground and the unity gain part does not require capacitors. Fig. 4 shows the block level and -RC implementation of this all-pass filter. Aiming for direct cascadability, the -C topology of Fig. 5 [12] with equal input and output DC voltages was proposed. Transistors and realize the low-pass signal path with a DC-gain of 2. Transistors and realize the inverting unity gain path. Using PMOS transistors in the "slow low-pass path" and faster NMOS transistor in the unity gain path, the useful frequency range of the delay cell is maximized. Also, current re-use for NMOS and PMOS transistors reduces power consumption. The DC input voltage results in equal drain currents in and , and for . Therefore, . Modelling by its small-signal and C as the total capacitance, the transfer function and its low frequency delay can be written as:
The low-frequency delay is made controllable by splitting C in switchable binary weighted capacitors. Fig. 6 shows the bias circuit of the first delay cell of a delay line. It is the only cell with an AC-coupling capacitor to the input RF signal, . As each signal path has this, no difference in gain and delay results. The DC voltage of is equal to . is made more than 10 times larger than the source impedance of , for insignificant attenuation.
III. THE NON-IDEALITIES OF THE DELAY CELL
As the aim is to cascade cells, the non-idealities of a delay cell will now be analyzed with a capacitive load equal to the input capacitance of the next delay cell:
. In the analysis the effect of will be neglected as the voltage gain is low (unity gain all-pass behavior), while the right half-plane zero introduced by at is in the range of 50 GHz for the transistors used. This is well beyond the targeting low-GHz operating frequency, and therefore for the sake of simplicity we neglected its effect. This assumption was validated by checking hand calculation versus simulation results.
A. Finite Output Impedance and Parasitic Capacitances
Considering finite output impedances of the transistors and the parasitic capacitors which affect the pole/zero frequency and DC gain, the transfer function of the filter becomes: (6) where and are the transconductance and output conductance of and in saturation, and those of and and of . The parasitic capacitances and are absorbed in C. Also absorbs the parasitic capacitors plus the input capacitance of the next delay cell . The transfer function (6) deviates from the ideal one (4) in two aspects: 1) the DC-gain is less than unity, and 2) an extra high frequency pole causes both an extra phase shift and a high frequency gain roll-off. If the following conditions are satisfied:
then the transfer function can be rewritten as (8) Using the analysis in [13] , the maximum usable pole frequency is defined as the frequency where the gain roll-off with respect to DC is less than , resulting in (9) Fig. 7 . The phase at the gate of , and the output of the delay cell in a delay line.
For frequencies larger than , the roll-off is more than . Substituting and (unity current gain) in (9) results in (10) To estimate and compare it with the delay cells reported in [2] and [11] (benchmarked in [13] ), the same technology (UMC 180 nm CMOS), same dB and same GHz for the NMOS transistors have been used. The choice 1 dB is only for comparison to [13] , and it will be reduced in Section IV where several delay cells will be cascaded. Substituting the values in (10), the result is GHz, which is a 4x improvement compared to other circuits in [13] . Fig. 7 shows the simulation results of the delay cell. Reading as the frequency where 45 phase shift occurs w.r.t. DC, we find GHz and a gain roll-off of dB at which is due to the parasitic capacitor effects at the output of the cell. Also the DC gain is not 0 dB due to the finite output impedances of transistors. In Section IV, the DC-gain will be calibrated to 0 dB and the useful frequency range is increased further to 5x (up to 2.5 GHz) that of other reported -RC all-pass delay cells.
The operating bandwidth is limited to , to keep the gain roll-off less than . Within the operating bandwidth, the value of the true time delay and group delay mainly depends on but may also be affected by the 3 dB gain-roll-off frequency due to the parasitic pole at the output. Because is much larger than , a linear phase approximation can be made. This causes both a constant time delay shift and group delay shift equal to
. Equation (11) shows expressions for both the total true time delay and group delay of the delay cell:
In both equations, the second term is much smaller than the first term.
B. Nonlinearity
The nonlinear V-to-I conversions in and can be compensated by the I-to-V function of , which are inverse functions. Also, the mirror and with gain 2 ideally renders an inverse function nonlinearity compensation. However, reactive harmonic distortion [15] occurs at frequencies well below the pole frequency. The I-V conversion by becomes more linear for higher frequencies, as the (linear) capacitor starts to dominate the I-V conversion instead of the square-root I-V function due to diode connected transistor . As the V-I conversion of remains nonlinear (quadratic for long transistors), the overall function is nonlinear. Due to the phase shifts caused by capacitor C, the nonlinearity compensation between and is degraded. Therefore, the nonlinearity of the filters cell increases by increasing the frequency.
C. Thermal Noise
The input-referred thermal noise of the delay cell can be written as (12) where is the noise excess factor of a MOSFET. As (12) shows, the input-referred noise is frequency-dependent. In a delay line of n cascaded delay cells with unity gain, the total input-referred noise power is n times the noise of each individual delay cell. Therefore, in systems with variable numbers of cascaded delay cells, the total noise figure will be delay-dependent.
D. PVT Sensitivity
Process, voltage and temperature (PVT) variations may affect the gain and amount of the delay of each delay cell. Due to mismatch and the finite output impedance of the transistors, the DC gain of the delay cell is not exactly one. In cascaded cells, these errors add up. A tuning mechanism for DC gains is addressed in Section IV. Moreover, just as for any -C filters, there will be spread in the filter time-constant and hence delay due to PVT. Using master-slave techniques [16] , these variations can be cancelled largely, e.g., by using replicas of the delay cell in an oscillator loop, and tuning its frequency equal to a well-known reference frequency. 
IV. DELAY CELL ENHANCEMENTS
We will now describe some techniques to reduce true time delay variation (make the delay more constant over the frequency band), extend bandwidth, and (fine-) tune the delay and gain.
A. Phase Linearization (Small Delay Variations)
It is shown below that adding a resistor R between the gate and drain of (Fig. 8 ) improves the linearity of the phase transfer function in a limited frequency band. This can be considered as "inductive peaking" that is often used in wideband amplifiers for equalization of the gain [17] . Here, its purpose and optimization targets phase linearity and low , to minimize delay variation. The conductance of the linearizedphase circuit inside the dashed rectangle in Fig. 8 is (13) Its value for very low and very high frequencies is and , respectively. As is shown conceptually in Fig. 9 , the phase transfer function of the linearized delay cell shows a smaller value of compared to two other phase transfer functions and . Hence, not only is the variation in group delay reduced, but also the variation in true time delay (low ). This happens for a certain optimum value of . For low frequencies, the phase curve is similar to that of an all-pass cell with its pole/zero at (curve ), while for high frequencies it follows the phase curve of a cell with pole/zero at (curve ). For intermediate frequencies the phase curve is an interpolation between the two lines and . A proper value of found through parametric simulations results in a curve with a minimum amount of the delay variations in a band around , i.e., a minimum value of the criterion (see (2)) [5] , [6] . Note that closer proximity of to zero corresponds to less delay variation versus frequency. Fig. 10 shows simulated phase curves with as a parameter. The process technology used is now 160 nm CMOS, and Table I lists the circuit parameters. is evaluated at operation frequency of 1.75 GHz (in the middle of the band 1-2.5 GHz).
varies improves from 0.52 GHz for to 0.06 GHz for k (optimum). In terms of Table I .
delay variation over 1-2.5 GHz, using (2), a delay variation reduction from 9.8% for , to 1.4% for k is found. The phase linearization resistor increases the noise figure of the delay cell by about 1.7 dB.
B. Bandwidth Extension
The load capacitor plus the parasitic capacitors at the output of the delay cell cause an unwanted pole and, consequently, gain roll-off plus an extra amount of delay. In a cascade of identical delay cells, the total load plus parasitic capacitance at the output node is . Therefore, the parasitic pole at output is:
. An active inductive peaking technique [18] is used for bandwidth extension by adding resistor to convert the diode-connected transistor to an "active inductor" (Fig. 11) . The impedance of the active inductor (the part inside the dotted box) is (14) Choosing results in . Therefore, the pole at the output becomes which means 50% bandwidth extension. Fig. 12 shows the gain curve with as a parameter. The transistor parameters are the same as in Table I . Theoretically, 50% bandwidth extension happens at
; however, because of extra parasitic capacitance due to and and also finite output impedances of , and , simulation shows a 33% bandwidth extension. The DC gain drop of 2 dB is caused by the finite output impedance of the transistors. The bandwidth extension resistor increases the noise figure of the delay cell by about 0.6 dB. In the following subsections, a technique is introduced to compensate the DC gain drop.
C. Binary Tuning of the Delay
Referring to (5), delay can be fine-tuned by varying C, which is designed as a 3 bit switchable binary weighted capacitor bank (see Fig. 13 ). Because all capacitors of the bank are AC-grounded, referred to , they are easily switchable with PMOS transistors. 
D. Gain Adjustment
The practically achieved DC-gain of the filter is less than one because of the finite output impedance of the transistors (refer to (6)). In a delay line gain errors add up and there may be a need to calibrate the gains to unity. For this purpose, a switchable structure consisting of and and has been added (Fig. 14) . and work in parallel with and to increase their effective width by an amount equal to W, so that the DC gain is multiplied by . Transistor sinks the excess DC current at the output point to keep the DC output bias voltage unchanged.
is re-used from the biasing circuit (refer to Fig. 6) .
A set of binary weighted switchable gain tuning stages makes the tuning more precise (Fig. 15) . Three bits have been used for the DC-gain tuning in a gain-range of 3 dB ( dB). Table II shows a comparison between the simulated results of this work and the simulated results of other reported -RC delay cells (refer to Fig. 3) . The technology used in every case is UMC 180 nm to compare to [13] . The of NMOS transistors for all circuits are the same to maintain equal GHz for fair comparison. As the table shows, the pole frequency of this work is much higher (more than ) than other works. The NSNR [15] (defined as SNR/P@IM3 1%) criterion of the cells at 0.1 GHz bandwidth was used to compare dynamic range. The NSNR of this work is 1 dB better than [11] and 6 dB less than [2] , partly due to , but mainly due to the number of noise contributing devices of the new delay cell [12] . Clearly, the strongest point of this work is its frequency range, which is much better than for other circuits in the same technology.
V. BEAM-FORMING SYSTEM DESIGN
In this section, the timed-array system specifications are related to the delay cell requirements. The formulas of this section are used in Section VII to find the specifications of the sub-blocks in timed-array antenna systems. Suppose we aim at N antenna elements, a frequency band from to , a maximum steering angle w.r.t. the bore-sight and b bits of spatial steering resolution, while the required noise figure is less than . No grating lobes should exist and dB null depth is targeted. From these specifications, system design parameters are extractable using [3] and [4] , like the distance between antenna elements, maximum required delay, number of delay steps, and the noise figure of each channel.
To avoid grating lobes, the distance between antenna elements must be less than half the wave length at the maximum operating frequency (15) The noise figure for N antennas improves with 10log(N) [dB] w.r.t. the noise figure for a single antenna channel. The maximum required delay per channel depends on: 1) the number of antenna elements (N), 2) the distance between antenna elements , and 3) the maximum steering angle . It can be expressed as [4] : (16) ( is the speed of the waves in the space.) The minimum delay steps depend on: 1) distance between antenna elements , 2) maximum steering angle and 3) spatial resolution in bits (b): (17) The null depths are ideally equal to , but gain mismatch will decrease the null depths. Timed-array system simulations shows that for a four-antenna array and null depths less than 40 dB, less than 0.06 dB gain difference between the channels is required.
VI. FOUR-CHANNEL WIDE-BAND BEAM-FORMING IC
The designed delay cell is the basic building block of the time-delay-based timed-array antenna IC. The IC has four antenna channels (Fig. 16) [12] . Each channel applies adjustable delay and gain on the input signal. As shown in Fig. 16 , the adjustable delay is a combination of "Fine " cells cascaded with "Coarse " delay cells. The "Fine " is realized by a cascade of three delay cells with small delay steps (refer to Fig. 13) . Each "Coarse " is a cell with large delay steps (refer to Fig. 15 ). In Section VII, it will be shown that 550 ps total delay has been realized in a 5 bit delay resolution via "Fine " and six cascaded "Coarse " cells. The last "Coarse " cell acts as a termination. An LNA/BALUN precedes the delay chain to reduce the noise figure. The output signals of the lines are added to each other to complete the beam-forming function. Then the signal is down-converted to IF via a mixer and external LO. The total noise of the chip depends on the beam direction because the amount of the delay at each channel (number of cascaded coarse cells) changes with the beam direction. The maximum noise figure occurs when the beam directs toward the maximum steering angle . In this case, the average delay of the channels is at maximum.
Analysis based on a Taylor series expansion shows that distortion has only minor impact on the phase of the fundamental frequency. Therefore, the position of the null and its depth do not change much. However, strong signals may also generate higher harmonics with different phases than the fundamental signal in each antenna channel. After summation, the amplitude of the harmonics can add up and cause high-frequency interference even if a signal comes from a null direction. Whether this is a problem depends on specific requirements and boundary conditions which are outside the scope of this paper. Next, the functionality and circuit structure of the sub-blocks are explained.
A. LNA/BALUN
The LNA/BALUN has four main functions: 1) antenna impedance matching, 2) low noise amplification, 3) single-to-differential conversion (BALUN), and 4) gain tuning. The singleto-differential conversion makes the signals less sensitive to interference from other channels. It exploits a noise-cancelling common-gate (CG)-common-source (CS) structure (Fig. 17 ) [19] . The DC blocking capacitors and are the only series capacitors in each channel. Due to a design error, their parasitic capacitance to the substrate is the main cause of the bandwidth limitation. and are DC fed to the "Fine " block. The AC gain in CG, CS stages of the LNA/BALUN can be trimmed by controlling bias voltages and to provide gain equalization and calibration. A 4 bit DAC is used to cover 1 dB gain variation with 0.06 dB as LSB step. This small range hardly degrades (it remains less than dB) but provides the required gain resolution to provide dB null depths.
B. Fine Delay Control
The Fine block realizes small delay steps. It consists of three cascaded delay adjustable cells (Fig. 13) , to cover one coarse delay step with extra margin for PVT, to prevent "missing bits". The gain of the fine blocks are the same for all antenna channels, therefore, it does not affect the beam pattern and consequently they do not require gain calibration.
C. Coarse Delay Control
The Coarse delay line consists of six cascaded delay cells, each with a fixed delay and an adjustable amount of gain. At each (voltage) output of a coarse delay cell, there is a V-I converter (see Fig. 16 ) which can be activated or not. This acts as a selectable tap to effectively change the length (and the delay amount) of the delay line. The gain adjustability of the stages is to calibrate the gain of the coarse delay line to unity, independent of the number of cascaded blocks.
D. Selectable V-I Converters
The selectable V-I converters fulfill two tasks: They select the desired output of the delay chain and they convert the signal from voltage to current. The input capacitance of the V-I converter has an effect on the delay of the channel, but because this delay shift is equal for all channels, it does not affect the beam pattern. However, they limit the bandwidth. Because the signals are converted to current, the summation function required for beam forming can be implemented by simply connecting all outputs together.
E. LO, Mixer and Output Buffer
An external differential LO is used to down-convert the beam-formed signal to IF. The circuit and its outputs are differential and an active Gilbert-cell mixer is used with load resistors. The output voltage is buffered via source followers to match the output impedance to 50 . 
VII. CHIP IMPLEMENTATION AND MEASUREMENTS
To demonstrate wideband beam-forming, a four-channel beam-forming chip was designed in 160 nm CMOS, covering more than one octave of bandwidth from 1 GHz to 2.5 GHz. The beam can be steered from 60 to 60 related to bore-sight in 4 bits resolution. To avoid grating lobe conditions, the required inter-element antenna distance is cm (refer to (15) ). The maximum required delay in each channel is found from (16) and is ps. The delay step size is derived from the 4 bit beam-steering resolution (refer to (17)):
ps. The to ratio shows that for 4 bit steering angle resolution, 5 bit delay resolution is needed per channel. The targeted average noise figure of the channels when the beam steers towards is 8 dB at the mid of the frequency range ( GHz), i.e., the noise figure of each channel at 255 ps delay must be 8 dB. The 255 ps delay consists of three cascaded coarse cells besides fine . A single-ended-to-differential voltage gain of 13 dB for each channel and 3.5 dB noise figure for the LNA/BALUN theoretically results in 8.9 dB noise figure for every individual delay cell. Keeping the overdrive voltage of the transistors similar to Table I results in 3.6 mA current for each individual delay cell. In this test chip, the simple bias circuit of Fig. 6 consisting of a diode connected N-and PMOS was used. Reduction of the supply directly decreases the current and consequently affects the , noise and time-delay of the delay cell. To stabilize performance, either a voltage regulator will be needed, or a bias current source should be used to bias M6 and M7 in Fig. 6 . Fig. 18 shows the chip photograph. For each channel, the delay versus frequency over the whole frequency band and for all settings was measured. An effective number of bits for the delay setting equal to was found (Fig. 19) . The delays are approximately constant within 10 ps variation in the operating frequency band of 1-2.5 GHz. The flatness of the delay curves in Fig. 19 is an immediate result of applying the technique in Figs. 9 and 10 to linearize the phase versus frequency and demonstrates that the optimization approach is highly effective. Substituting the maximum delay variations (10 ps) and the maximum amount of delay ( ps) in (2), we find GHz at the mid of the frequency band ( GHz) which is close to the circuit simulations shown in Fig. 10 . Fig. 20 shows the gain versus frequency variations for all delay and gain settings (without the effect of the LNA/BALUN gain trimming). For all delay settings the gain varies less than 1.8 dB at each individual frequency point from 1 GHz to 2.5 GHz band. For each delay setting (a fixed delay) the gain versus frequency variations from 1 GHz to 2.5 GHz remains less than 2.8 dB (or 1.4 dB with respect to its average). The gain adjustability in the BALUN provides another opportunity to trim the gain of the individual frequency points with 0.03 dB resolution. The gain variations versus all delay amount settings with the help of BALUN gain trimming is 0.03 dB over 1-2.5 GHz band (non-simultaneous). This gain equality resolution results in deep null depths of the beam pattern which will be shown later (Fig. 22) . The gain, noise figure and input matching versus frequency of a single receiver channel set at 255 ps delay is shown in Fig. 21 . The 255 ps is the average delay of the four channels when the beam steers towards its maximum angle which results in the maximum noise figure for the timed array (worst case scenario). For other steering angles, the average delay of the channels is less and Fig. 21 . Gain, input matching and noise figure when delay of the channel is 255 ps. therefore the noise figure is better. The measured gain and noise figure is in agreement with the simulations within 0.9 dB.
The measured beam pattern was compared with a simulated ideal time-delay-based beam-forming system as shown in Fig. 22 . For the frequency range from 2 to 2.5 GHz, the 3 dB beam width varies from 63 to 51 and the null-to-null distance from 37 to 29 . Also, a new null appears at 38 in the pattern both in measurement and simulation.
The method used for beam pattern measurement is as follows. Four RF signals representing the antenna signals are generated via four external RF generators. Experiments were done at 2 GHz and 2.5 GHz, while an external 3 GHz signal was applied to the LO input. The beam-formed signal is down-converted to 500 MHz to 1 GHz. Going through all delay settings, the beam patterns for 2 GHz and 2.5 GHz are synthesized. Comparing to the simulated beam pattern, it can be seen (Fig. 22 ) that spatial directions for the beam and nulls in the measured pattern closely follow the ideal pattern. The null depths of the beam pattern were limited to 24 dB which was caused by the cross talk between the off-chip transmission lines of the antenna channels.
Table III [24] compares several reported delay circuits implemented via different technologies and topologies. Compared to other methods, the proposed circuit provides the lowest amount of delay versus frequency variation (1.8% over more than an octave of bandwidth). Also, it is the most compact delay circuit which provides between 1 and 2 orders of magnitude more delay per area. Therefore this circuit is one of the best candidates for low-GHz RF-band applications requiring large amounts of delay. Table IV shows a comparison between our -RC timed-array chip and two other reported time-delay-based chips designed for beam-forming, which exploit LC delay lines and transmission lines. The compactness of the delay cells allows us to implement the chip in a much smaller area at comparable power consumption conditions. The reported noise figure of this circuit is higher, but it is for the worst case scenario (maximum steering angle:
). Steering to smaller angles (referred to the bore-sight) requires less delay and produces less noise. The reported amplitude versus frequency variations ( 1.4 dB) are instantaneous for the 1-2.5 GHz frequency band at each delay settings. The gain-trimming property of the LNA plus gain calibration of the delay cells provide 0.03 dB resolution for individual frequencies. 
VIII. CONCLUSION
This paper has presented a compact all-pass -C cell that was compared to other reported -RC delay cells, showing 5 higher operating frequency range. A chip implementation of the delay cell in 160 nm CMOS results in a flat gain up to 2.5 GHz for the delay cell, with the help of a bandwidth-extension technique. The delay cells are directly cascadable to realize a delay line without AC-coupling or buffering. This avoids parasitic capacitances to ground from DC blocking capacitors which limit frequency range or require extra current consumption. The circuit exploits current re-use with a slow PMOS low-pass path in parallel to a fast NMOS unity gain path to maximize the useful frequency range. Bandwidth and phase linearity are further enhanced by adding carefully dimensioned resistors to the diode-connected transistors. Gain fine control in the delay cells allows for precise gain calibration, independent of delay. Using simulation, a direct comparison of the new delay cell with existing -C and -RC delay cells has been made in terms of frequency range, dynamic range and power consumption. The SNR/P at 1% IM3 of the designed delay cell is 1 dB better than [11] and 6 dB worse than [2] , while the frequency range is at least 5 larger (compared to [2] and [11] ). To validate performance, a four-antenna beam-forming receiver chip with a maximum steering angle of 60 was designed in 160 nm CMOS technology with a total delay per channel of 550 ps in an area of 0.15 mm . Compared to other chips with LC delay lines and transmission lines, this is about 2 orders of magnitude more delay per area. The 550 ps delay is digitally controllable in 13 ps steps. Delay variation over a bandwidth from 1 to 2.5 GHz is less than 10 ps, which is only 1.8% of the realized delay. In other words, the selectable delays are monotonous, with low RMS error in the frequency band, and therefore are easy to use in calibration schemes. The delay/size of the circuit, which is 7857 ps/mm , is at least 50 more than other delay circuits reported in the literature, which makes it quite suitable for low-GHz operations that need large amounts of delays. An effective delay resolution of 4.7 bits is demonstrated, which corresponds to an effective spatial beam steering resolution of 3.5 bits for full scale steering range of 60 , i.e., 10.6 spatial angle resolution. The average noise figure of each antenna channel in the worst case scenario (when the average delay in four channels is maximum, i.e., a beam direction is at 60 ), is 10 dB.
