Compact cascadable gm-C all-pass true time delay cell with reduced delay variation over frequency by Garakoui, Seyed Kasra et al.
Compact Cascadable gm-C All-Pass True Time 
Delay Cell with Reduced Delay Variation over 
Frequency 
 
Seyed Kasra Garakoui, Eric A. M. Klumperink, Senior member, IEEE, Bram Nauta, Fellow,  IEEE, 
Frank E. van Vliet, Senior member, IEEE 
The authors are with the University of Twente, CTIT Institute, IC Design Group, P.O.Box 217, 
7500AE,Enschede, The Netherlands (e-mail: kasra.garakoui@teledynedalsa.com).(Tel: +31 61 
4989772) 
Abstract—At low GHz frequencies, analog time-delay cells realized by LC delay lines or 
transmission lines are unpractical in CMOS, due to their large size. As an alternative, delays can 
be approximated by all-pass filters exploiting transconductors and capacitors (gm-C filters). This 
paper presents an easily cascadable compact gm-C all-pass filter cell for 1-2.5 GHz. Compared to 
previous gm-RC and gm-C filter cells, it achieves at least 5x larger frequency range for the same 
relative delay variation, while keeping gain variation within 1dB. This paper derives design 
equations for the transfer function and several non-idealities. Circuit techniques to improve 
phase linearity and reduce delay variation over frequency, are also proposed. A 160nm CMOS 
chip with maximum delay of 550psec is demonstrated with monotonous delay steps of 13 psec 
(41 steps) and an RMS delay variation error of less than 10psec over more than an octave in 
frequency (1 – 2.5GHz). The delay per area is at least 50x more than for earlier chips. The all-
pass cells are used to realize a four element timed array receiver IC. Measurement results of the 
beam pattern demonstrate the wideband operation capability of the gm-RC time delay cell and 
timed array IC-architecture.  
 
Index Terms—Time delay, True Time delay, All-pass filter, Phase shifter, CMOS, Timed 
array receiver, Phased Array Receiver, Beam forming, Beam squinting, Equalizer, Delay 
Compensation. 
I. Introduction 
TIME delay circuits have broad applications in communication systems, e.g. for FIR and IIR fil-
ters [1], equalizers [2], and wide band beam forming [3], [4]. This paper deals with the latter 
application, where a "timed array" is targeted instead of the more commonly used phased array. 
In a timed array, true time delays are used instead of the narrowband phase shifter approxima-
tion. In this way beam squinting can be minimized [4], [5]. In beam forming receivers the varia-
ble delay cells compensate the relative delay of signals of the antenna channels.  The transfer 
function of an ideal delay cell is: H(s)=e
-s
 (Fig. 1). Its gain is 1 and its phase is linear versus 
frequency. The delay () at frequency f0 is: f0=-(f0)/2f0, ideally independent of f0 (linear 
phase). Note that we consider true time delay here, not group delay, which is defined as –/. 
Achieving constant true time delay is tougher as it not only requires constant group delay inde-
pendent of frequency but also a constant ratio between - and   independent of frequency [6].  
There are different IC compatible circuits to approximate a time delay, e.g. transmission lines 
[7], [8], LC delay lines [9], switched capacitor delay circuits [10] and gm-RC or gm-C all-pass 
filters [10]. However, at low GHz frequencies, transmission lines and LC delay lines in CMOS 
are unpractical due to the low quality factor of coils, loss of the transmission lines and their large 
sizes. Switched capacitor time-delay circuits on the other hand are not fast enough for low GHz 
applications. One of the few remaining options is to exploit an all-pass filter approximation of a 
delay, e.g. a 1
st
 order all-pass filter:  
 Hap_1(s) =
1−s(τ 2⁄ )
1+s(τ 2⁄ )
                                   (1).  
The transfer function of this all-pass filter is plotted in Fig.. At low frequencies it approximates 
the ideal delay cell but at higher frequencies the delay is reduced and delay variations occur. 
This delay variation is quantified via the criterion f=0 [6] which is the crossing point of the fre-
quency axis and the tangent to the phase curve at operating frequency f0 (Fig.). The delay varia-
tion in f around f0 is approximately:  
 
τ(f0±∆f)−τ(f0)
τ(f0)

fφ=0
f0
1−
fφ=0
f0
∙
±∆f
f0
                                       (2).  
The 1
st
 order all-pass transfer function can be realized both with gm-RC filters [2], [11] (see 
Fig.3) and the gm-C filter presented in this paper and in [12]. In [13] a benchmarking method has 
been proposed to compare delay cells based on f=0. This method is re-used here to show that the 
gm-C delay cell of [12] has better performance than other published designs.  Moreover, the fea-
sibility of a compact broadband beamforming IC with gm-C delay cells is demonstrated. Apart 
from bandwidth, other important properties of the delay cell are: 1) Cascadability; 2) Compact-
ness; 3) Wide delay tuning range; 4) High delay tuning resolution and precision; 5) Gain control-
lability; 6) Noise figure and 7) Linearity. In [12] it has been shown that the gm-RC all-pass filters 
of [2] and [11] (shown in Fig.3) do not work adequately up to several GHz in UMC 180nm 
CMOS because of their high parasitic capacitors. Besides, they need DC blocking capacitors or 
source-follower buffer circuits to realize cascadability, which limits the bandwidth and/or results 
in high current consumption. It will be proven that the gm-C topology of [12] has better perfor-
mance: 1) Low delay variation over a 5x wider frequency band compared to other reported gm-
RC delay circuits, while maintain similar noise and nonlinearity performance; 2) Compactness 
compared to LC or transmission lines; 3) High resolution of delay and gain tuning; 4) Direct 
cascadability. Compared to [12], this paper adds circuit analysis and circuit optimization tech-
niques, e.g. for phase linearization and bandwidth extension. The structure of this paper is as 
follows: in section II, the all-pass filter as an approximation of a delay cell is reviewed. In sec-
tion III the all-pass filter of [12] is explained, while section IV discusses improvements of its 
characteristics. Section V assesses non-idealities of the delay cell. Section VI establishes a rela-
tion between requirements on the beam forming system requirements on the delay cell. Section 
VII presents the sub-circuits of the timed array IC and section VIII presents measurement re-
sults, while section IX draws conclusions. 
II. 1ST ORDER ALL-PASS DELAY CELL 
 
The transfer function of the 1
st
 order all-pass filter of (1) can be re-written as a combination of a 
low-pass part with DC-gain of two and a unity-gain part [14] as:                       
  Hap_1(s) =
1−s(τ 2⁄ )
1+s(τ 2⁄ )
=
2
1+s(τ 2⁄ )
− 1                           (3)  
It is realizable without floating capacitors and hence with good bandwidth potential, because the 
low-pass part can be implemented by a capacitor to ground and the unity gain part does not re-
quire capacitors. Fig.4 shows the block level and gm-RC implementation of this all-pass filter. 
Aiming for direct cascadability, the gm-C topology of Fig.5 [12] with equal input and output DC 
voltages was proposed. Transistors M1, M4, M5 and M3 realize the low-pass signal path with a 
DC-gain of 2. Transistors M2 and M3 realize the inverting unity gain path. Using PMOS transis-
tors in the “slow low-pass path” and faster NMOS transistor in the unity gain path, the useful 
frequency range of the delay cell is maximized. Also, current re-use for NMOS and PMOS tran-
sistors reduces power consumption. The DC input voltage Vin,DC results in equal drain currents 
IDC in M1, M2, M3 and M4, and 2IDC for M5. Therefore, Vout,DC=Vin,DC. Modelling M4 by its 
small-signal gm4 and C as the total capacitance, the transfer function and its low frequency delay 
can be written as: 
H(s) =
vout(s)
vin(s)
=
1−
sC
gm4
1+
sC
gm4
                                             (4)              
τLF ≈ 2C/gm4                                            (5)  
The low-frequency delay is made controllable by splitting C in switchable binary weighted ca-
pacitors. Fig.6 shows the bias circuit of the first delay cell of a delay line. It is the only cell with 
an AC-coupling capacitor to the input RF signal, Vin,RF. As each signal path has this, no differ-
ence in gain and delay results. The DC voltage of Vout is equal to VDC,bias. RB is made more than 
10 times larger than the source impedance of Vin,RF, for insignificant attenuation.  
III. THE NON-IDEALITIES OF THE DELAY CELL 
As the aim is to cascade cells, the non-idealities of a delay cell will now be analyzed with a ca-
pacitive load equal to the input capacitance of the next delay cell: Cgs,M1+Cgs,M2. In the analysis 
the effect of Cgd will be neglected as the voltage gain is low (unity gain all-pass behavior), while 
the right half-plane zero introduced by at gm/Cgd is in the range of 50GHz for the transistors 
used. This is well beyond the targeting low-GHz operating frequency, and therefore for the sake 
of simplicity we neglected its effect. This assumption was validated by checking hand calcula-
tion versus simulation results. 
A. Finite output impedance and parasitic capacitances 
Considering finite output impedances of the transistors and the parasitic capacitors which affect 
the pole/zero frequency and DC gain, the transfer function of the filter becomes:  
H(s) =
1
1+
2
gmn
(gdsn+gdsp)
∙
1−
1
gmp
(gdsn+gdsp)
1+
1
gmp
(gdsn+gdsp)
∙
1−
sC
[gmp−(gdsn+gdsp)]
1+
sC
[gmp−(gdsn+gdsp)]
∙
1
1+ 
sCL
[gmn+2(gdsn+gdsp)]
           (6)  
Where gmn and gdsn are the transconductance and output conductance of M1, M2 and M3 in satu-
ration, gmp and gdsp those of M4 and 2gmp and 2gdsp of M5. The parasitic capacitances Cgs,M4, 
Cgs,M5, Cdb,M4 and Cdb,M3 are absorbed in C. Also CL absorbs the parasitic capacitors Cgs,M3, 
Cdb,M5, Cdb,M2, Cdb,M3 plus the input capacitance of the next delay cell (Cgs,M1+Cgs,M2). The trans-
fer function (6) deviates from the ideal one (4) in two aspects: 1) the DC-gain is less than unity; 
and 2) an extra high frequency pole causes both an extra phase shift and a high frequency gain 
roll-off. If the following conditions are satisfied:  
gmn ≫ 2(gdsn + gdsp)                                     (7a)                         
gmp ≫ 2(gdsn + gdsp)                                      (7b)  
Then the transfer function can be rewritten as:  
H(s) =
1−
2
gmp
(gdsn+gdsp)
1+
2
gmn
(gdsn+gdsp)
∙
1−
sC
gmp
1+
sC
gmp
∙
1
1+ 
sCL
gmn
                                    (8) 
Using the analysis in [13], the maximum usable pole frequency fp is defined as the frequency 
where the gain roll-off with respect to DC is less than ∆Hp, resulting in: 
fp ≤
gmn
2πCL
√
1
(1−∆Hp)
2 − 1                                      (9)                                                    
For frequencies larger than fp, the roll-off is more than ∆Hp. Substituting 
CLCgs,M1+Cgs,M2+Cgs,M3=3Cgs,M1 and ft,M1gmn/2Cgs,M1 (unity current gain) in (9) results in:                                         
fp ≤
ft,M1
3 √
1
(1−∆Hp)
2 − 1                                   (10) 
To estimate fp and compare it with the delay cells reported in [2] and [11] (benchmarked in 
[13]), the same technology (UMC 180nm CMOS), same Hp=1dB and same ft,M1=12.4GHz for 
the NMOS transistors have been used. The choice Hp=1dB is only for comparison to [13], and 
it will be reduced in section IV where several delay cells will be cascaded. Substituting the val-
ues in (10), the result is fp2 GHz, which is a 4x improvement compared to other circuits in [13]. 
Fig.7 shows the simulation results of the delay cell. Reading fp as the frequency where 45° phase 
shift occurs w.r.t. DC, we find fp1.7GHz and a gain roll-off of 1.5dB at fp which is due to the 
parasitic capacitor effects at the output of the cell. Also the DC gain is not 0dB due to the finite 
output impedances of transistors. In section IV the DC-gain will be calibrated to 0dB and the 
useful frequency range is increased further to 5x (up to 2.5GHz) that of other reported gm-RC 
all-pass delay cells.  
 The operating bandwidth is limited to fp, to keep the gain roll-off less than ΔHp. Within the op-
erating bandwidth, the value of the true time delay and group delay mainly depends on fp, but 
may also be affected by the -3dB gain-roll-off frequency f-3dB due to the parasitic pole at the out-
put. Because f-3dB (ft,M1/3) is much larger than fp, a linear phase approximation can be made. 
This causes both a constant time delay shift and group delay shift equal to 1/2f-3db. Eqn. 11 be-
low shows expressions for both the total true time delay () and group delay (g) of the delay 
cell:  

2 tan−1
𝑓
𝑓𝑝
2πf
+
1
2𝜋𝑓−3𝑑𝐵
                                                                (11-a) 
τ𝑔 ≈
𝑓𝑝
π(f2+f𝑝
2)
+
1
2𝜋𝑓−3𝑑𝐵
                                               (11-b)           
In both equations the second term is much smaller than the first term.”          
B. Nonlinearity 
The nonlinear V-to-I conversions in M1 and M2 can be compensated by the I-to-V function of 
M3, which are inverse functions. Also, the mirror M4 and M5 with gain 2, ideally renders an in-
verse function nonlinearity compensation. However, reactive harmonic distortion [15] occurs at 
frequencies well below the pole frequency. The I-V conversion by M4 becomes more linear for 
higher frequencies, as the (linear) capacitor starts to dominate the I-V conversion instead of the 
square-root I-V function due to diode connected transistor M4. As the V-I conversion of M5 re-
mains non-linear (quadratic for long transistors), the overall function is nonlinear. Due to the 
phase shifts caused by capacitor C, the non-linearity compensation between M1, M2 and M3 is 
degraded. Therefore, the nonlinearity of the filters cell increases by increasing the frequency.  
C. Thermal Noise  
The input referred thermal noise of the delay cell can be written as: 
v2in̅̅ ̅̅ ̅ = 8kTγ (
gmn+gmp
gmn2
) [
3gmp
2+(Cω)2
gmp2+(Cω)2
] = 8kTγ (
gmn+gmp
gmn2
) [
3(
fp
f
)
2
+1
(
fp
f
)
2
+1
]                                    (10)   
where is the noise excess factor of a MOSFET. As (12) shows, the input referred noise is fre-
quency-dependent.  In a delay line of n cascaded delay cells with unity gain, the total input re-
ferred noise power is n times the noise of each individual delay cell. Therefore, in systems with 
variable numbers of cascaded delay cells, the total noise figure will be delay dependent. 
D. PVT sensitivity 
Process, Voltage and Temperature (PVT) variations may affect the gain and amount of the delay 
of each delay cell.  Due to mismatch and the finite output impedance of the transistors, the DC 
gain of the delay cell is not exactly one. In cascaded cells these errors add-up. A tuning mecha-
nism for DC gains is addressed in section IV. Moreover, just as for any gm-C filters, there will be 
spread in the filter time-constant and hence delay due to PVT. Using master-slave techniques 
[16], these variations can be cancelled largely, e.g. by using replicas of the delay cell in an oscil-
lator loop, and tuning its frequency equal to a well-known reference frequency.  
IV. DELAY CELL ENHANCEMENTS 
 
We will now describe some techniques to reduce true time delay variation (make the delay more 
constant over the frequency band), extend bandwidth and (fine-) tune the delay and gain. 
A. Phase linearization (small delay variations)  
It is shown below that adding a resistor R between gate and drain of M4 (Fig.8) improves the 
linearity of the phase transfer function in a limited frequency band. This can be considered as 
"inductive peaking" that is often used in wideband amplifiers for equalization of the gain [17]. 
Here, its purpose and optimization targets phase linearity and low f=0, to minimize delay varia-
tion. The conductance 𝑔𝑚4,𝐿𝑖𝑛 of the linearized-phase circuit inside the dashed rectangle in Fig.8 
is: 
  gm4,Lin(s) = gm4 ∙
sCgsM4
gm4
+1
sRCgsM4+1
                                                                                                (13)  
Its value for very low and very high frequencies is gm,LFgm4 and gm,HF1/R respectively. As is 
shown conceptually in Fig.9, the phase transfer function of the linearized delay cell lin shows a 
smaller value of f=0 compared to two other phase transfer functions 1 and 2. Hence not only 
the variation in group delay is reduced but also the variation in true time delay (low f=0).This 
happens for a certain optimum value of R. For low frequencies the phase curve is similar to that 
of an all-pass cell with its pole/zero at gm4/2C (curve 1), while for high frequencies it follows 
the phase curve of a cell with pole/zero at 1/2RC (curve 2). For intermediate frequencies the 
phase curve is an interpolation between the two lines `1 and 2. A proper value of R found 
through parametric simulations, results in a curve (Lin) with minimum amount of the delay var-
iations in a band ∆f around f0, i.e. a minimum value of the criterion f=0 (see eqn. (2)) [6], [5]. 
Note that closer proximity of f=0 to zero corresponds to less delay variation vs. frequency. 
Fig.10 shows simulated phase curves with R as parameter. The process technology used is now 
160nm CMOS and Table 1 lists the circuit parameters. f=0 is evaluated at operation frequency of 
1.75GHz (in the middle of the band 1-2.5GHz). f=0 varies improves from -0.52GHz for R=0 
to -0.06GHz for R=1.5k(optimum). In terms of delay variation over 1-2.5GHz, using (2), a 
delay variation reduction from 9.8% for R=0, to 1.4% for R=1.5k is found. The phase lineari-
zation resistor increases the noise figure of the delay cell by about 1.7dB. 
B. Bandwidth extension 
The load capacitor plus the parasitic capacitors at the output of the delay cell (CL+Cgs3) cause 
an unwanted pole and consequently gain roll-off plus an extra amount of delay. In a cascade of 
identical delay cells, the total load plus parasitic capacitance at the output node is 3Cgs,M1. There-
fore, the parasitic pole at output is: fp,outgm3/(6Cgs,M1). An active inductive peaking technique 
[18] is used for bandwidth extension by adding resistor RBWE to convert the diode connected 
transistor M3 to an “active inductor” (Fig.11). The impedance of the active inductor (the part 
inside the dotted box) is: 
 ZA−ind(s) =
1
gm3
sRBWE.Cgs,m3+1
s
Cgs,M3
gm3
+1
                                                                                (14)                                                    
Choosing RBWE=1/gm3 results in ZA-ind=1/gm3. Therefore, the pole at the output becomes 
fBWE=gm3/(4Cgs,M1) which means 50% bandwidth extension. Fig.12 shows the gain curve with 
RBWE as a parameter. The transistor parameters are the same as in table 1. Theoretically, 50% 
bandwidth extension happens at RBWE (=1/gm,M3)=298, however, because of extra parasitic 
capacitance due to Cdb,M2, Cdb,M5 , Cdb,M3 , Cgd,M2, Cgd,M3 and Cgd,M5 and also finite output imped-
ances of  M3, M2 and M5, simulation shows a 33% bandwidth extension. The DC gain drop of -
2dB is caused by the finite output impedance of the transistors. The bandwidth extension resistor 
(RBWE) increases the noise figure of the delay cell by about 0.6dB. In the following subsections a 
technique is introduced to compensate the DC gain drop.  
C. Binary tuning of the delay  
Referring to equation (5), delay can be fine-tuned by varying C, which is designed as a 3 bit 
switchable binary weighted capacitor bank (see Fig.13). Because all capacitors of the bank are 
AC-grounded, referred to Vsupply, they are easily switchable with PMOS transistors. 
D. Gain adjustment 
The practically achieved DC-gain of the filter is less than one because of the finite output im-
pedance of the transistors (refer to equation (6)). In a delay line gain errors add up and there may 
be a need to calibrate the gains to unity. For this purpose, a switchable structure consisting of 
M2,E and M3,E and M5,E has been added (Fig.14). M2,E and M5,E work in parallel with M2 and M5 
to increase their effective width by an amount equal to W, so that the DC gain is multiplied by 
1+. Transistor M3,E sinks the excess DC current at the output point to keep the DC output bias 
voltage unchanged. VDC,bias is re-used from the biasing circuit (refer to Fig.6).  
A set of binary weighted switchable gain tuning stages makes the tuning more precise (Fig 15). 3 
bits have been used for the DC-gain tuning in a gain-range of 3dB (LSB0.4dB). Table 2 shows 
a comparison between the simulated results of this work and the simulated results of other re-
ported gm-RC delay cells (refer to Fig.3). The technology used in every case is UMC 180nm to 
compare to [13]. The VGS of NMOS transistors for all circuits are the same to maintain equal 
ft,n=12.4GHz for fair comparison. As the table shows, the pole frequency of this work is much 
higher (more than 5x) than other works. The NSNR [15] (defined as SNR/P@IM3=1%) criterion 
of the cells at 0.1GHz bandwidth was used to compare dynamic range. The NSNR of this work 
is 1 dB better than [11] and 6dB less than [2], partly due to IIP3, but mainly due to the number of 
noise contributing devices of the new delay cell [12]. Clearly, the most strong point of this work 
is its frequency range which is much better than for other circuits in the same technology. 
V. BEAM FORMING SYSTEM DESIGN 
 
In this section the timed array system specifications are related to the delay cell requirements. 
The formulas of this section are used in section VII to find the specifications of the sub-blocks in 
timed array antenna systems. Suppose we aim at N antenna elements, a frequency band from fmin 
to fmax, a maximum steering angle max w.r.t. the bore-sight and b bits of spatial steering resolu-
tion, while the required noise figure is less than NFmax. No grating lobes should exist and <-40dB 
null depth is targeted. From these specifications, system design parameters are extractable using 
[3] and [4], like the distance between antenna elements, maximum required delay, number of 
delay steps, and the noise figure of each channel. 
To avoid grating lobes, the distance between antenna elements (d) must be less than half the 
wave length at the maximum operating frequency (fmax): 
 d ≤
λfmax
2
                                      (15) 
The noise figure for N antennas improves with 10log(N) [dB] w.r.t. the noise figure for a single 
antenna channel. The maximum required delay per channel (max) depends on: 1) the number of 
antenna elements (N), 2) the distance between antenna elements (d), 3) the maximum steering 
angle (max). It can be expressed as [4] (c is the speed of the waves in the space): 
  τmax = (N − 1)
d.sin(θmax)
c
                                                       (16)  
The minim delay steps (min) depends on: 1) distance between antenna elements (d), 2) maxi-
mum steering angle (max) and 3) spatial resolution in bits (b): 
 τmin =
d.cos(θmax)
C
∙
θmax
2b−1
                                    (17)     
The null depths are ideally equal to -, but gain mismatch will decrease the null depths. Timed 
array system simulations shows that for a 4-antenna array, and null depths less than -40dB, less 
than 0.06dB gain difference between the channels is required.  
VI. 4 CHANNEL WIDE BAND BEAM FORMING IC 
 
The designed delay cell is the basic building block of the time delay based timed array antenna 
IC. The IC has four antenna channels (Fig 16) [12].  Each channel applies adjustable delay and 
gain on the input signal. As shown in Fig 16, the adjustable delay is a combination of “Fine ∆” 
cells cascaded with “Coarse ∆” delay cells. The “Fine ∆” is realized by a cascade of three de-
lay cells with small delay steps (refer to Fig.13). Each “Coarse ∆” is a cell with large delay 
steps (refer to Fig 15). In section VII it will be shown that 550ps total delay has been realized in 
a 5 bit delay resolution via “Fine ∆” and 6 cascaded “Coarse ∆” cells. The last “Coarse ∆” 
cell acts as a termination. An LNA/BALUN precedes the delay chain to reduce the noise figure. 
The output signals of the lines are added to each other to complete the beam forming function. 
Then the signal is down-converted to IF via a mixer and external LO. The total noise of the chip 
depends on the beam direction because the amount of the delay at each channel (number of cas-
caded coarse ∆ cells) changes with the beam direction. The maximum noise figure occurs when 
the beam directs toward the maximum steering angle max. In this case the average delay of the 
channels is at maximum.  
Analysis based on a Taylor series expansion shows that distortion has only minor impact on the 
phase of the fundamental frequency. Therefore the position of the null and its depth doesn’t 
change much. However, strong signals may also generate higher harmonics with different phases 
than the fundamental signal in each antenna channel. After summation, the amplitude of the 
harmonics can add up and cause high frequency interference even if a signal comes from a null 
direction. Whether this is a problem depends on specific requirements and boundary conditions 
which are outside the scope of this paper. Below, the functionality and circuit structure of the 
sub-blocks are explained. 
A. LNA/BALUN 
The LNA/BALUN has four main functions: 1) antenna impedance matching, 2) low noise ampli-
fication, 3) Single to differential conversion (BALUN) and 4) Gain tuning. The single to differ-
ential conversion makes the signals less sensitive to interference from other channels. It exploits 
a noise cancelling common gate (CG)-common source (CS) structure (Fig.17) [19]. The DC 
blocking capacitors Ccg, Ccs and Ccsg are the only series capacitors in each channel. Due to a de-
sign error, their parasitic capacitance to the substrate is the main cause of the bandwidth limita-
tion. Vout,cg and Vout,cs are DC fed to the “Fine ∆ “ block. The AC gain in CG, CS stages of the 
LNA/BALUN can be trimmed by controlling bias voltages Vb1 and Vb2 to provide gain equaliza-
tion and calibration. A 4 bit DAC is used to cover 1dB gain variation with 0.06dB as LSB step. 
This small range hardly degrades S11 (it remains less than -10dB) but provides the required gain 
resolution to provide <-40dB null depths. 
B. Fine delay control 
The “fine ” block realizes small delay steps. It consists of 3 cascaded delay adjustable cells 
(Fig.13), to cover one coarse delay step with extra margin for PVT, to prevent “missing bits”.  
The gain of the fine  blocks are the same for all antenna channels, therefore, it doesn’t affect 
the beam pattern and consequently they do not require gain calibration. 
 
C. Coarse delay control 
The Coarse  delay line consists of six cascaded delay cells, each with a fixed delay and an 
adjustable amount of gain. At each (voltage) output of a coarse delay cell there is a V to I con-
verter (see Fig 16) which can be activated or not. This acts as a selectable tap to effectively 
change the length (and the delay amount) of the delay line. The gain adjustability of the stages is 
to calibrate the gain of the coarse delay line to unity, independent of the number of cascaded 
blocks.  
D. Selectable V to I converters 
The selectable V to I converters fulfill two tasks: They select the desired output of the delay 
chain and they convert the signal from voltage to current. The input capacitance of the V to I 
converter has an effect on the delay of the channel but because this delay shift is equal for all 
channels it does not affect the beam pattern. However, they limit the bandwidth. Because the 
signals are converted to current, the summation function required for beam forming can be im-
plemented by simply connecting all outputs together.  
E. LO, Mixer and output buffer 
An external differential LO is used to down-convert the beam formed signal to IF. The circuit 
and its outputs are differential and an active gilbert cell mixer is used with load resistors. The 
output voltage is buffered via source followers to match the output impedance to 50. 
VII. CHIP IMPLEMENTATION AND MEASUREMENTS 
To demonstrate wideband beamforming, a 4-channel beam forming chip was designed in 160nm 
CMOS, covering more than one octave of bandwidth from 1GHz to 2.5 GHz. The beam can be 
steered from -60 to 60 related to bore-sight in 4 bits resolution. To avoid grating lobe condi-
tions, the required inter-element antenna distance is 0.52.5GHz6cm (refer to (15)). The maxi-
mum required delay in each channel is found from (16) and is: max=510psec. The delay step size 
is derived from the 4 bit beam steering resolution (refer to (17)): min13psec. The max to min 
ratio shows that for 4 bit steering angle resolution 5 bit delay-resolution is needed per channel. 
The targeted average noise figure of the channels when the beam steers towards max=60 is 
8dB at the mid of the frequency range (f0=1.75GHz), i.e. the noise figure of each channel at 
255psec delay must be 8dB. The 255psec delay consists of 3 cascaded coarse ∆ cells besides 
fine ∆. A single ended to differential voltage gain of 13 dB for each channel and 3.5dB noise 
figure for the LNA/BALUN theoretically results in 8.9 dB noise figure for every individual de-
lay cell. Keeping overdrive voltage of transistors similar to table 1 results in 3.6mA current for 
each individual delay cell. In this test chip the simple bias circuit of Fig.6 consisting of a diode 
connected N- and PMOS was used. Reduction of the supply directly decreases the current and 
consequently affects the gm, noise and time-delay of the delay cell. To stabilize performance 
either a voltage regulator will be needed, or a bias current source should be used to bias M6 and 
M7 in Fig.6. Fig 18 shows the chip photograph. For each channel, the delay vs. frequency over 
the whole frequency band and for all settings was measured. An effective number of bits for the 
delay setting equal to ENOB=4.7 (NOB=5) was found (Fig.19). The delays are approximately 
constant within <10psec variation in the operating frequency band of 1-2.5GHz. The flatness of 
the delay curves in Fig.19 is an immediate result of the applying the technique in Fig.9 and 
Fig.10 to linearize the phase vs. frequency and demonstrates that the optimization approach is 
highly effective. Substituting the maximum delay variations (10psec) and the maximum amount 
of delay ((f0)=550psec) in (2), we find f=0 0.06GHz at the mid of the frequency band 
(f0=1.75GHz) which is close to the circuit simulations shown in Fig.10. 
 Fig 20 shows the gain vs. frequency variations for all delay and gain settings (without the effect 
of the LNA/BALUN gain trimming). For all delay settings the gain varies less than 1.8dB at 
each individual frequency point from 1GHz to 2.5GHz band.  For each delay setting (A fixed 
delay) the gain vs. frequency variations from 1GHz to 2.5GHz remains less than 2.8dB (or 
1.4dB with respect to its average). The gain adjustability in the BALUN provides another op-
portunity to trim the gain of the individual frequency points with 0.03dB resolution. The gain 
variations vs. all delay amount settings with the help of BALUN gain trimming is: 0.03dB over 
1 to 2.5 GHz band (non-simultaneous). This gain equality resolution results in deep null depths 
of the beam pattern which will be shown later (Fig 22). The gain, noise figure and input match-
ing (S11) vs. frequency of a single receiver channel set at 255psec delay is shown in Fig.21. The 
255psec is the average delay of the 4 channels when the beam steers towards its maximum angle 
(max) which results in the maximum noise figure for the timed array (worst case scenario). For 
other steering angles the average delay of the channels is less and therefore the noise figure is 
better. The measured gain and noise figure is in agreement with the simulations within 0.9dB.   
The measured beam pattern was compared with a simulated ideal time delay based beam form-
ing system as shown in Fig 22. For the frequency range from 2 to 2.5GHz, the -3dB beam width 
varies from 63 to 51 and the null to null distance from 37 to 29. Also a new null appears at -
38 in the pattern both in measurement and simulation. 
The method used for beam pattern measurement is as follows: 4 RF signals representing the an-
tenna signals are generated via 4 external RF generators. Experiments were done at 2GHz and 
2.5 GHz, while an external 3GHz signal is applied to the LO input. The beam formed signal is 
down-converted to 500MHz to 1GHz.  Going through all delay settings the beam patterns for 
2GHz and 2.5 GHz are synthesized. Comparing to the simulated beam pattern, it can be seen 
(Fig 22) that spatial directions for the beam and nulls in the measured pattern closely follows the 
ideal pattern. The null depths of the beam pattern was limited to -24dB which is caused by the 
cross talk between the off-chip transmission lines of the antenna channels.  
Table 3 [24] compares several reported delay circuits implemented via different technologies 
and topologies. Compared to other methods, the proposed circuit provides the lowest amount of 
delay vs. frequency variation (1.8% over more than an octave of bandwidth).  Also it is the most 
compact delay circuit which provides between 1 and 2 orders of magnitude more delay per area. 
Therefore this circuit is one of the best candidates for low GHz RF band applications requiring 
large amounts of delay.  Table 4 shows a comparison between our gm-RC timed array chip and 
two other reported time delay based chips designed for beamforming, which exploit LC delay 
lines and transmission lines. The compactness of the delay cells allows us to implement the chip 
in much smaller area at comparable power consumption conditions. The reported noise figure of 
this circuit is higher, but it is for the worst case scenario (maximum steering angle: =60). 
Steering to smaller angles (referred to the bore-sight) requires less delay and produces less noise. 
The reported Amplitude vs. frequency variations (1.4dB) is instantaneous for the 1-2.5GHz 
frequency band at each delay settings. The gain trimming property of the LNA plus gain calibra-
tion of the delay cells provide 0.03 dB resolution for individual frequencies. 
VIII. CONCLUSIONS 
This paper presented a compact all-pass gm-C cell that was compared to other reported gm-RC 
delay cells, showing 5x higher operating frequency range. A chip implementation of the delay 
cell in 160nm CMOS results a flat gain up to 2.5GHz for the delay cell, with the help of an 
bandwidth extension technique. The delay cells are directly cascadable to realize a delay line 
without AC-coupling or buffering. This avoids parasitic capacitances to ground from DC block-
ing capacitors which limit frequency range or require extra current consumption. The circuit 
exploits current re-use with a slow PMOS low-pass path in parallel to a fast NMOS unity gain 
path to maximize the useful frequency range. Bandwidth and phase linearity is further enhanced 
by adding carefully dimensioned resistors to the diode-connected transistors. Gain fine control in 
the delay cells allows for precise gain calibration, independent of delay. Using simulation, a di-
rect comparison of the new delay cell with existing gm-C and gm-RC delay cells has been made 
in terms of frequency range, dynamic range and power consumption. The SNR/P at 1% IM3 of 
the designed delay cell is 1dB better than [11] and 6dB worse than [2], while the frequency 
range is at least 5x larger (compared to [2] and [11]). To validate performance, a 4 antenna 
beamforming receiver chip with a maximum steering angle max of 60 was designed in 160nm 
CMOS technology with a total delay per channel of 550 psec in an area of 0.15 mm
2
. Compared 
to other chips with LC delay lines and transmission lines, this is about 2 orders of magnitude 
more delay per area. The 550 psec delay is digitally controllable in 13psec steps.  Delay varia-
tion over a bandwidth from 1 to 2.5GHz is less than 10 psec, which is only 1.8% of the realized 
delay.  In other words, the selectable delays are monotonous, with low RMS error in the fre-
quency band and therefore easy to use in calibration schemes. The delay/size of the circuit which 
is 7857 ps/mm
2
 is at least 50x more than other delay circuits reported in literature which makes 
it quite suitable for low GHz operations that need large amounts of delays. An effective delay 
resolution of 4.7bits is demonstrated which corresponds to an effective spatial beam steering 
resolution of 3.5 bits for full scale steering range of  60,  i.e. 10.6 spatial angle resolution. 
The average noise figure of each antenna channel in the worst case scenario (when the average 
delay in 4 channels is maximum, i.e. a beam direction is at 60 is 10dB.  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
REFERENCES 
 
[1]  A. V. Oppenheim, R. W. Schafer and J. Buck, "Filter Design Techniques," in Discrete-Time 
Signal Processing, 2nd ed., New Jersey, Pretince-Hall, 1999, ch. 7, sec. 1, pp. 440-449. 
[2]  J. Buckwalter and A. Hajimiri, "An active analog delay and the delay reference loop," IEEE 
RFIC Symp., 2004, pp. 17-20.  
[3]  H. J. Visser, "Antennas," in Array and Phased Array Antenna Basics, West Sussex, 
England, Wiley, 2006, ch. 2, sec. 5, pp. 76-80. 
[4]  R. J. Mailloux, "Phased Arrays in Radar and Communication Systems," in Phased Array 
Antenna Handbook, 2nd ed., Norwood, Artech house, 2005, ch.1, sec. 3, pp. 44-60. 
[5]  S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. van Vliet, "Phased-array antenna 
beam squinting related to frequency dependency of delay circuits," EuRad Conf., 2011, pp. 
416-419.  
[6]  S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. van Vliet, "Time delay circuits: A 
quality criterion for delay variations versus frequency," IEEE ISCAS Proc., 2010, pp. 4281-
4284.  
[7]  F. Ellinger, "Passive Devices and Networks," in Radio Frequency Integrated Circuits and 
Technologies, Berlin, Germany, Springer, 2007, ch. 6, sec. 1, pp. 195-198. 
[8]  F. E. van Vliet, M. van Wanum, A. W. Roodnat and M. Alfredson, "Fully-integrated 
wideband TTD core chip with serial control," Gallium Arsenide applications symp., Munich, 
pp. 89-92, 2003.  
[9]  T. Chu, J. Roderick and H. Hashemi, "An Integrated Ultra-Wideband Timed Array Receiver 
in 0.13 μm CMOS Using a Path-Sharing True Time Delay Architecture," IEEE J.Solid-State 
Circuits, vol.42, no. 12, pp. 2834- 2850, Dec. 2007.  
[10]  B. Razavi, Design of Analog CMOS Integrated Circuits, New York: McGraw-Hill, 2001, 
ch. 12, sec. 2, pp. 410-423. 
[11]  P. Horowitz and w. Hill, "Unity-gainphase splitter," in The Art of Electronics, 2nd ed., New 
York, Cambridge University Press, 1999, ch. 2, sec. 8, pp. 77-78. 
[12]  S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. van Vliet, "A 1-to-2.5GHz 
phased-array IC based on gm-RC all-pass time-delay cells," IEEE ISSCC Dig. Tech. Papers, 
2012, pp. 80-82.  
[13]  S. K. Garakoui, E. A. M. Klumperink, B. Nauta and F. E. van Vliet, "Frequency Limitations 
of First-Order gm - RC All-Pass Delay Circuits," IEEE T. on Circuits and Systems II, vol. 
60, no. 9, pp. 572-576, Aug. 2013.  
[14]  K. Bult and H. Wallinga, "A CMOS analog continuous-time delay line with adaptive delay-
time control," IEEE J. Solid-State Circuits, vol. 23, no. 3, pp. 759- 766, June 1988.  
[15]  K. Bult, "Harmonic Performance," in Analog Cmos square-law circuits, PhD Thesis, 
Enschede, University of Twente, 1988, ch. 6, sec. 4, pp. 93-98. 
[16]  B. Nauta, “Tuning,” in Analog Cmos Filters for Very High Frequencies, Enschede, The 
Netherlands, Springer, 1992, ch. 5, sec. 2, pp. 139-141. 
[17]  B. Razavi, "Prospects of CMOS technology for high-speed optical communication circuits," 
Solid-State Circuits, IEEE Journal of , vol.37, no.9, pp. 1135-1145, Sep. 2002.  
[18]  T. H. Lee, "noise," in The Design of CMOS Radio-Frequency Integrated Circuits, 2nd ed., 
Cambridge University Press, 2003, ch. 11, pp. 361-362. 
[19]  S. C. Blaakmeer, E. A. M. Klumperink, D. M. W. Leenaerts and B. Nauta, "The BLIXER, a 
Wideband Balun-LNA-I/Q-Mixer topology," IEEE J. Solid-state Circuits, vol. 43, no.12, pp. 
2706-2715, Dec. 2008.  
[20]  E. A. M. Klumperink and B. Nauta, "Systematic comparison of HF CMOS 
transconductors," IEEE T. on Circuits and Systems II, vol. 50, no. 10, pp. 728- 741, Oct. 
2003.  
[21]  J. Roderick, H. Krishnaswamy, K. Newton and H. Hashemi, "Silicon-Based Ultra-
Wideband Beam-Forming," IEEE J. Solid-State Circuits, vol. 41, no. 8, pp. 1726- 1739, 
Aug. 2006.  
[22]  H. Veenstra, M. Notten, D. Zhao and J. R. Long, "A 3-channel true-time delay transmitter 
for 60GHz radar-beamforming applications," ESSCIRC Proc., 2011, pp. 143-146.  
[23]  T. Chu and H. Hashemi, "A true time-delay-based bandpass multi-beam array at mm-waves 
supporting instantaneously wide bandwidths," IEEE ISSCC Dig. Tech. Papers, 2010, pp. 
38-39.  
[24]  A. C. Ulusoy, B. Schleicher and H. Schumacher, "A Tunable Differential All-Pass Filter for 
UWB True Time Delay and Phase Shift Applications," IEEE Microwave and Wireless 
Components Letters, pp. 462- 464, 2011, vol. 21, no. 9.  
[25]  E. Adabi and A. M. Niknejad, "Broadband variable passive delay elements based on an 
inductance multiplication technique," IEEE RFIC Symposium, 2008, pp. 445- 448.  
[26]  Q. Ma, R. Mahmoudi and D. M. W. Leenaerts, "A 12ps true-time-delay phase shifter with 
6.6% delay variation at 20-40GHz," IEEE RFIC Symposium, 2013, pp. 61-64.  
 
 
 
 
 
 
 
 
 
 
Captions:  
Fig. 1. The gain and phase transfer function of an ideal time delay cell 
Fig.2. 1st order all-pass filter with extrapolation point f=0 vs. ideal delay (linear scales)  
Fig.3. Two known gm-RC delay circuits: a) of [11] and b) [2] 
Fig.4. The block view and architectural view of the 1
st
 order all-pass filter implementable with 
no floating caps 
Fig.5. The proposed 1
st
 order gm-C all-pass filter in [12]. 
Fig.6.  The biasing circuitry of the first filter cell in a cascade line 
Fig.7. The phase at the gate of M5, and the output of the delay cell in a delay line   
Fig.8. The phase linearization technique  
Fig.9. Low frequency linearization technique of the phase transfer curve of the filter (conceptu-
ally depicted) 
Fig.10. Simulation results of the phase linearization technique for the parameters shown in table 
1 
Fig.11: Bandwidth extension of the filter  
Fig.12. Simulation of bandwidth extension technique (RWBE as a parameter) 
Fig.13: The delay cell with 3 bit delay selection. 
Fig.14: Adding gain tuner to the delay cell 
Fig 15.  Adding 3-bit gain calibration to the filter 
Fig 16. The timed array IC: block level [12] 
Fig.17: LNA/BALUN  
Fig 18. The chip photograph  
Fig.19. Delay versus frequency for all delay settings. 
Fig 20. Gain of the delay line (fine tune and coarse tune) vs. f for all delay settings  
Fig.21: Gain, input matching and noise figure when delay of the channel is 255ps.  
Fig 22. The measure beam pattern compared to a simulated ideal beam pattern. 
 
 
Table 1. The transistor parameters of the simulated delay cell 
 
Table 2. Comparison between simulated delay cells 
 
Table 3. benchmarking and comparison between different reported delay cells. 
 
Table 4. Comparison between this works and two other time delay based timed array systems 
 
 
 
 
 
 
 
 Fig. 1. The gain and phase transfer function of an ideal time delay cell 
 
Fig. 2. 1st order all-pass filter with extrapolation point f=0 vs. ideal delay (linear scales)  
 
 
Fig. 3. Two known gm-RC delay circuits: a) of [11] and b) [2] 
 
 
 Fig. 4. The block view and architectural view of the 1
st
 order all-pass filter implementable with 
no floating caps 
 
 
Fig. 5. The proposed 1
st
 order gm-C all-pass filter in [12] 
 
 
Fig. 6.  The biasing circuitry of the first filter cell in a cascade line 
 
 
Fig. 7. The phase at the gate of M5, and the output of the delay cell in a delay line   
 
Fig. 8. The phase linearization technique  
 
Fig. 9. Low frequency linearization technique of the phase transfer curve of the filter (conceptu-
ally depicted) 
  W/L(m/m) Vth(V) ID(A) VGS(V) VDS(V) 
M1 13.76/0.25 0.465 488 0.714 0.714 
M2 13.76/0.25 0.465 488 0.714 0.714 
M3 13.76/0.25 0.465 488 0.714 0.714 
M4 10.64/0.25 0.449 488 1.086 1.086 
M5 21.28/0.25 0.449 976 1.086 1.086 
 
Table 1. The transistor parameters of the simulated delay cell 
 
 
Fig. 10. Simulation results of the phase linearization technique for the parameters shown in table 
1 
 
 
Fig. 11. Bandwidth extension of the filter  
 Fig. 12. Simulation of bandwidth extension technique (RWBE as a parameter) 
 
Fig. 13. The delay cell with 3 bit delay selection 
 
Fig. 14. Adding gain tuner to the delay cell 
 
 
Fig.15.  Adding 3-bit gain calibration to the filter 
 
 Fig.3a 
[11] 
Fig.3b 
[2] 
This work 
 
 
VGS,nmos  (V) 0.714 0.714 0.714   
VGS,pmos (V) - - 1.086   
Vth,nmos(V) 0.427 0.427 0.427   
Vth,pmos(V) - - 0.449   
IDC (mA) 0.2 3.4 0.6   
Pole fp (GHz) 0.36 0.48 2.63   
Input referred noise @fp  
(nV/sqrt(Hz) 
12.8 
 
2.4 6.2   
IIP3 (dBm@50) @fp 3 5 3   
VIM3=1% (mV) 48 84 45   
SNR @ IM3=1% in 
BW=0.1GHz  
52 71 57   
Normalized SNR/P 
(1%IM3,1Hz,1mW [20]) 
138 145 139   
 
Table 2. Comparison between simulated delay cells 
 
 
Fig.16. The timed array IC: block level [12] 
 
Fig. 17. LNA/BALUN  
 
Fig. 18. The chip photograph  
 
 
Fig. 19. Delay versus frequency for all delay settings. 
 
Fig.20. Gain of the delay line (fine tune and coarse tune) vs. f for all delay settings 
 
Fig. 21. Gain, input matching and noise figure when delay of the channel is 255ps.  
 
Fig. 22. The measure beam pattern compared to a simulated ideal beam pattern. 
 
 [21] [22] [23] [9] [24] [25] [26]  [26]  This 
work  
Technology 0.18 
m 
SiGe 
0.13 
m 
SiGe 
0.13 
m 
SiGe 
0.13 
m 
CMOS 
0.8 
m 
SiGe 
0.09 
m 
CMOS 
0.25 
m 
SiGe 
0.25 
m 
SiGe 
0.14 
m 
CMOS 
Supply Voltage(V) 2.5  2.5 2.5 1.5 2.5 N/A 2.7 1.5 1.5 
Gain variation(dB) 1 N/A N/A 3 0.7 3 0.9 1.4 1.4 
Max. delay(ps) 64 16 54 225 25 26 12 12.5 550 
Frequency[GHz] 1-15 55-65 31-41 1-15 3-10 0-8 20-
40 
25-
35 
20-
40 
25-
35 
1-2.5 
Delay variation 16
 
N/A N/A 14 40 10 6.4
 
3% 6.6
% 
3.2% 1.8% 
PDC/channel(mW) 87.5 - 104 78 38.5 - 146 33 90 
Resolution(ps) 4 1.2 18 15 Cont. 13 Cont. Cont. 13 
Size(mm
2
) 0.82 0.35 1.44 1.5 0.23 N/A 0.1 0.1 0.07 
Delay/Size  
(ps/mm
2
) 
78 45 37.5 150 109 N/A 120 125 7857 
Table 3. benchmarking and comparison between different reported delay cells. 
 
 
 
 
 
 
 
 
 
 
 
 This work  Chu, ISSCC 2007 [9] Van Vliet, GAAS 2003 
[8] 
Technology CMOS, 140nm CMOS, 130nm PHEMT, 250nm 
Technique Gm-C LC delay Transmission line 
Features Per Antenna Channel 
Gain  12-15dB (f-dependent) 10dB 6-9dB 
Noise Figure 8-10dB 2.9-4.8dB 4.3dB 
IIP3 -13 to -20dBm (min to 
max delay length) 
Not mentioned Not mentioned 
-1dB compression point -21 to -28 dBm (min to 
max delay length) 
Not mentioned 5dBm 
Amplitude variation vs. 
f 
1.4dB  1dB 2.5dB 
Delay resolution 14psec 15psec 2.5psec 
Delay variation vs. f <10psec <40psec <20psec 
Maximum delay 550psec 225psec 150psec 
Current consumption 50mA 40mA Not mentioned 
Complete 4 channel beamformer 
Beam direction resolu-
tion 
3.5Bit 3.5Bit 6Bit 
Bandwidth 1-2.5GHz 18GHz 3-16GHz 
Power consumption  250mA@1.8V 370mA@1.5V Not mentioned 
Die area 1mm
2
 10mm
2
 10mm
2
 
 
Table 4. Comparison between this works and two other time delay based timed array systems 
      
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
