The Large Analog Bandwidth Recorder And Digitizer with Ordered Readout
  (LABRADOR) ASIC by Varner, G. S. et al.
ar
X
iv
:p
hy
sic
s/0
50
90
23
v2
  [
ph
ys
ics
.in
s-d
et]
  1
5 J
un
 20
07
The LargeAnalog Bandwidth Recorder andDigitizer withOrdered
Readout (LABRADOR)ASIC
G.S. Varner a,∗ L.L. Ruckman a, J.W. Nam b, R.J. Nichol c J. Cao d, P.W. Gorham a and
M. Wilcox e
aDepartment of Physics and Astronomy, University of Hawaii, 2505 Correa Road, Honolulu HI 96822, USA
bDepartment of Physics and Astronomy, University of California at Irvine, Frederick Reines Hall, Irvine CA 92697, USA
cDepartment of Physics and Astronomy, University College London, London WC1E 6BT, UK
dwas at the Univ. of Hawaii, now with NeuroPace Inc., 1375 Shorebird Way, Mountain View CA 94043, USA
ewas at the Univ. of Hawaii, now with Oceanit Laboratories Inc., 9565 Kaumualii Hwy., Waimea HI 96769, USA
Abstract
Three generations of full-custom analog integrated circuits designed for low-power, high-speed sampling of Radio-
Frequency (RF) transients in excess of the Nyquist minimum have been developed. These 0.25µm CMOS devices
are denoted the Large Analog Bandwidth Recorder and Digitizer with Ordered Readout (LABRADOR) ASICs and
finally consist of 9 channels of 260 deep sampling. Continuous sampling is provided with common stop capability.
Input analog bandwidth is approximately 1GHz and sampling speeds are adjustable from 0.02 to 3.7GSa/s.
Completely parallel internal conversion supports 12-bit digitization and readout of all 2340 cells in under 50µs.
1. Introduction
Observation of the early universe through neu-
trino messengers of the highest possible energies
requires a detector of enormous instrumented vol-
ume. One promising means to observe such a large,
radio-transparent target is viewing the Antarc-
tic ice shelf via high altitude balloon [1]. Such a
balloon-borne detector needs hundreds of high-
speed sampling channels (multi-event buffering),
operating over a frequency band from 200-1200
MHz [2]. Since all power must come from solar
panels, and heat dissipation is a major problem,
commercial flash ADCs were precluded.
∗ Corresponding author. Tel./fax: +001 808-956-2987.
Email address: varner@phys.hawaii.edu (G.S. Varner).
For at least two decades a number of Switched
Capacitor Array (SCA) devices have been re-
ported in the high energy physics literature, for
example [3,4,5], and many with sampling speeds
high enough for greater than Nyquist sampling
of a GHz analog bandwidth signal. These GSa/s
devices have been used for low and high energy
neutrino detection [6], particle physics [7,8] and
gamma-ray astronomy [9]. However, despite such
high sampling speeds, all of these devices have
analog bandwidth cutoffs which limit their use at
UHF frequencies and above.
We present here the results of three generations
of a high analog bandwidth ASIC designed to meet
these instrumentation needs.
Preprint submitted to Elsevier Science 8 August 2018
2. Architecture
A number of different CMOS SCA architectures
have been discussed in the literature. An excellent
summary of the storage circuit details and perfor-
mance may be found in Ref. [10]. As will be seen
below, in order to couple in high analog bandwidth
it is necessary to limit the parasitic and storage
capacitance of the SCA array. Thus a compact,
minimal storage array was considered and initial
prototyping looked promising [11]. This choice of
a compact storage matrix was guided and syner-
gistic with very similar storage architectures be-
ing explored for Monolithic Active Pixel Sensors
(MAPS) [12] for charged particle tracking.
2.1. Theory of Operation
Employment of SCA techniques in CMOS pro-
cesses have been effective in the areas of basic sig-
nal processing, continuous filter design, and pro-
grammable capacitor arrays, used for Digital-to-
Analog (DAC) and Analog-to-Digital (ADC) con-
version. As elements of a basic programmable fil-
ter, a simple inline capacitor between two switches
may be used to form a frequency-controlled resis-
tor, with resistance R given by [13]:
R =
1
fcC
(1)
for a given capacitor C, being switched at fre-
quency fc [Hz]. Almost arbitrarily complex filters,
composed of these variable R and C configura-
tions, can be formed and expressed in terms of
poles and zeros in a transfer function, the mathe-
matics of which is conveniently described via the
z-Transform [14], a staple of modern signal pro-
cessing. As an example, from these simple build-
ing blocks, first order filters can be constructed as
represented by the transfer function:
H(z) = K
b0 + b1z
−1
1 + a1z−1
(2)
where z−1 = e−iωT , T = 1/fc, −1 < a1 < 1, and
K is an overall normalization constant. Through
choice of constants, one can form a Low-pass filter
(b0 − b1 = 0):
H(z) = K
1 + z−1
1 + a1z−1
(3)
or a High-pass filter (b0 = −b1):
H(z) = K
1− z−1
1 + a1z−1
(4)
Since first-order filters only have one real pole, they
cannot directly realize band-pass or notch filters.
More flexible and universal are filters of second-
order and beyond. Second order SC filters are often
called biquad circuits and have may be expressed
as
H(z) = K
b0 + b1z
−1 + b2z
−2
1 + a1z−1 + a2z−2
(5)
and is analogous to the continous time case where
the transfer function may be represented by
H(s) = K0
s2 + d1s+ d0
s2 + c1s+ c0
(6)
where as long as the sampling frequency is much
higher than the signals of interest, the approxima-
tion z−1 ≃ 1 − iωT may be used. And from this
point, standard pole-zero analysis can be used.
Beyond simple synthesis of rather complex fil-
ters using standard tools, the true power of this
technique lies in pairing such SCA processing with
operational amplifiers on an integrated circuit to
achieve powerful sampling and signalmanipulation
capabilities. For instance, in analogy with an R-2R
ladder topology, a multiplying DAC may be ex-
pressed using an array of switches and capacitors
with the simple transfer function [15]
H(z) = z−
1
2
n∑
i=1
2−ibi (7)
and ignoring the half-period delay indicated by the
z−
1
2 , can synethize an output voltage vout based
upon a reference voltage vref via the expression
vout = vref
n∑
i=1
bi
2i
(8)
with the bi being the binary-coded digital signal,
precisely as expected for a DAC. ADC topologies
are now myriad and the focus of this paper is on
a specific type - the transient waveform recorder.
2
In some ways this makes use of the simplest SCA
structure of them all, the Sample-and-Hold (S/H)
circuit. The great power of the papers referenced
above derives from the ever increasing speed and
compactness of deep submicron CMOS processes.
While an idealized waveform recorder is simply
an array of S/H circuits, parasitic capacitances re-
quire consideration of parasitic circuits like those
referenced above. For the specific application at
hand, parasitic inductances and capacitances are
critical to storing analog waveformswith frequency
content in the Giga-Hertz range.
2.2. Bandwidth Limitations
In order for an SCA storage device to be useful,
it must have a decent number of storage cells. Load
capacitance increases as a function of the number
of switches connected to the incoming signal line,
as well as the resistance-shielded storage capaci-
tances when the switches are closed. For a properly
coupled 50Ω stripline into an ASIC, for a purely
capacitive storage array, the 3dB roll-off is given as
f3dB =
1
2πZ0C
. (9)
Therefore, to obtain a 3dB bandwidth of 1.2GHz,
the pure capacitance must be limited to approxi-
mately 2.65pF . This value is already smaller than
that of the high-ESD protection diodes (∼ 10pF )
provided in a standard design library used and
therefore the input protection must be modified.
A more accurate assessment of the input coupling
performance requires a refined description of the
input circuit model and will be discussed in much
more architecture-specific detail below. In sum-
mary, to realize 1GHz of analog bandwidth with
good coupling, will use the following design prin-
ciples:
(i) 50Ω stripline everywhere
(ii) minimize input protection capacitance
(iii) minimize switch drain, storage capacitance
Based upon these considerations, efforts have
been made to maintain a 50Ω coupling across the
sampling array inside the ASIC. As a trade-off be-
tween storage depth and parasitic drain capaci-
tance, 256 samples per input was chosen. Finally,
the size of the storage capacitance was studied.
2.3. Storage Limitations
Limits are imposed on theminimum size possible
for the storage capacitor. Since for a S/H circuit
there is no means to perform a Correlated Double
Sampling, an ambiguity in the actual stored value
is given in terms of electron counting statistics by
the usual expression
vrms =
√
kT
C
(10)
where k is Boltzmann’s constant and temperure T
is in Kelvin.Matching the 12-bits of dynamic range
of the sample storage conversion to a Wilkinson
ramp voltage range of about 1V, a slope of approxi-
mately 0.25mV/least count is realized. At this level
of sensitivity, the impact of the choice of storage
capacitance value is seen in Fig. 1. At the upper
left of this figure is a schematic representation of
the basic storage cell for the first two generations of
ASIC that utilized a transimpedance storage archi-
tecture. A reference value of 78fF is shown – about
matching the least count of the ADC as shown.
mV
C
kT
v
store
rms
23.0  
Cstore = 78fF
Impact of Storage Cap size
0
0.5
1
1.5
2
2.5
0 50 100 150 200
Storage Cap [fF]
V
rm
s
 [
m
V
]
Vrms
Ref point
For 1V useable input range
9bits
10bits
11bits
12bits
Fig. 1. Noise limited sampling resolution as a function of
storage capacitance value.
In the last generation a different readout is em-
ployed, although the same basic NMOS transistor
gate capacitance storage is used. This constraint
on minimum size is subsequently considered in the
choice of storage capacitance. However reducing
the storage capacitance too much makes switch
charge injection and leakage current effects more
prominent.
3
2.4. Architectural Details
Three generations of LABRADOR architecture
ASIC have been designed, fabricated and tested.
Their key features are summarized in Table 1. All
have been fabricated in the TSMC 0.25µm CMOS
(LO) process and have been packaged in a 100-
pin plastic TQFP package. Economics and package
performance simulations [11] drove this decision.
BGA packages were considered and may be used
in the future to reduce the contribution due to lead
inductance, however all test results are shown for
this same 16.6 x 16.6 mm plastic package.
Table 1
Summary of three LABRADOR generations, where for
brevity they will be referred to by a shortened designation;
e.g. LABRADOR1 = LAB1.
Item LAB1 LAB2 LAB3
# of RF inputs 8 8 9
Samples/input 256 256 260
Total samples 2048 2048 2340
# of ADCs 128 128 2340
ADC Conversion cycles 16 16 1
Readout latency [µs] 2200 2200 ≤ 50
Analog MUX out [µs] 25.6 25.6 N/A
DC GND ref. no yes no
Analog out yes yes no
50Ω term. end end input
In contrast to the first two generations of
LABRADOR ASIC, the third generation was a
purely digital output device, changed input termi-
nation scheme to be at the input, and went to a
massive array of Wilkinson ADCs (one per pixel).
These differences and lessons learned will be high-
lighted below. The architecture of first two ASICs
is illustrated schematically in Fig. 2. Examining
Table 1, the primary difference between LAB1
and LAB2 was the attempt to provide a means to
internally bias the RF inputs. This circuit did not
work well due to high resistance noise coupling.
LAB1 results are similar, though better in all
cases. Eight RF input channels are each sampled
by an array of 256 SCA storage cells. Sampling oc-
curs continually until a trigger signal is generated.
Fig. 2. Block diagram of the LAB1/LAB2 architecture.
Samples are stored for the 8 RF inputs in an array of 256
storage cells. Writing is controlled by a write pointer that
continuously cycles across the array and stored values are
held upon receipt of trigger signal. Stored values are then
addressed (gain adjusted) and either stored for conversion
in an array of 128 Wilkinson ADCs or multiplexed off-chip
for external conversion.
At this point the analog samples are held and not
overwritten. These stored values are then selected
and a transimpedance relay of the stored charge
is made, which is either stored into input samples
of an array of 128 channels of Wilkinson ADC or
analog multiplexed and transferred off-chip for ex-
ternal ADC conversion. A die photograph of the
approximately 10mm2 LAB1 device is shown in
Fig. 3.
Fig. 3. A die photograph of the LABRADOR1 ASIC.
4
5:
4.7:
28:
~13:
10fF
~0.7k:
~12fF0.02:
0.3:
50-130:
LABRADOR M5
M4 Z~13:
238O
102O
Fixed
LABRADOR Resistance Estimate Input (RF) Input (ref)
bond wire
Length 17000 O 0.0 0.1 pad
70 0.2 M5-M4
Metal 4(sheet) = 0.07 Ohm/sq 71.42857 5.0 typ length (sq.)
Metal 5(sheet) = 0.03 Ohm/sq 166.6667 5.0 typ length (sq.)
Poly contact = 5.1 Ohm 6 6 0.9 0.9
via 1= 2.7 Ohm 6 3 0.5 0.9
via 2= 5.35 Ohm 6 3 0.9 1.8
via 3= 8.26 Ohm 6 3 1.4 2.8
via 4= 11.34 Ohm 6 1.9
10.5 11.5 Total per feed
28 Rterminator
Measured: Ohm 50.0 Grand Total
Fig. 4. Schematic representation and resistance breakdown of the LAB1 signal chain. Effects due to both resistive drop
across the sampling array, as well as low impedance of the on-chip stripline, were observed in testing.
Layout of the LAB1 ASIC quite directly follows
the arrangement of the functional blocks in the
schematic diagram.While efforts were made to op-
timize the coupling of the input signal based on ear-
lier efforts with the STRAW [11] architecture, the
choice of LAB1 input structure represented a com-
promise, as shown in Fig. 4. Signals are straight in-
put shots on the left and terminated in a 28Ω resis-
tor at the right. This choice is a trade-off between
widening the signal trace, which would lower the
microstrip impedance even below the Z0 = 13Ω
shown, or having even larger resistive losses across
the array. These resistive losses made for a vex-
ing amplitude-dependence across the array. To ad-
dress this issue in LAB3, a 50Ω termination resis-
tor is placed directly at the input to the detector.
The termination resistor was removed from the ar-
ray end. Since offset biasing could be performed
directly at this input termination, the resistance
of the signal line was unimportant and the on-chip
stripline could be made exactly Z = 50Ω. Any
reflection at the end of the array would be back-
terminated, though this stub is short. At maximum
signal frequency of 1.2GHz, for a stripline of 2mm
long (about 10ps at vprop ≃ 23c), the phase intro-
duced by this stub is about
2 · 10ps
(1.2GHz)−1
· (360◦) ≃ 8.6◦ (11)
which is acceptable, though for operation at higher
frequencies, such effects may be non-negligible. In
all cases the input protection diodes have been
completely removed. Current discharge is provided
through a 20kΩ pull-down resistor to ground and
voltage clamping is provided by external back-to-
back RF diodes.
Other lessons gleaned from the first two
LABRADOR generations included observing that
while having analog samples available for exter-
nal digitization has merits, non-linearities in the
transimpedance response and temperature depen-
dence were major issues. As space was available
to permit completely parallel conversion of all 9
channels by 260 samples, in-situ conversion was
adopted, as illustrated in Fig. 5. Including four
extra “tail” samples avoids a sampling record gap
during the interval in which the write pointer is re-
turning to the beginning of the sampling window.
5
Wilkinson ADC
inputs
4 RF
12
timing
control
inputs
5 RF
LABRADOR(3) architecture
SCA bank: 5 rows x 260 columns
SCA bank: 4 rows x 260 columns
Fig. 5. Block diagram of the LAB3 architecture. In contrast to the LAB1/LAB2, the stored analog signal is never transferred.
Instead, direct Wilkinson conversion is done within each storage cell.
Details of the required timing and offset cal-
ibrations are discussed below. In order to acco-
modate the additional samples, as well as provide
space for a Wilkinson comparator and 12-bit latch
in each pixel storage cell, the die had to increase
slightly to approximately 3.2 by 2.8 mm. Metal
fill rules required covering the interesting parts of
the die, making LAB3 far less photogenic than
LAB1/LAB2 and thus not included. Addition of a
9th channel was done to allow insertion of a com-
mon reference clock into the data stream for each
LABRADOR. This was found to be use for improv-
ing the temporal alignment of waveforms recorded
by different chips.
All three generations use the same write pointer
structure. This is a classical voltage-controlled in-
verter chain, with an odd number of stages such
that a ripple continuously propages. An XOR cir-
cuit and a look-ahead signal are used to open each
storage gate for the time it takes to transition from
the look ahead to current locations (4-6 samples).
Despite best efforts at balancing the threshold
voltage and NMOS versus PMOSL:W ratios, some
amount of propagation variation is expected when
the ripple edge across the array is transitioning
low-to-high versus high-to-low, as shown below.
The ramping voltage for Wilkinson conversion
is generated by using a current source and either
an internal or external reference capacitor. In all
testing shown below, an external 200pF capacitor
is used. An external (68kΩ) bias resistor sets the
drive strength of the current source to approx-
imately 20µA. A common Gray-code counter is
provided on chip and broadcast to all SCA cells.
When the ramp threshold is crossed in a particu-
lar cell, the current count value is latched. Upon
completion of ramping, all 2340 12-bit values are
available for random-access readout.
2.5. Design Evolution
In summary, the biggest changes in going from
the LAB1/LAB2 architecture to the LAB3 are
(i) direct termination at array input
(ii) Wilkinson conversion in each storage cell (no
analog signal transfer)
(iii) addition of a 9th (clock reference) channel
and by these choices good performance results
have been obtained, as documented below.
6
3. Test Results
A variety of tests have been carried out to eval-
uate the performance of the LABRADOR series of
waveform recorders. These measurements attempt
to verify the degree to which the performance tar-
gets have been met, as well as characterize the sys-
tem in preparation for UHF radio transient detec-
tion. Because of the superior performance of mea-
sured noise, bandwidth, linearity and digitization,
results are shown the LAB3 ASIC.
3.1. Sampling Speed
The sampling speed dependence on an ad-
justable control voltage (ROVDD) is plotted in
Fig. 6. Stable sampling speeds ranged from 0.02
to almost 4 GSa/s (limited by operation beyond
the 2.5V nominal VDD rail voltage).
0
0.5
1
1.5
2
2.5
3
3.5
1 1.5 2 2.5 3
Freq. Adj. Voltage (ROVDD) [V]
S
a
m
p
li
n
g
 F
re
q
. 
[G
H
z
]
Avg.
-cycle
+cycle
SPICE
Fig. 6. Sampling rate as a function of control voltage. Both
data and SPICE simulation are plotted, where a difference
is observed between rising or falling edges of the ripple
oscillator as described in the text.
The SPICE simulation was fairly conservative
and should be considered a lower-limit, pessimized
for a worst-case spread in actual CMOS fabrica-
tion parameter values. While the sampling rate is
defined as the cycle average of the so-called Rip-
ple Carry Out (RCO), which is a copy of the write
pointer monitored external to LAB3, the propaga-
tion speed of the high-to-low and low-to-high are
seen to be different. At a nominal 2.6GSa/s this
corresponds to about a 2% effect and is readily cal-
ibrated out by latching the RCO bit state at the
time a trigger is recorded, as will be discussed later.
3.2. Input Coupling
Pulsing the input to the LAB3 chip with a fast
risetime signal, a reflectionR = +6.8% is observed.
Solving the usual expression
Z − Z0
Z + Z0
= R (12)
an impedance value of Z = 57Ω is determined.
This is consistent with the measured 59Ω DC re-
sistance of the fabricated device, which appears to
be about 20% higher than specified, though within
spreads observed for silicide block in recent similar
runs.
Because the signal of interest is an RF signal,
a standard DC linearity scan performance is less
important than evaluation with a realistic impul-
sive signal. Therefore, to determine the input cou-
pling, linearity and cross-talk performance, an RF
impulse was used as shown in Fig. 7. Most of the
signal power of interest is in the steep high-to-low
transition.
Cross-talk and Gain Reference pulse
-1500
-1000
-500
0
500
1000
1500
2000
2500
2.000 4.000 6.000 8.000 10.000 12.000
Time [ns]
A
m
p
li
tu
d
e
 [
m
V
]
Ref pulse
Fig. 7. Time-domain signal of the RF pulse used to evaluate
input coupling, linearity and crosstalk, as recorded with a
3GHz bandwidth oscilloscope.
A 3GHz analog bandwidth oscillocope was used
to record this reference signal. However the sig-
nal from the pulse generator itself was not flat
in the frequency domain. Moreover this reference
pulse has been bandwidth limited between 200-
1200MHz, to match the frequency range of the
ANITA instrument signal chain, in which these
measurements have been performed.
7
Because determination of the analog bandwidth
of the LAB3 device requires removing the intrin-
sic frequency of the RF pulse itself, its FFT has
been measured and is displayed as the blue curve
in Fig. 8. In red in this upper plot is the recorded
LAB3 response.
FFT difference
S
ig
n
al
 P
o
w
er
 [
d
B
m
]
4
 L
A
B
3
 s
p
li
t 
lo
ss
 [
d
B
]
-3dB
LAB3
RF pulse
Bandwidth Determination
Fig. 8. Determination of the LAB3 ASIC analog band-
width in a test board configuration with a four-way split of
the RF signal. In top (blue) RF reference pulse and (red)
LAB3 FFT. At bottom is the difference, where for perfect
coupling a -6dB loss would be expected.
Taking the difference of these two curves, the
analog response versus frequency is determined
and shown in the bottom plot of Fig. 8. At the
left edges of the curves the impact of the 200MHz
high-pass band definition filters are seen. Of note
is peaking of the signal in the 300-400MHz range,
an effect seen in earlier testing. Taking the -3 dB
point as the line shown, the roll-off frequency is
just over 900 MHz, though signal power is still
available out to 1200MHz. Four LAB3 are being
tested in parallel and thus an ideal loss would be
-6dB, indicating some amount of loss in the RF
signal chain and coupling into the chip. Earlier
tests on a dedicated, single LAB3 board, without
band definition filters (e.g. 1200MHz low pass)
indicated somewhat better higher frequency re-
sponse and some of this loss may be due to com-
ponents on the ANITA flight digitizer (SURF[2])
board used for evaluation. Therefore this curve
may be considered a conservative lower bound on
the analog bandwidth.
We note that the peaking observed is also
present in the case of gaussian noise, though the
peak of the distribution is a function of the input
biasing network. This is likely due to resonant L-C
response in the input front end and seems coupled
to the cross-talk observed below.
3.3. Linearity
A determination of the linearity of the digitizing
system has been made by varying the RF signal
amplitude as displayed in Fig. 9.
Linearity scan
y = 6808.7e
-0.1229x
R
2
 = 0.9983
1
10
100
1000
10000
0 10 20 30 40 50 60
Attenuation used [dB]
V
p
e
a
k
-p
e
a
k
 [
m
V
]
Series1
Expon. (Series1)
Fig. 9. Linearity determined by attenuating an RF pulse
as described in the text.
Since power attenuators are used, the response
is characterized in dB and a linear fit is observed on
a logarithmic plot. Good linearity is seen with just
a hint of saturation at large signal amplitudes and
some non-linearity at small signal amplitude due
to the coaddition of board-level noise. Any non-
linearity observed is likely due to non-linearities
in the ramp generation circuit or comparator bias
setting. Over a span of 40dB in dynamic range,
the LAB3 output tracks input to within statistical
measurement errors.
8
3.4. Crosstalk
By inserting the signal successively into the
LAB3 channels, a cross-talk correlation plot was
constructed as shown in Fig. 10.
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 1
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 2
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 3
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 4
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 5
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 6
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 7
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 / 
Si
gn
al
 (P
ea
k-t
o-
Pe
ak
)
-310
-210
-110
1
10
Cross Talk -- Signal Channel 8
Fig. 10. Measured crosstalk for each channel as a function
of channel into which signal is injected. For signal in self-
-channel, the amplitude is unity. Note that these values are
overestimated, as described in the text.
These values shown are determined by searching
for a peak around the time of the input signal. Due
to noise, statistically a few percent peak is mea-
sured even for the case of no cross-talk. Therefore
the values shown are overestimated. For RF appli-
cations, even a 10% voltage crosstalk is only 1% in
power.
Nevertheless, for other applications it is impor-
tant to understand the source of this effect. A hint
to the origin of this crosstalkmay be seen in Fig. 11.
Similar temporal and frequency dependence to
the cross-talk can be reproduced in SPICE simu-
lations, though the solutions are not unique. That
is, the amplitude and phase information can be
mimicked by tuning the voltage source output in-
ductance of the pedestal network or with respect
to bond-wire inductance stray coupling. Based on
these results, a channel-dependent phase-lag to the
cross-talk was predicted and subsequently verified
qualitatively, as shown in Fig. 12.
AGND
Input 1
20k
5nH
      
      
      



      
      
      
      




pad
2pF50
5nH
package
20k
5nH
Distributed array Cap
Input 2
Vref
Vref
      
      
      



      
      
      



pad
2pF50
5nH
package
Distributed array Cap
Fig. 11. Schematic representation of the input bias circuit.
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 1
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 2
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 3
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 4
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 5
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 6
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 7
Channel
0 2 4 6 8 10
Cr
os
s 
Ta
lk
 T
im
e 
- S
ig
na
l T
im
e(n
s)
-10
-5
0
5
10
Cross Talk -- Signal Channel 8
Fig. 12. Phase lag of the measured crosstalk. For channels
separated by the timing control section of the chip both
the amplitude and phase are less well constrained.
In addition, a small component of direct radia-
tive coupling between the on-chip striplines cannot
be ruled out, though was difficult to model (met-
alic heat sinks would need to be taken properly
into account in the 3D EM simulations). All results
indicate that better packaging (lower inductance)
and stripline shielding would help improve the ob-
served effects.
9
4. Required Calibrations and Stability
In order to obtain the test results shown, a num-
ber of calibrations are needed. In the process of
applying these, much improved resolution is ob-
tained. Temperature dependence and timing pre-
cision limits are considered.
4.1. Gain and Pedestal Calibration
For the measurements shown, the gain has
been adjusted to approximately 1mV/least count.
A comprehensive pedestal histogram of all SCA
storage channels (excluding channel 9) on the 36
LAB3 flown on ANITA is summarized in Fig. 13.
Entries  74592
Mean   1204.710
RMS    49.609
Pedestal (ADC counts)
1000 1100 1200 1300 1400
0
1000
2000
3000
4000
Pedestal Distribution
Fig. 13. A summary of the pedestal values (in mV) for all
SCAs of 36 production LAB3 tested.
Channel 9 is excluded since it has a different volt-
age offset value due to the clock input biasing. The
spread seen is a combination of 36 pedestal voltage
differences, SCA-SCA variations, and Wilkinson
ramp slope and starting voltage offsets. Also the
gain of one LAB3 (values clustered around 1100)
had an anamolously low gain. Overall the RMS of
this distibution is just over 4%.
4.2. Timing Calibrations
In order to obtain the best possible timing reso-
lution, a number of calibrations, due to the method
in which the sampling is implemented, must be
considered. As mentioned earlier, the write pointer
is monitored using a copy of the signal called RCO.
Since the sampling is done in so-called Common
Stop mode, it is continuous until a trigger condi-
tion is formed. Thus all samples have already been
recorded by the time a trigger is acted upon. In or-
der for sampling to be continuous it is necessary for
the write pointer to wrap around from the end of
the array to the beginning, as illustrated in Fig. 14.
257256321
ROVDD
ROGND Sampling on write pointer passage
260259258
RCO
Fig. 14. Write pointer wrap around. While the write pointer
returns to position 0 of the array, additional tail samples
are taken in order avoid a gap in the sampling record.
Four additional “tail” samples are provided to
permit samples to be recorded during the time in
which the write pointer is returning to the begin-
ning of the array. Even though the physical dis-
tance is only 20-30ps at the speed of light, the
need to go through an additional inverting stage
(to form ring oscillator) and the capacitance asso-
ciated with the long signal line back to the begin-
ning of the array limit the speed of write pointer
return.
Also mentioned earlier, the write pointer speed
of propagation across the array is a function of
the transition direction. Likewise the delay time of
write pointer return is also RCO phase dependent.
The most general case of these calibration con-
stants is illustrated in Fig. 15. From the measured
RCO frequency (fRCO), the sampling frequency is
determined as
fsampling = 2× 256× fRCO. (13)
10
Phase 1
ε1
SCA 0
SCA 1
SCA 2
Phase 0
SCA 0
SCA 1
SCA 2
Phase 0 Phase 0
Phase 0Phase 1
SCA 259
SCA 258
SCA 255
SCA 259
SCA 258
SCA 255
∆t0
RCO
SCA 259
SCA 255
SCA 258
SCA 0
SCA 1
SCA 2
ε0 ∆t1
T
Fig. 15. Definition of the most general LAB3 sample timing relationships and constants. Determination of their values is
described in the text.
Expressing fRCO in terms of its period T , half
period T0 corresponds to RCO phase 0 and half
period T1 corresponds to RCO phase 1, or
fsampling = 512× (T0 + T1)−1 (14)
in which case the time step of an individual sam-
ple is expressed as
∆t =
T
512
. (15)
In general, as mentioned, the half periods T0 and
T1 are not half the period T :
T0 6= T1 6= T
2
(16)
which means that the average individual time
steps in phase 0 (∆t0) are different from those in
phase 1 (∆t1). Likewise the delay time of the write
point propagation for RCO0→ 1 (ǫ1) and for RCO
1 → 0 are in general different and related to the
difference between average ∆t0 and ∆t1. Finally,
due to transistor threshold dispersion, the actual
widths of each of the time bins (∆t0..2590,1 ) can be
slightly different.
Using a known periodic input signal, it is pos-
sible to generate calibration values for all of these
parameters. An example of determination of the
relative average ∆t0 and ∆t1 is shown in Fig. 16.
Entries  920
Mean   13.675
RMS     0.198
 / ndf 2χ
 50.396 / 41
Constant  2.486± 58.250 
Mean      0.006± 13.681 
Sigma    
 0.005±  0.179 
Period (samples)
12 13 14 150
20
40
60
Square Wave Period -- Phase 0
Entries  903
Mean   13.456
RMS     0.216
 / ndf 2χ
 38.623 / 45
Constant  2.340± 56.175 
Mean      0.006± 13.469 
Sigma    
 0.005±  0.185 
Period (samples)
12 13 14 150
20
40
60
Square Wave Period -- Phase 1
Fig. 16. Measurement of the write pointer propagation
(sampling speed) difference for the RCO = 0 (top) and
RCO = 1 (bottom) phases for a 200MHz reference clock.
In each case the variable parameter is tuned un-
til the spread or offset in the determined period is
minimized. Because the period is well determined,
the procedure is very efficient and requires a rela-
tively small amount of calibration data.
11
Similarly, the write pointer wrap around delays,
ǫ0 and ǫ1, may be determined by constraining the
measured period to be consistent across the write
pointer wrap around. An example is shown in
Fig. 17.
Entries  1213
Mean    1.418
RMS     0.136
 / ndf 2χ  37.350 / 16
Constant  6.277± 178.953 
Mean      0.004±  1.416 
Sigma    
 0.003±  0.131 
Period (ns)
0 1 2 3 4 50
50
100
150
Wrap Offset -- Phase 0 to 1
Entries  1168
Mean    1.099
RMS     0.143
 / ndf 2χ  21.807 / 15
Constant  5.882± 159.651 
Mean      0.004±  1.104 
Sigma    
 0.003±  0.144 
Period (ns)
0 1 2 3 4 50
50
100
150
Wrap Offset -- Phase 1 to 0
Fig. 17. Extraction of the wrap timing offsets (ǫ0 and ǫ1)
for a given LAB3.
To a certain extent these calibration steps must
be bootstrapped. For example, correctly minimiz-
ing the error on these ǫ parameters requires that
the average time steps in each of the RCO phases
be correctly determined. A subtlety here is that the
RCO phase is recorded at the time a trigger signal
(hold) is issued. Because the RCO latching in the
data is not completely synchronous, there is in gen-
eral a delay between the measured value of RCO
and its actual value. This ambiguity is resolved
by assigning a phase delay between the measured
RCO that depends upon the address at which the
hold was issued, the so-called “HitBus” value. The
value of this delay is tuned in the data until the
width of the measured period is again minimized.
Finally, using a high frequency clock it is possi-
ble to constrain the average half period and assign
its average value to the ∆t0..2590,1 bin in which the
positive/negative lobe peaks. Using this prescrip-
tion the distribution histogrammed in Fig. 18 is
obtained.
Data: Bin-by-bin Calibration Constants
Bin width in Calibration File [ps]
~34 ps RMS
Fig. 18. Summary distribution of the calibrated individual
time bin widths for all SCAs in 36 LAB3 ASICs.
4.3. Time Resolution Limitations
Applying these timing corrections to the data
leads to an improvement in the time resolution of
signals in the data. The precise improvement de-
pends upon the signal distance within the window
(cumulative error) and method for correlating sig-
nal shapes to extract a timing feature for compar-
ison.
To understand the intrisic performance limits
and the significance of the bin-by-bin correction,
a simple Monte Carlo study was performed to
determine the extent to which the technique used
to extract the observed timings would lead to the
observed distribution. Introducing a completely
random scatter (uniform distribution) of 15% to
the nominal 386ps bin width, 600MHz sine MC
was then synthesized and the algorithm applied.
A value of 15% was determined empirically to
provide a good representation of the observations
in data. Due to irreducible errors in the specific
implementation of this zero-crossing technique,
application of these constants improves the tim-
ing resolution to about 28ps, as shown in Fig. 19.
This has improved the resolution by about 20ps
in quadrature, though perhaps there is still room
for improvement. Since two edges are used to
determine this time interval, the single edge mea-
surement is about 28ps/
√
2 or about 20ps and
probably is a limit with the current LAB3.
12
Difference from True Bin Width [ns]
Sqrt(342-282)
~ 19.3 ps
Fig. 19. Monte carlo.
These determined parameters appear to be sta-
ble in time and only depend upon thermal effects,
and described next.
4.4. Temperature Dependence
The voltage controlled oscillator for the write
pointer is fundamentally temperature dependent.
During operation of LAB1 an external delay lock-
ing loop circuit was used to adjust ROVDD to com-
pensate. However this circuit suffered from large
phase noise as well as a nasty habit of locking onto
a frequency subharmonic at power-on. Therefore,
with the addition of a dedicated timing channel –
needed to precisely alignmultiple LAB3 waveforms
offline – ROVDD was fixed and timebase correc-
tion is implemented by fitting to the period of the
reference clock.
The temperature dependence of the sampled fre-
quency is shown in Fig. 20. Good agreement is
seen with SPICE simulations of the temperature
dependence, once an operating reference point is
set. This fine tuning is needed to correct for the
overly pessimistic parasitic capacitance estimate
used earlier in simulating the ripple oscillator fre-
quency. Using the reference clock signal on chan-
nel 9, this temperature dependence of the VCO is
corrected in the offline analysis. A fit to this de-
pendence gives a change of approximately 55ps/◦C
over the 30ns period of the 33MHz reference clock,
or about 0.2%/◦C.
 / ndf 2χ
  1124 / 159
p0       
 0.00± 31.65 
p1       
 0.00005± -0.05437 
C)°Temperature (
20 25 30 35 40
Cl
oc
k 
Pe
rio
d 
(n
s)
29
29.5
30
30.5
31
ANITA Data
Data Fit Line
SPICE
Fig. 20. Measured and SPICE simulated temperature de-
pendence of the LAB3 sampling period.
In contrast, the pedestals are a very weak func-
tion of temperature. In Fig. 21 is displayed the dif-
ference in pedestal values taken after an ambient
temperature change of approximately 17◦C.
Entries    7.423961e+07
Mean    1.808
RMS     2.759
(Warm - Cold) in mV
-50 0 50
10
210
310
410
510
610
710
Pedestal Stability
Fig. 21. Difference in pedestal between dedicated pedestal
runs taken 30 hours apart, at a difference in ambient tem-
perature of 17 degrees Celcius.
Taking this difference, an estimate of the
pedestal temperature dependence is
PEDavg = +0.052 · ADCcounts
◦C
(17)
13
For reference, and to illustrate the typical chip-
level noise, an example noise run is shown in
Fig. 22. Representative noise values are about
1.3mVrms, though there is some non-gaussian be-
havior in the combined distribution of 2.2 million
samples from 9 separate RF channels.
Noise
Entries  2234721
Mean   0.0261
RMS     1.264
 / ndf 2χ  1.717e+05 / 14
Constant  563± 6.752e+05 
Mean      0.0009± 0.2576 
Sigma     0.00±  1.16 
Sample Measurement [mV]
-10 -8 -6 -4 -2 0 2 4 6 8 100
100
200
300
400
500
600
310×
All channels, 1000 events
Fig. 22. Sample LAB3 1k event noise run, with all 9 chan-
nels combined into a single distribution.
4.5. Interleaved Sampling
Right at Nyquist sampling of UHF RF sine wave
signals visually appears undersampled to most ob-
servers. This is due to expectations from seeing
smooth curves generated by 20+ GHz offset inter-
leaved sampling of a repetitive waveform, provided
by most digital signal oscilloscopes. By providing
precise external delays it is possible to enhance
the sampling speed and provide oversampling with
the LAB3 chip. Interleaving of 8 inputs, running
at 2.5GSa/s each has been done to provide single-
shot recording of a 400MHz sine wave signal at
20GSa/s, as shown in Fig. 23. Each color repre-
sents the samples recorded by a single channel.
While there is some scatter due to the delays not
being perfectly tuned, this indicates that there is
still more performance to be gained by increasing
the analog bandwidth yet further and implement-
ing such interleaving. For low power and very high
sampling rate applications, where signals may not
be repetitive, this technique may be useful. This
and other improvements are considered next.
Time [ns]
0 1 2 3 4 5 6 7 8 9 10
A
m
pl
itu
de
 [m
V]
-600
-400
-200
0
200
400
600
400MHz sine
Entries  0
Mean x       0
Mean y       0
RMS x        0
RMS y        0
20 GSa/s interleaved on 8 input channels 400MHz sine
Fig. 23. Example of 20GSa/s interleaved, single-shot wave-
form recording of a 400MHz sine wave signal on 8 LAB3
input channels, each plotted with a different color.
5. Future Directions
Beyond increasing analog bandwidth, to fully
exploit the enhanced sampling speed of deep sub-
micron processes, there is a desire to increase sam-
pling depth. This is being explored in a follow-
on device designated the Buffered LABRADOR
(BLAB) ASIC.While the sampling speed increases
below 0.25µm, loss of dynamic range due to re-
duced rail voltages and increased leakage current
may preclude going to smaller feature sizes.
6. Applications
During December 2006 to January 2007, 36
LAB3 ASICs flew successfully at 120,000 feet for
35 days around the Antarctic continent. Some of
the test results shown above are from this data set.
During this same period, test deployments for an
in-ice radio detector, using this device, were made
at the south pole in conjunction with the IceCube
array [16]. Recently, these ASICs were evaluated
in a collidering beam environment for upgrade
of the Belle Time-Of-Flight readout [17], and a
variant for operation at a Super B-factory [18] is
being developed for high timing precision, single
photon recording [19]. For these future applica-
tions, a deeper sampling depth is highly desirable
and such a device is currently being prototyped.
14
7. Summary
A Switched Capacitor Array (SCA) device has
been developed in a 0.25µm CMOS process with
a 3dB analog bandwidth of almost a Giga-Hertz,
capable of being sampled at many GSa/s, or well
above Nyquist minimum. Sampling is performed
at low power and the entire array of 9 channels by
260 samples can be digitized to 12-bits of resolu-
tion and read out within 50µs. With calibration
excellent time and sample voltage resolution have
been obtained over a large range of temperature
and sampling speeds.
8. Acknowledgements
This work was supported by the National Aero-
nautics and Space Administration (ROSS Pro-
gram), the Department of Energy (HEP Division)
University of Hawaii base program support as well
as support from the Advanced Detector Research
program.
References
[1] P.W. Gorham et al. (ANITA Collaboration),
Submitted to Phys. Rev. Lett., hep-ex/0611008,
SLAC-PUB-12286.
[2] G.S. Varner et al. (ANITA Collaboration), Detection
of Ultra High Energy Neutrinos via Coherent Radio
Emission, Proceedings of International Symposium on
Detector Development for Particle, Astroparticle and
Synchrotron Radiation Experiments (SNIC 2006) pp
0046, SLAC-PUB-11872.
[3] S. Kleinfelder, IEEE Trans. Nucl. Sci.35 (1988) 151;
IEEE Trans. Nucl. Sci.37 (1990) 1230.
[4] K.L. Lee et al., IEEE Trans. Nucl. Sci.38: 344-347,
1991.
[5] G.M. Haller and B.A. Wooley, IEEE J. Solid State
Circuits 29 (1994) 500; IEEE Trans. Nucl. Sci. 41
(1994) 1203.
[6] S. Kleinfelder, IEEE Trans. Nucl. Sci. 50 (2003) 955.
[7] C. Bro¨nnimann, R. Horisberger and R. Schnyder, Nucl.
Instr. Meth. A420 (1999) 264.
[8] S. Ritt, Nucl. Instr. Meth. A518 (2004) 470.
[9] E. Delagnes et al., Nucl. Instr. Meth. A567 (2006) 21.
[10] S. Panebianco et al., Nucl. Instr. Meth. A434 (1999)
424.
[11] G.S. Varner et al., Monolithic Multi-channel GSa/s
Transient Waveform Recorder for Measuring Radio
Emissions from High Energy Particle Cascades, Proc.
SPIE Int.Soc.Opt.Eng. 4858 (2003) 31.
[12] G. Varner et al., Nucl. Instr. Meth. A541 (2005) 166;
Nucl. Instr. Meth. A565 (2006) 126.
[13] P. Allen and E. Sanchez-Sinencio, Switched Capacitor
Circuits, Van Nostrand Reinhold, 1984.
[14] R. Unbehauen and A. Cichocki, MOS Switched-
Capacitor and Continuous-Time Integrated Circuits
and Systems, Springer-Verlag, 1989.
[15] R. Gregorian and G. Temes, Analog MOS Integrated
Circuits for Signal Processing, John Wiley & Sons,
Inc., 1986.
[16] H. Landsman, A. Karle et al. (IceCube
Collaboration) and G. Varner and L. Ruckman, to be
published International Cosmic Ray Conference 2007
proceedings.
[17] J. Rorie and G. Varner, Signal Timing and Readout
(STaR) pipelined upgrade for the Belle TOF System, to
appear Proceedings of the 2007 IEEE Nuclear Science
Symposium.
[18] S. Hashimoto (ed.) et al., KEK-Report-2004-4 (2004).
[19] L. Ruckman and G. Varner, Photodetector Readout
Monolithic with Precision Timing, to appear
Proceedings of the 2007 IEEE Nuclear Science
Symposium.
15
