The LAB4D is a new application-specific integrated circuit (ASIC) of the Large Analog Bandwidth Recorder and Digitizer with Ordered Readout (LABRADOR) family, for use in direct wideband radio frequency digitization such as is used in ultrahigh energy neutrino and cosmic ray astrophysics. The LAB4D is a single channel switched-capacitor array (SCA) 12-bit sampler with integrated analog-to-digital converters (ADC), developed in the TSMC 0.25 µm process.
Introduction
There has been an increasing interest in the development of CMOS switched capacitor array (SCA) samplers due to their low-cost and high performance.
These devices have been thoroughly written about in high energy physics literature [1][2][3][4]
[5] [6] . Many have accomplished digitization speeds high enough for greater than Nyquist sampling of a GHz analog bandwidth signal. In addition to their use in precision photon timing [7] [8], these ≥GSa/s devices are being used in experiments designed to detect neutrinos [9] [10] [11] and gamma-rays [12] . SCA samplers are cost-effective alternatives to analog-to-digital converters (ADCs) due to their excellent timing, recording, and high resolution amplitude [13] [14] .
The analog nature of SCA samplers also limits their widespread acceptance, requiring additional resources and time consuming post-hoc calibrations in order to achieve high performance. To simplify the integration and usage of these digitizers by minimizing or sometimes eliminating the need for on-line or off-line calibrations via software, the LAB4 ASIC design was developed with the goal to demonstrate the ability to trim the inherently non-uniform timebase generated by the CMOS voltage-controlled delay line (VCDL) by the ASIC itself, using simple internal digital-to-analog converters (DACs). A further simplification involved moving all previously-external voltage biases and current references to internal DACs, resulting in the significant reduction of support electronics needed for digitization.
Another limitation of SCA samplers is the long readout time. While data can be acquired at GSa/s rates, data readout typically is significantly slower (tens of MSa/s). At slower rates (∼kHz), a high dead time with each readout can be linked to the lack of sufficient derandomization, and the LABRADOR family of ASICs targeted this issue. For example, a 1024 sample readout at 10 MSa/s would result in the digitizer being unavailable for 102.4 µs, which, at a 1 kHz trigger rate would result in 10% dead time. The LAB4 series integrates improvements stemming from the buffered LABRADOR (BLAB) series of ASICs [15] to achieve simultaneous sampling and readout; acting as a 4-event derandomizer by dividing the total number of samples into multiple windows that can be written to or read out separately. This allows a significant reduc-tion in dead time. Assuming a random trigger probability, the LAB4D would experience only 0.1% dead time under the previous example.
The LAB4D, the fourth revision of the LAB4 design, was fabricated in the TSMC 0.25 µm CMOS (LO) process, and packaged in a 48-pin quad-flat noleads (QFN) package to reduce parasitic bondwire inductance. A die image of the LAB4D ASIC is shown in Figure 1 . The design and performance results of the LAB4D will be discussed.
Architecture
A variety of different CMOS switched capacitor array architectures that are similar to the LAB4D have been discussed in the literature [16] . A compact, minimal storage array was used in order to limit the parasitic and storage capacitance of the SCA [17] . The decision to pursue a compact storage matrix architecture was symbiotic with similar designs being explored for Monolithic Active Pixel Sensors (MAPS) for charged particles [18] [19] .
The LAB4D architecture is a descendant of the LABRADOR-3 (LAB3) design [20] , developed for the Antarctic Impulsive Transient Antenna (ANITA), an ultra-high energy neutrino and cosmic ray balloon-borne observatory. The ANITA experiment requires ∼100 sampling channels over 200-1200MHz [21] .
Commercial flash ADCs were impractical due to cost and power limitations.
The LAB3 ASIC has been successfully deployed on four ANITA long-duration balloon flights and the LAB4D is scheduled to be deployed on future missions.
The features of the LAB4D and LAB3 are summarized in Table 1 . All of the LABRADOR ASICs have been fabricated in the TSMC 0.25 µm CMOS (LO) process and all previous generations were packaged in a 64-pin plastic thin quadflat package (TQFP). The LAB4D was the first to utilize a 48-pin QFN package in order to reduce lead inductance and for improved compactness.
The LABRADOR ASIC ADCs feature a usable signal voltage range of 0-2.5V and the LAB4D's ADC has an effective readout rate of >100 kSa/s. The LAB4Ds were designed as single channel RF digitizers in order to remove bond- 
Sampling Array
As shown in Figure 2 [15] data [3] data [4] data [1] data [2] data[1:2] It is the alternating between which sampling array and intermediate storage array the sampling cell writes to, that resembles a ping-pong ball constantly changing its side of the table and, hence, the origin of the terminology. The two-stage transfer that is shown, extends the settling time for the secondary array to a full sampling cycle. The result is an improved decoupling of the primary array from the sampling array, which is necessary given the fact that the main storage array occupies most of the physical space of the ASIC (see Figure 1 ) and the samples that are being stored must travel long routes across the chip to reach the designated storage cell.
Timing Generation
As shown in figure 4 , timing for the 128 primary sample-and-hold cells in edge. It needs only be set to a reasonable value, via DAC, such that the 50% duty cycle of the incoming SSTin clock is not made excessively asymmetric.
Implementation
Characterization of the LAB4D performance was performed using a 12-channel digitizer designed for the ANITA experiment, the Sampling Unit for Radio Frequencies, version 5 (SURFv5). The SURFv5 is a CompactPCI-compatible 6U printed circuit board (PCB), with a single Xilinx Artix-7 field-programmable gate array (FPGA) interfacing with all LAB4Ds and providing a CPU interface 
Results
The performance of the LAB4D ASIC as implemented on the SURFv5 was characterized to determine its suitability for RF digitization. Specifically, the noise level, linearity, working range, sample-to-sample timing variation, and analog bandwidth were measured. Next, the stability of these measurements with respect to operating temperature was characterized.
Temperature variations were investigated using 3 LAB4Ds which were coupled using thermal paste to a copper heat sink with a Peltier heater/cooler with a separate heat sink on the opposite side, as well as a thermometer for monitoring the LAB4D temperature. Current through the Peltier device was varied to control the LAB4D temperature.
Noise
Each individual storage cell in the main storage array develops a slightly different DC offset due to non-uniformity in the fabrication process. These offsets, called "pedestals", must be measured and removed to recover the input signal. Subtraction of the individual pedestal can easily be done at the point when each sample is read out, since each sample's subtraction is independent.
The measured pedestal values for all LAB4D ASICs on a single SURFv5 is shown in Figure 6 . The intrinsic pedestal variation results in a relatively minor 
Linearity
After pedestal subtraction, the linearity of the digitization of the LAB4D was Gain dispersion within the linear region was measured to be approximately 2% over all cells.
Sample-to-sample timebase variation
The DLL is first optimized by sampling a 235 MHz sine wave. The individual trim DACs are set to a common approximate delay, which assigns a portion of the delay for each element to be controlled by the trim delay, and the remainder to be controlled by the DLL. Because of internal routing in the delay line, even and odd samples are set to different initial values. Then, a sine wave is fit to a single window of data and VtrimT is adjusted so that the fit frequency matches the input frequency. An example of this optimization can be seen in Figure 8 . At this point, the average sampling speed of the LAB4D is 128 times the external clock, and the sample-to-sample timing variations can then be tuned.
The individual sample timing is then trimmed using the individual trim DACs. Because the DLL acts to keep the overall delay of the VCDL the same, adjustments in any single sample trim DAC will result in the corresponding adjustment of the timing of that sample and all other samples. That is, slowing down a single sample by 1% using the trim DACs will result in the remaining samples speeding up by 1 127 % to compensate. We therefore use an iterative minimization procedure to determine all trim DAC values simultaneously. This is done by measuring the fraction of observed samples where the observed value (relative to the subsequent sample) crosses the DC pedestal ("zero-crossing fraction"), either positive-going or negative-going, for 8000 separate waveforms. These 8000 waveforms make up a single iteration in the minimization procedures we discuss throughout this text. This fraction is a simple measure of the width of the time sample, and should be constant if the sample timing is regular. For samples that have a zero-crossing fraction greater than the average, the trim DAC value is decreased, speeding up that sample. Likewise, samples that have a lower-than-average zero-crossing fraction are slowed down by increasing the trim DAC value. Global structure in the delay line is compensated for by appropriate initial estimates of the trim DACs.
This procedure is repeated multiple times, which progressively reduces the variation in the timebase. Within approximately 10 iterations (80,000 waveforms), the RMS variation in the sample-to-sample timing is reduced below 5 ps. The iterative procedure eventually (after about 30-40 iterations) reaches RMS timing variations of approximately 3 ps. The improvement in timing over iteration count, as well as the sample-to-sample timing, can be seen in Figure 9 .
The decrease in improvement below 5 ps is due to the limited statistics of each iteration, rather than an intrinsic limitation of the timebase tuning procedure.
The last intrawindow delay (t 127 − t 126 ) contains an unknown extra delay which is currently under investigation, and results in this sample timing being a noticeable outlier from the other samples. This should be further adjustable by Finally, the stability of the timebase with respect to temperature variation was also investigated. First, the DLL was tuned at a temperature of 30
• C and a 210 MHz sine wave was digitized by the LAB4D. Data was then taken between 10
• C and 60
• C in 10
• C intervals, with the DLL enabled, and again with the DLL function disabled, in order to investigate the reduction in temperature sensitivity due to the DLL. The sampling rate was manually set by fixing the voltage which controls the common delay in the VCDL via an internal DAC.
The sampling frequency of the LAB4D was determined again using the zero- 
Analog bandwidth
The LAB4D analog bandwidth was measured using the impulse response of the SURFv5. The impulse response was used to avoid effects from sam- pling persistence [22] which would be present when using a frequency-swept continuous-wave (CW) source. The SURFv5, as was previously mentioned, has an RF input chain consisting of a Mini-Circuits HFCV-145+ and LFCN-1200+ high and low pass filter, respectively, as well as two TCD-13-4X couplers used to couple off a copy of the input signal and to couple in a low-frequency calibration tone. The bandpass of the input chain is primarily determined by the high and low pass filters, which act to block DC and as an antialiasing filter for the digitizer. A high-frequency impulse generated by a Tektronix AWG5104 was digitized by both the SURFv5 and a Tektronix MSO 5204B oscilloscope with a 2 GHz input bandwidth under identical conditions. Waveforms from the SURFv5 were correlated and averaged to produce the upsampled waveform, shown in Figure 11 , along with the impulse viewed by the oscilloscope, shown in Figure 12 for comparison.
To extract the small-signal analog bandwidth, the Fourier transform of the impulse response for both the SURFv5 and the oscilloscope were taken using a ±10 ns Hann window around the peak to eliminate effects from reflections due to imperfect input matching at the coaxial cable connections. The SURFv5 input bandpass was then obtained from a test board using a network analyzer and applied to the oscilloscope response. Finally, the LAB4D de-embedded frequency response was obtained by subtracting the measured SURFv5 response from the modified oscilloscope response. The oscilloscope and the de-embedded LAB4D response are shown in Figure 13 .
The measured -3 dB point of the LAB4D is approximately 1.3 GHz, with a frequency response flat to within 0.5 dB observed up to 1.1 GHz. This is a conservative measurement, as uncertainty in correlating the corresponding waveforms from the SURFv5 when averaging them will reduce the high-frequency response. The low-frequency response is purely determined by the high-pass filter on the SURFv5. The LAB4D has no intrinsic low-frequency limitation and the frequency response below 200 MHz is expected to be flat. This represents a 44% increase over the LAB3, which had a measured -3 dB point of 900 MHz.
Summary
A switched capacitor array device developed in the TSMC 0.25 µm CMOS (LO) process which utilizes a unique "ping-pong" intermediate storage array
architecture has been designed, fabricated, and characterized. This ASIC, the LAB4D, has a -3 dB upper analog bandwidth limit of greater than 1.3 GHz, a sampling frequency of 3.2 GSa/s (well above Nyquist minimum for the 200-1200MHz ANITA band), a sample window length of 320 ns, and features the unique ability to tune the timebase sampling offsets, reducing the RMS time variance between sample cells to be reliably less than 5 ps.
Acknowledgments
This work is the result of the continued support from NASA, the National Science Foundation, and the US Dept. of Energy, High Energy Physics Division.
