A number of particle astrophysics initiatives to exploit radio emission from high energy particle cascades require highfrequency sampling of antenna array signals. Nyquist-limited sampling of GHz frequency radio signals for an antenna array may be accomplished by commercially available test units. However, these technologies are incompatible with the size, power and cost constraints of long-duration balloon or satellite flight. Taking advantage of low trigger rates for such arrays, high resolution digitization may be performed a postori, at much slower speed and power, on waveforms stored in analog storage cells. This paper presents the design and performance simulation of a multichannel CMOS VLSI ASIC named STRAW (Self-Triggered Recorder for Analog Waveforms), optimized for low duty-cycle, high sampling frequency operation.
INTRODUCTION
In the history of astronomy surprises have occurred in every new regime explored, and frequently the most important results have not been those that were predicted. An example of contemporary interest has been the window which is opening to the observation of Ultra High Energy (UHE) cosmic ray events [1] . Observation of such events, in excess of several times 10 19 eV, has spawned a great deal of experimental and theoretical interest. Notably, no clear cutoff in particle energies has been observed to date from a variety of experiments [2] , in contrast to expectations for the removal of such events through interaction with the cosmic microwave background [3] , as may be seen in Figure 1 . Placing constraints on the cornucopia of theoretical models [4] , which have arisen to explain the energy spectrum at highest energies, has led to the development of vast detector arrays [5] . At some point the practical limits of extending these proven air shower detection techniques will be exhausted. This may be seen in the overall cosmic ray energy spectrum as shown in Figure 2 . Such low event rates at the upper end of the energy spectrum require significant detection volumes. One promising technique for improving the effective detection volume is by means of measuring coherent radio pulses produced in a shower medium as predicted by G.A. Askaryan [6] . Recent beam test measurements [7] of this technique in both sand and salt have demonstrated the viability of this technique. Figure 3 depicts the beam test configuration used at SLAC to measure the Askaryan-effect RF pulses for simulated large electro-magnetic showers as produced in high energy cosmic showers in dense material. Figure 4 shows the expected shower profile with measured data points overlaid, indicating excellent agreement with expectation. The inset diagram in Figure 4 shows the time-domain response of an antenna to an Askaryan pulse. In Figure 5 above is seen the temporal response of a pair of cross-polarized antennas, clearly indicating near total linear polarization of the recorded RF pulse. Note that the depolarization after about 2ns is due to reflections inside the test box. This polarization may be exploited to clearly identify Askaryan-effect RF-pulses from other background sources since thermal noise is incoherent and anthropogenic noise is largely circularly polarized. Another key feature of this measurement technique is the ability to determine the shower energy with a single-point measurement, a huge advantage over calorimetric methods. In Figure 6 the scaling of the total shower energy is seen from the SLAC experiment.
The left-hand side of Figure 6 demonstrates the functional dependence of the measured electric field strength from the shower. On the right, the frequency dependence of the electric field is plotted. Fitting to an empirical formula [8] for the field strength scaling:
where R is the distance to the source, is the radio frequency, and is the decoherence frequency (2500 MHz for silica sand and scales roughly as the radiation length). A 0 and have empirically determined values [7] and W T gives the shower energy as a product of the total number of electrons N e : W T = N e W e , where is the thickness of the radiator in radiation lengths and W e the electron energy. Finally, K and account for material and aperture corrections, respectively. Armed with such a scaling law and needing only to determine the distance to the shower, a good estimate of the shower energy may be obtained. Returning to the right-hand side of Figure 6 , it may be clearly seen that significant signal strength exists at GHz frequencies. Taking these requirements into consideration, as well as practical limitations on atmospheric propagation, antenna response and low-noise amplifiers, the requirements for a data recording system may be summarized: >= 1GHz analog input bandwidth multi-GSa/s sampling rate minimum phase distortion for clean polarization determination maximum dynamic range (>= 9 bits) internal Analog to Digital Conversion (ADC) short recording period (100-200ns if optimally matched) self-triggering with fine threshold adjustment strongly desired deadtimeless
The next section compares these requirements to capabilities of published and available methods. Based upon the shortcomings of what is already available, Section 3 defines the operating specifications and simulated performance for a CMOS chip capable of meeting the requirements.
HIGH SPEED TRANSIENT DIGITIZERS
It has long been the dream of experimentalists to utilize the flexible triggering, high analog input bandwidth and high sampling speed of modern digital sampling oscilloscopes as a Data AcQuisition (DAQ) system. For a very limited number of channels and with large readout latencies, this is an option. The high cost per channel, limited dynamic range within gain scale and unneeded agility in time base and gain are not well matched to large channel count, fixed parameter, long-term installations. Likewise, a more traditional solution of using fast ADCs followed by some type of post sample processing has limitations. High analog input bandwidth and high sampling speed ADCs suffer from low precision, high cost, high power consumption and the need for high-speed digital signal processing downstream.
Thus, the dream may be recast as desiring to have "an oscilloscope on a chip". A number of analog pipeline and waveform digitizing systems have been developed for High Energy Physics (HEP) systems in which detector signals are captured as "snap shots", that is, sampled at high frequency for a limited period of time. These signals may be stored in a fast analog memory and retrieved at a lower rate and digitized with a slower ADC before a new waveform is acquired [9] . Coming full circle, this scheme has in the past formed the core of the LeCroy high-speed oscilloscopes [10] .
A survey of the literature indicates that previous successful analog pipeline/Switched Capacitor Array(SCA) implementations have been generally optimized for two different detector readout modes. The first, well-suited to a colliding beam environment, is based on sampling at a given fixed rate. Keeping an analog record of the signal before and after the event of interest allows for pile-up rejection and even signal extraction from the tail of a preceding signal, essential for a high rate environment. A second implementation is intended for intermittent, triggered applications. As such it has been successfully used for reading out Photo-Multiplier Tubes (PMTs), for instance from large channel count optical Cherenkov detector arrays.
Upon closer inspection the various analog storage architectures are conceptually very similar and are distinguishable only in terms of their operating mode (common start versus common stop), means of storing and retrieving a stored signal on a sampling capacitor and the means of converting this information into a digital value. An excellent comparative summary of these distinctions for various SCAs is given in Ref. [11] . Based on this and augmented by recent publications and private communication, Figure 7 provides a summary of the sampling speed and analog bandwidth for various designs. Because of space limitations, the reader is referred to the references for further details on the specifics for each design. In most cases, the analog input bandwidth has not been specified and so an estimate has been made based upon the input capacitance and technology or through communication with the authors. Figure 7 in addition compares the performance for a few commercially available ADCs. This is somewhat misleading in that the channel density for these devices is low, the power high and the resolution only marginal. Even so, the SCA-based schemes still compare favorably. ZEUS [12] RD2 [13] Kleinfelder [14] Haller [15] ADeLine1 [11] DSC/DRS [16] AD9410 [17] CLC5957
High Speed Digitizer Comparison
[18]
ADS5102 [20] MAX1449 [21] Desired Max. Operating Region Apart from possibly the DRS chip, none of these devices are obvious matches to the performance requirements stated in the previous section. Implemented in a deep sub-micron process, the DRS design is potentially suitable, though the horizontal error bar indicates that there is still uncertainty in the analog bandwidth as it has yet to be measured [17] . Also, it should be mentioned that the author is aware of a revised version of the Kleinfelder design [22] , which is potentially suitable. At the moment, however, none of the proven designs has the required analog input bandwidth, sampling rate and self-triggering features needed for the next generation of RF detector readout.
3.THE STRAW2 CHIP
Based upon the desired performance outlined above, a chip design has been pursued which attempts to extend the successful SCA architectures mentioned above to higher sampling frequencies and analog input bandwidths. In addition, as it is envisioned that operation independent of an external trigger will be the normal operating mode, selftriggering capability is essential. During the procedure of exploring what is possible via SPICE simulation for a deep sub-micron (0.25mm) CMOS process, the concept of a Self-Triggered Recorder for Analog Waveforms (STRAW) took shape. In the initial version, STRAW1, 32 RF input channels were considered as well as rather simplistic triggering circuitry. Further study indicated that better input impedance matching and elaborate triggering was required, radical enough a redesign as to increment the design number to clearly distinguish. In Table 1 is codified the design parameters for the STRAW2 design. Essential to the design is the performance of the analog input bandwidth and triggering. These will be addressed in following subsections. It might be argued that 12 bits of ADC aren't really needed, as the dynamic range is rather modest. However, since the signals to be extracted are largely buried in the noise, a number of bits dedicated to sampling the noise baseline with high precision are considered important. Since the ADC is of Successive Approximation (SAR) type, only fractional additional deadtime is incurred by extending to 12 bits. Another item of note in Table 1 is the digitization deadtime. As will be seen below, provision will also be made to use an external ADC. In this case the deadtime could be reduced to a few mili-seconds, but still represents a significant deadtime once a trigger is accepted. Rather than focus on a fractional increase by parallelizing the readout and digitization, another strategy has been pursed. No matter how rapid the acquisition deadtime, there is still a finite chance of missing events of interest. To address this issue, it is expected that multiple STRAW2 chips will be used in parallel. Each chip generates a trigger signal and this information is collected externally by a trigger sequencer, which subsequently issues a hold to one chip per trigger. Given the low rate for real events, and the desire to catch possible "double-bang" events, two or three deep buffering is expected to be adequate. One other item of note in the Table is the provision for monitoring scalers. These are intended to monitor the noise trigger rate for the low-level threshold comparators. In order to operate right at the noise threshold, multiplicity logic is used on the random noise triggers to distinguish a possible event signal from pure noise. In order to "ride" the noise level, feedback on the individual channel trigger singles rates is a necessary input for adjusting the trigger threshold. More on the triggering details will be provided in subsection 3.3 below.
A block diagram of STRAW2 functionality is shown in Figure 8 , with each of the key functional blocks indicated. As seen in Figure 8 , 16 RF signals are brought into the array of 32 rows by 256 columns. Every other SCA channel is used to log analog trigger information on a per-channel basis. The purpose of this is twofold: for diagnostics of the recorded transient signals that lead to the generation of a trigger, as well as monitoring comparator status. The ADC, analog multiplexing and trigger control elements in this block diagram are straightforward. Emphasis in the next three subsections focus on design elements crucial to the success of the design. Specifically, the first subsection on the storage of the analog waveforms addresses the sampling frequency, linearity and gain. A second subsection analyzes the critical issue of analog input bandwidth. Finally, in the last subsection, a preliminary study of triggering on highspeed bipolar signals is presented.
Analog Waveform Recording
The design and performance expectations for the SCA storage elements are similar to those for other designs targeted for a deep sub-micron (0.25 m) process [16, 22] . Delay timing is implemented with an inverter chain whose propagation delay time is adjustable by means of either voltage rail or bias current adjustment. Similar to other previous designs, the input voltage is not buffered, but directly switched onto the storage capacitive element. However, as may be seen in Figure 10 , the actual readout is mechanism is different. Specifically, during readout neither the charge nor the stored voltage is directly transferred, but rather the conductance through the storage FET. This technique is reminiscent of that employed for the readout of an active pixel sensor [23] . Avoiding the direct transfer of the stored quantity avoids issues of readout cross-talk and parasitic bus capacitances. In addition, each storage element can be made quite compact, allowing for a dense array of storage cells. In the STRAW2 design the storage FET is sensed across a variable pull-up resistor. By changing the resistor, the effective voltage gain may be changed. Figure 11 illustrates the expected output swing versus stored waveform input voltage with a 40k pull-up resistor. For small signal amplitudes, the output response is rather linear. However, to obtain maximal linearity, it is expected that a dedicated calibration would need to be performed. One disadvantage of this method is that the limited dynamic range of these inputs could become even more modest. Based upon the storage capacitance an idealized estimate for the rms noise for each sample: An estimate for the working voltage of this configuration is something like 1V, which corresponds to approximately 12-bits of dynamic range. The effective gain of the readout may then be considered to be the amount by which the least significant bits of the ADC will be buried in the noise. That is, a gain of 8 corresponds to 3 bits of ADC noise sensitivity. In summary, simulation and simple calculations indicate that the analog storage performance should suffice.
Analog Input Bandwidth
A relative lack of examples of mixed-signal components optimized for high analog input bandwidth operation is a concern. Package manufacturers have invested a lot of time studying degrading factors in their chip carriers. For high frequency but narrow bandwidth operation, choices can be made in the input matching that are not possible when considering operation over a wide range of input bandwidths. Layout of the CMOS chip has been carefully done to ensure a 50 impedance controlled stripline internally, with a terminator at the end of the transmission line. A simple estimate for the analog input frequency roll-off can be made based upon the input impedance and capacitance. In Table  2 is listed an estimate of the sources of capacitance for an RF channel of the STRAW2 chip. 
GHz

RC f dB
However, this estimate ignores the importance of bonding wire inductance and impedance mismatch. Therefore an effort has been made to perform a full 3-D Electromagnetic simulation of the input, using LC [24] .
3-D EM Modeling
For convenience, it is preferable to package the STRAW2 die in a 100-pin plastic Thin Quad Flat Pack (TQFP) package. While it is acknowledged that flip-chip bonding to a Ball Grid Array (BGA) or Chip Scale (CS) package would have benefits, no suitable standard package has been found. Development of a custom package is an expensive proposition. Therefore, if the TQFP package can be shown to perform adequately, it will be used. In Figure 12 is shown the geometry used to simulate an input including printed circuit board trace, package and CMOS die.
LC is a potentially very powerful tool as it applies a Finite Difference Time Domain (FDTD) methodology to the challenging task of full 3-D EM field solving.
However, because of constraints on using simple geometries in LC, trapezoidal structures have had to be emulated with a series of rectangular structures. The main elements modeled in the simulation are the printed circuit board, with 50 stripline impedance, the plastic TQFP package, including bonding wires, and the die, with emulation of the ~2mm long impedance matched stripline. Analysis of the Sparameters for the input response is shown in Figure 13 below. Limitations on the number of grid points and computation time required simplifications to the number of simulated input channels as well as the extent of the printed circuit board. A preliminary S-parameter simulation result is shown in Figure 13 . These initial results look promising. At 1 GHz the VSWR is about 1.8 and rises to about 1.9 at 2GHz. As a limit on the number of lattice points was employed in the simulation, it is possible that important effects have been neglected. Therefore more detailed simulations will continue to be pursued. Taken at face value, this result indicates little performance loss up to about 2GHz. This is well-matched to the fact that the antennae that are planned to be used for either ANITA [25] or a possible NaCl detector [26] array are envisioned to have modest bandwidth above about 1.5GHz. 
Triggering on RF Signals
Triggering on pulses with broad bandwidth content up to and including 1GHz is an interesting problem. In addition to requiring a comparator with a high gain bandwidth product, some account for the bipolar nature of the pulse is required. In fact, depending upon orientation and line-of-sight to the shower charge separation, the largest amplitude (primary) received pulse can be predominantly positive or negative. This is somewhat visible in the raw antenna pulse seen in the top waveform trace in Figure 15 . A straightforward way around this problem is to use dedicated positive and negative amplitude discriminators. As mentioned above, for small signals, a multiplicity requirement is placed upon the RF channels being monitored to help resolve true signals from noise. In addition, it is desirable to issue a trigger when a large signal is seen in any one channel, independent of the status of the other channels. This is useful for the case when the other channels may correspond to antennae that are pointing in a different direction and no signal correlation should be expected. In Figure 14 is a simplified schematic of a single-channel trigger. The positive and negative comparator thresholds are set with a DAC for each channel individually. Figure 15 shows a SPICE simulation of trigger performance for an attenuated version of a waveform recorded in the SLAC beam test indicating functionality. 
4.SUMMARY AND SUBMISSION PLANS
Evolution in the trigger parameters as a result of discussion of the triggering requirements has thus far delayed submission of the design twice. Indeed, it was the strong conviction of the author that having a chip fabricated and test results in advance of presentation of the design at this conference was essential. However, this simply was not possible. A preliminary floorplan diagram showing the total chip layout is provided in Figure 16 . Pending a design review in mid-August, the current plan is to submit the STRAW2 design to MOSIS for fabrication at the beginning of September. Packaged parts for testing should be available approximately 3 months thereafter. 
