I. INTRODUCTION
Many modern data acquisition systems require the recording of analog signals as a function of time over a wide dynamic range. Most commonly, the analog information is digitized at the required acquisition rate using an analog-to-digital converter (ADC). However, in a number of applications analog waveforms need only be captured as snap shots; continuous digitization is not necessary. Examples of such applications include pulse echo phenomena (RADAR, LIDAR, ultrasonics, and non-destructive material and medical testing), pulse shape recording (high energy physics experiments, accelerator diagnostics), and laboratory instrumentation (oscilloscopes, -transient digitizers). In such cases an input waveform can be sampled at a high rate for a limited period pf time, and the samples stored in an analog memory. The analog samples are then retrieved at a lower rate and digitized with a slow ADC before a new waveform is acquired. Many channels may be multiplexed onto one converter when readout speed and latency are not crucial.
Advantages of using an analog memory include low overall power dissipation and cost, high density, and potentially superior dynamic range at high sampling rates.
Two main technologies are available for the realization of an integrated analog memory:
charge-coupled devices (CCD's) and switched-capacitor circuits. Integrated circuits based on switched-capacitor techniques are inherently capable of higher accuracy and sampling rates than CCD devices. Furthermore, CCD's require elaborate clocking circuitry that generally dissipates considerable power.
Strong cost and performance incentives especially encourage the use of analog switchedcapacitor memories in high energy physics experiments [ 11. Fast analog waveform capture for thousands of channels must be provided with a minimum of power dissipation. The design challenge is to produce a uniform and linear response in a large number of memory cells at a level of performance comparable to the high accuracy inherent in the technology. Principal performance issues are cell-to-cell offset and gain variations within a memory channel, which are governed by the circuit architecture and its sensitivity to the matching properties of its constituent transistors and capacitors [2]- [5] . In high-precision applications, the lowest achievable cell non-uniformities may not be acceptable and therefore must be eliminated by correcting the data. In large systems, it is essential that the number of correction constants and the computational effort be minimized.
Early analog memory circuits based on a sample-and-hold topology contain a sampling switch, a storage capacitor, and a readout buffer in each memory cell [6]- [9] . In order to meet the need for lower power and higher density, architectures based on switched-capacitor circuits were Cell pedestals are therefore a function of the input signal and may require individual offset and gain corrections. In addition, a serious,drawback of these implementations in high-speed applications is the dependence of the sampling transistor turn-off time on the signal level [7] . In circuits based on traditional charge redistribution switched-capacitor techniques [ 141, sampling-switch charge injection can be made independent of the signal level, but the cell gain is a linear function of the size of the storage capacitor. Cell-to-cell gain matching of better than 0.5 % across an entire channel is therefore difficult to achieve, and both offset and gain corrections are commonly needed for each cell. The sampling speed of published analog memory circuits is presently limited to less than 150 MHz.
This work presents a circuit that enables sampling rates as high as 700 MHz while sustaining a dynamic range of more than 12 bits. In addition, cell pedestals are independent of the signal amplitude, cell gains are insensitive to component sizes, and the sampling time is independent of the input level. This allows a straightforward improvement in performance by means of a simple offset subtraction. Cell-to-cell gain corrections are not needed.
A specific application for the proposed memory is high energy physics accelerators and colliders, where bunches of particles are transported at close to the speed of light inside beam pipes several miles long [15]-[17] . In order to control the operation of the particle beam with sufficient accuracy, its transverse position must be measured at as many as a thousand locations with a precision of better than 1 pm across a range of 5 mm. The complexity and cost of such a measure-. _ ment system can be significantly reduced through use of high-speed, high dynamic range analog _ memories.
The proposed analog memory circuit is described in detail in Section II of this paper. In Section III, the design of the on-chip write and read control circuitry is explained. Experimental results characterizing the performance of the memory are presented in Section IV.
II. ANALOG MEMORY DESCRIPTION
Shown in Fig. 1 is a block diagram of an analog waveform recorder with m memory channels.
The analog waveforms applied at the m inputs are sampled and stored in the main analog memory core. The write and read addresses for the core are generated in the write and read control blocks, respectively. Since all memory channels are written and read simultaneously, the addresses are common to all channels. In applications where the readout time permits the serial readout of the channels, the m outputs can be read out on a single common output line by including an on-chip analog multiplexer.
A simplified schematic of one channel of the proposed analog memory, comprising n memory cells, is shown in Fig. 2 . Each memory cell consists of a large write ( The operation of the circuit can be described by dividing the data acquisition process into write and read cycles. In the write phase, analog signals applied at the channel input, Vin, are sampled and stored in the memory cells at a high rate. The stored analog information is subsequently . -re-ad out serially at the channel output, V,, at a lower speed.
During the write phase, switch Mi, is turned on, connecting the signal Vi, to the input bus, while switch MoUt and the read switches M,l through M,,l are all off, isolating the input bus from --the read bus. Switch Mrst is on to keep the read bus at a defined potential, V,, during the entire.
write phase. An analog signal applied at the circuit's input is sampled onto the cell capacitors Ci by sequentially turning transistors Mwl through M,, on and off as illustrated in Fig. 3(a) . Samples of the input waveform at n discrete times are thereby stored in the memory channel.
The voltage AV,i , stored across capacitor Ci in memory cell i, 1 5 i 5 II, after sampling is After the write phase has been completed and the input waveform is stored in the analog memory, the read cycle is initiated. During readout, the switch Mi, is turned off while Mout and --Mrst are turned on, forcing the input and read bus to V,. MrLYt is then turned off and the voltage stored in the first cell is read out by turning transistor M,, on as illustrated in Fig. 3 (b). After the output has settled, the signal may be digitized with an external low-speed, low-power A/D con--verter. Following digitization, M,,, is again turned on and M,t is turned off, which forces the input bus back to V, in preparation for the readout of the next cell. This cycle is repeated for all cells. It is essential that the input bus always be forced back to V, before a new cell is read out; otherwise, charge 'sharing and parasitic capacitance effects will seriously degrade the performance of the memory. By turning the cell read switches off after the reset switch is turned on, the potential across the capacitors is initialized to a defined state for the next write phase. The minimum readout time depends on the number of cells to be read out and the performance of the amplifier.
Once the write switch is turned off, the cell capacitor nodes connected to the cell transistors are left in a high-impedance state for the remainder of the write phase and the entire read phase.
The charge at these nodes is thus conserved and, with the input and output bus forced to V, between the readout of adjacent memory cells, only three parasitic capacitances influence the dc transfer function of a memory cell. One is the capacitance CPi associated with the cell sampling capacitor terminal that is connected to the write switch M,. CPi comprises the parasitic capacitance of the sampling capacitor together with the drain-substrate capacitance of the read switch and the drain-substrate and gate overlap capacitances of the write switch. The second parasitic capacitance is the gate overlap capacitance of the read switch, C,i, and the third parasitic to be considered is the capacitance, CPP, between the input bus and the read bus. CPP consists of the capacitance between the inverting input and output of the amplifier (a fraction of the gate-drain capacitance of the amplifier input transistor) together with capacitances associated with interconnections on the chip.
-In the proposed memory the voltage across the cell capacitor, rather than the charge stored on . -that capacitor, is sensed during readout. When memory cell i is selected for readout, the voltage at -the output of the amplifier, Voi, can be described as a function of the input voltage, Viny in the form
The gain factor Ai is given by (VH-VB-V7.) -AVpwi 1 1
where Wri and Lri are the width and the length of the read transistor.
Because, as indicated by (4)- (6), both Ai and Vow are independent of the input voltage, Vin, it follows that the output voltage of the analog memory channel, V,, is a linear function of Vi,.
For applications where a high input bandwidth is required, the write transistor must be made -.
-large because the cell bandwidth, B, is determined by the size of the sampling capacitor and the resistance of the write transistor:
The bandwidth of a memory cell and the size of the error voltage AVp,i are therefore correlated.
In order to simplify the calibration and correction procedure, the uniformity of the sampling -cell transfer characteristics must be considered. In the architecture presented here, memory cell as is the case in [6]- [13] . Note that in the analyses presented in some of these references the error voltages from the sampling switches are neglected, which is not a valid assumption for a high input-bandwidth analog memory.
The influence of the size of the sampling cell capacitance on the memory cell gain can be derived from (4), and the gain variation across a channel as a function of capacitor mismatch is
The parasitic capacitance CPP is small compared to practical values for the cell sampling capacitance Ci. As a result, it is expected that the gain will be insensitive to the capacitance mismatch and thus uniform across a memory channel.
Since the cell capacitor nodes connected to the cell transistors remain floating after the write switch is turned off, care must be taken to ensure that no leakage occurs at those nodes, for all possible ac and dc input signals, during the entire write and read phases. To avoid subthreshold leakage, the maximum input voltage swing, AViny in the write phase is limited to (10)
For the same reason, the maximum voltage swing A&i at the output of the amplifier must be less than V, -VL during the read phase. The corresponding limit for the input voltage swing during the write phase is then
From (10) it is apparent that the maximum voltage swing is limited by the value of reference input voltage, VC, which must be chosen such that the sampling switch impedances are small enough to achieve the desired bandwidth, as given by (7). The bias level VB is set to avoid leakage during the read phase and to ensure that V, does not exceed the amplifier output voltage range.
A common input switch, Min, could be used in the circuit of Fig. 2 because of the relatively small number of memory cells required in each channel for the intended application. This input switch must be large enough to achieve the desired bandwidth in the presence of the parasitic capacitance of the input bus and the combined capacitance of the addressed memory cells.
Finally, it should be noted that in the design presented herein, the turn-off time of the sampling switches A4wi is independent of the signal level, thus eliminating a timing error that would otherwise be present for high-frequency input signals.
III. CONTROL CIRCUIT DESCRIPTION
Traditionally, shift-registers have been used for write address control in analog memory circuits. At sampling rates above 100 MHz this approach is difficult to implement, and in the present design a starved inverter delay chain, illustrated in Fig. 4 , is used instead. Such inverter chains have been employed previously in digital applications [22] . Each delay element in the chain consists of five MOS transistors, as indicated by the shaded box in Fig. 4 . A write pulse applied at input Ain propagates through the delay elements, thereby producing the write address signals $++,I
through Qwn for the analog memory core. The delay of the write pulse through the chain is set by control voltage Vcr,, which thus determines the write sample frequency. The minimum width of the write pulse Ain (minimum acquisition time of the memory cell) is constrained by the accuracy with which the analog signal is to be acquired and the input time constant of the sampling cell.
In order to ensure a delay, and thus sampling frequency, that is independent of variations in the fabrication process, a servo feedback circuit, also shown in Fig. 4 , is used. The leading edge of a reference input signal A,.,.,., is compared to the trailing edge of the last write sample clock Qwn.
When the delay is less than the intended value, logic gate U1 turns transistor Mt on, which in turn The speed with which Vctr and thus the sampling frequency, can be adjusted, is governed by the magnitudes of currents I, and 12 and the size of capacitor C,, which is selected to avoid perturbations from leakage currents of switches S1 and S2 between acquisition cycles. The net leakage current is given by the sum of the currents flowing through the four reverse-biased source/drain pn-junctions of S1 and S2 in Fig. 4 . Switch S1 is included so that the voltage across C, is modified only while the delays are being compared during the write phase. Switch S2 is added to ensure that V$. is remains constant during the write phase. S2 is turned on during the read phase in order to update Vctr in preparation for the next write cycle. The start-up time of the circuit is determined by the sizes of II,Zz, and C,.
The readout of a memory channel is implemented with an on-chip two phase shift-register together with the logic used to generate the read control signals &I through &, as illustrated in Fig. 5 . The serial input signal @iin is shifted through a dynamic register by non-overlapping clocks 0 srl and $sr2. The enable signal Qen is used to disable the read addresses while the device is reset between readout of two successive memory cells, as shown in Fig. 3 .
IV. EXPERIMENTAL RESULTS
The analog memory circuit has been fabricated in a 2+m CMOS technology with poly-poly capacitors. Fig. 6 is a die photo of the prototype. Two channels with 32 memory cells in each were This type of amplifier also provides sufficient gain, speed, and noise performance.
The performance of the analog memory was evaluated by driving the input with high-quality pulse, dc, and sinusoidal signal sources, digitizing the data read out from the memory with a com-_ -mercial 16-b ADC, and transferring the acquired data to a workstation for processing.
The response of one channel to a 2-V input voltage step with a 3-ns rise time is shown in Fig. 7(a) and illustrates the operation of the device with I/c and VB set to 2.5 V. The output signal alternates between the output levels of the 32 cells and the amplifier reset level VB as illustrated in Fig. 3 . The delay feedback control signal Aref was adjusted to establish a sampling rate of -700 MHz (1.42 ns between the turn-off of adjacent sampling transistors). The readout time for each cell was set at 11 ps, which is the conversion time of the 16-b ADC used in the test setup.
The settling time of the analog memory output to 0.1% is 1 ps. In Fig. 7(b) the output pulse is plotted as a function of input time, and the results agree with the input pulse monitored on an 12 oscilloscope with respect to rise and fall time, pulse width, and the signal undershoot. The input time constant of the memory, defined as the product of the on-resistance of the write transistor and the capacitance Ci, was designed to be less than 0.5 ns for Vc = 2.5 V.
The nonlinearity of the experimental circuit was measured by applying 38 equally spaced input voltages and fitting the output levels to a straight line using the least-squares method. Fig. 8(a) shows the output of a typical cell plotted as a function of input voltage over a range of 3 V, and in Fig. 8(b) the deviations are shown for the chosen input voltage range of 2.5 V. The maximum deviation is 0.7 mV, or 0.03 % of full scale.
The dynamic range is commonly defined as the maximum recordable signal divided by the baseline noise, which determines the smallest detectable signal. The noise voltage for the analog memory can be expressed as
where C, is the capacitance at the inverting input node of the amplifier during readout, and ve4 is the input-referred noise voltage of the amplifier. The baseline noise of the analog memory was . -determined by recording the device response to repeated measurements with a constant input and calculating the mean square error. An RMS of 0.3 mV was obtained from sets of 100 repeated _ -measurements, independent of the input signal level. The dynamic range of the device is therefore better than 8,000/l, or 13 bits.
The cell-to-cell pedestal variations result in a RMS deviation of 1.8 mV across a channel. This is expected for the selected switch and capacitor sizes and the fabrication process used. In order to investigate whether the cell pedestals depend on the input signal, the responses to various dc input levels were recorded, and the response to one dc reference level was subtracted from these measurements; The differences for all 32 cells are plotted in Fig. 9 as a function of the input voltage.
In order to plot the data on the same scale, the nominal input level has been subtracted. Each data point represents the mean value from five measurements so that variations in the result due to baseline noise can be neglected. The RMS cell response variation after cell pedestal subtraction across the entire input signal range is only 0.3 mV, demonstrating that the sampling switch charge injection is independent of the dc input signal level and can be reduced to the level of the thermal noise by a simple subtraction.
The average gain, AVJAVin, of a memory channel at low frequencies was measured to be 0.9967, with an RMS gain variation across the channel of 0.0001, as indicated by Fig. 10 . Calibration of the channel therefore requires only a simple cell pedestal subtraction in order to achieve a precision of better than 12 bits for dc signals. The measured absolute gain agrees well with estimates based on (4).
The ac performance of the circuit has been quantitatively evaluated by applying free running sine waves of various amplitudes and frequencies at the analog input. Since the phase of the input signal was not synchronized to the sampling process, the results also provide a measure of the ac rily to two sections of the circuit, the delay chain and the feedback control. The total delay of the inverter chain is regulated by the feedback control circuit. The peak-to-peak timing jitter measured at the end of the 32 stage delay chain is less than 1 ns, which translates into 31 ps per sampling interval or delay element. This jitter corresponds to a sampling frequency error of 2 % at the 700 MHz rate and will decrease linearly with an increase in the number of delay elements. Additional error is introduced by cell-to-cell sampling time, or delay, variations. These variations have been estimated by fitting the sine wave with the individual delays of the elements as a parameter (with the same element delays for each of the 20 measurements sets). The best fit yielded an RMS value of 25 ps for the element delay variations across a channel. This timing error is due to delay element mismatch and is independent of the input signal level. This error can be corrected for if required. The approximate start-up time of the delay chain was measured to be less than ten cycles at a trigger rate of 120 Hz. For the intended application start-up effects are of no concern since the circuit is exercised long before input waveforms need to be acquired.
The performance of the analog memory is summarized in Table 1 .
V. CONCLUSION
In analog waveform sampling applications switched-capacitor memories can provide superior _ . performance with respect to cost, space, dynamic range, sampling rate, and power dissipation when compared to flash A/D converters and CCD devices. Present analog memory circuits are generally limited to sampling frequencies of 150 MHz. This paper has described a memory architecture that enables sampling rates from 200 MHz to 700 MHz by utilizing starved inverter delay elements with on-chip delay feedback compensation. In the proposed circuit, memory cell pedes--tals are independent of the input signal amplitude and can be eliminated by analog or digital subtraction. This is an especially important attribute in applications where digitization and subtraction are to be included on the same chip as the memory. . The proposed analog memory is a viable alternative to real-time analog-to-digital converters . _ in applications where continuous acquisition is not required. The power dissipation of the device is orders of magnitude below that typical of commercial monolithic converters, which are pres---ently limited to a dynamic range of 8 bits for rates exceeding 100 MHz.
ACKNOWLEDGEMENT
The authors wish to thank Dr. Dietrich Freytag for numerous helpful discussions. Block diagram of an analog waveform recorder with m memory channels.
Simplified schematic of the analog memory circuit.
Timing diagram for the (a) write and (b) read phase.
Write control circuit with starved inverter delay elements and feedback compensation circuit.
Read control circuitry.
Prototype die photo.
Response of one.channel to a 2-V input voltage step with a 3-ns rise-time sampled at 700 MHz. Pulse is plotted on an (a) read and (b) write time scale. 
