Abstract--An ASIC wafer test system has been developed to provide comprehensive production screening of the ATLAS Semiconductor Tracker front-end chip (ABCD3T). The ABCD3T[1] features a 128-channel analog front-end, a digital pipeline, and communication circuitry, clocked at 40 MHz, which is the bunch crossing frequency at the LHC (Large Hadron Collider). The tester measures values and tolerance ranges of all critical IC parameters, including DC parameters, electronic noise, time resolution, clock levels and clock timing. The tester is controlled by an FPGA (ORCA3T) programmed to issue the input commands to the IC and to interpret the output data. This allows the high-speed wafer-level IC testing necessary to meet the production schedule. To characterize signal amplitudes and phase margins, the tester utilizes pin 2 -driver, delay, and DAC chips, which control the amplitudes and delays of signals sent to the IC under test. Output signals from the IC under test go through window comparator chips to measure their levels. A probe card has been designed specifically to reduce pick-up noise that can affect the measurements. The system can operate at frequencies up to 100 MHz to study the speed limits of the digital circuitry before and after radiation damage. Testing requirements and design solutions are presented.
I. INTRODUCTION
HE ATLAS Detector at the Large Hadron Collider (LHC), due to start data taking in 2006, is currently entering the production phase for all the detector subsystems. The Semiconductor Tracker (SCT) is one of the key subsystems, featuring an unprecedented tracking volume instrumented with silicon strip modules, with about 6 million channels to read out. The project requires about 50,000 front-end chips, which are mounted directly on the detectors in about 4,000 Contact information is the following. Telephone: 1-510-486-5451; fax: 1-510-486-5101; e-mail: vafadeyev@lbl.gov detector modules. Thorough IC testing is required to achieve acceptable module yields and avoid excessive effort for reworking of modules. Furthermore, the detector is subject to significant radiation damage. Since access is very difficult, one must ensure high reliability and sufficient performance margin to maintain adequate performance of the overall system. This requires an IC test system that goes beyond mere functional tests, but also measures parameter spreads and performance margins in a multi-channel chip that combines both low-noise analog and fast digital circuitry.
We have designed and built an ASIC wafer test system to provide production wafer screening of the ATLAS SCT frontend chip (ABCD3T). The features of the ABCD3T are introduced in Section II. The overview of the test system is given in Section III. The tests are described in Section IV.
II. THE ABCD3T CHIP
Here we give only a short overview of the main features of the chip. A more detailed description can be found elsewhere [1] - [4] .
The readout chip combines 128 parallel channels of lownoise preamplification, pulse shaping and threshold discrimination with digital circuitry that performs on-chip data sparsification, data compression and readout, so that only struck channels are read out. The chip also includes on-chip charge calibration circuitry and on-chip DACs to control the discrimination thresholds. Time resolution of the chip matches the LHC bunch crossing time of 25 ns. This chip differs significantly from ICs available in the mass market in that it combines a large number of parallel analog channels with fast digital circuitry, so specialized test procedures are required.
Because of the moderate spatial resolution requirements, the IC utilizes a binary readout scheme where the signals from the silicon detector are amplified and then compared to a threshold. Only the result of this comparison, based on a hit or no-hit logic, enters the input register of the digital part. The binary readout has the advantages of relieving the burdens of the power consumption, cost and the data transmission. The data are latched into the input register in either edge or level sensing mode. All channels in a chip have a common threshold. The channel-to-channel matching, although acceptable for the non-irradiated chips, is expected to deteriorate significantly during the lifetime of the experiment [4] . To ameliorate the effect, an internal 4-bit trim DAC controlling the comparator offset is included in each channel. After the input register, the data enter a 132-cell deep digital pipeline. A trigger command copies the latest data from the pipeline into a readout buffer, which can hold the data from the last 8 triggers. The data compression logic examines the buffer and optionally sparsifies them according to the time structure of the hits.
Each half of a module in the detector contains six ABCD3T chips, one of which is a "master" communicating with the outside world, and the rest are "slaves", transmitting the data to the neighbor until they reach the "master". Therefore, the digital part of each chip includes the data and token exchange circuitry. Most of the communication signals are duplicated to mitigate the effect of a circuit or a chip failure in the high radiation environment of the LHC.
We communicate with the chip by LVDS signals. CMOS inputs set the chip address and mode of operation. For test purposes, the DAC outputs are available on probe pads.
The performance of the analog part of the chip for each channel, such as gain, offset, noise and trim DAC linearity and ranges, is tested with the internal calibration circuitry. The data are read out through the digital circuit and analyzed later on as described in Section IV.A.
The digital functionality is tested by injecting input signals and comparing the output with the expected pattern. The chip design allows one to inject a channel mask instead of the real data to the input register or the digital pipeline, therefore isolating the preceding parts of the chip, including analog, from the test.
In addition, the power consumption and I/O signal properties are determined to qualify a working chip.
III. SYSTEM OVERVIEW

A. Components
The ASIC wafer test system, which consists of several custom designed PC boards and control software, has been conceived and built to meet all of the ABCD3T testing requirements. The system architecture is presented in Fig. 1 . A photograph of a test system in use is shown in Fig. 2 .
The system is controlled by a GUI written in Visual C++, which runs in a PC under Microsoft Windows (W95/NT/2000). The GUI is based on the code for the previous version of the test system [5] . The PC communicates with the VME crate using the NI-VXI interface from National Instruments.
The crate houses a custom-designed VME board equipped with an ORCA3T FPGA of 186,000 gates. The FPGA interprets the VME commands, communicates with the ASIC, and controls most critical parameters in the system. The FPGA operates at 40 MHz and is synthesized from VHDL code. The data between the VME board and the probe card are transmitted as differential signals. Two intermediate boards, named pin driver and connector boards, have been designed and built to transfer and control the signals between the VME crate and the probe station. The connector board contains high current high speed pin drivers AD53040 with variable output signal level to test ABCD3T inputs, and window comparators discriminating on the upper and lower ABCD3T output signal levels. The pin drivers have 5 mV linearity error and rise and fall times of 0.8 ns in our signal range. The pin driver board includes signal delay chips, DACs and 8-channel 12-bit ADC AD7890 with 1mV measurement accuracy to probe the internal ABCD3T parameters: power consumption and feedback control of voltages supplied to the chip and ambient temperature of the pin driver board.
For all signals, the pin-driver and window comparator levels and signal delays are controlled by the FPGA through the DACs.
A probe card has been designed specifically to reduce the pick-up noise that can affect the analog measurements. Digital and analog signals are separated on different planes. An analog ground layer as well as a split digital/analog ground layer are used and low frequency filters are applied to the differential lines that control the analog part of the chip. All the decoupling capacitors are located as close as possible to the probe pins.
B. Design Considerations
To satisfy the schedule for the construction of the ATLAS SCT, the wafer screening production rate must exceed two wafers per day per testing site. In practical terms, that translates to the testing time of about 1.5 minutes or less per chip. The system design was heavily influenced by this speed requirement.
The systems used to test the front-end electronics for the silicon trackers of other experiments, such as CDF [6] and CMS [7] , typically carry the chip raw output data over to a PC for later analysis. Sometimes data sparsification is used to relieve the burden on the data buses. The previous version of the system built in this way turned out to be too slow for our project, for two reasons. The first one is the necessity to read out large amounts of information to characterize the performance of the chip with binary readout and the special feature of the trim DACs. The other is the increased scale of the project, compared to the previous generation of experiments, combined with short production time and the low yield of the DMILL process.
The key advantage of this system is delegation of the data processing capabilities to the tester hardware. The FPGA chip on the VME board performs the low-level data processing, offering reduced information for transfer to the PC, as described in the following paragraphs. This avoids the data transfer bottleneck. Additional system capabilities, rarely used in testing electronics for high energy physics experiments, include the I/O signal tests and high frequency operation well beyond the operating frequency.
For the analog tests, the FPGA was programmed to be able to issue a sequence of the ABCD3T input commands, which can be looped over a given number of times. A typical sequence is the calibration pulse command followed by the trigger. The spacing between the commands is controlled by the user. Furthermore, a dedicated datastream-decoding algorithm was conceived to interpret the data from the chip and to extract the numbers of channels having hits. The count of the number of hits for each channel is kept, stored in a memory chip on the VME board. The number of hits per channel, together with the decoding algorithm error codes, is the only information from the test to be read out. This scheme greatly reduces the amount of information transferred from the VME board to the PC.
The functions of the digital circuitry of the IC are verified using test vectors, which define the sequence of the IC's control line values for consecutive clock cycles. The IC's output signals are transmitted to the FPGA and compared with the expected data corresponding to that particular test vector. Only the binary result of the comparison is read out. Both the stimulus test vector and the expected chip response are stored in memory chips on the VME board, which makes it possible to run the test vector multiple times without reloading. This allows repetitive testing to detect soft errors. The expected chip response was obtained with Verilog simulation of the ABCD3T chip design. The FPGA algorithms do not interfere with the test vector content, which allowed for rapid test vector development without firmware changes.
It has been shown that radiation damage slows some digital circuits [4] . To evaluate ABCD3T performance after irradiation, the test vectors are run at frequencies higher than the nominal 40 MHz. This capability is achieved by buffering the signals in IDT72V2113 FIFOs, which are able to perform the read/write operations at different frequencies. The FIFOs can hold the data for up to 262,000 clock cycles worth of data. The data exchange between the FPGA and the FIFOs is done at 40 MHz, and the data flow between the FIFOs and the probe card can be done at a different frequency, in the range between 40 and 100 MHz. The higher frequency is obtained using a phase locked loop circuit on the VME board. The frequency is controlled by the FPGA.
The FPGA program followes the rules of modular design, with separate blocks for the analog tests capabilities, test vectors, frequency setup, DACs setup and ADCs readout. As the result, the debugging of the system was greatly simplified, and some simultaneous operations were allowed, such as measuring the power consumption while sending triggers to the chip.
Consideration was given to the system reliability issues. The following critical parameters are recorded: 1) the digital and analog voltages on the ABCD3T 2) the positioning of the probe card relative to the wafer during the test, as read back from the probe station after the initial wafer alignment 3) the temperature of the connector board, which is the hottest place in the system, due the presence of the highcurrent pin driver chips.
The software organization follows a staged design with clearly separated functions:
1) The acquisition software compresses the data and stores them in data files. There is no data analysis at this step, the online computer is relinquished from the time-consuming calculating tasks.
2) At the second stage, the data are analyzed using a powerful cluster of Linux computers at CERN. The parameters characterizing a chip are extracted. The chips are classified according their performance. The ROOT [6] framework is extensively used at this stage. The data processing time is dominated by the S-curve fitting (Section IV.A), and is comparable to the testing time when done on a PC. The cluster is much more convenient to use than a single computer, due to the necessity to reprocess the data after bug fixes and software updates.
3) The chip parameters are stored in the SCT Oracle Database at the University of Geneva for later retrieval by module builders. The database is also used to keep track of the material. Initially, the wafers are received at CERN and, after initial inspection, sent to a testing site. After the test, the wafers are diced at a local company, and good chips are sent to one of the module building sites. The location of wafers and chips is registered with the database by a Java application upload of an ASCII form each time they are sent or received. This includes the physical location of the chips in a gelpack when sent to a module building site and the placement on a module.
Since there is no data analysis done by the online software, it is possible to waste testing time on low quality chips. We have developed two special procedures to speed up the screening in such cases. The first procedure tests the communication with the ABCD3T configuration register. All other tests are skipped if the communication is not successful. The second test scans the digital pipeline of the chip. The remaining tests can be skipped if the pipeline does not pass the test. We currently do not select on the results of this test, but reserve this possibility for the scenario of receiving wafers with continually low yield.
Finally, a statistical study was done to tune the repetition parameters in order to minimize the wafer screening time while maintaining the necessary precision [7] .
IV. TESTING PROCEDURES
We test the performance of the analog part of the chip, digital part, their power consumption, several internal DACs and the relative phases and levels of the I/O signals.
A. Analog Tests
The basic method for the characterization of the analog circuitry of the ABCD3T consists of scanning the threshold with a fixed calibration charge. This measures the integral of the (ideally) gaussian resolution function. A typical "Scurve" from this measurement is shown in Fig. 3 , fitted to a complementary error function. The 50% point indicates the threshold value, which corresponds to the injected charge after amplification, and the width characterizes the noise.
A total of 13 histograms is taken for each channel of each chip (over 400,000 histograms per wafer) to extract the gain, offset, noise and the trim DAC performance. For each bin of the histogram, 200 triggers are sent to the chip under test.
To measure the gain and offset, we do the scan at four different calibration input charges. From the 50% points the response curve is plotted and fit to a straight line. The gain is taken as the slope of the fit (Fig. 4, 5) .
To characterize the trim DACs, we do the scan of DAC bits for all channels and a fixed calibration pulse. A measure of linearity is obtained in a procedure analogous to the gain extraction. To verify the trim DAC ranges, we loop over the four range settings.
With the test parameters chosen for production testing, the accuracy of noise measurement is dominated by the 9% statistical uncertainty. The precision of the gain and trim DAC linearity measurements is dominated by 5% error on the calibration pulse amplitude. Their statistical uncertainty is less than 1%. Fig. 3 . Example of a threshold scan for a singe channel with fixed injected charge. This is a basic method to study the performance of a binary chip. The midpoint of the transition gives the value of the threshold equal to the injected pulse after amplification. The width of the transition measures the electronic noise. The histogram is fitted to a complementary error function. 
B. Digital Tests
The test vectors have been developed to characterize the performance of different parts of the digital circuit. The following set is used in the test:
• Configuration register R/W test.
A value is written to the ABCD3T configuration register, then the chip is set to present the content of the register on the output, and the values are compared. All bits of the ABCD3T configuration register are scanned.
• Addressing, beam counter and the error code test.
The chip possesses an addressing scheme to distinguish different chips on a module. The counter of the LHC beam crossings is embedded in the output data to separate the information from different events. The bits of both counters are scanned. In addition, abnormal conditions are created and the chip response is verified.
• Data compression logic test.
Functionality of different criteria for data sparsification is verified. Hits are placed in different time slices relative to the trigger.
• Dynamic digital pipeline test.
The digital data are supplied to the input register of the digital circuit according to channel mask. Time structure of the pipeline is scanned.
• Static digital pipeline test.
The digital data are supplied directly to the pipeline, avoiding the input register.
• Redundant command line test.
The dynamic pipeline test is performed using the redundant input command and clock lines.
• Data and token (by)passing circuitry test.
The inter-chip communication functions are tested. The test vectors are run at a set of frequencies from the nominal 40 MHz up to 90 MHz to ensure sufficient performance margin for radiation damage. The scan over the digital voltage within 5% of the nominal 4.0V is also performed. The lower range of the scan is extended to monitor the quality of the probe card contact with the chip pads. The problems with the contact are revealed by nonmonotonic dependence of test vector efficiency on the voltage.
C. Power Consumption
We determine the power consumed by the chip by measuring the currents, separately for the digital and analog parts. We calibrated each test system to within 1% in this measurement. To simulate the conditions of the experiment, the 100 kHz trigger rate is set while random 3% of channels have hits. The measurement is done separately for the "master" and "slave" modes. Chips with power consumption outside of 30% margin from the wafer mean value are rejected.
D. Internal DACs
We verify the performance of comparator threshold voltage, input transistor bias current and shaper current DACs by doing a full scan of the respective DAC bits and measuring the voltage values with the test system ADC. The linearity of the DACs is extracted and compared with specifications. The error on each measurement is 1%. It stems from the uncertainty in the values of resistors in the amplifiers used to match the signal ranges with the ADC inputs.
E. I/O Signals Tests
We test the properties of ABCD3T input or output signals by using test vectors, stimulating the calibration signal and varying the test conditions.
We test the phases of the input signals, relative to the input clock, by scanning their delays. Setup and hold time requirements for inputs are verified in this way.
We test the phases of the output signals, relative to the input clock, by scanning the delay for the data receiving register latching clock and using the measurements of the signal propagation times in the system.
We vary the swing of the input differential signals to measure the minimal working value.
We vary the thresholds of the window comparators, placed after differential operational amplifiers, to measure the swing of the output differential signals.
Finally, we vary the duty cycles of the input clocks from 40 to 60% to ascertain the stability of the chip performance against the clock shape variation.
The specifications of the I/O properties do not reject a sizable fraction of the chips, however it is important to measure them to monitor fabrication quality, and to select better-matched chips when building a module. The latter motivation stems from the fact that the chips on a module need to communicate with each other, and from the expectation that the output signal delays may exceed the span on one clock period after the total radiation dose. An example of the distribution of an output signal phase for a wafer is shown in Fig. 6 . One can see that even chips from the same wafer exhibit sizable timing variation.
V. ACCEPTANCE CRITERIA
We only accept chips that pass the digital tests for frequencies of 40 and 50 MHz at a digital voltage between 3.8 and 4.2 V. All chip channels must respond to the calibration pulses. We further select on the following criteria:
• channel noise and the average chip noise • the average chip gain • the trim DAC linearity • the average trim DAC ranges for a chip • channel offset • Threshold DAC linearity • bias and shaper current DACs linearities • the chip power consumption
Chips meeting the requirements with all channels functional are selected for module construction. We also select a sample of chips with one bad channel, which are otherwise functional, to use for various studies and system development. The size of this sample is similar to that of the chips accepted for the experiment. 
VI. CONCLUDING REMARKS
To shoulder the workload, three test systems have been deployed at the following ATLAS Semiconductor Tracker institutions: University of California at Santa Cruz, CERN and Rutherford Appleton Laboratory. The differences between systems are restricted to the different probe station models used at each location. The performance of the testers has been verified using a common set of reference wafers. The yield was found to agree within 1% for all test sites.
The production of wafers with final version of the ABCD3T chips has started, and the first batch has been screened.
The test time was optimized. The time to screen one wafer containing 256 ABCD3T chips is about 5 hours. The tests requiring the most time are characterization of the analog performance of the chip, digital tests and I/O tests. They take 53, 29 and 10% of the time correspondingly. During the tests, most of the time is spent on communication with the chip.
VII. ACKNOWLEDGMENT
We would like to thank W. Dabrowski from FPNT UMM, Cracow, Poland for invaluable discussions and suggestions especially in the area of the IC studies. We thank E. Spencer from UCSC and R. Matson from RAL for help in deploying the system at the testing sites, and their continuous feedback.
