A front-end readout system with a custom backplane and custom circuit modules has been developed for the RICH subsystem of the PHENIX experiment. The design specifications and test results of the backplane and the modules are presented in this paper. In the module design. flexibility for modification is maximized through the use of Complex Programmable Logic Devices. In the backplane design, source synchronous bus architecture is adopted for the data and control bus.
I. INTRODUCTION
The RICH subsystem of the PHENIX Experiment at the RHIC accelerator of Brookhaven National Laboratory is primary device for electron identification, which has 5120 photomultiplier tubes (PMT) to measure Cherenkov photons [l]. Both charge and timing information from each PMT are recorded. Dynamic range of up to 10 photoelectrons is required for charge information, and to digitize them in 10-hits resolution. The timing resolution less than 200 ps is required to reduce background hits cased by electrons originating elsewhere than the interaction vertex. The peak interaction rate is about 14 kHz in Au+Au collisions and reaches about 10 MHz in p+p collisions in the future upgrade-able luminosity. The system combined Level-l (LVL-1) trigger with analog memory unit (AMU) has been designed to achieve deadtime-lass data acquisition under these conditions. The RICH readout system is newly designed system customized for the RICH subsystem to maximized a readout speed and an implementation density.
LVL-I trigger generation, data collection and buffering, and formatting and sending data to the data collection module (DCM). In order to achieve event rate of 25 kHz, it is required that the front-end readout system completes the data transfer within 40us. The RICH LVL-1 trigger is one of the six local LVL-1 triggers, which should he generated by every bunch-crossing cycle. In the case of RICH detector, the LVL-I trigger signal is formed as XO-hits length of a serial data packet digitized from the summation of 20 photomultiplier tube signals. Sixty-four channels of photomultiplier tube signals are digitized in a single AMU/ADC Module, where analog memories continuously hold an integrated charge signal and a time to amplitude converted (TAC) signal at each bunchcrossing. Once a global LVL-1 signal is received, memorized voltages corresponding to the trigger are converted to digital data and are buffered in the Readout Modules. The acquired data are also formed as a serial data packet and sent to the DCM.
The Controllcr Module performs all of the functions associated with front-end elcctronics control, data collection, command interpretationlexecution, and communication. In the normal operation mode, the signals of control and timing are received from the MTM. In thc non-operation mode, the slow control, i.e. initialization, reconfiguration of programmable devices or parameter setting etc., is performed by the ARCNET via copper twisted-pair line.
Symmctric bus architecture is adopted as their system bus, named Common Bus for control signal and Separate Bus for data transfer, on hoth sides of the central Controller Module, which are operated with so called 'source synchronous timing mode' instcad of 'common-clock timing mode' used for usual synchronous system. On the other hand, precisely synchronized timing signals such as AMU-write or TAC-stop signals are independently provided to the modules from the front panel using twisted-pair cables with same total delay time adjusted hy cable length. These timing signals are formed on the Clock Distributor daughter Board (CDB) on The initialization after rese.t or power-up is performed through K2 with serial line to the backplane or with special line for a chip. The K2 is extension U 0 for the COM20051 controller on GAB. The reconfiguration of the SRAM type Complex Programmable Logic Devices (CPLD) used for the DSU and the ALM is also managed by the COM20051 through K2. Reconfiguration data are transmitted from online control system via the ARCNET. The EEPROM type CPLD used for the Clocker, the K2 and the BTS are not reconfigured via the ARCNET but configured before installation.
FRONT-END MODULES AND CIRCUlTS

A. AMU/ADC Module
The operational codes (MODE-BITS), global LVL-1 trigger signal (LVL1-Accept), and readout enable signals (ENDAT) are passed to the DSU via the Clocker. The DSU decodes the MODEBITS and schedules the data collection. Following to the MODE-BITS information, DSU controls BTS, ALM, and all other Modules. The DSU also generates a header information like an event number and an AMU cell address when the LVLlLAccept is received. All of the exceptional handling is also managed by DSU.
The LVLI-Accept and ENDAT are transmitted from DSU to ALM and the Readout Module only during the operation mode. Once LVL1-Accept is received, DSU asserts an A-D conversion signal (ADC-Start) to the AMU/ADC Modules after the ALM outputs AMU address on the bus. After the conversion, DSU asserts a signal of the burst transfer transaction (Transfer-Go) if no burst transfer transaction is proceeding. The BTS handles all of the burst transfer transaction in response to Transfer-Go signal from the DSU. When the completion of the burst transfer transaction, the BTS returns the Transfer-Rdy signal that indicates next transfer is acceptable. The BTS returns the Transfer-En to the DSU if burst transfer have finished with error. The mechanism of the burst transfer is explained in the following section. The ALM manages the AMU cell address during its write-and read-operation. The write-operation is repeated and synchronized with BC, while the read-operation is only started when the LVL1-Accept signal is received. The ALM keeps available addresses to he written and also keeps addresses to be waited for A-D conversion. Figure 5 shows a block diagram of the ALM. The ALM consists of two FIFOs and a shift register. AMU cell addresses for the write-operation are kept in the available address FIFO. The shift register i s used to delay for the LVL-1 latency, where AMU cell addresses are shifted with the beam clock. After a certain delay an AMU cell address is selected at the exit of the shift register whether the LVLI-Accept signal is received or not. The AMU cell address is moved to the accepted address FIFO when the LVLI-Accept signal is received, while the cell address is returned to the available address FIFO for reuse when the LVL1-Accept signal is no1 received. The ALM is controlled by DSU. In the START mode the ALM is circulating cell addresses and waiting for the LVLlAccept signal. The A-D conversion i s started by the DSU when LVL1-Accept signal is received, then the AMU-Read-Add is being output to the bus. After the A-D conversion, the ALM pushes a used address into the Available FIFO when the PushLAddress signal is asserted from the DSU. In the STOP mode an initialization of the available address FIFO (FIFO-Init) must he performed to download the AMU cell address 0 through 63 after a reset or power-up The DSU and the ALM are implemented on ALTERA FLEX 10K20 and IOKlO IS], respectively. Since the FLEX 10K has embedded Array Blocks of RAM, the width and depth of FIFOs and the latency shift register can he reconfigured from a remote computer via the ARCNET.
C. Readout Module
The Readout Module collects data from AMUIADC Modules, and sends the formatted data to the DCM. Two Readout Modules are connected to the left and right 2x 32-bits Separatc Bus. Each module has two identical circuit in upper and lower parts corresponding to the pairs of the AMU/ADC chips for charge and TAC measurement. Header information and charge data are transfcrred to thc upper and lower Charge FIFOs in parallel. The header information and TAC data are also transfcrred to the upper and lower TAC FIFOs. Data packet is built by the Readout Controller in the 
D. LVL-I Trigger Module
The LVLl Trigger Module generates the local LVL-I trigger signal and sends it to the Global LVL-I Trigger Module via the G-LINK. In the RICH subsystem the currentsum of 20 PMT signals is utilized as the trigger to find a ring image of the Chcrenkov radiation. The current-sum of four PMT signals is output from each Integrator/TAC chip to the backplane. Furthermore, the current-sum from the 5 AMU/ADC Modules is summed up again, which is just connected to the same line, and sent to the LVL-I Trigger Module. Sixteen sum-up signals, i.e., 320 channels of PMT signals, are handled by a single LVL-I Trigger Module. Reference current signals from the Integrator/TAC chip are also provided to the LVL-I Trigger Module. The range of the currenl signal is from 2.5mA to 7.5mA corresponding from 0 to 70 photoelectrons, and the reference current is 7.5mA. The currents are converted to voltages by pull-up registers and are fed into an operational amplifier AD8002, which converts the differential voltage by unit gain in order to meet the dynamic range (1.6 to 3.6 V) of the input voltage of the AD876 Flash Analog-to-Digital Converter (FADC). The AD876 is 10-hits CMOS FADC with on-chip sample-andhold amplifier [6]. 
IV. BUS ARCHITECTURE AND PERFORMANCE
A. Source Synchronous Bus Architecture
In recent years, the source synchronous bus is actively discussed for the high clock rate hoard design or the VME bus extension. Usually in the common clock timing mode, the clock is generated elsewhere in the system and is used to launch data out of the driver and latch it into the receiver. The maximum operating speed can he estimated by Period = TD,,"@, + T,m#rm""ec, + T,#,,,,@, + T h ( 1 ) where T,,,, is the driver's output valid delay, T,m,t,,,,,,c, is the  interconnect delay, T,,,,,,, , is the receivers input setup timing and Tstm, is the skew between the clock at the driver and receiver. T,n,,,,,, and Tsbex, strongly depend on not only the trace of the signal line via the backplane but also the capacitive load of inserted module.
In the source synchronous system design the timing difference between the clock and the signal is compensated each other because they are generated on the Controller Module (see Figure l) , and the speed of the bus is now given by Period =iT,,, + T,m,er,.,,,c, + 5,,,,+ 
Since the acquired data or the header information is transferred from the AMU/ADC Modules or the Controller Module, the timing mismatch between the clock and signal arises. In our source synchronous system to avoid it, the clock is generated from the same module as the data source module.
The bus speed is estimated with the equation (1) applied inside the module itself. In other word, if the module can operate within the clock rate given by the equation (I), the system is expecting to operate correctly. Hence the system can be operated at the speed of the lowest clock rate in all of the modules. Notice, however, that actual speed depends on the noise and the uncertainty of the delay of the parts. 
C. Burst Transfer Mechanism
Four channels of digitized data (16-hit per channel) are transferred to a Readout Module in parallel via the Separate Bus during one 4BC cycle. Thirty-two channels of 64-hits data arc sequentially transferred from an AMUlADC Module to the Readout Module during one burst transaction cycle (32 clocks plus overhead) indicated by a BurstLFrame signal, which is driven by the Burst Transfer Sequencer (BTS). The burst transfer transaction is performed on both sides of the module simultaneously using the same Burst-Add and BurstLFrame signals. 800Mbids. An analog to digital conversion completes within 5ps after the LVL1-Accept signal is received, operating with 10-hits mode at 10OMHz (200MHr internal) conversion clock period. Since the integrated charge information is calculated from the difference between signal sample (Postevent) and baseline sample (Pre-event), two sets of data are transferred in one event cycle. For this reason, it takes another 5 p plus overhead to convert to the post-event. However, the conversion can be started after the pre-event is latched in the output register of the AMUlADC chip. As soon as the post-event header is generated by thc DSU, the BTS starts the burst transfer transaction. Burst transfers of the header and the event data are completed around 1p and within 4 . 3~ plus overhead, respectively. The data transfer to the DCM quickly starts if the ENDAT signal is enable. Data packet is built according to the order of the header (9 words), pre-charge (320 words), post-charge (320 words) and post-TAC (320 words) and trailer (10 words), and is less than 20ps per event.
D. Data Transmission Speed
successively sent out. Transfer time for the pre-charge, postcharge and post-TAC are Sps, respectively. Thus, total processing time for each event is expected to he less than 30ys ( 2 0~s in the future option using 2 G-LINK ports). With the pipeline processing, the transfer time is expected to be [31 M. S. Emery, et al., "A Multi-Channel ADC for Use in the PHENIX Detector", IEEE Transaction on Nuclear requirement for the board design is greatly reduced even for high clock rate operation. The transfer speed of the backplane has reached 640 Mbvtels with 128-hits data bus.
The prototype boards have been tested and final board design is in progress. The performance of the backplane is tested and the digitized data have successfully transferred at 40MHz clock without errors.
Total processing time for each event is expected to he less than 30ys ( 2 0 p future option). This result indicates that the performance is satisfied with the requirement of the PHENIX experiment.
