Abstract-Located between the on-detector front-end electronics and the global data acquisition system (DAQ), the off-detector electronics of the CMS electromagnetic calorimeter (ECAL) is involved in both detector readout and trigger system. Working at 40MHz, the trigger part must, within 10 clock cycles, receive and deserialise the data of the front-end electronics, encode the trigger primitives using a non linear scale, assure time alignment between channels using a histogramhistogramingming technique and send the trigger primitives to the regional trigger. In addition, it must classify trigger towers in three classes of interest and send this classification to the readout part. The readout part must select the zero suppression level to apply depending on the regions of interest determined from the trigger tower classification, deserialise frontend data coming from high-speed (800Mbit/s) serial links, check their integrity, apply zero suppression, build the event and send it to the DAQ, monitor the buffer occupancy and send back pressure to the trigger system when required, provide data spying and monitoring facilities for the local DAQ. The system, and especially the data link speed, the latency constraints and the bit error rate requirements have been validated on prototypes. Part of the system is about to go to production.
I. INTRODUCTION
T HE Compact Muon Solenoid (CMS) experiment, at the Large Hadron Collider (LHC) in construction at CERN, will be equipped with a high resolution electromagnetic calorimeter (ECAL) made of 75,848 crystals. The readout and trigger electronics of ECAL is divided into two parts: the ondetector electronics located inside the CMS detector and the offdetector electronics located outside the detector in the service underground cavern. The on-detector electronics is read through about 9000 high-speed (800Mbit/s) serial links by the offdetector electronics. These links and the corresponding optoelectronics must resist to the stringent radiation conditions of the detector area.
The information from the electromagnetic calorimeter is used by the level one trigger (LV1) together with the hadronic calorimeter and the muon subdetector information.
For the LV1 decision, the ECAL is read with a coarse granularity: in the case of the barrel, 25 times coarser than the nominal one. For this purpose, the ECAL is divided into 56 × 72 (η, ϕ)-regions 1 , 34 × 72 in the barrel and 11 × 72 in each endcap, called trigger towers. In the barrel, a trigger tower corresponds to a 5 × 5 matrix of crystals. In the endcap, the number of crystals per trigger tower differs from one trigger tower to another. The information used for the LV1 decision is called trigger primitives. A trigger primitive consists of the evaluated transverse energy deposited in a trigger tower and of a single bit qualifying the energy deposit expansion along η.
The size of the whole ECAL data for a single event is too large to be sent as this to the CMS DAQ system. A data volume reduction of ∼ 20 is required on the ECAL part of an event. This reduction is performed by the combination of zero suppression and of an algorithm, called "selective readout", selecting the regions of interest of the calorimeter. The selective readout algorithm, described in [1] , selects the areas of the ECAL which must be read with a minimal zero suppression threshold. The rest of the ECAL is read with a high zero suppression threshold or read only with the coarse granularity of the trigger primitives. In order to select the regions of interest, the trigger towers are classified in three classes of interest depending on the deposited transverse energy. The energy deposited in each trigger tower is compared to two Et(tow)>HT HT>Et(tow)>LT LT>Et(tow) Fig. 1 . Selective readout algorithm. See the text for the description of the algorithm. The figure illustrates the case of one trigger tower with a high transverse energy deposit (above the higher threshold HT, in red): the crystals of 3 × 3 trigger tower matrix around this trigger tower are read with a minimal zero suppression threshold. The crystals of the trigger tower with a medium transverse energy deposit (between the higher and lower thresholds, HT and LT, in orange) are also read with a minimal zero suppression threshold.
thresholds; trigger towers with an energy above the higher threshold are classified as high interest trigger towers, those with an energy between the two thresholds as medium interest ones and those with an energy below the lower threshold as low interest trigger towers. If a trigger tower belongs to the high interest class then the crystals of this trigger tower and of its neighbour trigger towers, which corresponds for the barrel to 225 crystals, are read with a minimal zero suppression. If a trigger tower belongs to the medium interest class, then the crystals of this trigger tower, which corresponds for the barrel to 25 crystals, are read. If a trigger tower belongs to the low interest class and it is not the neighbour of a high interest trigger tower, then it is not read with the fine granularity or optionally it is read but with a severe zero suppression. This algorithm is illustrated in Fig. 1 . In any case the full calorimeter is read with the coarse granularity of a trigger tower and this information (in principle, the trigger primitives themselves) is sent to the CMS DAQ and is always available offline, for instance for missing transverse energy calculation.
The general ECAL read-out architecture is represented in Fig. 2 . One Very Front-End card (VFE) reads five crystals and five of these cards are plugged into one Front-End card (FE). The clock is distributed to the front-ends through token rings controlled by the "Clock and Control System" cards (CCS). The FE cards compute trigger primitives and send them at 40 MHz to the Trigger Concentrator Cards (TCC) through dedicated serial links. The TCCs finalise the trigger primitive calculation (for the endcap only), compress the trigger primitives using a non-linear scale and, after synchronisation and serialisation done by the Synchronisation and Link mezzanine Board (SLB), send them to the level 1 trigger system.
The trigger is distributed to the front-end electronics, to the TCCs and to the Data Concentrator Cards (DCC) by the CCS cards. On a level one accept, the FE cards send the data to the DCCs where they are delayed in input pipelines. During the latency introduced by these pipelines, the selective readout processor (SRP) receives trigger tower classification flags from the TCCs, determines the calorimeter regions of interest according to the selective readout algorithm previously described, produces the selective readout flags indicating the zero suppression level to apply for each readout unit (readout partition of 25 crystals) and sends them to the DCCs. The DCCs perform the actual zero suppression and send the resulting reduced data to the global DAQ. In addition the DCCs provide access to the data via the VME bus. This access is used by the local DAQ and the laser-based ECAL calibration system.
The architecture of the off-detector electronics which has been described shows up five main parts: the CCS cards, the TCC cards, the SLBs, the SRP and the DCC boards.
II. CCS CARD
The CCS cards have three main responsibilities:
• slow controls of the front-end electronics: electronics configuration and status report • distribution of fast timing signals: e.g. clock, trigger.
• fan-in of the trigger throttle signals The slow control commands come from the VME bus. The front-end is accessed through mezzanine PCI cards called mFEC (mezzanine Front-End controller, [5] ) and plugged on the CCS card. An interface between the VME and the local bus is implemented in an FPGA (Field Programmable Gate Array) chip, the Xilinx Spartan IIE. This VME interface FPGA, which is the master of the local bus, handles also the interrupts coming from the mFEC and translates them into VME interrupts.
The mFEC accesses to the front-end electronics through a control ring: each FE board has a Communication and Control Unit (CCU) implemented in an ASIC (Application Specific Integrated Circuit). The CCUs together with the mFEC are connected in a ring as illustrated in Fig. 3 . The protocol used to control the FE boards is similar to the IBM token ring protocol. The links between the CCUs, which are "on the detector" are electrical. The links to the CCS, which is "off-detector", are optical.
The fast timing signals arrive from the experiment trigger system by the Timing and Trigger Control (TTC, [9] ) link. These signals are decoded by the LHC-standard TTCrx ASIC [4] channel B providing the fast control commands The following command are used by the ECAL front-end electronics:
• "BC0" ("Bunch Crossing zero"), marks the beginning of an LHC orbit, • "resync", resets all pointers, all the data are lost and the internal event ID is set to 0.
• "power-up reset"
• "monitoring mode enabling", switches the ADC to a secondary mode. These commands are sent to the front-end electronics together with the clock: they are encoded by missing clock edges. The command translation and encoding is implemented in a Xilinx Spartan IIE FPGA, denoted "Trigger FPGA" on the CCS block diagram represented in Fig. 4 . Therefore a single signal with the clock and the fast control commands goes from the Trigger controller to the mFECs (8 on a CCS board). This fast timing signal is then distributed to the FE boards of the control ring as illustrated in Fig. 3 . On each front-end board an ASIC, called TPLL (Tracker Phase Locked Loop chip) extracts the commands and the clock from the fast timing signal, recovered with the help of a PLL. The TPLL contains also a phase shifter in order to synchronise the clocks arriving in each FE boards. A PLL with a crystal, called QPLL [7] , is used to reduce the clock jitter.
One CCS card has 8 mFECs, each mFEC controlling one control ring of 8 to 9 FE boards. 54 CCSs are required in total for the controls of the ECAL front-end electronics.
The trigger controller has another functionality: it merges the "Timing Throttle System" (TTS) signals from the TCC and DCC cards of the ECAL off-detector system, and sends the resulting TTS signal to the TTS link. The TTS signal is a feedback from the front-end electronics (or its emulation) for controlling the trigger rate.
Finally the CCS cards fan out the signal from the TTC link to the DCC and TCC cards. TTC and TTS signals transmitted between the ECAL off-detector cards travel on the TTS/TCC bus located on a back plane in the VME crate. Fast control commands can be inserted into the TTC signals sent to the 
TCC and DCC via the Trigger FPGA.
A CCS card is already used to control the front-end in test beam.
III. TCC
The TCC cards are responsible for sending the trigger primitives to the trigger system. A trigger primitive consists of:
• the measurement of the transverse energy deposited in a trigger tower coupled with the bunch crossing assignment • a bit, called "fine grain bit", qualifying the expansion along η of the energy deposit in the trigger tower. To evaluate the energy deposited in a trigger tower it is required to sum the signals from each crystal (corresponding to a readout channel) of the trigger tower. In order to minimise the number of links between the detector and the TCC cards, this sum is done (only partially for the endcaps) in the electronics on the detector, more precisely on the FE boards.
First of all, sums of five-crystals are computed: samples are added one-to-one. In the barrel the sums are done on strips of five crystals along ϕ as represented in Fig. 5 . Then, each strip signal passes through an FIR (finite impulse response) filter with 5 (optionally 6) taps and a peak-finder is used to extract the maximum value. This maximum value is proportional to the energy deposited in the strip and is the value sent to the trigger system as part of the trigger primitive.
The fine grain veto is calculated the following way. All the possible sums of two consecutive strips of a trigger tower are computed and the one with the maximum value is considered. The ratio of this maximum to the sum of the energy of the five strips is compared to a threshold, typically 90%: the fine grain bit is active if the threshold is passed.
For the barrel, all of the above operations are done in the FE boards. Each front-end board serialises the computed trigger primitive and sends it to a TCC via one optical link must firstly deserialises the data coming from the front-end boards. The energy sums are then compressed into an 8-bit word using a non-linear scale. The result of this compression together with the fine grain veto bit go to the Synchronisation and Link Board (SLB). All of these operations must be done within a 7 clock cycle latency. The encoded trigger primitives are stored during the LV1 latency; in case of an LV1 accept they will be sent to the DCC in order to be stored together with the data.
The complete trigger primitive calculation (except the compression part) can be precisely done in the FE boards for the barrel because a trigger primitive contains information related to a single trigger tower and a trigger tower is read by a single FE board. This condition is not fulfilled for the endcap. Indeed, in the endcap, the crystals are organised in an x-y grid. As for the barrel, an FE board reads an x-y matrix of 5×5 crystals. On the other hand the trigger towers must be organised according to an (η, ϕ)-partitioning: a trigger tower should cover a region ∆η × ∆ϕ 0.138 × 0.0873. The crystals are grouped to form trigger-towers approaching this ∆η ×∆ϕ region. This mapping is done in such a way that the crystals of a readout unit are assigned to trigger towers by multiples of 5. Thanks to this "multiple of 5" constraint, the five-crystal 2 partial sums can still be calculated on the front-end boards. The FIR filter and the peak finder of the FE board operate on the partial sums. The final sums are computed in the TCC cards. In this way a compromise is done between the number of optical links and the trigger efficiency, which requires an η, ϕ trigger tower partitioning.
Two types of TCC are required: one for the barrel, one for the endcap. The barrel TCCs have 72 inputs (68 are used), the endcap ones have 48 inputs. The endcaps have five FE board-TCC optical links (one for each partial sum) per FE board instead of one for the barrel. 36 TCCs are required to cover the barrel and 72 TCCs for the endcaps.
The TCCs are also in charge of the classification of the trigger towers in the classes of interest described in the introduction. These trigger tower flags, coded on two bits. The two bits plus one reserved bit are sent to the Selective Readout Processor.
Latency, the bit error rate of the data links were validated on a prototype implemented with one third of the nominal number of inputs. Direct measurements on a long term run have shown that the bit error rate at the output of the Agilent desarialiser is less than 3 · 10 −15 . Measurements from eye diagram and jitter of the signal getting out of the Agilent desarialiser have shown that the bit error rate is much lower than 10 −15 . The measured jitter of this signal is low, 20 ps and is within the requirements. Finally the latency were measured between one input of the TCC and the output of the Agilent deserialiser: the 3.13 clock cycles value was obtained. It is expected the latency introduced by the FPGA part will be within two clock cycles and therefore the total latency should be below 6 clock cycles, the specified maximum value being 7 clock cycles.
IV. SLB
On a TCC-68 card, 9 SLB mezzanine cards are plugged. The SLB cards [3] are responsible for synchronising the trigger primitives with respect to the bunch crossing time and sending them to the regional calorimeter trigger system. Histograms are used to rebuild the bunch structure of the LHC orbit. The bunch crossing time can be deduced from this structure and then used as a reference to synchronise data. A three clock unit budget is allocated for the SLB operation.
V. SRP
The SRP [1] must determine the calorimeter regions of interest to read with a minimal zero suppression threshold. The SRP time budget allocated to perform this operation is 6.4 µs. The SRP operates asynchronously at the 100 kHz LV1 accept rate. To perform the selective readout algorithm, the ECAL is partitioned in 12 regions. Each region is covered by one "Algorithm Board" (AB). The algorithm requires that each AB exchanges tower data with the 8 AB boards covering the adjacent regions: this makes 39 AB bidirectional interconnections. A commercial passive optical cross-connect is used for these interconnections. In addition to these AB-AB connections, there are 108 TCC to SRP and 54 SRP to DCC unidirectional connections. The AB board is built around the high integration FPGA Xilinx Virtex2Pro 2vp70. This FPGA contains 20 RocketIO multigigabit transceivers (MGT) operating at up to 3.125 Gbit/s transmission rate. Up to 12 of these MGTs are used for the unidirectional communications with the TCC and DCC boards and up to 8 are used for the communication with the adjacent ABs. The connection of the parallel optical links from TCCs, DCCs and the AB cross-connect is made with SNAP12 multisource agreement pluggable modules.
As for the TCC, the bit error rate has been measured from long-term run and from jitter and eye opening. The direct measurements on a long term run show that the bit error rate is less than 10 −15 , which is better than the requirements. The latency has been measured on a simplified AB model. The simplified AB exchanges trigger tower data to only two ABs and serves only 1/6 th of the assigned barrel area. The measured value, 2 µs, is well below the 6.4µs time budget. Because of the intrinsic parallelism of the AB firmware, the overall SRP latency is not expected to increase significantly.
VI. DCC The DCC [2] , a VME 9U board, must deserialise the data coming from the 68 high-speed links (800Mbit/s) of the FE boards. After a data integrity check, it applies the zero suppression, in order to reduce the data volume by a factor ∼ 20. The zero suppression threshold is chosen according to the area of the calorimeter that is read and the flags received from the SRP. The DCC formats the ECAL event fragments from the data received from the FE boards, the trigger primitives received from the TCC, and the selective readout flags received from the SRP. It serialises the event fragments and sends them to the global DAQ via an S-link 64 interface [8] at an average rate of 200 M Byte/s with a maximum link designed rate of 528 M Byte/s. The DCC monitors its data buffers and sends a feedback to the trigger: a warning message will lower the trigger rate, an "almost full" signal will inhibit the trigger, empty events being sent in order to keep the synchronisation, a "full" signal will result in a resynchronisation of the DAQ (the readout buffers will be reset).
Events can also be read by the local DAQ and the laser monitoring system from the VME bus. In total 54 DCC cards are required for the readout of ECAL.
The final DCC prototype is already used for the test beams.
VII. CONCLUSIONS The design of the off-detector electronics of the electromagnetic calorimeter, presented in this paper, is able to provide to the trigger system the required information for level one accept decision in the short time budget and to reduce the event size by the required factor ∼ 20. Most of the components the ECAL off-detector electronics were tested and validated, especially BER and jitter were carefully measured. The DCC and the CCS boards, are already used for the ECAL test beams.
