Abstract-The timing and fast control (TFC) system is responsible for controlling and distributing timing, trigger and synchronous commands to the LHCb front-end (FE) electronics. It is different from the equivalent systems of the other LHC experiments in that it has to support two levels of high-rate triggers. Furthermore, the TFC mastership of a configurable ensemble of FE electronics is centralized in one module: the Readout Supervisor. A pool of optional Readout Supervisors allows mastering of all or separate combinations of subsystems in parallel by remote programming of a patch panel in the distribution network. The speed requirements and the multifunctionality of the Readout Supervisor necessitate optimal technological solutions. At the same time the logic must be modifiable to support extensions or changes in the running modes. A first prototype has been built using field-programmable gate arrays (FPGAs) for the entire logic and it has been tested successfully. This paper gives an overview of the system architecture and describes in more detail the functions and the implementation of the Readout Supervisor.
I. INTRODUCTION
T HE LHCb experiment will be installed at the LHC accelerator at CERN. It is a single-arm spectrometer composed of ten subdetectors with the aim to study the large number of b-hadrons produced at the LHC. The bunch crossing rate of the LHC is about 40 MHz and the pp interaction rate at the LHCb interaction point is about 10 MHz, which gives LHCb a rate of about 300 kHz of b-hadrons in the detector. Fig. 1 shows a simplified picture of the entire LHCb readout system architecture. For a complete description see [1] and [2] . The readout system features two levels of high-rate triggers: a Level 0 (L0) trigger that brings down the interaction rate of 10 MHz to an event accept rate of maximum 1.1 MHz, and a Level 1 (L1) trigger with an accept rate of maximum 40 kHz. The L0 trigger processing is carried out in dedicated hardware modules whereas the L1 trigger processing takes place in the CPU farm. The architecture of the front-end (FE) electronics reflects this two level structure in that it consists of a L0 part and a L1 part. The L0 FE electronics samples the signals from the detector at a rate of 40 MHz and stores them during the fixed L0 trigger processing (4 s). The event data are subsequently derandomized before being handed over to the L1 FE electronics. The L1 FE has two channels to the event building network, one of which is used to transmit event data to the L1 trigger processing and the other which is used for the complete readout after the L1 trigger decision. Thus, upon receiving event data from the L0 FE, the L1 FE electronics sends a part of the data over the event building network for the L1 trigger processing and buffers the complete event data during the L1 trigger latency (58 ms). Upon receiving a positive decision the L1 FE derandomizes the events, zero-suppresses the data, and finally sends the complete event data to the CPU farm for the high level trigger (HLT) processing. The timing and fast control (TFC) system [2] is responsible for driving the readout of the LHCb detector by distributing synchronously timing, trigger and control information to the FE electronics. Although the distribution network of the TFC system is based on the CERN optical TTC system [3] developed in the CERN RD12 project and used by all LHC experiments, several components are specific to the LHCb experiment. Fig. 2 shows the logical layout of the TFC architecture. In order to support a fully partitionable system, that is the possibility of running autonomously one or any ensemble of subsystems in a special running mode independently of all the others, the TFC mastership has been centralized in one module: the Readout Supervisor (RS, "ODIN") [4] . The architecture contains a pool of Readout Supervisors, one of which is used for global data acquisition and which receives the two levels of physics trigger decisions. For separate local runs of subsystems a programmable patch panel, the TFC Switch, allows associating subsystems to the other Readout Supervisors. They may thus be configured to sustain completely different timing, triggering, and control, and can also be connected to local trigger sources. The TFC Switch distributes in parallel the information from the Readout Supervisors to the FE electronics of the different subsystems.
II. TFC ARCHITECTURE
The information transmitted by the Readout Supervisors to the FE electronics via the TFC Switch and the TTC distribution network consists specifically of the following: 1) LHC reference clock at MHz as received from the LHC timing generators. This is the master clock of all the electronics; 2) two levels of high-rate trigger decisions (L0 and L1); 3) commands resetting event related counters in the FE electronics used to identify the accepted events and to check synchronization; 4) commands resetting the FE electronics in order to prepare it for data taking or to recover from an error condition; 5) calibration commands activating specific calibration systems in the FE electronics or in the subdetectors; 6) IP/Ethernet addresses assigning the CPU farm destinations to the L1 trigger data and the full event readout. If the physics trigger rate gets abnormally high or data congestion occurs in the event building network, there is a potential risk of overflow in the buffers of the FE electronics. In order to prevent this, the Readout Supervisor controls the trigger rates according to the status of the buffers. Whereas the status of the fast buffers can only be known by emulating them centrally in the Readout Supervisor, slower buffers are monitored locally. In case they are monitored locally, imminent overflows are signalled via a dedicated throttle network. The Throttle Switch feeds back the buffer overflow warning signals from the slower buffers in the readout to the appropriate Readout Supervisor. Fig. 3 shows a detailed view of the TFC architecture. The Throttle ORs function as concentrators of the buffer overflow warning signals and make a logical OR of the signals within the same subsystem. The TTC modules in Fig. 3 are all standard components of the CERN TTC system.
III. TTC DISTRIBUTION SYSTEM
The TTC distribution network is based on fiber optics carrying two communication channels: a low-latency accept/reject signal (channel A) and framed and formatted broadcasts including Hamming code (channel B). Two types of broadcasts are available: 16 bit frames which have 8 bits of user information, so called short broadcasts, and 42 bit frames which have 16 bits of user information (8 bit data/8 bit address), so called long broadcasts. The short broadcasts and the long broadcasts take 400 and 1050 ns, respectively, to transmit.
The TTC system has been found to suit well the LHCb application. Channel A is used to transmit the L0 trigger decision at 40 MHz. Channel B is used to transmit the L1 trigger decisions, the synchronous commands listed above and the farm destinations. Fig. 4 shows the encoding of the short broadcasts.
The two channels are time-division multiplexed (TDM) and biphase mark encoded before being converted to an optical sig- nal. The biphase signal also allows transmitting the clock with low jitter. The encoding is done in the Readout Supervisor and the electrical to optical conversion is done by the TTCtx [3] modules (Fig. 3) , which have 14 high-power transmitters. The optical fan-out TTCoc allows distributing the signal to 32 destinations, which means that one TTCtx can drive up to 448 destinations. After receiving the signal using a PIN diode, the TTCrx ASIC reconstructs the 40 MHz clock and decodes the encoded signal into the user information.
IV. READOUT SUPERVISOR "ODIN"
The Readout Supervisor is a complex board responsible for a multitude of functions. Fig. 5 shows logically the principal blocks of functions. Below is a summary of the most important functions. A complete description can be found in [5] .
The TTC encoder circuit incorporated in each Readout Supervisor receives directly the LHC clock and the LHC orbit signal via a TTC machine interface (TTCmi). The clock is distributed on the board in a star fashion and is transmitted to all synchronous destinations via the TTC system.
The Readout Supervisor receives the L0 trigger decision from the central L0 trigger Decision Unit (L0DU), or from an optional local trigger unit, together with the Bunch Crossing ID. In order to adjust the global latency of the entire L0 trigger path to a total of 160 cycles (4 s), the Readout Supervisor has a pipeline of programmable length at the input of the L0 trigger. Provided no other changes are made to the system, the depth of the pipeline is set once and for all during the commissioning with the first timing alignment. The Bunch Crossing ID received from the L0DU is compared to the expected value from an internal counter in order to verify that the L0DU is synchronized. For each L0 trigger accept, the source of the trigger (3-bit encoded) together with a 2-bit Bunch Crossing ID, a 12-bit L0 Event ID (number of L0 triggers accepted), and a "force bit" is stored in a FIFO. The force bit indicates that the trigger has been forced and that consequently the L1 trigger decision should be made positive, irrespective of the L1 physics trigger. The information in the FIFO is read out at the arrival of the corresponding L1 trigger.
The RS receives the L1 trigger decision together with a 2-bit Bunch Crossing ID and a 12-bit L0 Event ID. 1 If the force bit is set the decision is converted to positive. The 3-bit trigger type and two bits of the L0 Event ID are subsequently transmitted as a short broadcast according to the format in Fig. 4 .
The Readout Supervisor controls the trigger rates according to the status of the buffers in the system in order to prevent overflows. Due to the distance and the high trigger rate, the L0 FE buffer occupancy cannot be controlled in a direct way. However, as the buffer activity is completely deterministic, the RS has a state machine to emulate the occupancy. This is also the case for the L1 FE buffers. In case an overflow is imminent the RS throttles the trigger, which in reality is achieved by converting trigger accepts into rejects. The slower buffers and the event-building components feed back throttle signals via the dedicated throttle network to the RS (Fig. 3 ). Data congestion at the level of the HLT farm is signaled via the Experiment Control System (ECS) to the onboard ECS interface, which can also throttle the triggers. For monitoring and debugging, the RS has history buffers that log all changes on the throttle lines.
The RS provides several means for autotriggering. It incorporates two independent uniform pseudorandom generators of L0 and L1 triggers according to a Poisson distribution. The RS also has a unit running several state machines synchronized to the LHC orbit signal for periodic triggering of a single or a specified number of consecutive bunch crossings (timing alignment), triggering at a programmable time after sending a command to fire a calibration pulse, triggering at a given time on command via the ECS interface etc. The source of the trigger is encoded in the 3-bit L1 trigger qualifier. 1 An alternative solution exists where the L1 physics triggers are received via GbEthernet in which case more information will be added. This possibility is already incorporated on the final prototype of the Readout Supervisor currently ready for production. The RS has also the task of transmitting various reset commands. For this purpose it has a unit running several state machines, also synchronized to the orbit signal, for transmitting Bunch Counter Resets, Event Counter Resets, L0 FE electronics reset, FE electronics reset, L1 Event ID resets etc. The RS can be programmed to send the commands regularly or solely on command via the ECS interface.
The RS transmits the IP/Ethernet destination for the L1 event data and for the complete readout as long broadcasts.
The transmission of the various broadcasts is handled according to a priority scheme. The Bunch Counter and the Event Counter Reset have highest priority. Any clashing broadcast is postponed until the first broadcast is ready (L1 trigger broadcast, IP/Ethernet destination) or until the next LHC orbit (reset, calibration pulse, and all miscellaneous commands).
The RS keeps a large set of counters that record its performance and the performance of the experiment (dead-time etc.). In order to get a consistent picture of the status of the system, all counters are sampled simultaneously in temporary buffers waiting to be read out via the onboard ECS interface.
The RS also incorporates a series of buffers analogous to a normal FE chain to record local event information and provide the DAQ system with the data on an event-by-event basis. The "RS data block" contains the "true" bunch crossing ID and the Event Number, and is merged with the other event data fragments during the event building.
V. IMPLEMENTATION OF THE READOUT SUPERVISOR
The Readout Supervisor has been designed with emphasis on versatility in order to support many different types of running mode, and modifiability for functions to be added and changed easily. For these reasons all the logic has been implemented using FPGAs.
The interface to the Experiment Control System is based on a commercial Credit Card PC with Ethernet from Digital Logic AG, Switzerland [5] . All the board logic on the RS is programmed, configured, controlled, and monitored via the Credit Card PC. All functionality is set up and activated via parameters that can be written at any time via a multiplexed 32-bit PLX local bus.
A first prototype of the Readout Supervisor has been built (Fig. 6 ) in a minimal version to perform feasibility tests of the most important and critical functionalities. The logic of the minimal version was implemented in ten FPGA's from the Altera FLEX 10KE family and from the MAX7000B family for very speed demanding functions. The minimal version had less state machines for auto-triggering and the FE part was completely left out.
The second and final prototype of the Readout Supervisor will incorporate the complete functionality in four bigger FPGA's from the Altera APEX20KE family. This reduces the routing on the PCB and leaves more room for modifications in the functionality without having to modify the hardware. It will also contain the full FE handling. The design of the second final version has been finalized and is ready for production.
VI. CONCLUSION
The LHCb Timing and Fast Control (TFC) system and the use of the TTC system are well established. The Readout Supervisor incorporates all mastership in a single module and provides a lot of flexibility and versatility. Partitioning is well integrated through the TFC Switch and the Throttle Switches.
A full test bench of the TFC system exists including a final prototype of the TFC Switch and a first prototype of the Readout Supervisor, which has been implemented entirely using FPGAs. The tests have been successful and a second complete version of the Readout Supervisor has been designed and is now ready for production.
