The Level 1 Decision and Data Flow/Control subsystems of the CLEO-III Trigger produce and distribute a trigger decision every 42 ns based on input from calorimetry and tracking subsystems. This paper describes the free-running pipelined trigger decision logic that correlates axial and stereo tracking information, and combines time-aligned calorimetry information onto a common backplane. Programmable trigger decision boards monitor this backplane, and can be configured as desired to respond to a wide variety of trigger conditions. The resulting trigger decision is regulated by a throttling mechanism that allows the data acquisition system to modulate the trigger rate to maximize throughput without buffer overrun. A central signal distribution mechanism delivers the trigger decision and system clock to the front-end electronics.
I. INTRODUCTION
The CLEO-II experiment began accumulating data at the Cornell Electron Storage Ring (CESR) in 1989. Serving as a world-class facility for the study of heavy quark physics, both CLEO and CESR have undergone substantial upgrades in recent years resulting in improved performance. This paper (the third of three [1, 2] ) addresses the decisions and gating aspects of the trigger for the most recently completed CLEO detector upgrade, CLEO-III.
A detailed discussion of the design parameters for the trigger and data-acquisition systems appears in the CLEO-III Detector Proposal [3] , and again in the 1994 CLEO-III Detector Status Report [4] . Previous descriptions of the trigger system, written before the design was finalized, appear in the references [5, 6, 7] . We present a brief overview of the entire CLEO-III trigger, followed by more detailed descriptions of the level 1 decision, data flow control, and gating components of the trigger.
II. SYSTEM OVERVIEW
A schematic view of the CLEO-III trigger system is shown in Figure 1 . Data from the calorimeter and drift chamber are received and processed in separate VME crates by the appropriate circuit boards to yield basic trigger primitives such as the track count and topology in the drift chamber, as well as the shower count and topology in the electromagnetic calorimeter. The information from both systems is correlated by global trigger circuitry which generates an L1Pass strobe every time a valid trigger condition is satisfied. The L1Pass signals are conditionally passed by the data flow control circuitry to the gating and calibration modules for distribution to the data acquisition system. In addition, luminosity information is obtained from the digital calorimetry subsystem and provided to the CESR accelerator via the global trigger. Not shown is the conventional VME CPU which directs the QVME interfaces.
Details on the analog and digital calorimetry trigger can be found in the first companion paper [1] , and details of the axial and stereo tracking trigger can be found in the second companion paper [2] , located elsewhere in these proceedings.
III. DECISION
The Level 1 trigger decision is implemented as a collection of 9Ux400mm VME modules which obtain diverse trigger information from the calorimetry and tracking subsystems. As with most of the other CLEO data acquisition subsystems, configuration and supervision is provided by a commercial MVME2304 PowerPC module, which plays the dual role of crate controller (CTL) and data mover (DM). Also, for clock and trigger distribution, as well as busy signaling, a timing interface module (TIM) is present. These modules are not specific to the trigger, and as such are not documented here. There are five unique module types in the Level 1 trigger decision subsystem. Communications between the modules is effected on a dedicated 3U high P5P6 backplane.
A. Calorimetry: CCGL
The crystal calorimetry global logic (GGCL) module accepts the accumulated projections and tile counts from the barrel and endcap SURF boards. The CCGL is primarily a cable connection point for conveying information onto the P5P6 backplane for use by the Level 1 trigger boards. Details of the SURF, and the tile processors (TPRO) which produce the information, can be found in the first companion paper [1] .
B. Axial Track Processing: AXPR
The wire by wire track-finding information from the axial tracking (AXTR) modules are processed by the axial processor (AXPR) to produce a preliminary track count that is made available to the L1 decision boards, and topological information that is fed to the tracking correlator. The AXPR also examines the time-evolution of the track count in each event to determine the event interaction time. Details of this can be found in the second companion paper [2] .
C. Tracking Correlation: TRCR
The topological output of the AXPR is combined with results from the stereo tracking boards (STTR) in the tracking correlator (TRCR) to provide more refined track counts and event topology information. This information includes a separate count of low and high momentum tracks as well as topological projections to both the inner and outer part of the detector, and is placed on the P5P6 backplane. Details of this can be found in the second companion paper [2] .
D. Level 1 Trigger: L1TR
The Level 1 Trigger (L1TR) module is the basic building block of the trigger decision. There may be one to eight L1TR modules, independently programmed for specific trigger conditions. Each module has equal access to the CCGL, AXPR and TRCR signals on the P5P6 backplane, as well as "external" conditionals that are passed to the P5P6 backplane via the LUMI board. All L1TR boards are identical in hardware, differing only in the way that their field programmable gate arrays (FPGAs) are configured. All L1TR boards see the same input information on the P5P6 backplane. The Trigger Logic Unit (TLU) allows the user to define 48 independent trigger "lines", each a (potentially complex) combinatoric function of the 179 P5P6 inputs. These lines are implemented using two layers of Altera APF8820 "FLEX" style in-system programmable (ISP) FPGA devices. The configuration PROMs for these FPGAs are easily reprogrammed "in system" via a custom VME interface on each board. Of these 48 lines, any 24 can be selected for further consideration by simple register manipulation. This final 48→24 "routing" involves no reprogramming and is done automatically with a few VME transactions at the start of every data collection run. The trigger pulses emerging on the 24 TLU output lines are one clock tick (42 ns) in duration. If several trigger conditions are satisfied by a single event, the trigger pulses on the respective trigger lines emerging from the TLU will typically be coincident.
Each of the 24 trigger lines from the TLU is routed through a 24 bit prescaler. These devices are implemented in Altera 9400 series "MAX" ISP-FPGAs, and act as gates that pass every N th trigger pulse where the N for each trigger line is independently adjustable to be any integer between 1 and 2 24 . For most trigger lines N is set to 1.
A single APF8820 chip is used to implement the "OR/Bunch" block shown in Figure 3 . The primary function of this is to simply "OR" the 24 prescaled trigger lines to produce the L1Pass signal that is sent to the DFC via the Luminosity module. The secondary, more silicon intensive role of this chip is to keep track of the time of the trigger relative to the phase of the accelerator timing system. This map of trigger time versus accelerator phase can be used to suppress erroneous triggers that are not associated with beam crossings. An example is triggers from cosmic rays. This capability of the "OR/Bunch" block has not been used.
The 24 prescaled trigger lines are also sent to a group of 24 scalers, whose job is simply to count the number of times each trigger line fired. The dynamic range of each scaler is 40 bits. Like all registers on the board, scaler information can be accessed via the local VME interface.
E. Luminosity: LUMI
As a central connection point, the Luminosity (LUMI) module monitors the open-collector L1Pass signal produced by the L1TR module(s), and forwards the signal to the flow control and gating subsystem for distribution to the data acquisition system. It also distributes the accelerator phase information, which comes from the flow control system, to the L1TR boards.
The LUMI board is a multipurpose module that performs several service tasks, including those listed above, for the L1TR boards. Addtionally, it collects and counts the online luminosity, for which it is named. The online luminosity is determined from events produced by a well understood and common physics process known as Bhabha scattering. These events are characterized by two high energy back to back electromagnetic clusters. The list of high energy clusters in each of the crystal calorimeter endcaps is delivered directly to the LUMI module from the endcap SURF board. The LUMI applies the back to back criteria and scales the singles (for potential background subtraction) and back to back events.
These data, for a single point in time, can be collected through the VME interface while the system is running.
The LUMI also has connectors to accept "external" trigger or inhibit signals from things such as pulse generators or random sources. These external signals are distributed across the P5/P6 backplane to the L1TR modules.
IV. FLOW CONTROL AND GATING
The flow control and gating logic is implemented as 6Ux160mm VME boards. The smaller form factor was chosen to simplify the design, reduce cost, and provide better control of the timing jitter. As with most of the other CLEO data acquisition subsystems, configuration and supervision is provided by an MVME2304 PowerPC module, which plays the dual role of crate controller (CTL) and data mover (DM). Unlike other subsystems, there is no timing interface (TIM).; the equivalent functions are inherent in the flow control system itself.
There are two unique module types in the flow control and gating subsystem.
A. Data Flow Control: DFC
The data flow control (DFC) module accepts L1Pass signals from the L1TR module(s) by way of the LUMI, and Busy (and Error) signals from the various data acquisition subsystems by way of the TIM and gating/calibration modules (described below). If the data acquisition system is available to accept event data, the L1Pass is asserted as L1Accept and is distributed by the gating/calibration modules to the TIM interfaces throughout the CLEO data acquisition system. L1Accept indicates that the current event is to be read out, assembled by the event builder, and processed by the Level 3 farm.
If any element of the data acquisition system is not available to accept event data, it can assert a Busy signal which the DFC will respect. No L1Accept signals are transmitted if any subsystem is asserting Busy or Error. The Error signal is similar to the Busy signal in that it can be asserted by any subsystem and is respected by the DFC. The intent, however, is for subsystems to assert Error in response to conditions where they expect to remain Busy until further action is taken. Error generates an interrupt to the DM/CTL module in the flow control/gating subsystem, which in turn triggers error recovery. In principle the DM/CTL could monitor Busy and initiate error recovery on time out, but the latency to initiating the error recovery would be substantially greater.
In order to enforce a maximum trigger rate of 1 kHz, the DFC generates its own local Busy signal. The duration of this self-Busy is programmable over the range of 84 ns to 2.7 ms.
The central logic of the DFC is provided by 4 Altera field programmable gate arrays. One (VME_LO) is specifically allocated to the VME interface. The second (VME_HI_A) provides the primary DFC function of L1Pass/L1Accept determination and bookkeeping. The remaining two FPGAs (VME_HI_BC) are identical, and provide Busy and Error bookkeeping respectively. For layout reasons, the VME_LO FPGA communicates with the lower half of the VME data bus (bits 15..0), while the HI FPGAs utilize the upper half of the bus (bits 31..16).
The central clock reference for the DFC is provided by the CESR accelerator timing system. This 42 ns clock is used as the basis for all synchronous actions. A clock selection circuit is present to allow the CESR clock, or its complement (21 ns shifted) to be used. An on-board crystal is also available, as well as a back-up TTL clock input; all are VME configuration register selectable. In addition to being used by the DFC, copies of the selected clock, with various delays, are distributed as true and complemented TTL, and differential PECL across unused lines of the VME backplane. Throughout the DFC design, multiple/redundant solutions are provided to maximize flexibility.
Central to the DFC function is a series of gated counters. Using the 42 ns clock, the DFC counts the number of ticks since initialized (CESR_TIME), the number of ticks that Busy has been asserted (TOTAL_BUSY), and the number of ticks that Error has been asserted (TOTAL_ERROR). The Busy and Error counter are 31 bits wide, with a sticky overflow bit; the CESR_TIME counter is a full 32 bits wide. In addition to the TOTAL registers for Busy and Error, a pair of 15 bit (plus sticky overflow) counters are provided for the CURRENT_BUSY and CURRENT_ERROR, corresponding to the assertion of these signals since the last L1Accept. To assist in system performance monitoring, a pair of primitive peak value registers, MAX_BUSY and MAX_ERROR are also present. The bits of these 15 bit registers are set whenever the corresponding bit in the CURRENT register is set, and cleared only on VME command. As such, the MAX registers provide a logarithmic bar-graph of the peak value of the Busy and Error signals.
Two additional counters are present, but are effectively clocked by the L1Pass and L1Accept signals themselves. TOTAL_L1 counts the number of L1Pass signals, and EVENT_NUM counts the number of L1Accept signals. The difference between these two is the number of L1Pass signals that were rejected because one or more data acquisition subsystems were busy or in error. Both registers are 32 bits.
Most of the above registers correspond to system performance monitoring or control. The DFC does contribute one element to the actual event data stream: event time. Using the lower 3 bits of the EVENT_NUM as an address, 8 registers are set aside to hold copies of the CESR_TIME corresponding to the L1Accepts. The DM/CTL receives an interrupt for each L1Accept, reads the corresponding the event time(s), and advances the READ_PTR register. The 3 bit READ_PTR is a simple overflow control mechanism. If the DM/CTL does not continue to advance this pointer, to keep it ahead of the EVENT_NUM, the DFC will regard this shortcoming as the equivalent to DM/CTL "Busy" and will suspend L1Accept signals until the condition is resolved.
The DFC communicates with the gating/calibration modules using unused bused lines of a standard VME 64x backplane. Both TTL and differential PECL are used for each signal, to maximize flexibility and allow for minimum jitter configuration. L1Accept, Busy and Error have been described above. In addition, a synchronization signal, Synch, is provided by the DFC. It is asserted once for every 256 L1Accept signals. If any part of the data acquisition system sees an L1Accept, and based on internal event counting, expects but does not see Synch, that condition is defined as an Error. Similarly, the presence of Synch at any other L1Accept is equally an Error. This mechanism has a relatively long latency for catching random L1Accept noise (excess or loss), but is extremely low cost. Two other signals, L2Data and L2Strobe, are available from the DFC, but have not been defined at this time. There application is for a potential future Level 2 trigger to accept/veto data collection.
A. Gating and Calibration: GCAL
The gating/calibration (GCAL) boards provide trigger distribution and data acquisition subsystem status gathering. While there is only one DFC module, there may be up to 18 GCAL boards in the immediate flow control/gating VME subrack, and additional modules can be provided in an extension subrack using the differential PECL signaling options from the DFC.
The central logic of the GCAL is provided by 3 Altera field programmable gate arrays. One is specifically allocated to the VME interface. The remaining two FPGAs are identical, and provide the basic GCAL function for each of two channels.
Each CGAL services two TIM modules, and hence two data acquisition subracks. While the VME interface and backplane signal conditioning are shared, the two halves of the GCAL (each servicing one TIM) are completely symmetric. In the following discussion, the GCAL function will be described in terms of only one TIM port, even though in fact there are two.
The GCAL receives a variety of clock signals from the DFC across unused lines on the VME backplane, both as TTL and as two phase shifted copies of differential PECL. This diversity was incorporated in the design to provide alternative signaling solutions, in an attempt to find one with minimum jitter.
Clock selection circuitry on the GCAL is available to choose between the various TTL and PECL clock versions, in both true and complemented form. As such, the 42 ns CESR clock can be tuned with a 21 ns precision. This allows fine adjustment between the DFC and the GCALs (and their downstream TIM modules), to accommodate differing setup and hold times for differing modules if needed.
L1Accept is also conveyed from the DFC to the GCALs via both TTL and differential PECL (jumper selectable), and time synchronized against the selected 42 ns clock. L1Accept can be delayed by an integral number of 42 ns ticks, from 42 ns to 1.4 ms. The width of the L1Accept sent to the TIM module is programmable from 42ns to 10 µs. Since the L1Accept is used by some data acquisition subsystems as a timing signal, the delayed/width-controlled result is resynchronized against the 42 ns clock prior to distribution. All of the critical timing circuitry is implemented in PECL for minimum jitter. A calibration signal (CAL) is treated in the same manner: TTL/PECL provided by the DFC, synchronized, delayed, resynchronized, and transmitted to the TIM.
An additional synchronization signal, Synch, is provided once for every 256 L1Accept signals. All timing associated with Synch is provided by the DFC.
The outgoing L1Accept, Synch, CAL, and 42 ns clock signals are provided to the TIM modules as low voltage differential (LVDS) signals. The corresponding replies from the TIM modules, Busy and Error, are received as LVDS. Two additional outgoing LVDS signals, L2Data and L2Strobe, are available from the DFC to the TIM via the GCALs. These are not in use at this time.
The GCAL maintains bookkeeping registers, much as the DFC, for current, max (logarithmic bar-chart), and total Busy and Error for each TIM. These are purely for performance monitoring and problem diagnostics. They allow finer granularity as to which TIMs are contributing the most Busy time, or which are asserting Error.
The GCAL does not contribute to the data event stream.
VII. STATUS
The Level 1 trigger decision and flow control/gating subsystems were built, then installed at CLEO-III during 1999-2000. As of October, 2000 all modules types are in place and fully operational. Two L1TRs are in place at this time; additional modules will be added as the triggering needs of the experiment grow more complex.
V. SUMMARY
The trigger decision and gating elements of the CLEO-III trigger have been presented. The modular and reprogrammable nature of the trigger decision logic allows significantly diverse trigger conditions to be specified, and the flow control/gating logic distributes the event collection commands while respecting the availability of the data acquisition system.
