We will describe in this paper the design of a powerful new high level trigger processing system that is flexible enough to be used in a wide variety of high energy physics applications. The system is based on a newly evolving technique of using large fast access memories as Readys to a given module are set, the module starts its operation. When a module has completed its operation, the module sets its output Ready line. In almost all applications, inter module timing need not be a concern of the user. The difficulties of timing between two asynchronous loops, with one loop feeding data to the other, are handled automatically by the Stack module functioning as a FIFO or a randomly readable buffer.
*University of Toronto
Toronto, Ontario Section 1: Introduction
We will describe in this paper the design of a powerful new high level trigger processing system that is flexible enough to be used in a wide variety of high energy physics applications. The system is based on a newly evolving technique of using large fast access memories as table look-ups in place of combinational logic. It uses high speed emitter coupled logic (ECL) and memories and is capable of making trigger decisions requiring complicated calculations in a few microseconds It is totally modular in concept representing an extension in power and flexibility of NIM fast logic philosophy rather than a fast reduced capability version of a small computer. This system has no computer like program counters, clocks, instruction registers, etc. which unnecessarily slow down fast trigger decision logic. Programming the processor is done by cabling modules together at the front panels, as in NIM fast logic, and by loading control information into modules via CAMAC. The modules operate in an unclocked, non handshake mode. rar more powerful than fast logic, these modules operate on multi-bit words, finding tracks, calibrating pulse heights and computing comiplicated functions. Single module operations typically take 50 nanoseconds. Modular RAMs (Random Access M'emories) allow table look-up of arbitrary functions. They are also used to control program flow, aborting processing when certain conditions have been met, for example. These RAMs may be loaded and checked through CAMAC. Simple logic functions (ANDs and ORs) are handled with General Logic Modules.
All internal data is transmitted using differential ECL drivers and receivers over single and multiconductor twisted pair cables. Hard--wire fanouts are not permitted. However, many modules provide buffered outputs of input data allowing fanout via daisy-chaining of modules.
Although the system operates in a high speed unclocked mode, system algorithms may include hardware processing of nested loops, conditional branching and subroutines, operations normally associated with serial computer-type processing. Unclocked operation means that each module only waits for data specifically required at its inputs; thus processing is completed in the shortest possible time. Operation is controlled by a system of Ready lines. When all required input Readys to a given module are set, the module starts its operation. When a module has completed its operation, the module sets its output Ready line. In almost all applications, inter module timing need not be a concern of the user. The difficulties of timing between two asynchronous loops, with one loop feeding data to the other, are handled automatically by the Stack module functioning as a FIFO or a randomly readable buffer.
The user need only interconnect modules to set up the desired algorithm and load memories with appropriate functions and tables.
To permit system testing, nearly every intermodule data path can be monitored or controlled by CAMAC read and write commands. The system is packaged using CAMAC hardware and a modified CAMAC crate with an ECL backplane and a CAMAC TTL to ECL translator in slot 23.
As seen from the host computer the system responds and transmits according to CAMAC standards.
A total of eighteen different modules have been designed. These include ten general purpose modules of varying complexity such as Memory Look-Up, Data Stack, General Logic, Fan Out and level conversion modules. There are four test modules, three limited purpose modules associated with track finding, and one special purpose module relevant only to the initial experimental application of the system. Many of these modules will be described in the following sections.
Further details and full documentation on the modules, hardware, etc., may be found in ECL/CAMAC Trigger Processor System Documentation, Fermilab Technical Memo TM-821.
Multi-Level Triggers
A multi-level triggering philosophy is fundamental to the concept of powerful modern trigger processing. Conventional NIM logic is used to make a loose "lowlevel" trigger on a time scale of %100 nsec. This trigger is used to gate data recording electronics such as CAMAC ADCs, TDCs, latches, etc. If the low level trigger rate is kept below 104/sec, for example, then high level decisions on whether or not the data should be recorded by the computer may take as long as 10-psec resulting in only 10% dead time. This is long enough for very sophisticated computations, using the fast integrated circuitry and memories now available, to reduce the data recording rate to the order of 100 events/sec. If counter, a hadron calorimeter and ,u identifiers. The initial application of the processor will be triggers based on the recoil system. We will use this application as an example showing the broad flexibility of the processor system. The recoil system is shown in cross sectional views in Figure 1 
Recoil System target (which may be up to 2 m long). To simplify track computations, the target to inner chamber distance is equal to the inter chamber spacing. Surrounding the MWPC are scintillation counters in 15 azimuthal sections and four radial layers. The inner two layers are plastic and the outer twc are liquid. In the MWPC the anodes run parallel to the target, spaced every few millimeters azimuthally. The outer cathodes of the MWPC consist of mylar on which a grid of wires has been wound and glued at 3mm spacing. The mylar is bent and glued to the inside of a honey comb support cylinder. Thus the wires, which run in circles perpendicular to the beam, measure Z the distance along the beam line. The signal induced by charge flowing toward the anode is read out on these cathode wires. The wire chambers provide a means for measuring the polar angle, 0, of tracks recoiling from interactions in the target. The energy deposited by particles traversing the scintillators is also measured. Recoiling protons of typical angle with kinetic energy between about 50 and 250 MeV stop in the scintillator and thus their total energy, E, is readily measured. With somewhat more difficulty the energy of protons over 250 MeV can be determined from the dE information measured in the scintillators. This information can also be used to identify the recoiling particle as a proton. Even if nothing is measured directly in the forward going system, x, in yp --xp, the missing mass, Mx, can be computed from the measurements of the recoil proton E and 0 if the photon energy Ey is known. Ey is measured by the photon tagging system. system'.
The trigger processor system will be used first to select events with only a proton in the recoil detector. This strongly enhances the sample of diffractively produced forward forward going states. Then the processor computes the missing mass and triggers on a selected range, for example to enhance diffractive charm events, 4 < Mx < 5GeV. The system is capable of handling a reasonable number of spurious tracks (6rays) and secondary interactions. It is estimated that an average of 10ps will be required for this trigger.
Briefly the recoil trigger system will operate as follows (refer to Figure 2 Figure 2 subsystems are described., details of the recoil trigger will be explained as examples.
Section 2: Packaging
The problems associated with ,packaging of electronic circuitry are quite often-grossly underestimated.
The foremost design objectives of the trigger processor were that it,be fast and that the design be general in nature so that the' hardware will be both easily. ad,aptable to.the dynamic requirements of ongoing.experiments and reusable and mass-producible for future experiments. These goals dictated a modular system implemented with high-speed ECL technology. In addition, data usually is read.from different sources and several modules require two-way communication with the host computer. Because of the processor's size and complexity, it was necessary to provide easy mechanisms for both on-line and off-line diagnostics. These additional requirements and the general availability of CAMAC hardware and software led to the packaging scheme shown in Figure 3 Figure 3 The ECL CAMAC power supply consists of four commercial supplies packaged in a nineteen-inch, rackmountable case seven inches high. The 
ECL CAMAC Module Construction Techniques
Owing to the high-speed properties of ECL-10000 circuitry, all system modules were constructed using a microstrip technique, i.e., wire over a ground plane, to provide nearly uniform signal transmission characteristics within the modules. The three module construction techniques used in the project are described and compared below.
An ECL wire-wrap CAMAC kludge-board was designed for the project. It has space for up to 62 sixteen pin ICs of which up to eight may be replaced with 24 pin ICs. Most modules needed in low quantities (including diagnostic modules) were wire-wrapped using this board. The characteristic impedance of signal lines on these wire-wrapped boards varied from 1000 to 1400; 120Q terminating resistors were used.
Modules required in larger quantities and designed in the project's early stages were constructed using three-layer printed circuit (PC) boards with the middle layer a ground-plane. Power distribution for these modules is by mini-busses above the board. The characteristic impedance of-signal lines on these boards holds fairly constant at 100Q (i.e., two oz. copper, ten mil signal line width and 1/32 inch. from signal line to center ground plane). This method of construction has two disadvantages. First, due to hi-gh-speed requirements, signal paths should not contain forks which add long stubs to the transmission lines. Signals must flow from a driver to-the first receiver, then to the next receiver, and so on to the terminating resistor. This requirement limits the number of ICs on a given PC board area to roughly 60% of the number on a conventional TTL design. The Table 2 compares the relative parameters of the three construction techniques.
Module Interconnections and Asynchronous Timing Requirements
The ease of interconnecting modules along with the ability to program modules via CAMAC helps meet the design objective of flexibility. Flexibility is necessary so that the processor system will be adaptable to future needs and to the dynamic requirements of ongoing experiments. All interconnections between trigger processor modules and almost all interconnections between these modules and other experimental electronics is via ECL differential drivers and receivers. All interconnections are made at the modules' front panels using 1100 mul tiple twisted-pair ribbon cable with industry standard flat ribbon connections for data and 110Q single twisted-pair cable with LEMO connectors for input Ready strobes.
Since the system is totally asynchronous (i.e., not sequential) one Table 3 summarizes the modules designed and built for the first application of this trigger processor, system.
.Of the eighteen different types and 108 modules built, sixteen types and 102 modules are general in nature and can be used in other experiments, eliminating the major portion of project design and development time.
All modules were specified and designed so that they could be computer diagnosed either individually or as part of subsystems or as a complete system. Additionally all subsystems and the system as a whole can be single-stepped by the host computer providing an additional level of diagnostic capability. These testing mechanisms make it easy to uncover failures both inside modules and in connecting cables. Routine monitoring of system operation during experimental running can be handled by the host computer. Monitoring and diagnostic software are discussed further in a later section.
Several of the system's important modules as .ell as the track finder subsystem will now be described. A sequential read is initiated by a signal at the Sequential Read Ready input. The output response is similar to the Random Read except that the read address comes from an internal counter which is incremented when the Output Ready goes high.
Since the internal RAM is a single access memory, logic in the module sorts out the write requests and the two types of read requests and stores them until they can be handled. The stack alternates between reads and writes when both are present by setting priority for the next request based on the type of request currently being serviced. For fast uninterrupted block writes to the stack, a read inhibit line is provided which locks out random or sequential reads.
It
Do Loop Indexer
This module acts as a controller that cycles through all values of two indices (I, K) much like a Fortran DO loop except that the upper limits may increase while the loop is in operation. Each pair of I > 0, K > 0 (or KMIN(I) as explained below) up to the upper limits at the end of data transmission is presented on the I, K outputs once and only once. Upper limits at any moment are flagged externally by signals at the I Too High (ITH) or K Too High (KTH) inputs to this module. A K Too Low (KTL) input is also available to set the K lower limit (KMIN(I)) for all higher I. KTL is used, for example, to avoid wasting time on backward tracks as discussed in the next section. The module is able to control the random read of data from two stacks (for track finding, for example) while the stacks are still being filled with data (from MWPC readout, for example). A description of the use of this module in the Track Finding Subsystem may be found in the following section.
At the beginning of each cycle the module examines the input control bits (ITH, KTH, CONT, KTL, etc.). Using backing registers it quickly outputs new indices. Simplified, the algorithm followed is this:
1. If neither ITH or KTH is true but CONT is received, then the new I is equal to the old I and K is incremented.
If KTH is true but ITH is not, I is incremented
and K is set back. For an I that has never been tried before, K is set to zero; for an I that has already been partially examined, K is set to the value of K that last caused KTH to be set. No I, K pair will be transmitted twice 3. If ITH is true, I is set back to zero and K is set back to the last value of K tried on the previous pass at this I. If all data was in the last time I was set back, then the pass thru all the indices just performed is the last required, and ITH will cause a DONE signal No more indices will be sent.
The Track Finder Subsystem
The Track Finder Subsystem was designed to find straight line track segments from particles traveling through three evenly spaced wire chambers. These may be planar or concentric as in the recoil system described earlier and shown in Figure 1 . The three chambers measure the coordinates of the points where tbe chambers are crossed by a particle trajectory, Zin Zmld, and zout, respectively. The system can also be used witho6t change to correlate hits in U, V, X wire chamber systems, where U and V refer to coordinate axes typically +20°from X. The Track Finder Subsystem is shown in Figure 4 . In the recoil system several wires fire at the site of each hit. The readout circuitry for each chamber provides a list of hits, characterized by the number of adjacent wires in each cluster and the position of the center of the cluster. The Track Finder accepts this as data and outputs a list of tracks found, characterized by the slope of the track (cot a) and the position of its intersection with the beam line, the assumed interaction vertex (Vz).
The centroid address and widths of groups of wires ' .hich have been hit during the current event are Hardware, whether at the module or subsystem level, must be developed in association with the preparation of software used in its checkout and later in its operation. In this section we describe our software and the knowledge we have gained of the relationship of software to hardware development and system operation.
Checkout software has been prepared for all the modules we have designed. This will make future production of these designs easier. Operational software, loading MLUs from a PDP-11 for example, iill also be available.
MLUs whose memories contain the functions or tables that control the physics of the trigger. A general purpose MLU loading software package is being prepared. This will call subroutines prepared by the user for each MLU. Examples of MLU programming in the recoil trigger are described at the end of this section.
Secondly, the experimental user must be convinced that the system as configured does what is intended and may wish to prepare a computer simulation of any new triggering system. A simulation is probably only necessary for more complex systems. It could be used occasionally during running conditions to compare with processor results. Continuous monitoring of processor performance is probably best handled by using Quad Scaler modules to count various internal processor conditions (such as the number of false tracks) and reading these through CAMAC into the experimental online computer on every event. The experimenters and on-line software, can monitor these scalers as is done for other detector parameters) looking for sudden changes or unusual conditions that indicate equipment failure.-
The system being described is a new approach to experimental triggering involving a significant jump in complexity in an area where careful scientific practice is essential. Because of this, we have chosen to bias toward overkill in our diagnostic and monitoring capabilities. In the following we describe the four broad areas of software development for the processor.
Simulation and Design Verification Studies
Simulation and design verification studies were required in studying the operation of the Track Finder, particularly the Do Loop Indexer. Although the function served by the Indexer is quite simple to explain, the internal timing required to obtain very high speed is tricky.
The state diagram of the Do Loop Indexer was first embodied in a Fortran program. This program was run against thousands of randomly generated events. The output convinced us that the basic organization of gates and registers was (or in a few cases was not) sound and the output was complete enough that it was generally not too hard to understand the reason that certain attempts at minimizinng the design failed. At least in the case of this module, it would have been a serious mistake to proceed without the simulator. As hardware was produced, the simulator slowly turned into a diagnostic program which calculated what was to be done and then exercised the hardware to insure that the hardware did what was expected. Eventually, all of the modules in the Track Finder were incorporated into the simulation, so that everything from the PWC receivers through the Track Stack could be exercised as a unit.
Simulation of a different kind is required to verify the loading of the various Memory Look Up modules. This is not a simulation of hardware performance as such, but rather a means of determining that physics is being done correctly by the numbers placed in the MLUs. As explained later, the numbers to be placedin the MLUs are first placed in disk files. The simulator runs can predict the output of a chain of MLUs by generating input data and then following the path, referring to the disk files to get MLU contents as required. System level hardware checkout will be incorporated into this simulator as well. A physics user of this system will be concerned with two software areas. One is the programming of where the various input and output fields are found within the input or output port, and builds the MLU load accordingly. The subroutine normally is a function of a number of parameters which are stored in a common parameter file. These parameters can be entered from the on-line computer terminal. By using the same parameter file for all MLUs in the system, consistency is assured. An MLU load file is also maintained by the Loading Program. Whenever a parameter is changed or a new subroutine initiated, the subroutine is called and the MLU memory load is prepared and placed in the MLU load file. When appropriate (beginning of the next data run or as requested from the computer console) all MLUs are loaded and verified in a few seconds from the designated MLU load file. At the beginning of each data run the MLU load file, the MLU parameter file, and all other parameters that define completely the processor configuration to be used during the run, are written on the data tape. This provides a permanent record for later analysis. Comments are also recorded to aid in interpreting the processor configuration during experimental analysis. Details on the MLU loading package and its use are provided in the System Documentation referred to earlier.
Programming the MLUs: The Recoil Trigger as an Example
In earlier sections we have described the recoil trigger in general outline and its track finding subsystem in some detail. Once the track parameters are determined much of the burden of the recoil trigger decision falls on the MLU modules. We shall describe here how these units perform the tasks of calibration and correction of scintillator pulse heights, of testing different hypothesis of particle type and kinetic energy, and finally of combining the resulting information and making the trigger decision. This algorithm is matched fairly well to the kinetic energy resolution of the detectors and, therefore, loses no information while compacting a large dynamic range into a few (in this case, five) bits.
Another application of the MLU module is the combination of various trigger information into the final trigger criteria, often with sliding thresholds. In addition to the calculated missing mass observed for a recoiling proton, the existance of other tracks in the recoil system coming from the same vertex, the scaled number of downstream tracks in the recoil system, numbers of neutral particles or low energy electrons observed -all are used as input to an MLU module loaded with predetermined combinations acceptable for triggering. For those events in the missing mass range of highest interest, the other criteria may be set loose. For study, one may trigger with one or more criterion removed and others tightened to get a purer sampling of events. Separate bits corresponding to different types of triggers are used. Some are prescaled external to the processor before being recombined for final master trigger generation.
In addition to the attempted calculations from all possible input bit combinations, the inputs are tested for consistency. In many applications, some input combinations are not phys-ically possible or meaningful. Typically, one number among the possible output numbers (notice, one number rather than one bit) is reserved to signify such data status. At other times, a bit is reserved for this purpose and the processor cycling is aborted rather than follow such configurations to the end.
These examples shcw how the MLU modules are appropriate for use whenever all possibilities for input information can be preprocessed to give output information. The amount of memory used here is already as large as is found in many on-line computers. This combined with the greater speed and the ability to use off-line analysis in generating output tables gives these devices more power than one can obtain, for example, by using on-line computers for event selection.
Section 5: Summary The status to date of the Recoil Triggering System is as follows. All modules are designed and production quantities on all but one design have been built in-house or have been received from vendors. Forty percent of the hardware including the Track Finder Subsystem has been tested. System diagnostic software and on-line data taking software are currently being worked on in parallel.
