The muon pretrigger system of the HERA-B experiment is used to find muon track candidates as one of the inputs of the first level trigger system (FLT). Due to the interaction rate of 40 MHz required to achieve an accuracy of 0.17 on sin(2β) after one year of running the total input of the muon pretrigger system is about 10 GByte/s. The latency to define muon track candidates should not exceed 1 µs for the processing units. Therefore the muon pretrigger is implemented as about 100 large size VME modules in a highly parallelized architecture. We present the concept, design and implementation of the system as well as performance studies and first physics results making use of the system in HERA-B.
I. INTRODUCTION
The HERA-B experiment at the electron proton collider HERA at DESY in Hamburg, Germany, studies the properties of B mesons produced in hadronic interactions of the 920 GeV protons of the HERA storage ring with an internal wire target in the halo of the beam [1] , [2] . The main emphasis of the experiment is to measure the CP symmetry violation by investigating the gold plated decay B → J/ψ K S 0 → l + l -π + π -where l stands for e or µ . For reconstructing one gold plated decay, 3✕10 11 primary interactions have to take place. In order to achieve siginificant statistics, i.e. 1000 gold plated decays per year, an interaction rate of 40 MHz has to be reached. Taking into account the average frequency of filled bunches inside HERA of about 8. 5 MHz this leads to a mean number of interactions per bunch crossing (BX) of 4.7. As a consequence up to 200 charged tracks are generated in a single event leading to occupancies in some detectors of 20%. Only one in a million interactions will produce a B meson and the branching ratios of the decays of interest are in the order of 10 -5 .
Therefore an elaborate trigger system is required to select events of interest with a high efficiency and to suppress background events by several orders of magnitude. This is realized in a four level trigger system where the lower levels consist of specialized hardware modules while the higher levels are based on PC farms. The first level trigger (FLT) is designed to search for track candidates. Seeds for the FLT tracks are provided by three distinct pretriggers, one for electron tracks, one for muon tracks, and one for tracks with high transverse momenta. The rate reduction factor of the FLT is required to be about 200 to achieve a manageable input rate for the second level trigger (SLT). This system reduces the rate by a factor of 25 using additional information mainly of the vertex detector system. The third level trigger (TLT) uses all detector information to perform cuts on masses and vertices resulting in a reduction factor of 20. The complete online reconstruction of a triggered event is done in the fourth level trigger (4LT).
II. FIRST LEVEL TRIGGER SYSTEM
Already in the first level of the trigger system a primary search for tracks is performed.
A. Strategy of the FLT
The FLT starts from track seeds delivered by three distinct particle identification devices and the corresponding pretrigger systems. The ECAL pretrigger system uses the deposited energy and its distribution in the cells of the electromagnetic calorimeter to define electron candidates [3] . In addition photons with high energy can be detected. Information from the pad readout of two superlayers of the muon system is used by the muon pretrigger system to define muon track candidates. There is a third pretrigger system, the high-p T pretrigger, to extend the trigger to non-gold plated decay modes. It uses three layers of tracking chambers with pad readout inside the magnet to define candidates for particles with high momenta [4] .
Using the track information of the candidates defined by the pretrigger systems, the FLT track finding units (TFU) search for combinations of hits in the next tracking layer in direction of the target. If hits are found, a search region, the `region of interest´ (ROI) for the next layer is determined, otherwise the track candidate is rejected. After a successful recursive track search through all layers used by the FLT the kinematic parameters of the tracks are derived by track parameter units (TPU). Either by counting tracks or by combining pairs of tracks and calculating their invariant masses, a trigger decision is derived by applying programmable cuts in the trigger decision unit (TDU) [5] . 
B. Requirements and Implementation
In order to achieve an FLT trigger decision after at most 12 µs, during which the detector information is stored in a front end driver (FED) buffer system, all pretrigger systems and the FLT are realized using specialized hardware modules in a highly parallelized and pipelined architecture. Due to the latency limitation as few as possible detector data are used (about 1/6 of all available data). As shown in figure 1 these data are transferred synchronously from the FED system to the pretrigger and FLT processors while the processors itself communicate via asynchronous message interconnections. To keep the system as flexible as possible, programmable logic chips are used in all hardware modules and internal calculations are based on look-up tables.
III. THE MUON PRETRIGGER SYSTEM
The muon pretrigger system uses a part of the available information from the muon detector to define track candidates for the FLT.
A. Muon Detector
The muon system -as shown in figure 2 -consists of 4 superlayers (MU1 to MU4) of muon chambers and 3 absorbers (MF1 to MF3). The superlayers MU1 and MU2 consist of 3 layers of tube chambers with 0° and ±20° stereo angles. MU2 is the only muon superlayer which is not used in the trigger but only for reconstruction, MU1 is used by the FLT. The two superlayers MU3 and MU4 are used both by the muon pretrigger and by the FLT. They consist of one 0° layer with cathode pad and wire readout, where the pad readout is used by the muon pretrigger while the FLT uses the wire readout. The superlayers MU3 and MU4 are separated by about 1 m in z direction with very little absorber material in between them in order to minimize the effect of multiple scattering. This results in clean information about the positions and directions of particles for the muon pretrigger to initiate the FLT search algorithm. For all 4 superlayers the region with the highest track densities near the proton beampipe is equipped with gas pixel chambers. In order to allow for a similar algorithm in the muon pretrigger hardware 6✕4 pixel cells are combined to `pseudo pads´. 
B. Muon Pretrigger
The muon pretrigger system uses the pad or pseudo pad data provided by the FED system to determine muon track seeds. Track seeds are defined by temporal and spacial coincidences of hits in MU3 and MU4.
The coincidence schemes are shown in figure 3. They are different for the pad and the pixel system as the deflection of tracks due to the magnetic field is higher for the outer part of the system, i.e. the pad system, where the momenta are lower.
Therefore for each pad in MU3 3✕2 corresponding pads in MU4 are combined in a logical `OR´. If the MU3 pad and at least one of the 6 pads in MU4 are hit, a track seed is generated. In the pixel system 2✕2 pseudo pads are combined in an `OR´. Apart from this the generation of track seeds is the same. The data from each pretrigger channel -defined by the coincidence data of one pad or pseudo pad column in MU3 -are processed in parallel. Figure 4 gives an overview of the dataflow in the muon pretrigger system.
The data rate to be processed by the muon pretrigger system is defined by the number of detector channels in the pad system and in the pixel system and by the readout frequency of the detector of 10.4 MHz. For the FLT decision additional information to tag the event data is added by the fast control system (FCS). Therefore the data input rate of the processing units of the muon pretrigger amounts to 17.9 GByte/s in the pad system. In the pixel system the data read from the detector are combined to pseudo pads by pixel mapping boards (PMB). This reduces the input rate of the pixel system to 1.6 GByte/s. About 100 large 9U VME boards and 300 optical transmitters and receivers are used to perform all data transmission and coincidence calculations. The latency of the muon pretrigger system is 1.8 µs.
C. Muon Pretrigger Hardware
The muon pretrigger is designed as a modular system of about 100 large size VME boards, consisting of 4 major components. The system is schematically displayed in figure 6 . The pretrigger link boards (PLB) are located in the front end driver (FED) crates next to the FED daughter cards, from which they receive the digitized pad information of the superlayers MU3 and MU4 at a rate of 10.4 MHz. For the pixel system an additional board, the pixel mapping board (PMB) is needed to convert the pixel readout data to the pseudo pad format used by the muon pretrigger. The PLB consists of 8 channels each covering the input information from one (pseudo) pad column and working independently from the other channels. The data are being multiplexed into two cycles of 16 bits of data and tagged with the BX number and a cycle number. Furthermore the data are being serialized with 32 bits at 25 MHz and transmitted via pretrigger optical links (POL) [6] over a distance of about 60 m to the pretrigger coincidence units (PCU), located in the electronics trailer outside the experimental area.
One PCU determines the coincidences of 4 pretrigger channels -corresponding to one (pseudo) pad column in MU3 and the corresponding MU4 columnsindependently from the other channels. Logically the PCU is divided in three parts:
The input part consists of 8 optical receiver piggy backs, the chips for the parallelization of the data and a dual ported RAM bank, used to synchronize the data from different optical links. In the processing part of the PCU the coincidences are determined using complex FPGAs. Per cycle 15 possible coincidences can be found in general, but due to internal limitations of the FPGAs used not all of them can be processed. The two selections are shown in figure 5 . The data in one cycle are divided in 4 blocks out of which at most 2 coincidences are selected (`2 out of 4´). A second selection is made if the number of coincidences found exceeds 5 per cycle. In this case the 5 coincidences which are nearest to the beampipe are chosen (`5 out of 8´). The selection steps are implemented using internal look-up tables (LUT) inside the FPGA. Only if at least one coincidence is found in a cycle all coincidences are stored in the zero suppression FIFOs. Up to this point the data are processed, using a pipeline architecture, in parallel to the BX clock of 10.4 MHz using an internal processing frequency on the PCU of 25 MHz.
After the zero suppression step all further processing is message driven and therefore independent of the BX clock. The last part of the PCU performs a serialization of the coincidences found and asserts a flag indicating to the pretrigger message generator (PMG) that valid data One message generator covers 8 PCU channels. The information about the coincidence provided by the PCU serves as an input to a LUT. This translates the information from the format used inside the muon pretrigger to a message format understood by the FLT tracking processors (TFUs). The message generated by the PMG defines an initial region of interest (ROI) for the TFUs. The maximum possible processing frequency of the PMG is 25 MHz, which might be reduced to about 20 MHz depending on the distribution of PCU channels where coincidences are found. The same hardware with modified firmware is used in the high-p T pretrigger system.
The interface between the muon pretrigger and the FLT is handled by LVDS multiplexer boards (LVDS-MUX). They receive input data from 2 PMGs and transform the logic levels of the electrical signals from PECL to LVDS, which is used in the FLT inter processor messaging.
IV. CONTROLLING AND MONITORING OF THE MUON PRETRIGGER
The online software initializes, controls and monitors the muon pretrigger hardware. It consists of several processes as shown in figure 7:
• `mpre_srv´ and `mpre_slave´ running on each VME CPU of the muon pretrigger system. They access the hardware based on commands issued from the other processes and communicate via shared memory with each other. `mpre_srv´ processes execute all hardware commands which are finished after a short time, while all other commands -e.g. monitoring functions -are executed by the `mpre_slave´ processes. Processes running on different VME CPUs operate independently from each other.
• `MPRE_BOSS´: the central coordinating process. It serves as an interface to all outside systems like DAQ or FLT by translating their request into sets of commads to be executed by `mpre_srv´ and `mpre_slave´ processes. In addition it takes control of the status of the hardware and of the other processes.
• `mpre_monitor´: the central process collecting all monitoring information from the `mpre_slave´ processes and publishing it
• `mpre_errlog´: collects the information and error messages from all processes
• `mpre_con´: an expert user interface Besides `mpre_slave´ and `mpre_con´ all processes are booted by the HERA-B DAQ environment. They communicate using the standard HERA-B protocol, `really powerful messaging´ (RPM). All commands from the DAQ are issued as state transitions using the `state machine control´ protocol (SMC).
V. EFFICIENCY OF THE MUON PRETRIGGER
In order to determine the efficiency of the muon pretrigger system data are taken with the pretrigger in spy mode, i.e. not used as a trigger but recording its trigger information in the archived event data. This allows to compare the coincidences which should have been found by the muon pretrigger hardware with the ones really found. As data taken with a random trigger do not contain enough muon pretrigger coincidences to perform such an analysis a specialized code has to be run on the SLT to The messages which should have been generated by the PMG modules according to the data taken and MUPRESIM are compared with the messages recorded in the data. The x-and y-positions of the messagesbelonging to individual pads -are determined, and for all pads, where at least 10 messages are found in MUPRESIM, the messages are compared. In addition some channels with known hardware problems are removed from the analysis. The efficiency is finally calculated as the ratio of perfectly matched messages from the hardware and the messages found in MUPRESIM. Figure 8 shows the distribution of this ratio.
Besides some inefficient channels with problems under investigation, which have an efficiency of 0, the main part of the muon pretrigger hardware has a high efficiency, 86% of the channels have an efficiency of more than 0.8, 50% of more than 0.99.
VI. FIRST ANALYSES USING THE MUON PRETRIGGER
The running of HERA-B in 2000 was mainly devoted to commissioning of the detector and the trigger systems. As a consequence of the FLT system not being fully operational until the last days of this year´s running, the second level trigger system (SLT) was used to account for the missing tracking part of the FLT. This gave the possibility to take some amount of data for J/ψ production with the working ECAL and muon pretrigger systems and the specialized SLT using the track candidates from the pretriggers. About 40% of the available data sample were analyzed. By requiring a match between the vertex detector tracks found in the trigger with the ones found in the offline reconstruction, cutting moderately on |p| (>5 GeV/c) and p T (>0.8 GeV/c), applying some geometric fits and leaving out high multiplicity events (by counting photons in the RICH detector) the signal can be clearly seen, as shown in figure 9 . The number of J/ψ ¡ µµ events is determined by fitting the sum of an exponential and a Gaussian distribution to the data.
Taking into account the very limited trigger setup used, without active FLT, these results prove that the muon pretrigger system works very well. 
VII. ACKNOWLEDGEMENTS

VIII. REFERENCES
