Feasibility of the Hardware Muon Trigger Track Finder Processor in CMS by Kluge, A & Wildschek, T
78 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
Feasibility of the Hardware Muon Trigger
Track Finder Processor in CMS
A. Kluge and T. Wildschek
Abstract—This paper describes a feasibility study for the design
of the muon trigger track finder processor in the high-energy
physics experiment compact muon solenoid (CMS), planned for
2005, at CERN. It covers the specification, proposed method, and
a prototype implementation. Comparison between several other
measurement methods and the proposed one are carried out. The
task of the processor is to identify muons and measure their
transverse momenta and locations within 350 ns. It uses data
from almost 200 000 detector cells of drift tube muon chambers.
The processor searches for muon tracks originating from the
interaction point by joining the track segments provided by the
drift tube muon chamber electronics to full tracks. It assigns
transverse momentum to each reconstructed track using the
track’s bend angle.
Index Terms—Muon trigger, track finder, transverse momen-
tum, VHDL hardware simulation.
I. TRACK FINDER PROCESSOR ENVIRONMENT
THE detector compact muon solenoid (CMS) [1] will workat the hadron collider LHC. Protons will collide with a
center of mass energy of 14 TeV. Every 25 ns a bunch crossing
occurs.
The detector CMS will be built around a high-field super-
conducting solenoid leading to a compact design for the muon
spectrometer, hence the name compact muon solenoid (CMS).
The solenoid has an inner radius of 3 m generating a uniform
magnetic field of 4 T parallel to the beam axis. The magnetic
flux is returned through a 1.8-m-thick iron yoke instrumented
with muon chambers. The magnetic field in the return yoke is
1.8 T. The overall dimensions of the detector are a length of
about 20 m and a diameter of 14 m.
The muon detector fulfills three basic tasks: muon iden-
tification, trigger, and momentum measurement. The muon
detector is placed behind the calorimeters and the magnet
coil. It consists of four muon stations interleaved with the
iron return yoke plates. The stations are numbered from 1
to 4 from inside out. A system of drift tubes (DT) [2] is
applied in the barrel region, while cathode strip chambers
(CSC) cover the forward region. In addition, resistive plate
chambers (RPC) cover the entire muon detector [1]. Fig. 1
shows the -view of the detector and a muon traversing the
tracker, the calorimeters, magnet coil, and muon system.
The track finder processor described in this paper processes
data from the drift tube system. The DT-trigger primitive
generator (TPG) [2] first processes the information of the
Manuscript received December 4, 1997; revised September 29, 1998 and
December 10, 1998.
The authors are with CERN, CH-1211 Geneva 23, Switzerland.
Publisher Item Identifier S 0018-9499(99)03707-7.
Fig. 1. r-view of the detector CMS and a muon track.
chamber locally. Up to two track segments [position and angle,
see Fig. 2(a) and (b)] per muon chamber are delivered.
The TPG provides a position resolution of 1.25 mm in sta-
tion one and two and 2.5 mm in stations three and four [3]–[7].
The choice of this input resolution to the track finder processor
is motivated as follows: 1.25 mm is the finest resolution that
can be provided by the TPG at reasonable expense. The track
finder processor should have a fine momentum resolution at
high momenta, where the complimentary RPC-based trigger
suffers from a poor position resolution. The relevant quantity
for momentum measurement is not the linear position, but
rather the azimuthal angle. The choice of resolution given
yields approximately constant angle resolution in the four
stations (about 0.3 mrad).
The DT-chamber system is divided into 12 -segments and
5 wheels in the -direction (see Fig. 4). There are 4 muon
stations. Thus the entire system comprises 240 chambers.
Hence 480 track segments are delivered to the regional drift
tube trigger, the track finder.
Track segments from different stations are collected by the
track finder (see Fig. 3). The task of the track finder processor
is to find muon tracks originating from the interaction point
and to measure their transverse momenta and locations in
azimuth and pseudo-rapidity . The track finder selects the
four highest muons in the detector and forward them to the
global muon trigger.
II. TRACK FINDER PROCESSOR SPECIFICATIONS
In the following the main features of the track finder
processors are listed and described shortly.
0018–9499/99$10.00  1999 IEEE
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 79
(a)
(b)
Fig. 2. (a) Up to two track segments per chamber are given out. (b) A track
segment consists of the spatial coordinate f (11 bit), the bend angle b (8
bit), and quality (3 bit).
A. Output Quantities:
• Transverse Momentum : Muons with a below
2.0 GeV/c do not reach the chambers due to the bending
in the magnetic field and energy loss in the absorbers. The
-range above 2.0 GeV/c is divided into 25 -classes
(see Table I). The -value is given out in a 5-bit number.
A sixth bit indicates the charge of the particle.
• Location in : A -resolution of 1.4 or 25 mrad is
provided by encoding the location in 8 bits.
• Location in : The -coordinate can be derived from the
place where a particle crossed detector wheel boundaries.
The -value is given in a 2-bit code. An option to improve
Fig. 3. Muon trigger track finder processor data chain. Numbers give the
latency of the components.
Fig. 4. DT-chamber segmentation.
-resolution is to include trigger primitives of the middle
( ) drift tube layers, measuring the -coordinate.
• Quality Information: Quality information indicates the
confidence that the found track is a real track and not
a ghost track. Moreover, it gives the -measurement
resolution that can be expected from this track. Two 2-bit
words are foreseen.
B. Requirements of the System
• Dead Time Free: The first level trigger architecture re-
quires a dead time free operation. That means that even
in case of a trigger accept by the global trigger the track
finder processor must stay operational for the subsequent
events. The trigger system is capable of accepting data
from each single bunch crossing, with a data repetition
rate of 40 MHz.
• Processing Time—Latency: It is important to keep the
processing time of the track finder processor as low as
possible, because during the time an event is evaluated
in the level one trigger all corresponding detector data
must be stored in the data pipeline. For the track finder
processor 23 bunch crossings or 575 ns are reserved
[8]. This number includes sorting of the four highest
muons. For the track finding and -measurement only
350 ns are available.




TRACK FINDER PROCESSOR SPECIFICATIONS
• Programmability: The trigger system has to be flexible
enough to permit changes in the algorithm and in the de-
tector geometry. Even if the designed chamber geometry
is not going to be changed, misalignment of the chambers
must be accounted for.
• Output Segmentation: The trigger system must output the
information about four muons with the highest in the
detector.
• Technology: It must be possible to implement the hard-
ware using today’s technology.
Table II summarizes the track finder processor specifica-
tions.
III. IMPLEMENTATION OPTIONS
Several methods for the implementation of the track finder
processor have been evaluated: pattern comparison, histogram
method, neural networks, and extrapolation method. It should
be pointed out that the assessment of the various technologies
might be quite different by the time CMS starts up in 2005.
However, for this feasibility study we have focused on what
is achievable with today’s technology, in accordance with the
technology requirement listed in the previous chapter.
A. Template Matching (Pattern Comparison)
The classical method in track finding hardware implemen-
tations is to search for predefined tracks or bit patterns. The
actual hit patterns are compared to the predefined patterns.
Application of content addressable memories [9]–[13] is an
obvious solution. However, it has to be stated that content
addressable memories are available commercially only with
small storage capacity. Examples can be found in [14] and
[15]. The storage depth is of the order of 1024 48 bit words.
Simulations have been conducted to estimate the number of
patterns to store. This number depends on the input resolution
used, the bending of the tracks, and the amount of multiple
scattering. Due to the high magnetic field and large amount
of material in the CMS muon system, the number of patterns
exceeds 2 10 [16], if one were to use the full resolution
for storing the patterns. Full resolution is not needed for
track finding, only for -assignment. So one could use a
two-stage design, where the first stage finds tracks, using a
coarse position resolution, while the second stage assigns
using full resolution. In such a design, the problem of back-
mapping arises: In a conventional one-stage design, the track
parameters are output directly by the track finding. In the two-
stage design, the track finding stage has to output pointers to
the full-resolution data, such that the second stage can retrieve
the full-resolution data from a pipeline memory and use them
for assigning . The conceptual simplicity of the one-stage
template matching method, however, would be lost.
In view of the fine granularity of the detector and the high
required -resolution combined with the high number of de-
tector channels, the nonprojective chamber geometry, and high
bending power of the detector, the template matching method
does not appear feasible within the available calculation time.
B. Histograming Method
When employing a histogram method one has to find
adequate histogram functions. In the case of the track finder
the best function values would be transverse momentum
and location as a function of
spatial and angular track segment coordinates . In all
stations but station three transverse momenta can be derived
from the bending angle . In close vicinity to station three
the bending angle has a zero crossing. Thus transverse
momenta cannot be deduced from track segments in station
three. Hence a method where the desired quantities, and
, are assigned by a function and detectable directly from
the histogram can be ruled out.
An alternative possibility is to find functions of the hit
coordinates ( and ) giving a calculated hit coordinate in
a reference plane. Since such a function for station three
cannot be found, station three has to be the reference plane.
Function values are spatial and angular coordinates of the
tracks in station three. They are entered in the two-dimensional
- -histogram. A peak will form in the histogram when
track segments come from the same track and thus
enter the same histogram bin. The location of the peak
corresponds to the location of the track but not to the trans-
verse momentum . That means the histogram method can
be used only to find tracks but not to assign a transverse
momentum. In addition to the number of entries for each bin,
one has to store the relative address of the track segments
which caused the entry. Once the peak in the histogram is
found one can select the track segments of the perceived
track(s) using their addresses stored with each bin. They are
used to find the hit coordinates to calculate the transverse
momentum . However, the compactness of the histogram
method is lost.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 81
Fig. 5. Principle of the track finder algorithm (three-step scheme).
For the task of assembling track segments to a complete
track the full -resolution (0.3 mrad or 11 bits) and -
resolution (10 mrad or 8 bits) is not needed. Assuming that
eight bits for and five bits for are sufficient, the histogram
still has a dimension of size 256 32 ( ). Each histogram
bin has to store the number of entries and the addresses of at
least four track segments which caused the entries. This means
the peak finder has to find the highest entry in a (256 32
) 8192 bin histogram. Given the timing constraints this is
also not practicable.
While proceeding in such a manner is a common approach
in software solutions, a hardware implementation does not
seem practicable. The size of the histogram requires a huge
amount of logic units. Moreover, the calculation time would
by far exceed the required maximum latency.
The strength of the histogram method, namely producing a
histogram with bins of the desired features, cannot be exploited
to the full extent. No function can be found for all input data
which produces the transverse momentum and location .
Therefore, the architecture loses its compactness.
C. Neural Networks
Recently, intensive research has been conducted on the
application of neural nets in high-energy physics [17]. This
includes software and off-line triggers as well as hardware
triggers. In several more high-energy physics experiments
neural networks are considered for application and some are
already in use (CDF [18], CP-LEAR [19], H1 [20], [21],
NEMO, WA92 [22]). However, it has to be said that until now
no first level trigger was employed using neural nets only. This
is due to the relatively long processing time and to the limited
complexity of implementable algorithms. The response time of
typical commercially available digital neural networks is found
to be between 1 and 10 ms. A recent application of a digitally
programmable analog neural network [23] is reported in [20].
The processing time in the described application is as low as
50 ns. Analog designs, however, typically have precisions of
a few percent. The position input to the track finder processor
Fig. 6. If a track segment is found to be within the extrapolation window
given by extra and thresholdext, the extrapolation is considered successful.
has a resolution of 11 bits, corresponding to a precision of
0.05%. Analogue designs are therefore out of the question.
Concluding the state of the art of hardware implementation
of neural nets it must be said that today it is not sufficiently
advanced. However, as designs of neural net implementations
evolve, especially with respect to processing speed and number
of inputs, they can become a possible solution for first level
triggering and thus for the track finder.
D. Extrapolation Method
A typical software approach, the extrapolation method was
elaborated for the hardware implementation.
The basic principle is to attempt to match track segments
caused by the same track. This is done by extrapolating into
the next station from a track segment using the spatial and
angular measurement.
While pattern matching methods usually deliver the wanted
track property directly, the extrapolation method requires three
steps:
• pairwise matching of track segments by extrapolation;
• assembling track segment pairs to full tracks;
• assigning the track properties transverse momentum and
location.
Fig. 5 illustrates the principle of the track finder algo-
rithm. Global track finding methods such as the pattern match
method directly set the input data into relationship to the
wanted features. While this would be an advantage in many
applications, in this case it complicates the implementation.
When performing a pattern match one looks for track segment
combinations which belong to one track (track finding). For
this task full measurement precision is not needed. However,
after a pattern has been recognized the link to the original track
segment data is not available any longer. Thus, patterns must
be formed by track segment data with full resolution so that
each pattern allows directly the determination of the wanted
features with high precision.
82 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
Fig. 7. Relationship between bend angle measurement b in the source station and deflection of a muon target   source.
Fig. 8. Relationship between bend angle b in station three and deflection of the muon is ambiguous.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 83
(a)
(b)
Fig. 9. (a) Matching track segment pairs are combined to a full track. (b)
Using the deflection 2   1 or the bending angle b to determine pt.
However, the extrapolation method splits the process of
measuring the transverse momentum into three steps. This
approach allows for a flexible use of the resolution in each
step according to the requirements. The track finding is per-
formed with reduced resolution, thus resulting in a reduced
number of track patterns to recognize. As a consequence, the
hardware expense is smaller and the execution time shrinks.
For assigning the transverse momentum the full resolution
of the track segment measurements is still available.
IV. EXTRAPOLATION METHOD
The pairwise matching is based on the principle of ex-
trapolation. Using the spatial coordinates and the
angular measurement of the source track segment
an extrapolated hit coordinate in another chamber
may be calculated. If a target track segment is found to be
at the extrapolated coordinate within a certain extrapolation





Simulations have been conducted to prove the feasibility of
the extrapolation between the stations [3], [4]. Fig. 7 shows the
relation between the bend angle in the source station and
the deviation of the particle track between target- and source-
station for several station pairs. The graphs
show unambiguous relationships proving the feasibility of
extrapolation between these station pairs. The same condition
can be found in all other station pairs except for those extrap-
olating from station three. Fig. 8 shows the situation when
extrapolation is done from station three. No unambiguous
relationship can be found. Moreover, for small bend angles
no prediction can be done at all. This effect is caused by
the zero crossing of the bend angle. However, since all other
extrapolations are feasible these problems can be circumvented
by extrapolating toward station three instead of off station
three (see left part of Fig. 5).
B. Track Segment Assembling—Acceptance Study
Once track segment pairs are found they are linked together
to full tracks. Track segment pairs of different chamber pairs
may be matched to each other if they have one track segment
in common. This scheme is illustrated in Fig. 9(a).
However, due to geometrical inefficiencies and chamber
failures missing hits must be accounted for. Simulation studies
for the acceptance of muons were conducted for two possibil-
ities [3]. The first option is to require tracks consisting of at
least three out of four track segments or track segments in the
two innermost stations. The second option is to require tracks
with at least two out of four track segments. Fig. 10 compares
the results. When simulating the requirement of three out of
four matching track segments only about 82% of all muon
84 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
Fig. 10. This plot compares two track finder requirements. Circles show the case where a track has to have at least three track segments or track segments
in the two innermost stations. Squares require only two track segments for a valid track.
tracks are found. An acceptance of more than 95% is achieved
when the requirements are loosened to two out of four matched
track segments. Consequently, the track finder accepts tracks
consisting of only two matching track segments. That means
even a single track segment pair is already considered a valid
track.
C. —Assignment
In order to assign transverse momentum we use the
track’s bend angle. Two methods are available [3], [5], [6],
[24]. One method uses the difference of positions in two
distinct stations [see Fig. 9(b)]. Two spatial coordinates are
sufficient. Fig. 11 illustrates the relation between transverse
momentum and aforementioned difference. One expects the
absolute value of the difference in bend angle to decrease
with increasing . However, due to the zero crossing of the
bend angle in station three the absolute value of the difference
of the angles at first rises with (except for ). An
unambiguous relationship between difference of positions and
transverse momentum is shown for difference . For
other station pairs this relationship is ambiguous. In such a
case the bend angle measurement of a single track segment
can be used to measure [3], [4].
Fig. 12 shows the -resolution achieved by the method
described.
The drift tube trigger primitive generator also delivers a
quality information. This quality indicates how many layers
of a chamber contributed to a track segment. It is also an
indication of the measurement resolution of the track segments
[2]. These quality bits are used to select the algorithm in order
to provide the most accurate -measurement.
D. Performance
The presented simulation results show that extrapolation,
track assembling, and assignment are possible. Fig. 13
shows the efficiency of the track finder processor for -
thresholds of 20, 40, and 50 GeV/c [3]. The results take
background into account by superimposing on average 20
minimum-bias events and by using a full simulation of the
particle interactions with the material of the detector. The mo-
mentum resolution is given by the steepness of the efficiency
curves at the nominal threshold values.
V. ARCHITECTURE AND HARDWARE ALGORITHM
The basic architecture of the track finder processor [16]
is described. The mapping of the chamber structure onto the
hardware level is discussed. Due to the bending of the tracks
and the nonprojective geometry of the chamber system muons
cross segment boundaries. This requires a large amount of
interconnection between processing units. Using the hardware
description language VHDL a simulation of the processor
model was conducted. The model was used to prove the
functionality of the algorithm. Moreover, it served to optimize
the system partitioning with respect to the amount of inter-
connections between processing units and processing latency.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 85
Fig. 11. Difference of azimuthal hit coordinates ii i (deflection of the muon between two stations) over transverse momentum pt for several station pairs.
Using the VHDL model a later described FPGA prototype was
designed [4]–[6], [16], [25], [26].
A. Logic Segmentation
Processing of the entire muon data of the detector within one
logical unit is impossible and also unnecessary. The amount
of 10-kbit (480 track segments times 22 bit per track segment)
data per crossing cycle yields an input data rate of about
400 Gb/s. The large amount of data to be processed causes
a severe integration problem. When splitting up the processor
in several physical units an interconnection problem between
those units arises. Fortunately, a muon track passes only a
small number of detector segments [16].
In order to render communication between processing units
possible at a minimum expense, the logical structure of the
chamber system is mirrored inside the track finder processor
hardware. In Fig. 14 the logical segmentation of the track
finder processor is shown.
A sector processor matches the track segments identified by
the drift tube trigger primitive generator logic [2] and tries
to form up to two complete tracks. If the sector processor
succeeds, it assigns a transverse momentum and determines
the location in and of each track. Tracks which traverse
more than one detector segment are given out by the sector
processor of the detector segment where the track’s innermost
track segment is found.
Of the 60 (12 -sector times 5 wheels) times two possible
tracks identified by the sector processors, only the four tracks
with the highest are retained by the detector sorter. All
the information on these tracks— , charge, , quality—is
forwarded to the global muon trigger. The latter combines the
track finder processor information with the trigger information
given by the RPC-system [1], [27].
B. Track Finder Processor Algorithm
In Fig. 15 a block diagram of a sector processor is dis-
played. The sector processor is divided into three parts—the
extrapolator (EU), the track assembler (TA), and the -, -,
and quality-assignment units (AU) [16].
The extrapolation unit EU attempts to match track segment
pairs of distinct stations using the extrapolation criteria de-
scribed earlier. When track segment pairs meet, these criteria
the information is forwarded to the track assembler TA.
Since tracks may cross detector segment boundaries, the
information of the extrapolation units of the neighboring
detector segments are also routed to the track assembler TA.
The track segment linker (TSL) and track selector (TSEL)
evaluate all extrapolation results in order to find up to two
86 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
Fig. 12. Resolution of the transverse momentum pt as a function of this momentum itself.
Fig. 13. Efficiency curves for pt-thresholds of 20, 40, and 50 GeV/c. Higher thresholds are not likely to be needed.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 87
Fig. 14. Logical segmentation of the track finder processor.
tracks with the innermost track segment in its own detector
segment. They forward the relative track segment addresses of
the track segments of found tracks to the track segment router
TSR. During the execution of the track assembler algorithm
the track segment data are stored in a buffer memory located in
the TSR. The relative addresses are used by the track segment
router TSR to extract the corresponding track segment data
out of the buffer memory.
The track segment data are forwarded to the -, -, -, and
quality-assignment units (PAU, AU, AU, AU).
Short Description of the Track Finder Algorithm: In the
following, a short overview of the track finder algorithm is
given (see Fig. 15). The algorithm reduces the number of
possible track candidates from about 1800 to two. Fig. 16
illustrates the first part of the reduction.
The extrapolators (EXT) match track segment pairs to each
other. For each possible track segment pair the information
bit indicating whether the track segments belong to each
other and the quality word of the matched track segments
are given out. After the extrapolation 1800 possibilities to
assemble valid track candidates exist. The track candidates are
called track segment patterns [Fig. 16(b), (c)]. Recognizing
1800 patterns or track candidates can be done easily by
employing a pattern comparison method. However, each of
the patterns has a quality information attached to it. In order
to select the two highest rank patterns a comparison of quality
numbers would be necessary. This takes too much time.
The extrapolation result selector (ERS) selects the two best
extrapolations for each source track segment. It encodes the
extrapolation result bits into address words , each
indicating the target track segments of up to two success-
ful extrapolations. The extrapolation quality is given out
[Fig. 16(d)].
The TSL attempts to link track segments together starting
from the innermost track segment. As the extrapolation result
selector (ERS) delivers up to two extrapolation addresses
per source track segment more than one track candidate
may be found originating from the innermost track segment
[Fig. 16(e)]. In order to cope with inefficiencies of the chamber
system a given number of track segment linker modules start
in stations other than station one. The track segment linking
scheme reduces the number of track candidates from 1800
to 72. A quality information remains attached to each track
candidate.
For each innermost source track segment the single track
selector (STS) retains only the track candidate with the highest
extrapolation quality. A total of 22 track candidates can survive
[Fig. 16(f)].
The cancel out units (COL) cancel tracks using track seg-
ments already contained by longer tracks. Thus the COL also
erase track patterns which are part of longer track patterns. An
example is a track consisting of track segments in station two
and three which are found to be equal to track segment two
and three of a track containing segments from all four stations.
The track class selector (TCS) selects the two highest
ranking tracks out of the remaining 22 track candidates and
forward the relative addresses of the matched track segments.
Selection criterion is the number of track segments involved
in a track candidate.
The track segment router (TSR) uses these relative addresses
mentioned above to extract the track segments data out of the
buffer memory and outputs the corresponding track segment




Extrapolation between six possible station pairings are con-
ducted in parallel (1-2, 1-3, 1-4, 2-3, 2-4, 4-3). Muons can
cross detector segment boundaries. Thus it is necessary also
to compare track segments with track segments from at least
five neighboring detector segments (two adjacent sectors in
and three adjacent segments in ). Consequently, for each
possible start track segment (two in each chamber) 12 target
track segments have to checked as to whether they fulfill the
extrapolation criteria. Thus in total 144 (six station pairings
times two track segments per chamber times twelve target track
segments) have to be conducted. Fig. 17 shows the block dia-
gram of an extrapolation unit responsible for an extrapolation
from one source track segment to 12 target track segments. The
output are the extrapolation results and the extrapolation
quality word for each of the target track segments. For
calculation of the extrapolation value static memory-based
lookup tables are employed. The comparison is done using
subtractors and window comparators. The extrapolation result
selector (ERS) employs priority encoders.
B. Track Assembling
The extrapolation result selectors (ERS) output the ad-
dresses of target track segments of successful extrapolations.
The track assembler employs a dynamic track segment link-
ing scheme [16] to assemble track segment pairs to up to
two full tracks. An example is illustrated in Fig. 18. In the
example it is assumed a muon track is composed of four
track segments (track segment 0 in station one and four
88 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
Fig. 15. Block diagram of a sector processor.
Fig. 16. Reduction of possible track candidates by the track finder algorithm.
Fig. 17. Extrapolation unit extrapolates from one track segment and compares the extrapolated value to 12 target track segments. It outputs the extrapolation
result (er) and the extrapolation quality (eq) for each comparison.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 89
Fig. 18. Principle of track segment linker operation.
and track segment 1 in station two and three). Using the
addresses provided by the extrapolation result selectors (in the
extrapolators) the scheme assembles the track in a sequential
way. The extrapolator EU extrapolating from station one to
two outputs the address for the target track segment TS2 1.
The extrapolator starting the extrapolation from this track
segment outputs the target track segment TS3 1, and so on.
The hardware implementation employs simple multiplexers
(Fig. 18). Target track segment addresses of extrapolation
between stations one and two are routed to the select input
of the multiplexer. Target track segments of all extrapolators
of extrapolations between stations two and three are routed to
the data input. As a consequence, the multiplexer outputs the
target track segment address in station three. This architecture
is repeated for extrapolations between station three and four.
Once the addresses of matching track segments are known, the
track segment data are extracted from a buffer memory using
multiplexer arrays (TSR).
C. Assignment
The transverse momentum is assigned using static
memory-based lookup tables. Locations and can be
derived directly from the measured track segment data and
track segment addresses, respectively.
VII. FPGA PROTOTYPE
This section describes the prototype of the muon track finder
processor.
The goals of the realization of the FPGA-prototype were:
• to demonstrate that the VHDL-model of the processor can
be implemented in hardware with reasonable expense;
• to show that the designed and simulated algorithm also
works implemented in hardware;
• to show that the general design concept is feasible.
However, for economic reasons the XILINX 4000 technol-
ogy was employed (and not an ASIC). It was not our aim to
build a prototype capable of fulfilling timing specifications of
the CMS first level trigger [8].
FPGA’s with an input/output (I/O) pin count of not more
than 192 pins were employed. Considering this restriction we
used each I/O pin to insert or extract two data bits to or
from each physical unit within one clock cycle. As the used
technology does not allow a synchronously working design
with a clock frequency in excess of 50 MHz the internal clock
frequency was designed to be 20 MHz. As a consequence the
I/O clock frequency yields 40 MHz. It is obvious that this is
no option for a final implementation. However, as mentioned
earlier, the FPGA-prototype was not designed to fulfill timing
specifications of CMS.
90 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 46, NO. 2, APRIL 1999
In all, the FPGA processor employs 19 FPGA’s and 19
lookup tables. A total of 240 000 FGPA gates (only 60 to 70%
are used) or 10 000 cellular logic blocks are available. An
amount of 10 000 component pins are on the printed circuit
board. A total of 1236 input bits are brought onto the board
each clock cycle and 362 output bits are given out each cycle.
As the board is operated in time a multiplexed mode with
only half the number of I/O pins is necessary, i.e., 618 input
pins and 181 output pins. The FPGA-processor evaluates each
event within 29 cycles (14 cycles are required for the final
system). The printed circuit board is designed in 9U VME
standard. However, due to the large number of I/O bits the
board is about 15 cm longer than the standard VME board.
As the FPGA-prototype demonstrates, the logic implemen-
tation of the designed algorithms does not pose a problem
for implementation. The number of necessary logic gates is
comparably low. However, the I/O count of the physical
units is very high. In the prototype design only a reduced
number of bits is processed. For the final implementation a bit
number of some 2500 has to be processed each cycle in each
sector processor. This task could become even more daunting
because discussions are ongoing to increase the amount of data
provided by the trigger primitive generators.
There are several options to come by the problem. One
is transmitting a group of track segments sequentially with
80 MHz. Another is transmitting in a bit serial format using
an optical link receiver or fast serial copper link directly on
the board. However, this problem is not solved yet.
The design of the prototype board showed clearly that
today’s FPGA’s are not suited to the track finder requirements.
Both the I/O count of the packages and the data propagation do
not allow the final design of the track finder system employing
today’s FPGA.
However, the functionality of the implementation of the
track finder algorithm, the momentum measurement algorithm,
and the hardware structure mapping of the detector geometry
onto the hardware level was proven to be adequate to the
system requirements. Moreover, it could be shown clearly
that the VHDL-model can be implemented in hardware with
reasonable hardware expense.
A feasibility study for a possible final implementation
employing application specific integrated circuits (ASIC’s) has
been conducted [16], [26]. It clearly demonstrated that today’s
ASIC technology is sufficiently advanced to implement the
processor fulfilling all requirements.
VIII. CONCLUSION AND FURTHER PERSPECTIVES
In the chapter “implementation options” and also in [16] it
is shown that no previously implemented system is suitable
to be applied in the track finder processor environment.
Conventional methods applied in hardware triggers fail for
the track finder. Especially, the most common approach,
the pattern comparison, must be ruled out because of the
large hardware extent. Instead the extrapolation method is
introduced. It is shown that the algorithm copes with the
track finder specifications. The algorithm can be implemented
with a minimum of hardware. Using VHDL and FORTRAN
simulation the algorithm and its hardware representation were
optimized. Simulation shows clearly that the simplicity of the
design concept, namely reducing data flow in subsequent steps
(by extrapolation, track assembly, and property assignment)
and selecting the highest ranking track candidate after each re-
duction step without sacrificing measurement accuracy, proves
to be an efficient method. The FPGA prototype demonstrated
that the algorithm (described in VHDL) can be implemented in
hardware with reasonable effort. The prototype clearly shows
the proper functionality of the implemented system. Using
simple logic modules, such as multiplexers, comparators,
subtractors, and logic gates, proves to be an important key
point for the success of the design. However, it is pointed out
that the number of bits to be processed in parallel poses a
challenge to the hardware implementation.
The design of the track finder processor, as it is introduced
in this work, cannot at all be regarded as terminated. Al-
though both the simulation and the prototype already delivered
satisfactory results the track finder design presented here
represents only a first step toward final implementation. The
work suggests an algorithm and an implementation method.
However, as the surrounding environment of the processor
will evolve, more and more of both the specifications and the
implementation of the track finder processor will have to be
refined accordingly.
REFERENCES
[1] CMS, “The compact muon solenoid,” CERN/LHCC 94–38, LHCC/P1,
Tech. Proposal, Dec. 15, 1994.
[2] M. De Giorgi et al., “Design and simulations of the trigger electronics
for the CMS muon barrel chambers,” CMS TN/95-01, CERN, Jan. 12,
1995
[3] T. Wildschek, “Design and simulation of the CMS first level muon
trigger track finder,” Ph.D. dissertation, Technische Universita¨t Wien,
1998.
[4] A. Kluge and T. Wildschek, “Track finding processor in the DTBX
based CMS barrel muon trigger,” in Proc. 1st Workshop on Electronics
for LHC Experiments, CERN/LHCC/95-56, Oct. 1, 1995.
[5] , “Track finding processor in the DTBX based CMS barrel muon
trigger,” in Proc. 2nd Workshop on Electronics for LHC Experiments,
CERN/LHCC/96-39, Oct. 21, 1996.
[6] , “The track finder of the CMS first level muon trigger,” in Proc.
3rd Workshop on Electronics for LHC Experiments, CERN/LHCC/96-39,
Oct. 1997.
[7] , “The hardware muon track finder processor in CMS—
Specification and method,” CMS Note 1997/091.
[8] A. Kluge and W. Smith, “CMS level 1 trigger latency,” CMS TN/96-33,
CERN, Mar. 8, 1996.
[9] H. W. den Bok et al., “Track recognition with an associative pattern
memory,” Nucl. Instrum. Methods, vol. A200, pp. 107–114, 1991.
[10] M. Dell’Orso and L. Ristori, “VLSI structures for track finding,” Nucl.
Instrum. Methods, vol. A278, pp. 436–440, 1989.
[11] S. R. Amendolia et al., “The AMchip: A VLSI associative memory for
track finding,” Nucl. Instrum. Methods, vol. A315, pp. 446–448, 1992.
[12] S. R. Amendolia et al., “The AMchip: A full-custom CMOS VLSI
associative memory for pattern recognition,” IEEE Trans. Nucl. Sci.,
vol. 39, p. 795, 1992.
[13] T. Kohonen, Content Addressable Memories, 2nd ed. New York:
Springer-Verlag, 1987.
[14] Cypress Semiconductor, Data Sheet Advanced Information Cy7C915
1k  42 SmartCAM.
[15] Music Semiconductors, Data Sheet MU9C1640 CacheCAM, June 4,
1993.
[16] A. Kluge, “The hardware track finder processor in CMS at CERN,”
Ph.D. dissertation, Technical University of Vienna, CERN-THESIS-98-
016, Oct. 1997.
[17] H. Kolanski, “Application of artificial neural networks in particle
physics,” DESY 95-061, Apr. 1995, ISSN 0418-9833 or Nucl. Instrum.
KLUGE AND WILDSCHEK: FEASIBILITY OF HARDWARE MUON TRIGGER 91
Methods, vol. A367, pp. 14–20, 1995.
[18] C. Loomis and J. Conway, “Using an analog neural network to trigger
on tau leptons at CDF,” in Proc. AIHENP’95, 1995.
[19] G. Athanasiu, P. Pavlopoulos, and S. Vlachos, “A neural network trigger
system for the CP-LEAR experiment,” in Proc. AIHENP’95, 1995.
[20] S. Schiek, and G. Schmidt, “Application of a high speed analog neural
network chip for first level triggering at the H1-Experiment at HERA,”
in Proc. Int. Conf. Artificial Neural Networks ICANN’95, Paris, France,
Oct. 3–9, 1995, vol. 2, pp. 363–368, ISBN 2-910085-18-X, 1995.
[21] Fent et al., “The realization of a second level neural network trigger for
the H1 experiment at HERA,” in AIHENP’96, Lausanne, Switzerland,
Sept. 2–8, 1996.
[22] C. Baldanza, “Results from a neural trigger based on the MA16
microprocessor,” Int. J. Mod. Phys., vol. C6, p. 567, 1995 or DFUB95/2,
1995.
[23] K. Hoen et al., “70 input 20 nanosecond pattern classifier,” in IEEE Int.
Conf. Neural Networks, vol. 3, pp. 1854–1859, 1994.
[24] T. Wildschek, “Simulation of the silicon tracker/vertex detector of the
ATLAS experiment at the large hadron collider at CERN,” Diploma
thesis, Technische Universita¨t Wien, 1993.
[25] A. Kluge and T. Wildschek, “The hardware muon track finder processor
in CMS—System and algorithm,” CMS Note 1997/092.
[26] , “The hardware muon track finder processor in CMS—Prototype
and final implementation,” CMS Note 1997/093.
[27] N. Neumeister et al., “CMS global trigger,” CMS TN/97-009, Jan. 20,
1997.
