New approach for the CMS Muon Trigger Track Finder processor by Erö, J
NEW APPROACH FOR THE CMS MUON TRIGGER TRACK FINDER
PROCESSOR
János Erö
Institute for High Energy Physics (HEPHY), Vienna - Austria
CERN, Div. EP; CH-1211 Geneva 23 - Switzerland
ABSTRACT
The CMS Group of HEPHY/Vienna develops the Barrel
Track Finder Processor for the CMS Muon Trigger. Its
first design approach proved its feasibility, but showed
also many drawbacks caused by the numerous
connections between Processors belonging to the
neighbouring muon sectors. This implied further
synchronisation problems as well. A new approach
concerning required muon track elements allows to
reduce the neighbour connections, but also enabled to
elaborate a global track finding method that results in a
faster process. As another result the hardware might
utilise FPGAs instead of the planned ASIC technology.
We cover the new design and its simulation results.
1. INTRODUCTION
In 1997 the first phase of the Muon Trigger Track Finder
development finished. The design and the prototype built
showed the feasibility of the basic ideas and the
possibility to build a hardware of this type [1], [2]. It
showed however many drawbacks that made it necessary
to consider new solutions and improvements allowing to
design a simpler and better maintainable hardware with
the same functionality level. The main problem areas
were the following:
x The design contained only the Track Finding
Functions. It did not contain Input Receiver,
Synchronisation and Redistribution modules
x The Design consisted of a very large number of
Neighbour Input and -Internal Result connections.
Thus the electronics was limited by the chips’ pin
amount rather than the logic components.
x The Track Linker and Selection blocks were hard to
overview and their functionality difficult to analyse.
x As described in [3] the Parameter (Feature)
Assignment Unit was not optimised, the decision for
the most effective algorithms was open.
x The design could be considered too optimistic
concerning the electronics - although feasible,
probably only with expensive fast ASICs.
x No recommendations were issued concerning system-
wide aspects, like system control and startup, layout
and cabling.
Thus based on this design, called “Baseline Design”, a
new development was made, the “Simplified Design”.
2. MAIN FIELDS OF IMPROVEMENTS
Considerations about the amount of information needed
to separate and reconstruct muon tracks resulted in a basic
simplification in the design [3]. It was proven that the
follow up of the muon tracks at the sideways neighbours
is not necessary. Those mouns changing the sector
boundaries can be reconstructed based on their path
before changing. This simplified approach has many
consequences:
x The Track Finder Processor does not need to get
Extrapolation Results from its sideways neighbours.
This decreases the number of interconnections and
eliminates one stage where extra synchronisation was
necessary.
x The Track Finder Processor still needs Extrapolation
Results from its neighbour in the next wheel (same
sector). This is, however, result of one single set of
Extrapolators, thus, to avoid the extra
synchronisation mentioned above, it is simpler to
perform the Extrapolation of the Next Wheel
Neighbour. This brings some additional electronics,
but simplifies the design considerably.
x Due to the suppressed follow up inside the neighbour
sectors, the number of possible tracks decreases very
much. This allows to use a very simple Track
Linking and Selection algorithm that can more easily
be implemented in electronic circuits.
x The independent sideways neighbours and the dual
Extrapolation of the next wheel neighbours might
result in muons which are found twice, by
neighbouring Track Finder Processors. To avoid
considering them as separate muons they should be
cancelled out, a task for the Fake-Pair Cancellation
Unit.
x As each Processor performs Extrapolations both in
the Next-Wheel Neighbours (with follow up) and the
Sideways Neighbours (w/o follow up), it should
receive Track Segment Input Data from these
Neighbours and also send its Track Segment Data to
its Neighbours. To minimise the number of necessary
synchronisation steps, every Processor should
synchronise the Input Data from its own sector and
send them synchronised to its neighbours[4]. Thus
the Data from the Neighbours do not need to be
synchronised again. This synchronisation and
transmission happens in the Sector Receiver Unit of
each Processor.
The Baseline Design has foresaw a unified Track Finding
approach for both the Barrel Region and the Endcap
Region. Later simulations [5] showed, however, that both
the detector geometry aspects and the necessary
algorithms require a different handling for these areas.
Thus a separate Track Finder Processor will be developed
for the Endcap CSC detectors and the Barrel DT
detectors. The Overlap Region between them requires
special attention. This region will be processed by both
Processors, and the exact boundary between their field of
activity should remain flexible. The Barrel Track Finder
Processor will receive the CSC Input Data, reformat them
to its own data format and synchronise with the DT Track
Segments. After this the CSC Data will be processed in



















































As seen in Fig 1., the Track Finder Processor consists of
five basic units, the Input Receiver, the Extrapolator, the
Track Assembler, the Fake-Pair Cancellation and the
Parameter Assignment. The Processor uses the Track
Segment Data as  input of the Extrapolators, the rest of
the Track finding process uses only the bitmap data of the
successful Extrapolations. Only the Parameter
Assignment requires the TS again. As this happens
several clock cycles later, the TS Data should be
pipelined in a FIFO.
3.1. The Input Receiver
The Input Receiver should contain the Link Receivers
that accept the Track Segments from the Trigger Server.
After synchronisation they also send these data towards
the neighbours, except the Track Segments of the 1st
Station: these cannot be a target of an Extrapolation. As a
further task the Input Data from the neighbours should be
received, but not synchronised, as this already happened
at the corresponding neighbour’s Input Receiver.
The Track Segments sent by the Trigger Server are
composed of  11bit Phi value, 10bit Phi_b (Bending
Angle) value and 3 bits of TS Quality [6]. As the final
decision concerning the Optical Links between the
Trigger Servers and the Track Finder Processors is not
yet made, it is not clear how the Link Receivers will look
like and how much space they require. According to
agreements they will be built on daughterboards, which
allows a greater flexibility both concerning the used
technology and the later maintenance. The size of these
boards is however still undetermined. As a consequence
one cannot tell if the Link Receivers can be placed on the
Track Finder Processor’s mainboard, or a separate Input
Receiver board is necessary.
In the second case the Input Receiver Board will only
receive the own sector’s Optical Links and performs their
transmission towards the neighbours, but will not receive
the neighbours’ data. This should happen on the Track
Finder Processor’s mainboard reducing the number of
required connections between the Input Receiver and the
mainboard.
3.2. The Extrapolator
The Extrapolator Unit performs all Extrapolations that
use the Processor’s own Sector as Extrapolation Source
and additionally the sideways and next wheel neighbours’
as Extrapolation Target. To follow up Muon Tracks in the
Next Wheel it also contains Extrapolators that use the
Next Wheel’s Stations #2, #3 and #4 as Source and the
same wheels Stations as Target - muons that have the next
wheel as Source cannot have the same (previous for the





























As seen in Fig. 2. each Extrapolator contains two Lookup
Tables (LUT), one for the Extrapolation Window’s upper
limit and one for the lower limit. The Source’s Phi value
will be added to both LUT results and the Target Phi
position will be compared to these limits. An
Extrapolation is considered as successful, if the Target
Phi is inside the window.
This means each Source TS needs one set of LUT, but to
make the procedure similarly simple for the
Extrapolations, where the Target is at the sideways
neighbour, there are separate LUTs for these cases.  The
number of Comparators corresponds to the number of
Targets, that is 4 for the same wheel and 2 for the next
wheel.
For the own wheel there are 1-2, 1-3, 1-4, 2-3, 2-4 and 4-
3 Extrapolations required. Taking into account 2 TS in
each Station and 3 directions (own Sector, right
neighbour and left neighbour) this means 6x2x3=36
Extrapolators. For the next wheel no Extrapolations are
performed from the #1 Station, thus the number of
Extrapolators is 3x2x3=18. This makes in total 54
Extrapolators.
According to the simulations it is enough if the
Extrapolators use only 8 Phi_b bits and 8 Phi bits, thus
the LUTs are 256x8 bit tables. This results in 221’184
LUT bits, taking into account both the low and high limit
LUTs. This amount of memory bits does not fit into
present FPGAs, but even if the technology development
would make this possible, another problem arises. When
putting all Extrapolators in one chip, this would possess
480 input and 660 output pins, not counting the Clock and
Reset inputs. Therefore a partitioning is necessary.
If the Extrapolator Unit is partitioned into 3 Chips, a
possible scheme that results in almost the same size
FPGAs can be the following:
Chip A: 1-2 and 2-4 Extrapolations,
Chip B: 1-3 and 2-3 Extrapolations,
Chip C: 1-4 and 4-3 Extrapolations.
This partitioning tries to minimise the pin number too, as
some Station’s Phi values are used both as Source and
Target. Hardware level simulations were made with this
scheme, and proved its feasibility using Altera
EPF10K200EGC599-1 FPGAs. The timing simulation
showed correct behaviour at full 40MHz clock and 2BX
delay time. The design fills up the given Chip’s total
Memory Area, leaves, however, 80% of the Logic Area
free.
3.3. The Track Linker
The baseline design’s Track Segment Linker and single
Track Selection unit used a complicated and hard to
maintain procedure to find out the two highest priority
muon tracks, and required many steps to perform - thus a
doubling of the working frequency was foreseen to keep
the requirements. This also meant using high speed
ASICs in the final realisation.
The Simplified Design counts with a reduced number of
possible Muon Tracks, as those ones that leave the
Processor’s Sector aren’t followed up in the neighbouring
sectors after they first crossed a neighbour Station. This
allows a new approach for the Track Linker. The Linker
can set up logic conditions for the existence of all
different Tracks. To limit the number of conditions these
don’t contain the very last Extrapolations’ Target, they
just check their existence. The Linker forms a Priority
List about the Tracks fulfilling these conditions and finds
out the highest priority ones. The Priority List is
composed of the highest priority Tracks towards the
lowest ones:
1-2-3-4; 1-2-3; 1-2-4; 1-3-4; 2-3-4;
1-2; 1-3; 1-4; 2-3; 2-4; 3-4
As seen the longer Tracks have priority over the shorter
ones, and among the Tracks of same length those
originating in lower Stations have priority over ones
originating higher. Inside one group the Tracks remaining
in the same sector have priority over those leaving the
sector.
As the Track Linker has the task to find two Muon
Tracks, there is a certain possibility that the second found
track will be a sub-track of the first one. To avoid this the
Track Linker, after finding the first Muon Track, cancels
out from the Priority Table all possible shorter Tracks
that are sub tracks of the first one, and then it repeats the
search after the highest priority track among the
remaining track candidates. This way the block scheme of














































This Track Finder Scheme was also simulated on
Hardware level. It fits into an Altera
EPF10K100EBC356-1 FPGA. The design was set up to
perform the task in a 7 stage pipeline. The max. speed
was 30MHz, which is still less than that required by
CMS, but the technologic development allows us to
expect higher speeds in the near future (Altera 20K series
or Xilinx Virtex). If these expectations will not get
fulfilled, one has to redesign the circuit with more
pipeline stages of less functionality.
The Track Linker delivers the TS addresses of the two
highest priority Muon Tracks found to the Fake Pair
Cancellation Unit.
3.4. The Fake Pair Cancellation Unit.
As mentioned above, the Simplified Design does not
follow up the Muon Tracks at the neighbours. This means
also that Track portions that are parts of the same Muon
Track might be found twice at neighbours. Theoretically
they can be cancelled by the later Sorter stages based on
their similar physical parameters, however this can be
ineffective, as different parameter assignment methods
might result in slightly different results for the same track
in neighbouring sectors. The safest way is to cancel these
tracks based on their identical TS location. This is easy to
do for the next wheel neighbours as their Track Finder
Processor is plugged into the same Crate, but more
complicated for sideways neighbours; this requires Inter-
Crate Connections. Therefore another solution is being
considered, that foresees these cancellations in the Global
Sorter, but requires forwarding the TS addresses there.
The simple Fake Pair Cancellation compares the TS
addresses of the found Muon Tracks and cancels out
those that contain lower priority TS combinations. This
logic scheme requires a limited number of FPGA gates
and can be implemented as part of the Track Linker Chip,
adding one BX to the total delay.
3.5. The Parameter Assignment Unit
The last stage of the Track Finder Processor is the
Parameter Assignment. The output of the Processor
contains max. two found Tracks described with the
Muon’s Pt in 5bits, Charge in 1bit, Phi in 8bits, Eta in
2bits and Track Quality in 2 bits.
The Track Quality here is a simple number showing how
many  valid Extrapolations were participating in the
resulting Track. Later simulations should show if it makes
sense to modify this number with the quality of the
participating input Track Segments.
The Eta value in this stage simply shows where the Muon
Track leaves the wheel towards the Next Wheel
Neighbour. This value will be forwarded to the Eta Track
Finder, that refines this value to 5-6 bits if a Track
Matching is possible [7].
The Phi value will be calculated as the position in the #2
Station, as recommended in [8]. This means forwarding
the #2 Station TS position for the case if the Muon Track
contains #2 Station TS, and an interpolation or
extrapolation for the cases if no #2 Station TS is
available.
The Charge is a simple function of the Track bending and
can be derived from the TS Phi_b values.
For determining the Pt value the present high level
simulations promise more effective methods than those
used in the Baseline Design. The most probable solution
will be to perform the Pt calculation always using two
methods in parallel and choose the better one’s results.
Both methods will use LUTs and as expected need 2BX
delay to get results. No hardware simulation was made
until now.
4. SYSTEM CONSIDERATIONS
It is planned to build the Muon Track Finder Processors
on 9U high 340mm deep boards. They will contain one
single VME connector, the other backside connectors will
be plugged into a custom design motherboard. The
simplified design requires to organise all Processors of
the same sector (one Wedge) into one Crate as in this
Design there are no more sideways neighbour connections
(except Inputs) but there are next wheel neighbour ones.
This also means that the Baseline Design’s Wheel Sorter
will be replaced by a Wedge Sorter in each Crate.
If the Input Link Receiver size allows all parts of the
processor to fit on one single board allowing to
incorporate two Wedges (2x6 Processors plus Eta Track
Finder and Wedge Sorter) in one Crate. Building 3 Crates
into a Rack, 2 Racks are necessary. If the Input Link
Receiver requires an own board, one Crate can only
contain Processors of one Wedge. In this case we have to
count with four Racks.
In order to have a constant environment, water cooled fan
units are foreseen below each Crate.
5. HARDWARE DEVELOPMENT
As mentioned above the design of the Simplified version
is made in a hardware-close environment. The modules
are either described as Altera AHDL descriptions or






































Fig. 4 shows the development path from the AHDL or
VHDL description to the programmed chip or chip
simulation. The VHDL behavioural description as design
base offers a big advantage, as this case allows  a direct
comparison between the behavioural and the resulting
chip simulation. Unfortunately the direct application of
the VHDL files as inputs to the Altera MaxPlus®
development system was not successful, because this
program imposes strong limitations to the VHDL code in
order to do a correct synthesis. Third party tools, like
Leonardo® promise a better solution here, which is being
investigated.
6. SYSTEM SETUP AND CONTROL
For the board level control, check and data spying the
best way seems to apply the JTAG Boundary Scan
methods. They allow testing the board’s connections,
signals and program the FPGAs. To extend the JTAG
chains beyond Board level, however, would introduce
serious electrical and logistic difficulties. Therefore the
JTAG chains are planned to remain board-internal loops
controlled by the VME interface.
The system level control should be modular, easily
maintainable and allow remote operation. Based on the
results of the CMS DAQ groups at the Test Beams [8] we
investigate the possibilities of a Java based environment




A.Kluge, T.Wildschek, The Track Finder of the CMS First
Level Muon Trigger, Proceedings of the Third Workshop
on Electronics for LHC Experiments , London, Great





A.Kluge, T.Wildschek, The Hardware Muon Trigger Track
Finder Processor in CMS - Specification and Method,
Architecture and Algorithm, Prototype and Final




G.M.Dallavale, M.Fierro, V.Genchev, C.Grandi,
N.Neumeister, P.Porth, H.Rohringer, A simplified Track





W.H.Smith, CMS Synchronisation Workshop




G.M.Dallavale, N.Neumeister, C-E.Wulz, B.P.Padley,
G.Wrochna, J.Hauser, D.Acosta, Issues Related to the
Separation of the Barrel and Endcap Muon Trigger Track-




R. Martinelli, A.J. Ponte Sancho, P. Zotto, Design of the





M. Kloimwieder, Improving the K-Assignment of the
DTBX Based First Level Regional Muon Trigger, CMS
Note in progress
 
[8.] G.Wrochna, Muon Trigger Objects, CMS Draft Note, see:
cmsdoc.cern.ch/~wrochna/tmp/trobj.ps
