The Gigafitter in track reconstruction for the search of rare events at hadron colliders by RAFANELLI, NICOLA
Università di Pisa
Facoltà di Scienze Matematiche Fisiche e Naturali
Corso di Laurea Specialistica in Scienze Fisiche
Anno Accademico 2007/2008
Tesi di Laurea Specialistica
The Gigafitter in track reconstruction
for the search of rare events
at hadron colliders
Relatore:
Chiar.mo Prof.
Mauro Dell’Orso
Candidato:
Nicola Rafanelli
2
Contents
Introduction 1
1 Rare b decays 3
1.1 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Flavor Changing Neutral Currents . . . . . . . . . . . . . . . . . . 5
1.3 Heavy flavor production in hadron collisions . . . . . . . . . . . . . 6
2 Experimental apparatuses 9
2.1 Tevatron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 CDF II detector . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 LHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 ATLAS: A Toroidal Lhc ApparatuS . . . . . . . . . . . . . 14
3 The Fast Tracker 21
3.1 Online track reconstruction . . . . . . . . . . . . . . . . . . . . . . 23
3.2 General structure of FTK . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.1 Data Organizer . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Associative Memory . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Track fitting with linear constraints . . . . . . . . . . . . . . . . . 29
3.4 The FTK predecessor: SVT . . . . . . . . . . . . . . . . . . . . . . 33
3.4.1 Track Fitter++ . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 Gigafitter 39
4.1 General description . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Hardware description . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 Pulsar Board . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Mezzanine Board . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.3 Virtex-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
i
CONTENTS
4.2.4 Clock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Input FIFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.1 Input protocol . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Combiner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5 Fitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 Scalar products with DSP48E slice . . . . . . . . . . . . . . 53
4.6 Track selection: the Comparator . . . . . . . . . . . . . . . . . . . 55
4.7 Hit spy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.8 Formatter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.9 Firmware on the Pulsar Board . . . . . . . . . . . . . . . . . . . . 63
4.9.1 VME interface . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.10 Next steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.11 The Gigafitter contribute to SVT . . . . . . . . . . . . . . . . . . . 68
4.12 Switching from SVT to FTK . . . . . . . . . . . . . . . . . . . . . 69
5 Application to rare b decays 71
5.1 FTK simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.1 Generation of internal FTK data banks . . . . . . . . . . . 72
5.2 FTK tracking performance . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Application to B0s ! +  . . . . . . . . . . . . . . . . . . . . . . 78
Conclusions 86
Bibliography 89
ii
Introduction
The high luminosity and energy reached at modern hadron colliders allow the
search of rare events which were not detectable till now. In particular, the detection
of some events, which are highly suppressed in the current formulation of the
Standard Model, would reveal the presence of new physics beyond the SM.
On the other hand, the hadron collider characteristics, which provide an in-
creased capability of producing rare events, bring also a so large number of back-
ground events and amount of data that the current storage capability is over-
whelmed. Therefore, a trigger system is imperative in order to drastically reduce
the event rate, by selecting interesting events. Trigger algorithms largely benefit
from the availability of particle reconstructed trajectories but, to be effective, the
reconstruction must be very efficient and must be done at full tracker resolution.
For a long time, this approach was systematically discarded because available and
devised implementations were cumbersome and too much expensive. Full trajec-
tory reconstruction of all events was introduced for the first time in the Level-2
trigger of the CDF experiment, operating at the Tevatron collider. The goal has
been reached thanks to the SVT processor, which has shown that this approach
is not only possible but also very effective and so successful to obtain a Panofsky
prize[1].
The new Large Hadron Collider, with its entre of mass energy of
p
s = 14 TeV
and a bunch crossing of 25 ns, will be, expecially in the high luminosity run
at a peack luminosity of 1034 cm 2s 1, a very challenging environment for the
existing trigger technology. The Level- 1 trigger has to collect event at a rate of
40 MHz, while the Level-2 will work at a maximum input rate of 75 kHz. These
requirements push the existing hardware to the limits, with the risk of limiting
the ability of the experiment to collect interesting physics sample.
The Fast Tracker (FTK) is a dedicated hardware for on-line pattern recognition
of tracker detector data, that has been proposed for ATLAS experiment. FTK is
an evolution of the SVT processor which can perform high-quality track recon-
1
CONTENTS
struction, with performance close to the off-line software. This will be performed
at the very high event rates output by the Level-1 trigger, i.e. up to 50-100 kHz.
In this thesis I present the my work and results for the developement of the
Gigafitter: a processor, based on the most recent FPGA technology, that would
reconstruct high-quality track parameters with a rate of a nanosecond per fit and
a latency of 50 100 ns. The Gigafitter has been developed to substitute the SVT
Track Fitter with the goal of improving the system performances, by overcoming
current hardware limitations. The computational power, provided by new Xilinx
Virtex-5 FPGA chips, allows the use of the Gigafitter inside the FTK processor,
in the much more demanding environment of ATLAS experiment.
In the first chapter, I briefly summarize the theoretical aspects that leads to
the choice of rare decays of b quarks, for the search of new physics. In the second
chapter some characteristics of CDF and ATLAS experiments are described, in or-
der to understand the working environments of the Gigafitter. In the third chapter
the FTK processor is presented, by showing how it performs track reconstruction
in a very short time. My personal contribution to the Gigafitter developement
is presented in Chapter 4. I describe in detail the structure of the firmware I
have written to perform track fitting. The technical solutions, used to implement
a linearized fitting algorithm and the selection of tracks, are shown. Eventually,
in the last chapter I present some studies that have been carried on in order to
estimate the Fast Tracker performances, in the physics case of rare B0s ! + 
decay events.
2
Chapter 1
Rare b decays
1.1 The Standard Model
The Standard Model (SM) is the current theory that describes electromagnetic,
weak and strong interactions, where the electromagnetic and the weak interactions
are unified in the electro-weak interaction. The theory has been deeply tested
during past years and it has been able to effectively reproduce experimental data.
The basic building blocks of matter, known as matter particles, are twelve fermions
and the relative anti-particles, grouped in leptons and quarks. In the theory both
leptons and quarks are organized in three generations as shown in the following
table:
Generation 1th 2nd 3rd Q
Leptons
e    ﬁ  -1
e  ﬁ 0
Quarks
u c t +2=3
d s b  1=3
The charged leptons interact via the electric and weak interactions, the neu-
trinos only via the weak interaction, while quarks have all three interactions.
The other fundamental bricks of the theory are five bosons: (photon), W,Z0, g
(gluon) and the not yet observed H (Higgs). Exhaustive explanations of the model
can be found in [2] e [3]. In this paragraph I’ll briefly mention the basic features
of a part of the model, known as the heavy-flavor sector, that is of interest for
this thesis. Its experimental investigation is mostly based on the study of decays
3
CHAPTER 1. Rare b decays
of hadrons containing heavy-quarks (c or b quarks) into lighter states. The flavor
changing interactions in SM are due to weak interactions between quarks. Weak
eigenstates u1; u2; u3 and d1; d2; d3 are linear combinations of mass eigenstates u; c; t
and d; s; b: 0
BBB@
u1
u2
u3
1
CCCA = U
0
BBB@
u
c
t
1
CCCA ;
0
BBB@
d1
d2
d3
1
CCCA = D
0
BBB@
d
s
b
1
CCCA (1.1)
The charged weak interaction couples u-like and d-like quarks with the Lagrangian[4]
LW q =   g
2
p
2
[u(1  5)dW+ + d(1  5)uW  ] (1.2)
where g is the coupling constant. The current terms uidi with i = 1 : : : 3 lead
to the generation mixing:
(u1; u2; u3)
0
BBB@
d1
d2
d3
1
CCCA = (u; c; t)U
yD
0
BBB@
d
s
b
1
CCCA (1.3)
where the row-vector represents the three functions of the u-like states and the
column vector those of the d-like states. VCKM = U yD is a 3  3 unitary matrix
called Cabibbo-Kobayashi-Maskawa matrix.
The matrix can be written as:
0
BBB@
Vud Vus Vub
Vcd Vcs Vcb
Vtd Vts Vtb
1
CCCA (1.4)
where the Vij elements are complex numbers. The unitary constraint reduces the
matrix observables to four parameters. It can be shown that possible parameter-
izations uses three mixing angles and a complex phase. The Kobayashi-Maskawa
parameterization is one of the possible parameterizations:
0
BBB@
c12c13 s12c13 s13e
 i
 s12c23   c12s23s13ei c12c23   s12s23s13ei s23c13
s12s23   c12c23s13ei  c12s23   s12c23s13ei c23c13
1
CCCA (1.5)
where sij = sin ij; cij = cos ij and  is a phase responsible for all CP-violating
phenomena in flavor changing processes in the SM. For the neutral current inter-
action the Lagrangian is:
LZ q =   g
4 cos W
fu[(1 5)ﬁ3 2 sin2 WQ]uZ+d[(1 5)ﬁ3 2 sin2 WQ]dZg
(1.6)
4
1.2. Flavor Changing Neutral Currents
where only uu-like or dd-like interactions may occur. As for the charged current,
we can write:
(u1; u2; u3)
0
BBB@
u1
u2
u3
1
CCCA = (u; c; t)U
yU
0
BBB@
u
c
t
1
CCCA (1.7)
where U yU = 1, because U is unitary. Therefore, there is no mixing and, conse-
quently, Flavor Changing Neutral Current interactions cannot occur at tree level.
Despite the great success of the model, there are parts of it still under in-
vestigation. Some phenomena (dark matter, neutrino masses. . . ) do not find an
explanation in the current formulation of the SM. Many extensions to Standard
Model were proposed, such as SUSY, MSSM and others. These models, to which
we can generically refer as New Physics (NP), are effective for energies greater
than the SM cut-off, where a direct observation of new particles is expected. This
set of circumstances moves the NP direct discovery to higher and higher energies,
which are difficult and expensive or even impossible to reach in the near future.
Therefore, the only left option is indirect search of NP that must exploit the de-
tection of tiny effects and corrections to the SM. From the experimental point of
view, the detection would be possible only in those processes where this correc-
tions dominate, ergo those highly suppressed in the SM. This will result in very
rare events that are to be detected against huge background samples.
1.2 Flavor Changing Neutral Currents
Flavor Changing Neutral currents (FCNC) may occur in the Standard Model only
beyond the tree level, requiring at least one-loop transition due to the absence
of mixing in the neutral weak interaction. This processes are therefore strongly
suppressed. In B physics area the FCNC interactions involve the transitions b! s
and b! d, as in the B0(s) ! +  decay.
The expected SM BRs for this processes are:
B(B0s ! + ) = (3:42 0:54) 10 9 (1.8a)
B(B0 ! + ) = (1:00 0:14) 10 9 (1.8b)
In the case of presence of New Physics additional contributions are added in
the calculation of the BR that can increase the order of magnitude. Hence, the
B0s decay mode to muons can be a very important benchmark for the SM and for
5
CHAPTER 1. Rare b decays
Chapter 1. Rare b decays
1.4 Flavor Changing Neutral Current
The charmless decays represent transition mediated by a charged current, where
the ∆F = 1 transition is coupled with an iso-spin variation ∆U = 1, the Flavor
Changing Neutral Current (FCNC) instead have instead ∆U = 0. In the B-
physics area the FCNC interaction involve the transitions b→ s and b→ d. These
processes are in general strongly suppressed in the SM, first of all because they
cannot happen at tree level and require at least one-loop transition (Fig. 1.4).
Within the FCNC family of decays there are final states involving only leptons,
i.e. B0(s) → µ+µ− decays, or mixed final states with both leptons and hadrons, i.e.
B0 → µ+µ−K∗0 (seen at the B-Factories), or the never seen B0s → µ+µ−φ.
1.4.1 B0(s) → µ+µ− modes
u, c, t
W−
W+
Z0
d, s l−
b¯ l+
u, c, t
W−
νl
W+
d, s l−
b¯ l+
Figure 1.4: Two Feynman diagrams representing the FCNC for the B0(s) → l+l−
process. These processes are a benchmark for the SM, in particular the processes
with electron and muon in the final states are strongly suppress in the model.
The suppression is further increased by the CKM mechanism, due the insertion
of the Vts element for the B
0
s mode, and of the Vtd element in the B
0 mode. The
expected BR are [21]:
B(B0s → µ+µ−) =(3.42± 0.54)× 10−9 (1.18a)
B(B0 → µ+µ−) =(1.00± 0.14)× 10−9 (1.18b)
In case of presence of physics beyond the SM, in the loop it is possible to in-
troduce additional contribution. In a R-parity violating scenario it is also possible
to have tree transitions. In all this scenarios, the BR of both decay modes are
enhanced by many order of magnitude. Under very general conditions, in a SUSY
scenario the BR is proportional to:
B(Bq → l+l−) ∝
m2qm
2
l tan
6 β
M4A0
(1.19)
14
Figure 1.1: Two Feynman diagrams of FCNC for the B0s ! l+l  process, where l is a
lepton. This process is strongly suppressed in Standard Model and is conse-
quently a good way to test the model.
the physics beyond it. From an experimental point of view this decay is preferred
to the B0(s) ! e+e  and the B0(s) ! ﬁ+ﬁ  because of a more efficient suppression
of background.
1.3 Heavy flavor production in hadron collisions
Chapter 1. Rare b decays
Other observables sensitive to the effect of NP are the CP asymmetry and the
Forward-Backward (FB) asymmetry in the angular distribution of the two leptons.
In particular the AFB in the SM is expected to be negligible, while it could be
large in MSSM.
1.5 Heavy Flavor product on in hadron col isi ns
q
q¯
b¯
b
g
g
b¯
b
g
g
b¯
b
g
g
g
g
b¯
b
Figure 1.6: Example of bb¯ production Feynman diagrams in a pp¯ environments.
The reported processes are known as direct production, gluon fusion, flavor exci-
tation and gluon splitting.
BaBar Belle CDF LHC
Luminosity cm−2s−1 4.6× 1033 8.3× 1033 1× 1032 1× 1034
σbb¯ 1.15 nb 1.15 nb 100 µb 500 µb
†
Production rate 5 Hz 10 Hz 1000 Hz 500 KHz
σbb¯/σhad 0.25 0.25 ≈ 10−3
† prediction [22].
Table 1.3: This table compares the production rate of bb¯ pairs in different envi-
ronment. These numbers don’t take into account experimental efficiencies.
Heavy-flavor studies in the past were done at different accelerator machine:
e+e− general purpose collider, like LEP, fixed target experiment, as CLEO, hadronic
16
Figure 1.2: Examples of bb production Feynman diagrams in pp collisions. The reported
processes are known as direct production, gluon fusion, flavor excitation and
gluon splitting.
The study of processes involving heavy-flavored quarks is part of every particle
physics program of almost any major experiments. In the past this studies have
6
1.3. Heavy flavor production in hadron collisions
been carried out at different accelerator machines: e+e  general purpose colliders,
such as LEP, fixed target experiments, as CLEO, hadron colliders, as Tevatron,
and e+e  colliders at the bb resonances, known as B-Factories. In recent years,
results obtained at the Tevatron collider and the prediction for the possibilities of
rare b-decays studies at LHC have given new motivation to investigate this part
of high-energy physics in hadronic colliders. In this type of machines the b-quark
pair production rate is very high. For example, at typical Tevatron luminosities
the production rate of b-hadrons within CDF experiment acceptance is about of
1,000 per second, which is far larger than those obtainable at leptonic machines.
BaBar Belle CDF LHC
Luminosity cm 2s 1 4:6 1033 8:3 1033 1 1032 1 1034
ﬀbb 1:15 nb 1:15 nb 100 b 500 b
Production rate 5 Hz 10 Hz 1000 Hz 500 kHz
ﬀbb=ﬀhad 0:25 0:25  10 3
Table 1.1: This table compares the production rate of bb in different environments. For
LHC predicted values are reported.
Tab.1.1 shows a comparison between different experiments to point out the dif-
ferences in the production rate and the luminosity needed to reach such values. An-
other relevant advantage of the b-physics at hadron colliders is the available centre-
of-mass energy in hadron collisions. At the Tevatron CM energy (
p
s = 1:96 TeV)
or at the LHC (
p
s = 14 TeV) all species of b-hadrons can be produced. Also the
relativistic factor  (Lorentz boost) of b-hadrons produced in hadron collisions
is larger than in B-Factories, with the consequently longer decay-lenght, allowing
more precise measurements in time evolution of heavy-flavours. In particular, this
was important for the measurement of B0s oscillation frequency[5]. The very large
bb cross-section in hadron collider goes however with a much larger total pp (or
pp) cross-section. This leads the interesting events to have a signal to background
ratio, i.e. with respect to the total production, of O(10 9). Searching for such a
rare signal is a primary challenge in hadronic colliders, that has been faced with
sophisticated trigger systems capable to identify interesting events rejecting most
of the background. One system that is designed to this purpose for the ATLAS
LHC experiment is the FTK processor that is the evolution of the SVT processor
7
CHAPTER 1. Rare b decays
now in use in CDF at the Tevatron collider. The next chapter is an introduction
to the two experiments and their trigger systems.
8
Chapter 2
Experimental apparatuses
In this chapter I give a brief description of the experimental apparatuses to which
the work of this thesis refers.
The first paragraph contains a description of the Tevatron collider and the CDF
experiment. Some details of this detector are needed in order to understand the
use of the SVT system, which is the FTK predecessor and will be used for testing
the performances of the Gigafitter on the field. After that, the CERN hadronic
collider LHC and the ATLAS experiment are described, where the attention is
focused on those parts that are relevant for the operation of the Fast Tracker.
2.1 Tevatron
Tevatron[6] is a particle accelerator at the Fermi National Accelerator Laboratory
in Batavia, Illinois. Before the construction of LHC1 it was the highest energy
particle collider in the world. It is a synchrotron accelerating protons and anti-
protons that started operating in 1987 with the so called Run 0. Afterwards it
has been continuously upgraded till the year 2000 when Run II started. In Run
II the center-of-mass energy provided by the accelerator is 1.96 TeV, colliding
protons and anti-protons beams, collected in bunches and spaced in time by 396
ns. One of the most relevant caracteristic of such machines is the instantaneous
luminosity (L). This is related to the rate of production of a particular physical
state with the relation:
rate[s 1] = L[cm 2 s 1]  ﬀ[cm2] (2.1)
1LHC will be operant in 2009
9
CHAPTER 2. Experimental apparatuses
where ﬀ is the production cross-section for the interesting process. The instanta-
neous luminosity depends on the accelerator parameters and can be expressed by
the formula:
L = f n1n2
4ﬀxﬀy
(2.2)
where f is the frequency of the collisions (50Hz in the Tevatron), ni the number of
particles in each beam, ﬀx;y the transverse width (using a Gaussian approximation
for the transverse beam shape). During Run II the peak luminosity reached the
value of L = 31032cm 2s 1. Fig. 2.1 shows a sketch of the Fermilab’s accelerator
system that points out the complexity of the machine.
Figure 2.1: The figure shows Fermilab’s accelerator system[7]
2.1.1 CDF II detector
The upgraded CDF detector[8] is a multi-purpose solenoidal magnetic spectrom-
eter, placed in one of the interaction points of the tevatron collider. In Fig. 2.2
there is a schematic view of the detector where the parts of the detector are shown;
each part has a different role in finding out the properties of the particles produced
in the collisions.
A superconducting solenoid generates a 1.4 T magnetic field in the inner part
of the detector, where is the tracking system devoted to reveal charged particles
and to measure their signs, trajectories and momenta. Outside the solenoid are
10
2.1. Tevatron
Figure 2.2: The figure show a section of the CDF II detector
the electromagnetic and hadronic calorimeters, that allow to measure the energy
of the emerging particles. The outermost part of the detector is used to reveal the
muons, that are able to pass through the calorimeters without being absorbed.
The tracking system of CDF (Fig. 2.3) is the sub-system of the detector that
is exploited by SVT to reconstruct the particle trajectories. Starting from the
inner part going outside in the shell structure, it is composed of three silicon
detectors, L00, SVXII, ILS and the drift chamber COT. Layer 00 (L00) is a single
layer micro-strip detector that is placed just outside the beryllium beam-pipe. It
allows a very precise measurement of the impact parameter but it cannot be used
by SVT because of its pedestal fluctuations that generate huge amount of data.
Its information is only used after the SVT filter, at a reduced event rate. SVXII
(Silicon Vertex) is a fine resolution micro-strip vertex detector that provides five
3-dimentional sampling of tracks. It is segmented in twelve 30 azimuthal sectors
(wedge) in the ﬃ direction and in six barrels in the z direction. The SVXII pseudo
rapidity2 coverage is jj . 2 that corresponds to a total length of 192 cm along the
beam axis. The Intermediate Silicon Layer (ISL) is a silicon tracker placed between
SVXII and the drift chamber, covering the jj . 2 pseudo-rapidity range. As with
L00, ISL is not used for track reconstruction by SVT. A large volume between the
2the pseudo rapidity  is defined as  =   ln tan =2, where  is the polar angle (colatitude)
with respect to the beam axis.
11
CHAPTER 2. Experimental apparatuses
0
1.0
2.0
0 1.0 2.0 3.0
3
3 0 o 
= 1.0
  
= 2.0
  
= 3.0
m 
m
o
Figure 2.3: Elevation view of a quadrant of the inner portion of the CDF II detectors
showing the tracking volume surrounded by the solenoid and the forward
calorimeters
silicon trackers and the solenoid is occupied by the Central Outer Tracker (COT).
This multi-wire, open-cell, drift chamber provides charged particle tracking at
large radii in the jj . 1:5 pseudo-rapidity region.
CDF Trigger System
The very high rate of collision events and the large amount of data coming from all
the parts of the detector in an hadronic collider force us to find a way to reduce the
amount of data that have to be collected and permanently stored. To make a fast
and reasonable selection of interesting events within the background, an advanced
trigger system is needed. At a typical Tevatron instantaneous luminosity L  100 
1030cm 2s 1 and with an inelastic pp cross-section of ﬀpp  60 mb, approximately
2:3  106 inelastic collisions per second occur, corresponding to one interaction per
bunch-crossing on average. The average size of information associate to each event
is 140 kbytes. Even in case of deadtime-less read-out of the detector, in order to
record all events, an approximate throughput and storage rate of 350 Gbyte/s
would be needed, largely beyond the possibility of currently available technology.
The read-out system has to reduce the 2.3 MHz interaction-rate to the 100 Hz
storage rate attainable at CDF.
CDF trigger system is based on three levels, each level receiving filtered events
from the previous one and computing, with more time available for processing, if
12
2.1. Tevatron
L2 Buffer L2 trigger
L1 Buffer
(pipeline)
Detector
L3 Farm
Mass
Storage
(accept)
(acc/rej)
Level 1
Level 2
50 kHz asynchronous
300-1000 
Hz    DAQ
L1/L2 rejection:  10,000:1
132 ns Bunch Spacing   (2.5 MHz / 396 ns)
7.6 MHz Crossing rate
  42 (14)
crossings
L1 trigger
pipeline. <20!s latency
<1 kHz accept rate
7.6 MHz (2.5 MHz)
4 !s latency
<50 kHz Accept rate
Figure 2.4: Data flow for the CDF detector trigger/DAQ system
one set of existing criteria is verified by the event data. The first two levels are
implemented with a dedicated hardware while the third level is software based.
Fig. 2.4 shows a scheme to explain how information flows trough the different
parts of the Data Acquisition System (DAQ). At Level-1 approximated sets of
information are extracted from raw data coming from the calorimeters, the COT
and the muon system by a custom-designed hardware that operates a first selection
to reduce the input rate to approximately 50 kHz, that is the input rate of the
Level-2.
At Level-2, an asynchronous system of custom-designed hardware processes the
events accepted by the Level-1. Additional information from the shower-maximum
strip chambers in the central calorimeter and the axial hits in the SVXII is com-
bined with the Level-1 primitives to produce Level-2 primitives. At this stage
information from SVXII is combined with Level-1 COT track primitives by the
SVT hardware system to form two-dimensional tracks with resolution similar to
the off-line one. The rate of events data is then reduced down to 600 Hz for
the Level-3 input. Data from Level-2 are transferred to Level-3, based on soft-
ware analysis on commercial CPU’s, where the event reconstruction benefits from
full detector information and improved resolution, with respect to the preceding
13
CHAPTER 2. Experimental apparatuses
trigger levels, including three dimensional track reconstruction, tight matching
between tracks and calorimeters or muon information, and more precise calibra-
tions. If an event satisfies the Level-3 requirements the corresponding event record
is transferred to mass storage at a maximum rate of 20 Mbyte/s. A peculiar aspect
of the CDF trigger system that must be pointed out once more is the usage of the
reconstructed tracks of charged particles from the first trigger level. At Level-1
the reconstruction is made by XFT (eXtremely Fast Tracker) with the information
from the COT, while in the second Level SVT uses also full-resolution information
coming from SVXII computing the track parameters within 20 s.
2.2 LHC
LHC (Large Hadron Collider) is the world’s largest and highest-energy particle
accelerator complex designed to collide two counter-rotating beams of protons or
heavy ions. It is being built in a circular tunnel 27 km in circumference buried
around 50 to 175 m underground. Proton-proton collisions are foreseen at energy of
14 TeV in the center of mass (7 times the energy available at Tevatron). Moreover
the instantaneous luminosity expected for the first three years (low luminosity
run) is of 1033cm 1s 1 and of 1034cm 1s 1 in the high luminosity run. The very
high luminosity, provided that the p-p total cross section at 14 TeV is expected to
be  80 mb, grants a event production rate in the high luminosity run of 109s 1. A
further very challenging attribute of the machine is the bunch spacing of 25 ns that
gives a crossing rate of 40 MHz with approximately 600 million inelastic events
per second.
Collisions at LHC take place inside four experiments: ALICE[9], devoted to
the study of heavy ion collisions, LHCb[10] dedicated to the B physics and the
two large general purpose experiments ATLAS[11] and CMS[12].
2.2.1 ATLAS: A Toroidal Lhc ApparatuS
The high interaction rates, particle multiplicities and energies that the LHC accel-
erator provides, as well as the requirements for precision measurements, have set
new standards for the design of particle detectors. ATLAS is one of the two general
purpose detectors that have been built for probing p-p collisions in such demand-
ing environment. The ATLAS Collaboration is formed by about 1800 physicists
and engineers from 146 different institutions located all around the world. Fig.2.5
14
2.2. LHC
shows an overview of the detector, it is 46 m long, 22 m high and weighs some
7000 tons. The ATLAS detector consists of four major components:
Figure 2.5: Overview of the ATLAS detector
 Inner tracker: measures the momentum of each charged particle
 Calorimeter: measures the energies carried by the particles
 Muon spectrometer: identifies and measures muons
 Magnet system: bends charged particles for momentum measurement
As shown in [13], the magnet configuration comprises a thin superconducting
solenoid surrounding the inner-detector cavity, and three large superconducting
toroids (one barrel and two end-caps) arranged with an eight-fold azimuthal sym-
metry around the calorimeters. This fundamental choice drove the design of the
rest of the detector. The inner detector is immersed in a 2 T solenoid field.
Pattern recognition, momentum and vertex measurements are achieved with a
combination of discrete high-resolution semiconductor pixel and strip detectors in
the inner part of the tracking volume, and straw-tube tracking detectors with the
capability to generate and detect transition radiation in its outer part. High gran-
ularity liquid-argon (LAr) electromagnetic sampling calorimeters, with excellent
15
CHAPTER 2. Experimental apparatuses
performance in terms of energy and position resolution, cover the pseudorapidity
range jj < 3:2. The hadronic calorimetry in the range jj < 1:7 is provided by a
scintillator-tile calorimeter, which is separated into a large barrel and two smaller
extended barrel cylinders, one on either side of the central barrel. In the end-caps
(jj > 1:5), LAr technology is also used for the hadronic calorimeters, matching
the outer jj limits of the end-cap electromagnetic calorimeters. The LAr forward
calorimeters provide precise electromagnetic energy measurements, and extend the
pseudorapidity coverage to jj = 4:9.
The calorimeter is surrounded by the muon spectrometer. The air-core toroid
system, with a long barrel and two inserted end-cap magnets, generates strong
bending power in a large volume within a light and open structure. Multiple-
scattering effects are thereby minimised, and excellent muon momentum resolu-
tion is achieved with three layers of high precision tracking chambers. The muon
instrumentation includes, as a key component, trigger chambers with timing res-
olution of the order of 1.5-4 ns.
The inner detector[13] (ID) needs a deeper description because of his role in
the hardware reconstruction of the particles trajectories by the FTK processor.
The ID consists of three independent but complementary sub-detectors (Fig.2.6).
Figure 2.6: Drawing showing the sensors and structural elements traversed by a charged
track of 10 GeV pT in the barrel inner detector ( = 0:3).
At inner radii, high-resolution pattern recognition capabilities are available using
discrete space-points from silicon pixel layers and stereo pairs of silicon microstrip
(SCT) layers. At larger radii, the transition radiation tracker (TRT) comprises
16
2.2. LHC
many layers of gaseous straw tube elements interleaved with transition radiation
material. With an average of 36 hits per track, it provides almost continuous
tracking to enhance the pattern recognition and improve the momentum resolution
over jj < 2:0 and electron identification complementary to that of the calorimeter
over a wide range of energies.
ATLAS Trigger System
The trigger and data acquisition system takes an essential role in the identification
of interesting rare physical events in hadron colliders. The immense (107) number
of data channels that come out the detector and the rate of 109 events per second
makes the choice of data to be stored a very challenging matter. The ATLAS
trigger system[13] has three distinct levels: L1, L2 and the event filter. Each trigger
level refines the decisions made at the previous level and, where necessary, applies
additional selection criteria. The data acquisition system receives and buffers
the event data from the detector-specific readout electronics, at the L1 trigger
accept rate, over 1600 point-to-point readout links. The first level uses a limited
amount of the total detector information to make a decision in less than 2:5 s,
reducing the rate to about 75 kHz. The two higher levels access more detector
information for a final rate of up to 200 Hz with an event size of approximately 1.3
Mbyte. The L1 trigger searches for high transverse-momentum muons, electrons,
photons, jets, and ﬁ -leptons decaying into hadrons, as well as large missing and
total transverse energy. Its selection is based on information from a subset of
detectors. High transverse-momentum muons are identified using trigger chambers
in the barrel and end-cap regions of the spectrometer. Calorimeter selections
are based on reduced-granularity information from all the calorimeters. Results
from the L1 muon and calorimeter triggers are processed by the central trigger
processor, which implements a trigger ’menu’ made up of combinations of trigger
selections. Pre-scaling of trigger menu items is also available, allowing optimal
use of the bandwidth as luminosity and background conditions change. Events
passing the L1 trigger selection are transferred to the next stages of the detector-
specific electronics and subsequently to the data acquisition via point-to-point
links. In each event, the L1 trigger also defines one or more Regions-of-Interest
(RoI’s), i.e. those regions within the detector where its selection process has
identified interesting features. The RoI data include information on the type of
feature identified and the criteria passed, e.g. a threshold. This information is
17
CHAPTER 2. Experimental apparatuses
CAPITOLO 2. APPARATI SPERIMENTALI 19
ATLAS-Detector
Pipeline-
Memory
Level1Accept
Trigger Readout Data
R
o
I 
C
o
o
rd
in
a
te
s
Readout-Buffer
~ sec.
Pipeline-Memory
Muon CaloTrack
Level2Accept
< 100µs
<
1
0
0
 H
z
<
1
 k
H
z
40 MHz 40 MHz 40 MHz
Regions of Interest
Trigger
Trigger
Level-1
Level-2
1-10 GB/s
Complete Events
Data-Storage
10-100 MB/s
1 Mbyte/event
ATLAS Trigger system and detector readout
0.1 x 0.1
~800,000 signals
~7200 signals
Event-Filter
<
7
5
 k
H
z
(1
0
0
 k
H
z
)
< 2ms
< 2µs
Figura 2.5: Livelli di trigger ad ATLAS, sono evidenziate le latenze di ogni livello, cos`ı
come i rate di input ed output.
Figure 2.7: ATLAS trigger levels. Input and output rates are shown.
18
2.2. LHC
subsequently used by the high-level trigger.
The L2 selection is seeded by the RoI information provided by the L1 trigger
over a dedicated data path. L2 selections use, at full granularity and precision, all
the available detector data within the ROI’s (approximately 2% of the total event
data). The L2 menus are designed to reduce the trigger rate to approximately 3.5
kHz, with an event processing time of about 40 ms, averaged over all events. The
final stage of the event selection is carried out by the event filter, which reduces the
event rate to roughly 200 Hz. Its selections are implemented using oﬄine analysis
procedures within an average event processing time of the order of four seconds.
It is clear that in order to perform the software selection within the required time
thousand of CPUs are needed.
19
CHAPTER 2. Experimental apparatuses
20
Chapter 3
The Fast Tracker
As seen in the previous chapters, the study of rare decays in a hadronic envi-
ronment can’t be carried out without a trigger system which is able to reveal
interesting event information against the background hiding the signal. With the
high rate of interactions and the QCD background (with cross sections of  tens
of mb), the search for events, with cross section of the order of nb and fb, is a
challenging matter. The challenge will be even harder in the Large Hadron Col-
lider, the next frontier for the HEP. The design of this accelerator places it at
top of both energy of collision and intensity with a CM energy of
p
14 TeV and
a peak luminosity of 1034 cm 2s 1, with a bunch crossing of 25 ns. Providing
high quality track reconstruction, over the full ATLAS detector, by the start of
processing in the level-2 computer farm, can be an important element in facing the
search of rare events with such a demanding environment. Fast Tracker processor
(FTK)[14] is a highly parallel processor dedicated to the efficient execution of a
fast track finding algorithm[15], based on the idea of a large bank of pre-computed
hit patterns. FTK is an evolution of the Silicon Vertex Tracker (SVT)[16] cur-
rently operating in the CDF experiment. The SVT reconstructs real time tracks
with sufficient precision to measure, for instance, b quark decay vertices. The pat-
tern recognition algorithm consists in associating fired detector channels, called
hits, into track candidates with low spatial resolution, called roads. Then, the
tracks are fitted and their parameters precisely determined. In this way, the fit
processing time benefits from the great reduction of combinatory brought by the
association in roads. The location of FTK, with respect to the Data Acquisition
data flow, is sketched in Fig. 3.1. FTK has access to the whole amount of tracking
data at a very high rate (up to 105 events/sec) in order to perform data reduction
for trigger applications. Hits in roads, with transverse momentum (PT ) above the
21
CHAPTER 3. The Fast Tracker
ATLAS Project Document No: Page: 3 of 8 
   
 
 
 
Interest.  We note that the dual-output HOLA board that drives both the usual DAQ data stream and an 
identical one to FTK was designed and tested, and passed an ATLAS board review.  It would replace the 
existing HOLA on the RODs.  A sketch of FTK is shown in Figure 3. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
he pixel and strip data are received by the Data Formatters (DF), which do simple cluster finding, 
e have done preliminary studies of FTK performance using a software simulation, FTKSim
4
.  Figure 4 
Figure 3:  Sketch of FTK and how it connects to the ATLAS silicon data stream. 
T
outputting the hits in each silicon layer to the Data Organizers (DO).  The DO boards serve two functions.  
They store the hits at full resolution and send them to the Associative Memory (AM) at a coarser 
resolution (superbins) appropriate for pattern recognition.  The AM boards contain a very large number of 
pre-loaded patterns or roads, corresponding to the possible combinations for real tracks of a superbin at 
each silicon layer.  These are determined in advance from full ATLAS simulation of single tracks using 
detector alignment extracted from real data.  The AM is a massively parallel system in that all roads 
simultaneously see each silicon hit.  When a road has found the requisite number of hit layers, it sends 
that road number back to the DO’s.  They immediately fetch the associated full resolution hits and send 
them and the road number to the Track Fitter (TF).  Because each road is quite narrow, the TF can 
provide high resolution helix parameters using the values for the center of the road and applying 
corrections that are linear in the track coordinate in each layer.  Fitting a track is extremely fast, 200 nsec. 
in the CDF SVT and much faster in FTK. 
 
W
compares the FTK and iPat impact parameter resolutions for the relatively clean environment of muons in 
sB !!"  events.  So far, the FTK resolution is equal to that of iPat with an additional 30 microns added 
ture.  Figure 5 compares light-quark rejection versus b-tagging efficiency for FTK and iPat, in quadra
Figure 3.1: The figure shows a sketch of the different logical parts of the FTK system and
their connections.
22
3.1. Online track reconstruction
interesting threshold, can be filtered from a huge number of other hits and written
in temporary memory banks accessible by the level-2 trigger logic, resulting in a
boost of its processing time.
3.1 Online track reconstruction
As said before FTK is the evolution of SVT, but the system has been completely
re-designed in order to significantly improve its performances. The track recon-
struction algorithm[15] is actually the same in the two systems and it is divided
into two parts: a first part implements the so-called pattern recognition, i.e. it
finds low resolution tracks (roads), associating the signals coming from the detec-
tor (hits); the second part fits high resolution tracks within the roads by applying
a simplified linear algorithm to the full resolution hits.
The first step in finding the roads associated to the hits in the detector tracker is
to construct low resolution hits. This is necessary in order to contain the pattern
bank (the memory where pattern are stored) within an acceptable size. Every
detector layer is segmented into many bins. For each event, a number of particle
tracks traverse the detector. Each track crosses one bin per layer, generating hits.
Therefore,each event is associated to specific strings of hits and misses ( coded
as 1s or 0s, respectively on each detector layer): in order to reduce the spatial
resolution the fired bins are then grouped in super-bins executing a logic OR of
adjacent bins. All tracks of physical interest correspond to bit patters that are
explicitly enumerated and stored in an appropriated data bank. From the data
bank the fired roads are extracted and associated to the high resolution hits, which
are then used to fit the track parameters. A conceptual sketch of this procedure
is shown in Fig. 3.2.
It must be pointed out that FTK doesn’t actually take part to trigger decisions
but it just computes track parameters of interesting events, rejecting most of the
background and storing this information in a memory buffer that level-2 CPU’s can
access at high rate. It has been designed to work with event information coming
from the level-1 trigger, with an input event rate of 50-100 kHz, and it must
perform pattern recognition and track fitting within 10-20 microseconds. Pattern
recognition is performed with negligible time delay by the so called Associative
Memory (AM)[17] that is essentially a Content Addressable Memory (CAM)[18],
i.e., a device that compares in parallel an input set of hits with all stored patterns
23
CHAPTER 3. The Fast Tracker
!"#$%&'&()(*(+'(#,-.-/-00&,-(.-'(1$2"3$%%-,4($'(5,"/6(7$%%-,(.$(895(
!"#$$#% &"'()*+% ,-#(.".% /#0$1!'2*#(',.% ).,% 34% '10.(!'% 5*&(*6*/'!*7'0.(!.% .% *% !.0$*% )'% .55#%
*0$*.&'!*%$."%,'%"*/#5!"12*#(.%).,,.%!"'//.%"*5/8*'(#%)*%#,!".$'55'".%*%,*0*!*%$".7*5!*%$."%*,%%,$22-,%
)*% 5./#()#% ,*7.,,#9% :-% ;1*()*% .7*).(!.% /8.% ,'% 5/.,!'% ).,,.% )*0.(5*#(*% ).*% 0:#-,% ;$<5% <%
6#()'0.(!',.%$."%&'"'(!*".%*,%/#"".!!#%61(2*#('0.(!#%).,%5*5!.0'%=$."%"*5$#().".%'%!',*%.5*&.(2.+%
(.,,-'!!1',.%>?3%5*%<%5/.,!#%)*%')#!!'".%).*%0:#-,(;$<0%)*%@AB%!0C9%%
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5#%,-',&#"*!0#%)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5#%,-D*!%E166."%5*%"./1$."'%
,-*(6#"0'2*#(.%".,'!*7'%',,'%,&".%/#,$*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7.(!#%',/1(*%;$<0%"*7.,'(#%
*,%$'55'&&*#%)*%1('%$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.()#%,-HI%,#&*/#%!"'%*%;$<0%/8.%
/#0$#(&#(#%#&(*%0:#-,(;$<%5*%"*/'7'(#%&,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I'$$".5.(!'2*#(.%5.0$,*6*/'!'%)*%/#0.%7.(&#(#%).6*(*!*%&,*%=$%0%'%G'55'%).6*(*2*#(.%'%$'"!*".%)'%;1.,,*%
')%',!'%).6*(*2*#(.%=*00'&*(*%'+%G+%/+%)C%.%)*%/#0.+%7*/.7."5'+%'%$'"!*".%)'%1('%,&".%'%G'55'%).6*(*2*#(.%5*%$#55#(#%
.5!"'"".%&,*%=$%0%')%',!'%).6*(*2*#(.%*(%.55'%/#(!.(1!*%.%(./.55'"*%$."%,'%6'5.%)*%6*!%).,,.%!"'//.%=*00'&*(*%.+%6C%%
=.5.0$*#%".,'!*7#%',%/'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(a) Every layer is subdivided in bins
!"#$%&'&()(*(+'(#,-.-/-00&,-(.-'(1$2"3$%%-,4($'(5,"/6(7$%%-,(.$(895(
!"#$$#% &"'()*+% ,-#(.".% /#0$1!'2*#(',.% ).,% 34% '10.(!'% 5*&(*6*/'!*7'0.(!.% .% *% !.0$*% )'% .55#%
*0$*.&'!*%$."%,'%"*/#5!"12*#(.%).,,.%!"'//.%"*5/8*'(#%)*%#,!".$'55'".%*%,*0*!*%$".7*5!*%$."%*,%%,$22-,%
)*% 5./#()#% ,*7.,,#9% :-% ;1*()*% .7*).(!.% /8.% ,'% 5/.,!'% ).,,.% )*0.(5*#(*% ).*% 0:#-,% ;$<5% <%
6#()'0.(!',.%$."%&'"'(!*".%*,%/#"".!!#%61(2*#('0.(!#%).,%5*5!.0'%=$."%"*5$#().".%'%!',*%.5*&.(2.+%
(.,,-'!!1',.%>?3%5*%<%5/.,!#%)*%')#!!'".%).*%0:#-,(;$<0%)*%@AB%!0C9%%
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5#%,-',&#"*!0#%)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5#%,-D*!%E166."%5*%"./1$."'%
,-*(6#"0'2*#(.%".,'!*7'%',,'%,&".%/#,$*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7.(!#%',/1(*%;$<0%"*7.,'(#%
*,%$'55'&&*#%)*%1('%$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.()#%,-HI%,#&*/#%!"'%*%;$<0%/8.%
/#0$#(&#(#%#&(*%0:#-,(;$<%5*%"*/'7'(#%&,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I'$$".5.(!'2*#(.%5.0$,*6*/'!'%)*%/#0.%7.(&#(#%).6*(*!*%&,*%=$%0%'%G'55'%).6*(*2*#(.%'%$'"!*".%)'%;1.,,*%
')%',!'%).6*(*2*#(.%=*00'&*(*%'+%G+%/+%)C%.%)*%/#0.+%7*/.7."5'+%'%$'"!*".%)'%1('%,&".%'%G'55'%).6*(*2*#(.%5*%$#55#(#%
.5!"'"".%&,*%=$%0%')%',!'%).6*(*2*#(.%*(%.55'%/#(!.(1!*%.%(./.55'"*%$."%,'%6'5.%)*%6*!%).,,.%!"'//.%=*00'&*(*%.+%6C%%
=.5.0$*#%".,'!*7#%',%/'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(b) Some bins are fired by particle
tracks
!"#$%&'&()(*(+'(#,-.-/-00&,-(.-'(1$2"3$%%-,4($'(5,"/6(7$%%-,(.$(895(
!"#$$#% &"'()*+% ,-#(.".% /#0$1!'2*#(',.% ).,% 34% '10.(!'% 5*&(*6*/'!*7'0.(!.% .% *% !.0$*% )'% .55#%
*0$*.&'!*%$."%,'%"*/#5!"12*#(.%).,,.%!"'//.%"*5/8*'(#%)*%#,!".$'55'".%*%,*0*!*%$".7*5!*%$."%*,%%,$22-,%
)*% 5./#()#% ,*7.,,#9% :-% ;1*()*% .7*).(!.% /8.% ,'% 5/.,!'% ).,,.% )*0.(5*#(*% ).*% 0:#-,% ;$<5% <%
6#()'0.(!',.%$."%&'"'(!*".%*,%/#"".!!#%61(2*#('0.(!#%).,%5*5!.0'%=$."%"*5$#().".%'%!',*%.5*&.(2.+%
(.,,-'!!1',.%>?3%5*%<%5/.,!#%)*%')#!!'".%).*%0:#-,(;$<0%)*%@AB%!0C9%%
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5#%,-',&#"*!0#%)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5#%,-D*!%E166."%5*%"./1$."'%
,-*(6#"0'2*#(.%".,'!*7'%',,'%,&".%/#,$*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7.(!#%',/1(*%;$<0%"*7.,'(#%
*,%$'55'&&*#%)*%1('%$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.()#%,-HI%,#&*/#%!"'%*%;$<0%/8.%
/#0$#(&#(#%#&(*%0:#-,(;$<%5*%"*/'7'(#%&,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I'$$".5.(!'2*#(.%5.0$,*6*/'!'%)*%/#0.%7.(&#(#%).6*(*!*%&,*%=$%0%'%G'55'%).6*(*2*#(.%'%$'"!*".%)'%;1.,,*%
')%',!'%).6*(*2*#(.%=*00'&*(*%'+%G+%/+%)C%.%)*%/#0.+%7*/.7."5'+%'%$'"!*".%)'%1('%,&".%'%G'55'%).6*(*2*#(.%5*%$#55#(#%
.5!"'"".%&,*%=$%0%')%',!'%).6*(*2*#(.%*(%.55'%/#(!.(1!*%.%(./.55'"*%$."%,'%6'5.%)*%6*!%).,,.%!"'//.%=*00'&*(*%.+%6C%%
=.5.0$*#%".,'!*7#%',%/'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(c) Bins are grouped in super-bins
!"#$%&'&()(*(+'(#,-.-/-00& (.-'(1$2"3$%%-,4( '(5,"/6(7$%%- (.$(895(
!"#$$#% &"'()*+% ,-#(.".% /#0$1!'2* (',.% ).,% 34% '10.(!'% 5*&(*6*/'!*7'0.(!.% .% *% !.0$* )'% .55#
*0$*.&'! %$."%,'%" /#5!"12*#(.%).,,.%!"'// %"*5/8*'(#%)*%#,!".$'55'".%*%,*0*!*%$".7*5!*%$."%*,%%,$22-,%
)*% 5./#( #% ,*7.,,#9% :-% ;1*()*% .7*).(!.% /8.% ,'% 5/.,!'% ).,, % )*0 (5*#(*% ).*% 0:#-,% ;$<5% <%
6#()'0.(!',.%$."%& "'(!*".%*,%/#"".!!#%61(2*#('0.(! %).,%5*5 .0'%=$."%"*5$#().".%'%!',*%.5*&.(2.+%
(.,,-'!!1', %>?3%5*%<%5/.,!#%)* ')# !'".%).*%0:#-,(;$<0%)*%@AB%!0C9%%
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5# ,-',&#"*!0#%)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5# ,-D*!%E166."%5*%"./ $ '%
,-*(6#"0'2*# .%".,'!*7'%',,'% &".%/# $*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7. !#%',/1(*%;$<0%"*7.,'(#%
*,%$'55'&&*# )*%1('%$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.( #%,-HI%,#&*/#%!"'%*%;$<0%/8.%
/#0$#(&#( %#&(*%0:#-,(;$<%5*%"*/'7'(#%&,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I'$$".5.(!'2*#( %5.0$, 6*/'!'%)*%/#0.%7.(&#(#%).6* *!*%&,*%=$%0%'%G'55'%).6*(*2*#(.%'%$'"! ".%)'%;1.,,*%
')%',!'%).6*(*2*#(.%=*00'&*(*%'+%G+%/+%)C%.%)*%/#0.+%7*/.7."5' '%$'"!* .%)'%1('%,&".%'%G'55'%) 6*(*2*#(.%5*%$#55 #%
.5!"'"".%&,*%=$%0%')%',!'%).6*(*2*#(.%*(%.55'%/#(!.(1!*%.%(./ 55'"*%$."%,'%6 5.%)*%6*!%).,, %!"'//.%=*00 &*(*%.+%6C%%
=.5.0$*#%".,'!*7#%', /'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(d) A logic OR is performed with the
bins that are part of the same
super-bins. The low resolution hits
are now available
!"#$%&'&()(*(+'(#,-.-/-00&,-(.-'(1$2"3$%%-,4($'(5,"/6(7$%%-,(.$(895(
!"# $#% &"'()*+% ,- ".% /#0$1!'2*#(',.% ).,% 34% '10.(!'% 5 &(*6 /'!*7' .(!.% % % .0$*% )'% .55#%
*0$* &'!*%$."%,'%"*/#5!"12*#(.% .,, %!"'// "*5/8* (#%)*%#,!".$'5 '".%*%,*0*!*%$".7*5!*%$."%*,%%, 22-,%
)*% 5./#()#% ,*7 ,,#9% :-% ;1 ()*% .7*).(!.% /8.% ,'% 5/.,!'% ).,,.% )*0 (5*#(*% ).* 0:#-, ;$<5% <%
6#() 0.(!',.%$."%&'" (!*".%*,%/#"".!! %61(2*#('0.(!#% ., 5*5 . ' =$."%"*5$#().".%'%!',*%.5*&.(2.+%
(.,,-'!!1',.%>?3%5*%<%5/.,!#%)*%')#!!'".%).*%0:#-,(;$<0%)*%@AB%!0C9%%
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5#%,-',&#"*!0#%)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5#%,-D*!%E166."%5*%"./1$."'%
,-*(6#"0'2*#(.%".,'!*7'%',,'%,&".%/#,$*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7.(!#%',/1(*%;$<0%"*7.,'(#%
*,%$'55'&&*#%)*%1('%$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.()#%,-HI%,#&*/#%!"'%*%;$<0%/8.%
/#0$#(&#(#%#&(*%0:#-,(;$<%5*%"*/'7'(#%&,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I $$".5.(!'2*# .%5.0$,*6*/'!'%)*%/#0.%7.(&#(#%).6*(*!*%&,*%=$%0%'%G 55'%).6*(*2*# .%'%$'"!*".%)'%;1.,,*%
')%',!'%).6*(*2*#(.%=*00'&*(*%'+%G+%/+%)C%.%)*%/#0.+%7*/.7."5'+%'%$'"!*".%)'%1('%,&".%'%G'55'%).6*(*2*#(.%5*%$#55#(#%
.5!"'"".%&,*%=$%0%')%',!'%).6*(*2*#(.%*(%.55'%/#(!.(1!*%.%(./.55'"*%$."%,'%6'5.%)*%6*!%).,,.%!"'//.%=*00'&*(*%.+%6C%%
=.5.0$*#%".,'!*7#%',%/'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(e) A fired road has been found with
pattern recognition
!"#$%&'&()(*(+'(#,-.-/-00&,-( '(1$2"3$%% ,4($'(5,"/6(7$%%-,(.$(895(
!"#$ #% &"'()*+% ,-#(.".% /#0$1!'2* (', ). % 34% 10.(!' 5*&(*6*/ *7'0.(! .% *% !.0$* )'% 55#
*0$*.&'!*%$."% '%"*/ 5! 12*#(.%).,,.%!"'// "*5/8*'(# )*%#,!".$'55'".% %,*0*!*%$".7*5! %$. %* %,$22-,%
)*% 5./#( ,*7.,,#9% :-% ;1*()*% .7*).(!.% /8.% ,'% 5 .,!'% ).,,.% )*0.(5*#(*% .*% 0:#-,% ;$<5% <
6#()'0.(!',.%$." &'"'(!*". ,%/#"". !#%61(2*#('0. !#% .,%5*5!. ' =$."%"*5$#().".%'%!',*% 5*&.(2.+%
(.,,-'!!1',.%>?3%5*%<%5/.,!#%)*%')#! '".%).*%0:#-,(;$< %)*%@AB%! C9 %
%
%
%%% % %
%
%
%
% % %
!
!
%
%
%
%
%
%%%%%%%%%% % % % % %
%
%
%
%
%
%
%
%
%
%
%
%
%
%
=.C%%'!!"'7."5#%,-',&#"*!0 %)*%#"%%-,<(
,-/&2<$%$&<%7*.(.%*()*7*)1'!'%1('%,&".(
/#,$*!'%
=6C%'!!"'7."5#%,-D*!%E166."%5*%"./1$."'
,-*(6#"0'2*#(.%".,'!*7'%',,'%,&".%/#,$*!'%
.5!"'()#(.%&,*%=$%0%'%',!'%).6*(*2*#(.%
='C%*%F%'">-,0%5#(#%51))*7*5*%*((;$<0% =GC%)1"'(!.%1(%.7.(!#%',/1(*%;$<0 "*7.,'(#%
*,%$'55'&&*#%)*%1( %$'"!*/.,,'%==$%0C%
=/C%*%;$<0%7.(&#(#%"'&&"1$$'!*%*(%0:#-,(
;$<0(
=)C%.5.&1.()#%,-HI%,#&*/#%!"'%* ;$<0%/8.%
/#0$#(&#(#%#&(*%0: -,(;$<%5*%"*/'7'(# &,*%
=$%0%'%G'55'%).6*(*2*#(.%
4*&1"'%@9A%J%I'$$".5.(!'2*#(.% 0$,*6*/'!'%)*%/#0.%7.(&#(#%).6*(*!*%&,*%=$%0%'%G'55 %).6*(*2*#(.%'%$'"!*".%)'%;1.,, %
')%',!'%).6*(*2*#(.%=*00'&* *%'+%G+%/+%)C%. )*%/#0.+%7*/.7."5'+%'%$'"!*".% '%1( %,&".% %G'55'%).6*(*2*#(.%5*%$#55#(#%
.5!"'"".%&,*%=$%0%')%',!' ).6*(*2*#(.%*(%.55'%/#(!.(1!*%. (./.55'" $."%,'%6'5.%)*%6*!%).,,.%!"'//.%=*00'&*(*%.+%6C%%
=.5.0$*#%".,'!*7#%',%/'5#%)*%5#,*%;1'!!"#%'">-,0C%
%
% KL
(f) High resolution hits are extracted
within the road
Figure 3.2: A schematic representation of the pattern recognition algorithm
24
3.2. General structure of FTK
and returns the address of the matching location. Track fitting is performed by
the Gigafitter: the dedicate hardware that I have contributed to develop, based
on a modern FPGA chip. Gigafitter applies a fast linear fit, described in section
3.3,in order to calculate the 2 and the fit parameters.
3.2 General structure of FTK
FTK consists of three main cooperating parts: the Data Organizer (DO), the
pipelined Associative Memory (AM) and the Gigafitter. While the DO and the
Gigafitter are developed with programmable chips (FPGA) that provide the nec-
essary flexibility in the development and debugging for a R&D project, AM are
standard cell ASICs. In this section I will describe the structure and functions of
the DO and the AM.
3.2.1 Data Organizer
The Data Organizer[19] interfaces the FTK with the DAQ system. DO (Fig. 3.3)
receives all full resolution hits found in the tracking detector for events selected by
a level-1 trigger. It transforms the full resolution hits, as provided by DAQ, into the
format and precision required by AM. The full resolution hits are then buffered
inside the Data Organizer into a structured internal database, which allows a
very fast retrieval of hits belonging to a given road. The database consists of 16
buffers called Event Storage Unit (ESU). Each ESU can store an high resolution
event waiting for the roads from the AM; it is therefore possible to simultaneously
handle up to 16 events. DO is part of a data driven pipeline based on a simple
pipeline transfer protocol, driven by an asynchronous Data Strobe (DS) signal.
The maximum DS frequency is 40 MHz. Start event and End event words are
added in data flow for synchronization in the AM boards and in downstream parts.
When roads are received back from AM all the detector hits contained in the roads
are immediately fetched from the ESUs and, with the road information, sent to the
DO. Every DO board can serve up to 2 layers of the Inner Detector. The complete
road with the associated high resolution hits are then sent to the Gigafitter.
The Data Organizer is an evolution of the Hit Buffer [20] which performs similar
functions inside the SVT processor. With respect to the Hit Buffer, the Data
Organizer is able to store and manage up to 16 events. While the Hit Buffer
processes one event at a time, the Data Organizer is able to overlap the processing
25
CHAPTER 3. The Fast Tracker
of two events: an event is received and stored in the internal database, while roads
from a previously written event are being processed.
Figure 3.3: Data Organizer
3.2.2 Associative Memory
Associative Memory[21] is a dedicate hardware system that must be able to store
all trajectories of interest and extract the ones compatible with a given event. A
trajectory is compatible with an event if all (or most ) detector channels crossed
by that trajectory have fired in that event. Such a task is very conveniently
parallelized in a modular architecture in which each module contains a pattern.
The pattern includes both the memory required for storing a single trajectory and
the logic needed to compare the coordinates of all fired detector channels with
those associated to the stored trajectory. Each module must receive, as inputs,
the complete configuration of fired detector channels of each event.
The AM boards are assembled into a pipeline: data (all hits and found roads)
exiting the first board are fed into the second and so on until the end of the
pipeline. In such a way, the AM is fully scalable: boards can be simply added to
increase the pattern bank size, with the only drawback of increased data latency.
On the other hand, if all boards were connected to the same input buses, the
maximum number of boards would be limited by the bus fanout. Different AM
boards can work on different events.
The AM board has a modular structure, consisting of 4 smaller boards, the
Local Associative Memory Banks (LAMB). Each LAMB contains 32 AM chips, 16
26
3.2. General structure of FTK
per face. A start event and an end event word separate hits and roads belonging
to different events. Each board input is provided with deep FIFOs. When an AM
board starts to process an event, the hits are popped from the FIFO. Popped hits
are simultaneously sent to the four LAMBs and to the output registers that drive
the connections to the downstream board.
(a) There are 4 LAMB per board with 32 AM
chips each.
(b) Data are send in parallel to the four LAMB
and then to the AMs trough INDI chips
Figure 3.4: The AM board and data flow within it
The parallelism structure of the AM board reflects the internal AM chip struc-
ture. The chip is composed by identical modules called pattern modules, each
one containing a candidate road and the logic to make the comparison. The hit
buses are fed in parallel to all the pattern modules that return their address if
there is a correspondence. Also, the pattern module structure is high parallel and
scalable. It contains several words[17], each word storing the address of one hit in
the corresponding layer. All words continuously compare their content to what is
on the layer bus and, if a match is found, the corresponding flip-flop (FF) is set.
This strategy is illustrated in Fig. 3.5
As soon as hits are downloaded into an AM board, locally matched roads set
the request to be read out from the LAMB. Once the event is completely read out
from the hit FIFO, the AM board make the last matched roads available within
few clock cycles. Roads must cross the whole AM pipeline to go back to the Data
Organizer.
27
CHAPTER 3. The Fast Tracker
Figure 3.5: Example of pattern matching with four layers in a pattern module. In Patt
0 all layers are matched and the address is sent in the output bus. In Patt 2
ans Patt 3 there is only one match and no output is produced.
28
3.3. Track fitting with linear constraints
According to what said before, a pattern could be considered matched when
there is a match in all layers. However, because of the detector inefficiencies, this
requirement is too strong and leads to poor efficiency of the algorithm. Thus the
AM chips is provided with a module (majority module) that makes the compar-
ison, requiring a fewer number of matches (e.g, only in 5/6 layers).However this
approach brings a new problem: the so called ghost tracks. To understand what
they are it is convenient to make an example. If in a six layer detector a particle
track is revealed in every layer, with the majority logic there will be a correspon-
dence, not only in the 6/6 road, but also in all that patterns that differ from the
6/6 for only one layer, generating different roads for the same physical event. Also,
in case the track is not revealed by all the layers but only by 5/6 layers, there will
be a match of all the patterns that differ from the others for the missing layer only.
Of course, it is possible to find the ghost tracks after the fit of the track parameters
and it is indeed what it is done by the Ghost Buster in SVT. But, in order to save
computing time, a best approach would be to eliminate multiple roads before the
track fitting; the part of FTK that does this job is the Road Warrior. The Road
Warrior groups the roads with the same physical informations (i.e. the fired bins
in the layers) and pushes into the data stream only one road per group.
3.3 Track fitting with linear constraints
The role of the track fitting step is to find, within all hit combinations in a road,
the physical tracks and to calculate their parameters with oﬄine quality. To in-
troduce the problem I explain what are the parameters that have to be computed.
A charged particle traveling in a uniform magnetic field follows an helicoidal tra-
jectory that can be completely described with five parameters. Considering a
cylindrical (r; ﬃ; z) coordinate system, where z is in the beam direction and ﬃ = 0
is in the horizontal plane, the parameters chosen to describe the helix are:
c: the signed half-curvature of the helix, defined as c = q=2R, where R is the
radius of the helix and q is the charge sign. The magnitude of this quantity
is directly connected to the transverse momentum: pT = 0:3eBR.
ﬃ0: the direction of the particle at the point of closest approach to the z-axis.
d0: the impact parameter, i.e., the distance of the closest point of the helix to the
z-axis, defined as d0 = jqj  (
q
x20 + y
2
0  R).
29
CHAPTER 3. The Fast Tracker
cot : where  is the polar angle of the particle at the point of its closest approach
to the z-axis. This is directly related to the longitudinal component of the
momentum: pz = pT cot .
z0: the z coordinate of the point of the closest approach to the z-axis.
The reconstruction of the charged-particle trajectory consists in determining the
above parameters trough an helix fit. Fitting an helix can be a tough problem but
pattern recognition makes it easier, permitting to solve it within the time imposed
by Level-2 trigger. The used algorithm is based on Principal Component Analysis
and the linearization of the problem[22] (Linearity is guarantee by the small road
size and it allows to greatly reduce the processing time). To explain the method I
will use, as an example, the case of the SVT processor in the CDF detector, where
only three parameters are fitted (c; d; ﬃ) to reconstruct the projection of the helix
on the r   ﬃ plane. To transfer it into the FTK environment, only the number of
coordinates and parameters have to be changed.
In SVT tracks are reconstructed from six coordinates: the positions of the hits
on four of the five SVX II layers and the ﬃ and c measured by XFT on the COT
data. Every track can be thought as a point on R6: represented as ~x = (x1; : : : ; x6).
A track is described by 3 parameters and thus we have 3 degrees of freedom. Hence
it is possible to write three constraint function ~fi(x) = 0 i = 1; 2; 3, to reduce
the degrees of freedom from the initial six to three. If these functions are known,
substituting a group of six coordinates inside them will enable to determinate if
it is a track.
The three constraint equations, representing a 3-dimentional surface embedded
in R6, can be very complex but they can be locally linearized, i.e. the surface can
be locally substituted with its 3-dimensional tangent plane.
~fi(~x) ' ~vi  (~x  ~x0) =
6X
j=1
vijxj + qi i = 1; 2; 3 (3.1)
This approximation is very good within a road, where coordinates are small
displacements. The constants ~vij and qi depend only on the detector geometry
and can be calculated oﬄine with simulations of the detector or with a more
reliable method, by using real data and the principal component analysis. In
a real detector, uncertainties will make the coordinates to fluctuate statistically.
The xi belonging to the same track are correlated trough a covariance matrix
ﬀij = hxixji   hxiihxji.
30
3.3. Track fitting with linear constraints
Figure 3.6: Within a road, the hits coordinates in each layer are separated by a small
displacements. The track parameters and 2 can then be approximated with
functions linear in the displacements coordinates.
Fluctuations are then induced in ~f1(~x); : : : ; ~f3(~x) that are correlated with the
correlation matrix:
Fij '
3X
k;l=1
@ ~fi
@xk
 @
~fj
@xl
ﬀkl (3.2)
calculated at the first order, where the derivatives are evaluated in h~xi = ~x0.
It is possible then to compute the quantity:
2 =
3X
k;l=1
~fi Fij ~fj (3.3)
that is distributed as a 2 with three degree of freedom and can be used to deter-
mine if a combination ~x of coordinates is compatible with a real track. Equation
3.3 defines a region of probability, next to the 3-dimensional surface defined by
the constraint functions, where the probability for a set of coordinates to be a
real track becomes greater as the representative point approaches the surface. So,
we can operate a cut on the 2 value to accept an ~x as a track (Fig.3.7). The
calculation can be simplified with a diagonalization of the correlation matrix F ,
by executing a rotation of the coordinate system in R6. In this operation we will
find that three i eigenvalues are negligible with respect to the others. The larger
i have eigenvectors in the directions lying on the constraint surface, while the
small i have eigenvector perpendicular to the same surface.
In the new coordinate system, redefining the ~fi, it is possible to write:
2 = 21 + 
2
2 + 
2
3 (3.4)
31
CHAPTER 3. The Fast Tracker
Figure 3.7: The figure shows a representation in a 3D space of the constraint surface. The
surface is approximated in a small area with the tangent plane. The rotated
coordinate system is also sketched. The points representing ~x coordinates are
distributed in the probability region next to the constraint. The 2 is the
distance of the point from the surface.
32
3.4. The FTK predecessor: SVT
which is easier to evaluate, and represents the distance of ~x from the constraint
surface and where the i are the redefined constraints that can be properly lin-
earized as in equation 3.1. ~x depends on track parameters that are now to be
found (xi = xi(d; ﬃ; c)). The dependence of parameters from the coordinate xi can
also be linearly approximated as:
c = ~vc  (~x  ~x0); d = ~vd  (~x  ~x0); ﬃ = ~vﬃ  (~x  ~x0) (3.5)
The constants ~vc; ~vd; ~vﬃ can be evaluated, as the ~vi anc ci from a Monte Carlo
simulation or the analysis of data from real tracks during the training of the system.
Further details concerning the application of principal component analysis to track
fitting can be found in [22][23]. To summarize the whole procedure of linear track
fitting we can subdivide it into three steps:
 To calculate the 2 function of the set of hits coordinates with:
2 = 21 + 
2
2 + 
2
3; i =
6X
j=1
vijxj + ci i = 1; 2; 3; j = 1; : : : ; 6
 To make a selection taking only the tracks with 2 < 2th, where 2th is a
given threshold
 To calculate the track parameters with p = ~vp  ~x + p0, where p is a generic
parameter.
The major advantage of this algorithm is to transform a difficult fit procedure in
few sums and scalar products that can be carried out in a very short time.
3.4 The FTK predecessor: SVT
The Silicon Vertex Trigger (SVT), one of the major component of CDF trigger,
works with the same algorithm described for FTK, but it is used to calculate
the parameters of the 2D projection of the trajectory on the r   ﬃ plane and
not all the 3D parameters as in FTK. SVT consists of 12 identical systems that
works in parallel each processing data from a SVXII wedge (30 degrees in ﬃ). The
system calculates the track parameters using the information from the five SVXII
layers and the information of the ﬃ and c parameters reconstructed by XFT in the
level-1 trigger (XFT uses the central drift chamber (COT) information). Using
the expressions !!inserire numero !!introduced in the previous section the SVT
33
CHAPTER 3. The Fast Tracker
processor has to reconstruct 3 parameters (ﬃ, d, c) in a 7-dimentional space (5 SVX
II and 2 XFT coordinates). The roads are here composed by 6 super-bins, where
five are the super-bins defined on SVXII with a resolution of 250 m and the sixth
is an azimuthal angle with 5 resolution from XFT (the second XFT coordinate is
unused by the AM). A majority 4/5 logic is used on the SVXII coordinates, but
the COT information must exist to consider valid the whole pattern.
The data flow inside SVT, sketched in Fig. 3.8, can be described as follow:
 Input data from SVXII are sent to the Hit Finder (HF) that converts the
information in bit words for the pattern recognition;
 these words are sent to the Merger board that merges this information with
the information from XFT, defining a bus to be sent to the AM;
 the AM system includes the AMB (Associative Memory Board), where the
pattern recognition takes place, and the AMS (Associative Memory Se-
quencer), that manages the communication between the AMBs and the other
boards with the SVT communication protocol. The Road Warrior function,
described for FTK, is also implemented in the AMS;
 the hits bus from the HF is also sent to the Hit Buffer (HB) where the hits
are stored waiting for the roads to come from AMS; when roads are ready, it
retrieves the corresponding hits and sends both road and hits to the Track
Fitter (TF++);
 the Track Fitter computes the parameters, with the full spatial resolution,
of all the possible tracks corresponding to the road. It finds all the possible
combinations of hits and operates a 2 cut. All the tracks that pass the cut
are sent, with the parameters and the road information, to a Merger board;
 in the Merger the buses coming from 12 data streams, each one corresponding
to one wedge, are merged into a bus to be read from the Ghost Buster. The
GB, as described for FTK, eliminates the ghost tracks and sends all the
information to the Level-2 trigger system.
3.4.1 Track Fitter++
The running Track Fitter (TF++), described in [24], is one of the last upgraded
parts in SVT. It is implemented in Pulsar boards, every board fitting tracks from a
34
3.4. The FTK predecessor: SVT
!"#$%&'&()(*(+'(#,-.-/-00&,-(.-'(1$2"3$%%-,4($'(5,"/6(7$%%-,(.$(895(
! "#
$%&'()!*+,!-!.)//(01023)4%520!16708)3%6)!90::;)(67%3033'()!&020():0!9%!<=>?!1%!253%25!%!9%@0(1%!3%/%!9%!167090!
0:033(52%670!=AB!CD!()//(01023)3%!9)!(033)2&5:%!9%!65:5(%!9%@0(1%!0!:5!16708)!9%!6522011%520!#$#-'$:-!(0):%44)35!
652!3'"%(/";'-0!E%2!20(5F+!G!9)3%!,&<!)((%@)25!9):!(%@0:)35(0!@%)!H%I(0!533%670+!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Figure 3.8: Overview of the SVT architecture
35
CHAPTER 3. The Fast Tracker
SVXII phi sector or wedge. The TF was upgraded because the old implementation
was not sufficient for the upgraded system; in particular it was too slow to process
large numbers of tracks and couldn’t handle more than 128k patterns. The TF++
is now compatible with 512k patterns and has gained a factor 2-3 in speed. The
system receives road packets from the HB, processes track parameters for multiple
combinations of hits in the packet, and outputs this information to a merger and
then to the GB. The TF++ uses the three FPGA chips mounted on the Pulsar
board (see paragraph 4.2.1 for a short description of the board). An FPGA is in
charge of all I/O. Upon receiving road-hit packets, it forms all combinations of
hits in a road and creates input words for the other FPGAs, which will do all the
fitting. Each one of the two chips is connected with two mezzanine cards mounted
on the board, where the constants needed for the fitting algorithm are stored in
RAMs.
TF++ limits To introduce the improvements that the Gigafitter project, pre-
sented in this thesis, will bring in SVT I will now show which are the limits of
the TF++. One of the difficulties encountered in implementing the TF in Pulsar
FPGAs is the width of the multiplications that can be carried out in these chips.
The hits information coming from the HB is 15 bits wide. To fit track parame-
ters 15 bits multiplier are then needed, but Altera chips on Pulsar have only 88
bits multipliers. This fact makes the fitting algorithm a little more complex. To
reduce the width of the multiplications the hit information is re-definited as the
sum of the position x0 of a superstrip (a low resolution bin) vertex (7 bits) and the
distance d of the hit on the superstrip from the vertex (8bits) (x = x0 + d). The
products x0i  ci of the superstrip coordinate and the coefficient of the linearized
fitting algorithm are then precalculated and stored in a RAM called SS_MAP.
This allow to calculate only the products dici that fits in the 88 multipliers.
This procedure needs a large amount of memory that is subtracted from the coef-
ficients memory, seriously limiting the size of the AM pattern bank and thus SVT
efficiency. Another limiting attribute of the current TF is the clock frequency of
the used FPGA that can’t be grater than 250MHz, while in the new Virtex FPGAs
the clock can run up to 550MHz. The problem is not limited to a greater process-
ing time, but also affects the reconstructed tracks quality. Fitting the same track
many times, deleting one layer in each fitting process, would eliminate inaccurate
track reconstruction due to noisy hits, which are recurrent, in particular, in the
high luminosity run. With Level-2 time requirements this strategy is not possible,
36
3.4. The FTK predecessor: SVT
in the current implementation. The introduction of Gigafitter, as I will show in
the next chapter, will solve this difficulties with a new design based on the newest
FPGA technology.
37
CHAPTER 3. The Fast Tracker
38
Chapter 4
Gigafitter
The Gigafitter (GF) is a R&D project approved in 2005 as part of the upgrade
of the SVT processor that has been carried on to meet the Run IIb requirements.
It was born from the idea of replacing the existing Track Fitter with the aim of
achieving a shorter SVT processing time, a better SVT efficiency, more stable
performances as a function of the Instantaneous Luminosity and a better SVT
acceptance due to the capability to handle larger AM banks.
The basic idea of the GF is to implement the fitting algorithm core inside a
Xilinx Virtex-5 FPGA chip[25]. A single chip, with a clock running up to 550
MHz, contains memories for a total of several Mbytes and hundreds of 18  25
multipliers and adders inside fast DSPs. The fit of track coordinates in SVT and
FTK processors is, thanks to the linear approximation presented in the previous
chapter, a matter of scalar products. The advantage of using DSP-like processors,
packed in large number inside the FPGA, is to perform many fits in parallel,
remarkably reducing the time necessary to fit a set of coordinates. Furthermore,
the high density of the packaging inside the Virtex-5 and the amount of memory
and logic available permits to make inside a chip what is now done with several
FPGA chips by the TF++.
The computation power of the GF, thanks to the use of the most recent FPGA
technology, is largely beyond the old TF capability, making it a perfect choice for
track fitting inside FTK which is designed to operate in a much more demanding
environment such as the ATLAS experiment. From the FTK point of view, which
is still under development, testing the Gigafitter with real data in SVT, a system
that has proved its capability, will grant a powerful way to analyze the processor
potential. For this reason, while designing the Gigafitter, the requirements im-
posed by FTK were always taken into consideration. In particular the scalability
39
CHAPTER 4. Gigafitter
of the design, implied in the use of a parallel structure, allows a simple relocation
of the GF from SVT to FTK.
In my thesis work I have written the basic firmware of the Gigafitter, that is the
part executing the fit operation. It calculates the 2 and the track parameters,
operates a selection on the tracks based on 2 and other track attributes and
sends the reconstructed tracks to the subsequent board. I have also written the
firmware to read and write, from a CPU, the RAMs, used to store the fitting
constants, and other necessary registers. This work is a first approach to the
Gigafitter architecture that has the goal of designing a working fitting data stream
relative to a SVXII ﬃ sector.
In this chapter I will show which is the logical structure of the Gigafitter in all
its parts, I will describe the hardware and the functions that I have implemented
in the firmware, also presenting some of the simulations that I have performed in
order to certify the proper functioning of the system.
4.1 General description
The GF is based on the use of one Pulsar board[26], an existing motherboard
already used in the SVT upgrade, and four mezzanine boards mounted on it (Fig.
4.1); three mezzanines which are the real core of the system, must perform the
fitting algorithm inside a Virtex-5 FPGA chip, while the fourth mezzanine has
a large RAM to perform non-linear corrections after the fitting procedure. The
system must receive 12 wedge data streams coming from SVT Hit Buffer, each
mezzanine receiving data from 4 wedge. This is the first important improvement
from the old TF where a pulsar board is used for each wedge and an additional
one to merge data (the job of 13 boards is squeezed in one board). All the com-
putational power is thus inside the mezzanine cards, while the pulsar grants the
compatibility with the whole system, the communication via VME protocol and
the merging of data.
The structure of the system core, involved in the fit implementation, is natu-
rally divided in five different parts: the Combiner, a large RAM, the Fitter, the
Comparator and the Formatter. Concurrency in operations, mix up somehow the
functions, but the segmentation enables to point out the fundamental parts of the
Gigafitter. Fig. 4.2 shows a schematic view of the GF structure and the intercon-
nection between its parts. The role of the listed parts, all included in the Virtex-5
40
4.1. General description
Figure 4.1: Gigafitter final structure
chip, is listed:
 The Combiner receives hits and road information. For the reason that a road
can have more than one hit per layer and every hit can be associated to a
track, the combiner forms the candidate tracks by generating the combina-
tory, i.e., finding all the possible tracks that can be produced from the given
hits. For this operation it stores the hits inside shift registers, one for every
layer. On the base of the hits information, as I will show later, it retrieves
the fit constants from a large RAM, associating the right constant to the
corresponding hit. When a road has at least one hit per SVXII layer (5/5),
it also generates all the combinations of hits with one missing layer each
(4/5). Then, it serially sends the hits and the constants, with hit quality
information, to the Fitter.
 The Fitter receives the hit&constants bus and, exploiting the large number
of DSP inside the chip, simultaneously calculates the track parameters and
the constraint functions. The scalar products are performed with a serial
pipeline composed of a 18 25 multiplier and an adder/accumulator inside
the DSP processor. Once the results are ready, the constraint functions are
sent to the Comparator while the track parameters are stored in a FIFO
system waiting for the 2 Comparator decision. The additional information
41
CHAPTER 4. Gigafitter
Figure 4.2: Internal structure of one Gigafitter fitting line
42
4.2. Hardware description
coming from the Combiner, regarding the hit combination layout (layer used
and quality of the hits), are only passed to the Comparator as they are.
 The Comparator calculates the 2 from the three constraint functions and it
checks if it is under the desired threshold. If the track doesn’t pass the 2 cut
it is discarded with all its parameters, while, in the opposite situation, the
next step depends on the type of track. If the track comes from a road with
only 4/5 hits the track is accepted and the parameter and the 2 are stored
into small FIFOs, while if it is one of the five 4/5 combinations coming from
a 5/5 road the parameters are stored and a quality function g (as goodness)
of the track is calculated. The g is a function of the 2, the configuration
of the layers, and the quality of the hits. Then the track waits for the other
four 4/5 combinations coming from the same 5/5 hit combination. The g
functions of the tracks are then compared and only the best is accepted.
This selection is performed, as I will explain, to increase the resolution on
track parameters.
 The job of the Formatter is to read the parameters and the 2 of the accepted
tracks and to merge all this information with the hits, the road number, and
some status data, pushing them into a FIFO using the SVT protocol.
Since the maximum SVT transfer rate is 40MHz, while in the FPGA the clock
can run up to 550MHz, an input FIFO and an output FIFO are implemented.
Some logic is needed to manage the SVT in input and ouput operations. After the
Gigafitter has fitted the tracks and pushed the information in the output FIFO,
other logic generates the output DS acting in accordance with the HOLD signal.
An other important part of the GF is the VME interface to communicate with
a CPU via the VME protocol. This is partially implemented inside the Virtex
but mainly in the Altera FPGA chips, mounted on the pulsar board (see section
4.2.1).
4.2 Hardware description
4.2.1 Pulsar Board
PULSAR[26] (as "PULSer And Recorder") is a general purpose 9U VME interface
board for HEP applications. It has been designed for the CDF trigger upgrade
43
CHAPTER 4. Gigafitter
Figure 4.3: Pulsar top side
but its design is general enough that it can be potentially used in many other
applications. It is a motherboard provided with many different interfaces and
three Altera Apex20k400[27] FPGA chips. It can hold four mezzanine cards that
are connected to the two central DATA I/O chips. The mezzanine card approach
allows Pulsar to interface with any data path. DATA I/O are connected to the
third FPGA (CONTROL) that has configurable I/O connections. All the chips on
the board can be accessed via VME interface from PC, using the connectors on the
backplain. To program the Altera chips I used the VHDL hardware description
language with the support of Leonardo Spectrum[28] for synthesis and Quartus
II[29] for place and routing. This choice was driven by the fact that the VHDL
language and the designing tools were used in other CDF applications based on
Pulsar boards, e.g. in the old TF.
4.2.2 Mezzanine Board
The mezzanine has been designed by INFN Padova, following the experience of
the mezzanine built for the L2 calorimeter trigger upgrade[30]. The interface with
the Pulsar is based on two 64 pins PMC IEE 1386 connectors[31], mounted on
the board back side (Fig. 4.4a). On the front side there are four connectors
from which the Gigafitter will receive data from the AMB, each one corresponding
to a wedge. Between the connectors, 52 pins KEL_8830E-052-170S, there are
24 receivers and one driver that allow the translation from LVDS (Low Voltage
Differential Signaling) signals, standard for the SVT protocol, to the TTL standard
44
4.2. Hardware description
!"#$%&'&()(*(+'(,$-".$%%/0(
! "#
$%! &'()%*%)%! &+,-.(/+%! &0*+12,(3! )4(! )0.',(/$0/0! 5(0/2,$0! 6'()+,-.! 789:3! -+%;%<<2+0! '(,! ;2!
&%/+(&%! $(%! ',0=,2..%!$2! %.';(.(/+2,(! /(=;%!>?@A3! &),%++%! /(;! ;%/=-2==%0! $%! ',0=,2..2<%0/(!
BCD5!EBC6FG!C2,$12,(!D(&),%'+%0/!52/=-2=(H!78I:3!(!J-2,+-&!FF!78#:3!-+%;%<<2+0!%/K()(!)0.(!
&+,-.(/+0!$%!)0.'%;2<%0/(!(!!&%.-;2<%0/(L!
?2&&%2.0!M-%/$%! 2;;N2/2;%&%! $(;;(!.(<<2/%/(! )4(! 2/$,2//0! %/+(,*2))%2+(! 2;;2! ?-;&2,! (! /(;;(!
M-2;%!K(,,O!%.';(.(/+2+2!;2!'2,+(!*0/$2.(/+2;(!$(;!*%,.12,(!$(;!@%=2*%++(,L!
P&&(! &0/0! &+2+(! ,(2;%<<2+(! $2;;2! &(<%0/(! $%! ?2$0K2! $(;;NFQ>Q3! &(=-(/$0! ;N(&(.'%0! $(;;(!
.(<<2/%/(! =%O! )0&+,-%+(! '(,! ;N2==%0,/2.(/+0! $(;! )2;0,%.(+,0! $(;! %0$--/0! $%! ;%K(;;0! $-(L!
RKK%2.(/+(3! $0K(/$0&%! %/+(,*2))%2,(! 2;;2! ?-;&2,3! M-(&+(! &)4($(! $%&'0/=0/0! %//2/<%+-++0! $%!
2$(=-2+%!)0//(++0,%!?SG!FPPP!8"T#!2!#9!'%/!78U:3!)4(!&0/0!&+2+%!'0&+%!&-;!,(+,0!$(;;2!.(<<2/%/23!
)0.(!&%!K($(!%/!>%=-,2!"LV2L!
!
>%=-,2!"LV!E2H!W!B%&+2!$(;!,(+,0!$(;;2!.(<<2/%/2!&-!)-%!X!%.';(.(/+2+0!%;!@%=2*%++(,L!
>%=-,2!"LV!EYH!W!B%&+2!$(;!*,0/+(!$(;;2!.(<<2/%/2!&-$$(++2L!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
6-;!*,0/+(!E>%=-,2!"LVYH! +,0K%2.0!%/K()(! %!M-2++,0!)0//(++0,%!$2!)-%! %;!@%=2*%++(,! ,%)(K(,O! %!
$2+%!$2!(;2Y0,2,(!$2;;(!ASZ!,(;2+%K(!2%!M-2++,0!1/2-/3!)0,,%&'0/$(/+%L!F/!M-(&+0!)2&0!&%!+,2++2!$%!
)0//(++0,%! 2! IV! '%/3! $%! +%'0! [P5! \TT"]PW]IVW8U]63! 78T:3! %/&+2;;2+%! &(=-(/$0! ;(! &'()%*%)4(!
',(K%&+(!'(,!;2!)0.-/%)2<%0/(!2++,2K(,&0!%;!',0+0)0;;0!6B^L!A++,2K(,&0!$%!(&&%!K%2==%2/0!&(=/2;%!
(a) Back view of the mezzanine
!"#$%&'&()(*(+'(,$-".$%%/0(
! "#
$%! &'()%*%)%! &+,-.(/+%! &0*+12,(3! )4(! )0.',(/$0/0! 5(0/2,$0! 6'()+,-.! 789:3! -+%;%<<2+0! '(,! ;2!
&%/+(&%! $(%! ',0=,2..%!$2! %.';(.(/+2,(! /(=;%!>?@A3! &),%++%! /(;! ;%/=-2==%0! $%! ',0=,2..2<%0/(!
BCD5!EBC6FG!C2,$12,(!D(&),%'+%0/!52/=-2=(H!78I:3!(!J-2,+-&!FF!78#:3!-+%;%<<2+0!%/K()(!)0.(!
&+,-.(/+0!$%!)0.'%;2<%0/(!(!!&%.-;2<%0/(L!
?2&&%2.0!M-%/$%! 2;;N2/2;%&%! $(;;(!.(<<2/%/(! )4(! 2/$,2//0! %/+(,*2))%2+(! 2;;2! ?-;&2,! (! /(;;(!
M-2;%!K(,,O!%.';(.(/+2+2!;2!'2,+(!*0/$2 (/+2;(!$(;!*%,.12,(!$(;!@%=2*%++(,L!
P&&(! &0/0! &+2+(! ,(2;%<<2+(! $2;;2! &(<%0/(! $%! ?2$0K2! $(;;NFQ>Q3! &(=-(/$0! ;N(&(.'%0! $(;;(!
.(<<2/%/(! =%O! )0&+,-%+(! '(,! ;N2==%0,/2.(/+0! $(;! )2;0,%.(+,0! $(;! %0$--/0! $%! ;%K(;;0! $-(L!
RKK%2.(/+(3! $0K(/$0&%! %/+(,*2))%2,(! 2;;2! ?-;&2,3! M-(&+(! &)4($(! $%&'0/=0/0! %//2/<%+-++0! $%!
2$(=-2+%!)0//(++0,%!?SG!FPPP!8"T#!2!#9!'%/!78U:3!)4(!&0/0!&+2+%!'0&+%!&-;!,(+,0!$(;;2!.(<<2/%/23!
)0.(!&%!K($(!%/!>%=-,2!"LV2L!
!
>%=-,2!"LV!E2H!W!B%&+2!$(;!,(+,0!$(;;2!.(<<2/%/2!&-!)-%!X!%.';(.(/+2+0!%;!@%=2*%++(,L!
>%=-,2!"LV!EYH!W!B%&+2!$(;!*,0/+(!$(;;2!.(<<2/%/2!&-$$(++2L!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
6-;!*,0/+(!E>%=-,2!"LVYH! +,0K%2.0!%/K()(! %!M-2++,0!)0//(++0,%!$2!)-%! %;!@%=2*%++(,! ,%)(K(,O! %!
$2+%!$2!(;2Y0,2,(!$2;;(!ASZ!,(;2+%K(!2%!M-2++,0!1/2-/3!)0,,%&'0/$(/+%L!F/!M-(&+0!)2&0!&%!+,2++2!$%!
)0//(++0,%! 2! IV! '%/3! $%! +%'0! [P5! \TT"]PW]IVW8U]63! 78T:3! %/&+2;;2+%! &(=-(/$0! ;(! &'()%*%)4(!
',(K%&+(!'(,!;2!)0.-/%)2<%0/(!2++,2K(,&0!%;!',0+0)0;;0!6B^L!A++,2K(,&0!$%!(&&%!K%2==%2/0!&(=/2;%!
(b) Front view of the mezzanine
Figure 4.4: The mezzanine with the two connectors to the Pulsar on the back side (a), the
four connector for the cables coming from the previous board, the Virtex-5,
the two EEPROM, test and programming pins and the power supplier on the
front side(b)
used inside the mezzanine and in the Pulsar. The number of receivers and drivers
follows the number of bits in input (24 bits per wedge) and output (one HOLD
signal per wedge). National DS90LV032ATM and DS90LV031ATM are used. The
signals are then directly sent to the Virtex-5 which is mounted on the board with
the necessary power supply and two EEPROM. The two EEPROM will contain
the data to program the FPGA at every startup. Next to the EEPROM there is
a connector providing the programming interface while, on the opposite side of
the chip, there is a 20 pins connector, that, used in the firmware testing, can be a
direct input/output access to signals in the Virtex.
4.2.3 Virtex-5
The Virtex-5 family[32] provides the most advanced devices in the FPGA from
Xilinx. Built on a 65-nm copper process technology, these chips provide a very
high logic density. The family comprehends different platforms, each platform
containing a different ratio of features to address the needs of a wide range of
application. The device mounted on the mezzanine is the XC5VSX95T of the
45
CHAPTER 4. Gigafitter
SXT platform (optimized for DSP and memory-intensive applications), which,
when the mezzanine was designed, was the device containing the largest number
of the very useful DSP slices.
The device basic constituents are:
 DSP slices (DSP48E) are DSP-like processors, containing a 25  18 two’s
complement multiplier, an adder/subtracter/accumulator and many other
logic to perform several operations. This is the feature that drove the choice
of this device and it is intensively exploited in the GF design.
 I/O blocks that are the package interface;
 Configurable Logic Blocks (CLBs) that provide the basic logic functions,
shift registers and distributed RAMs;
 Block RAMs that are composed of 36-Kbit true dual-port RAM blocks with
optional dual 18-Kbit mode, provided with multi-rate FIFO support logic.
 Clock Management Tile (CMT) blocks, composed of two DCM (Digital Clock
Manager) blocks and one PLL (Phase-Locked-Loop) clock generator.
A summary of the device features is reported in Tab. 4.1 Thanks to the inter-
!"#$%&'&()(*(+'(,$-".$%%/0(
!""#$%&'(%)*+'"*,-$.*/&$"$00!&)1*
*
!"#$% #&'()%*+,%#&'()$%
+--./%
0*1!2%
34-5617%
8&4(6$%
,.1%
94$5-4:;56<%
*.=%0>:2%
98?%
@AB%
8&4(6$% CA%
>:%
DE%
>:%
,.1%
0>:2%
!,F$%
?!G%
B1H-6$$%%
BI<H'4I5%
#&'()$%
B5J6-I65%
,+!%
#&'()$%
,.14=;=%
*'()65GK%
LF?%
F-.I$(64M6-$%
F'5.&%
GNK%
#.I)$%
,.1%
O$6-%
GNK%
CEP1@E% C@QRP% C7RP% E@P% @AA% R@@% AQA@% C% E% @% CE% CS% E@P%
*
2!3'""!*415*6*7$8)(8'*+'"*,-$.*9$(&':;*<=;9><?;2*
*
@(!0$'*!*A/'8&!*B!8&!*+$8.)%$3$"$&C*+$*D/%0$)%!"$&C*'*!""!*.)88$3$"$&C*+$*D!(*"!B)(!('*,$!8,/%)*
+$*A/'8&$*'"'E'%&$*)&&$E$00!%+)%'* "'*.('8&!0$)%$*8$!*$%* &'(E$%$*+$*B'"),$&C* F,)%* "!*.)88$3$"$&C*+$*
/&$"$00!('*,"),G*,)%* D('A/'%0'* D$%)*!$*;;H*IJ0K*,-'*+$*+'%8$&C*+$* ")L$,!M*A/'8&)*+$8.)8$&$B)*8$*
($B'"!*+/%A/'*/%#)&&$E!*($8.)8&!*!$*('A/$8$&$*,-'*$"*.()L'&&)*+'"*@$L!D$&&'(*+'B'*8)++$8D!('1*
N*A/'8&'*)&&$E'*,!(!&&'($8&$,-'*8$*!LL$/%L'*$"*B!%&!LL$)M*,)E/%'*!*&/&&$*$*E)+'(%$*OP@NM*+$*
.)&'(* ($.()L(!EE!('* $%* A/!"8$!8$* E)E'%&)* $"* ,-$.* ,)%* '8&('E!* D!,$"$&C1* P'(* .'(E'&&'('* &!"'*
).'(!0$)%'M*8/""!*E'00!%$%!*Q*8&!&)*A/$%+$*.('B$8&)*/%*!..)8$&)*,)%%'&&)('*+$*&$.)*R2!L1*S%)"&('*
8)%)*8&!&'*$%8&!""!&'*T*UUP7VI*FU"',&($,!""W*U(!8!3"'*!%+*P()L(!EE!3"'*7'!+*V%"W*I'E)(WKM*
8'E.('* +'""!* <$"$%:* XTTYM* 8/""'* A/!"$* ./Z* '88'('* ,!($,!&)M* &(!E$&'* ")* 8&'88)* ,)%%'&&)('* !..'%!*
,$&!&)M* $"* D$(E[!('*+'""#OP@NM* $%*E)+)* &!"'* ,-'* '88'* ")*E!%&'%L!%)* $%*E'E)($!* '* ")* .)88!%)*
,!($,!('*!/&)E!&$,!E'%&'*8/"*,-$.*.($%,$.!"'*)L%$*B)"&!*,-'*$"*8$8&'E!*B$'%'*!"$E'%&!&)1*
P'(* "!* .()L(!EE!0$)%'* 8$!* +'"* 9$(&':* ;* ,-'* +'""'* +/'* UUP7VIM* 8$* .)88)%)* 8D(/&&!('*
!..)8$&$* 8&(/E'%&$*8)D&[!('M*,)E'*B$8&)*!%,-'*.'(* $*,-$.* $%8&!""!&$*8/""!*P/"8!(1* S%*.!(&$,)"!('* $"*
,)E.$"!&)('* ,-'* !+)&&!* "!* <$"$%:* Q* S>U* XT4YM* $"* A/!"'* .'(E'&&'* +$* 8$%&'&$00!('* '* $E."'E'%&!('*
E)+/"$* 8,($&&$* 8$!* $%*9J\]* ,-'* $%*9'($")L* XT^YM*E'%&('* .'(* 8$E/"!('* '* &'8&!('* "'* D/%0$)%!"$&C*
$E."'E'%&!&'* %'$* .()L(!EE$* ,)8_* L'%'(!&$* 8$* Q* 8,'"&)* +$* !+)&&!('* I)+'"8$E* <U* XT;YM* +'""!*
I'%&)(*@(!.-$,81*
V"&('*!L"$*'"'E'%&$*.($%,$.!"$*!..'%!*+'8,($&&$M*8/""!*E'00!%$%!*8)%)*!%,-'*.('8'%&$*/%!*8'($'*
+$*,)E.)%'%&$*,-'M*8'../(*8',)%+!($*+!*/%*./%&)*+$*B$8&!* ")L$,)M*($8/"&!%)*$%+$8.'%8!3$"$*.'(* $"*
,)(('&&)*D/%0$)%!E'%&)*+'"*+$8.)8$&$B)M*,)E'*$*,)%+'%8!&)($*+$*3W.!88*'*"'*('8$8&'%0'*.)8&'*8/""'*
&'(E$%!0$)%$*+'""'*B!($'*"$%''M*)*$*+/'*!+!&&!&)($*+$*&'%8$)%'M*%','88!($*.'(*&(!8D)(E!('*"!*,)(('%&'*
+$* !"$E'%&!0$)%'* .($%,$.!"'* +'""!* 8,-'+!M* !* 4149M* !"* D$%'* +$* !+!&&!("!* !""'* '8$L'%0'* +'$* +$B'(8$*
+$8.)8$&$B$1*
S%D$%'M*.'(*D!,$"$&!('*"'*).'(!0$)%$*+$*,)%&()"")*+'""'*D/%0$)%!"$&C*$E."'E'%&!&'*%'""#OP@NM*"!*
8,-'+!* Q* 8&!&!* +)&!&!* +$* /%* ,)%%'&&)('* +$* +'3/L* '* +$* !..)8$&$* &'8&* .)$%&8M* !$* A/!"$* Q* .)88$3$"'*
* 4`
Table 4.1: Virtex-5 XC5VSX95T internal resources
nal blocks described and the density with which they are packed inside the chip,
in addition to the clock that can run up to 550 MHz (very fast compared with
the external 40 MHz clock used on the Pulsar boards in SVT), Virtex-5 can pro-
vide all the computing power needed for the Gigafitter project. In the Virtex-5
programming I used the tools provided by the manufacturer (IS 9.2i[33]) and
Verilog hardware description language. Since there were no previous experience
in programming Virtex devices inside CDF, the choice of Verilog was based on
my previous knowledge of the language. The firmware has been simulated with
the post place&route simulation module generated inside ISE and run in Mentor
Graphics ModelSIM[34].
46
4.3. Input FIFO
4.2.4 Clock
All the GF parts are synchronized by the 40MHz SLink_Clock generated by the
Pulsar board. All the logic implemented on the Pulsar operate at this rate. The
clock is also sent to the mezzanine cards where enters the Virtex through a dedi-
cated pin. A DCM-PLL system generates a new clock, locked to the SLink_Clock,
that is used inside the FPGA and runs to the maximum speed allowed by inter-
nal delays. Simulations show that the effective clock frequency allowed is mainly
limited by I/O delays. This limit vanishes thanks to the I/O FIFOs, but a better
optimization is possible to approach the nominal value of 550 MHz. The DCM gen-
erates a LOCKED signal when the internal clock is locked to the SLINK_Clock,
which is the standard configuration while running. In case of SLink_Clock glitches
and at power-on, the DCM resets LOCKED to logic 0 and the Gigafitter can’t op-
erates. Thus it sends an HOLD signal to the HB.
4.3 Input FIFO
The SVT system employs a uniform protocol for data transfer. Each word is 25
bits wide. The communication between the receiving and the transmitting boards
is based on two signals: a strobe signal (DS), from the transmitting board and an
HOLD signal from the receiving board. The strobing time is defined by the strobe
positive edge; 1 to 0 transitions are meaningless. If the HOLD signal is set to logic
0 (HOLD is an active high signal), the transmitting board can send data. After
changing data on the bus, the transmitting board provides the time to latch them
with the DS rising edge (Fig. 4.5).
Figure 4.5: DS signal sent by the HB to the Gigafitter with the data bus.
The HB DS signal must therefore be managed properly, after the translation
from LVDS to TTL standard performed by the mezzanine receivers. Since the
DS is not a free running clock it couldn’t be used as write clock for a dual clock
FIFO ( a simulation showed that Virtex FIFOs need a free running clock to operate
47
CHAPTER 4. Gigafitter
properly) and also its utilization as clock enable was prohibited by the DS slowness,
compared to the FPGA internal clock frequency. An alternative solution to the
dual clock FIFO is then implemented. A single clock FIFO is used but data are
first registered using the DS as clock, while the same DS generates the FIFO write
enable (WE) with the simple logic sketched in Fig. 4.6. When the GF can’t
(DS)
FIFO
FIFO CLOCK
FIFO INPUT
Figure 4.6: The logic used to receive DS from Hit Buffer. WCLK is the Virtex internal
clock
receive data it must set the HOLD signal, and this can happen in two situations:
either its input FIFO is getting full or the internal clock is not ready at power on.
Therefore the HOLD signal is generated by the ALMOST_FULL flag from the
FIFO (the ALMOST_FULL flag threshold is now set when the fifo is half full but
it could be optimized to reduce the size of the FIFO) and the LOCKED signal
that is set by the DCM when the internal clock phase is locked to the Pulsar clock
(SLINK_Clock).
4.3.1 Input protocol
The input data to the Gigafitter as in the old TF [35] are provided by the SVT
Hit Buffer board. The data consist of the silicon hit information measured by the
SVT Hit Finder, the track parameters measured by the XFT, and the Associative
Memory road identification number (road ID) from the AM pattern recognition.
The contents of the input data, listed in Tab. 4.2, are organized into packets. As
shown in Tab. 4.2, the Gigafitter receives packets made of multiple SVT words
per road. The minimum number of input words per road (one packet) is eight:
five SVXII hit words, two XFT words, and one AM road word. One of two XFT
words is used for AM pattern recognition. When there are more than one hit in
48
4.4. Combiner
a layer, the GF receives more than eight words per road (x01 in Tab. 4.2). The
GF receives several packets per event when the AM system finds multiple track
candidates. The last word of a road is tagged with the EP (End of Packet) flag
while the EE tags the end of an event.
Table 4.2: The input data to the Gigafitter from the HB
2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
EE EP data
layer z x0
layer z x1
layer z x01
:
layer(XFT) XFT information for the AM
sign c from the XFT ﬃ from the XFT
1 AM road ID
:
1 1 L2B error flags PA Event Tag
4.4 Combiner
The information coming from the AM boards includes the roads found in the
pattern reconstruction and the corresponding hits. The Combiner is the part in
charge for the generation of hit combination. It interfaces with the Input FIFO
thanks to a control module that also recognizes the protocol words. Every road
can include none, one or more than one hit per layer but it must always have
XFT hits. This requirement comes from the fact that, in order to have a good
resolution in PT , the track reconstruction needs the information from COT. The
module reads the layer information and stores all the hits occurred on the same
layer inside a shift register. It checks the presence of the XFT words, and if it
receive the EP without having found them, it discards the data. When it receives
the EP signal, it loops over all the layers in order to form the combinations.
Coefficients and Constants When combinations are ready, the Combiner must
retrieve the right constants and parameters for the fitting algorithm. As shown in
49
CHAPTER 4. Gigafitter
the previous chapter, it is necessary to have a set of coefficients and constants for
each region where the linear approximation is good enough. Because coefficients
and constants are stored and retrieved together, in this section I will refer them
simply with constants. It has been demonstrated that in the ﬃ coordinate one set
of constants is good enough for one SVXII sector (30). Therefore, every fitting
line has its own set of constants. The choice of the proper set of constants must be
performed on the basis of the z coordinate, the hit layout, and the Long Cluster
(LC) information. I’ll now explain what the latter is, how, and why it is used. The
SVXII detector is segmented into small bins to achieve a better spatial resolution.
After the digitalization of the event information, more than one bin in a layer
could have been fired and these bins could be contiguous. If that is the case only
one hit is recorded and the Long Cluster flag is raised to take account of the poor
spatial resolution on the measurement. Therefore, there could be a long cluster for
different reasons: two particles cross the layer in adjacent bins and thus we can’t
resolve them; a particle crosses the layer with a high angle of incidence traversing
two bins; induced noise. Therefore the LC information is very important because
it informs us about the error in the coordinate measurement. In order to take in
some account the coordinate resolution in the fitting algorithm, the LC must take
part to the constants selection. Moreover, a different set of coordinate is necessary
for each combination of used layers. The used layers are identified by the hit
map, which is a 6 bits word. Each bit in the map corresponds to a single logical
layer where c and ﬃ from XFT are considered together. A look-up table, called
MAKE_ADDRESS, is addressed with 3 bits from the z of the innermost hit in the
combination, 5 bits for the LC map and 6 bits for the hit map. The output of this
RAM is a 8 bit address for the Constants RAM. Each location of the Constants
RAM contains all the information needed to fit the selected combination, which
are: 7 coefficients and one constant for each parameter and for each constraint
((7+ 1) 6), where each coefficient and constant is made of 14 bits. This leads to
a total of (7 + 1) 6 14 = 672 bits.
Once the constants have been retrieved from the RAM, hits and constants must
be passed to the Fitter. As I will show in the next section, the Fitter must be
fed with one hit and corresponding coefficient per clock cycle in order to calculate
each term of the scalar product. Also the constant must be sent in parallel in order
to optimize the timing inside the DSP slices. A serializer is in charge of reading
the constants and the combinations of hits and filling a FIFO for the Fitter.
50
4.5. Fitter
4.5 Fitter
The Fitter is the Gigafitter core, the part where the scalar products of the lin-
earized algorithm are calculated. In this first approach to the GF structure, where
the principal extent was to write a complete working fitting line for one wedge, I
wrote this component trying to make the most effective use of DSP arrays in the
chip. The fitter calculates six scalar products in parallel, corresponding to 3 con-
straint (1; 2; 3) and the 3 track parameters. Every product (p =
P
i cihi+c0) is
calculated in a serial pipeline where one hit and the corresponding coefficients are
multiplied and then added to the other products and to the constant c0. This pro-
cedure benefits from the structure of the DSP slice, that is optimized for pipelined
calculations, and allows the use of only one DSP slice (one multiplier and one
adder/accumulator) per scalar product (other calculation structures, such as cas-
cade or tree, requires the use of several multipliers and adders). The optimization
of internal resouces is very important since we want to implement a highly par-
allel structure, also looking at FTK environment, that is of course very resource
demanding. In order to approach the calculation in this way, hits and constants
are transmitted to the Fitter in a large bus which, at every clock cycle, provide
an hit (for example the hit on the first layer) with a set of six coefficients, that
are the coefficients needed to calculate the partial products hi  ci. The constants
c0 are also provided in parallel and added to the first partial product. The sys-
tem is controlled by a simple state machine and is designed to be never busy. To
understand when a set of coordinates is finished and the accumulator must be pre-
pared for the next one, the control state machine receives the EV (End Vector) bit
and dynamically changes the DSP configuration. Although it is always being able
to receive data, the Fitter can interface with the FIFO, where hit combinations
are stored, thanks to the same control state machine, that understands the FIFO
EMPTY signal and sends a RE (read enable) signal.
Provided that the sets of coordinates are composed of five SVX layers and
c,ﬃ from XFT, we have to perform seven multiplications per parameter. This is
because, even if only combinations with 4/5 layers are fitted, the Fitter always
receives from the Combiner seven words (hi and ci). One ci constant in the
combination will be 0, corresponding to the deleted or missing layer.
51
CHAPTER 4. Gigafitter
Figure 4.7: Fitter internal structure
HITS 
(18)
Constant (18)
COEFF. 
(18)
Figure 4.8: Sketch of a DSP48E slice. Used resources are highlighted. Until the scalar
product is not complete, the partial product P is sent back to the adder via
the Z multiplexer. Z is controlled by the Control state machine via OPMODE.
ALUMODE is fixed to use the DSP in MACC configuration.
52
4.5. Fitter
4.5.1 Scalar products with DSP48E slice
The DSP48E slice, described in the design guide[36], is the Digital signal Pro-
cessing element of the Virtex-5 FPGA chip. The feature implemented in this
device are a 2518 bits two’s complements multiplier and an add/subtract func-
tion that is extended to function as a logic unit. This logic unit can perform a
host of bitwise logical operations when the multiplier is not used. The DSP48E
slice includes a pattern detector and a pattern bar detector that can be used for
convergent rounding, overflow/underflow detection for saturation arithmetic, and
auto-resetting counters/accumulators.
The multiplier inside the slice operates on two’s complements operands while
data sent by the Combiners are coded with module and a sign bit. A conversion is
then needed before the multiplication and it is done with simple logic ports. The
scalar product will be reconverted into signed integers. The device is used inside
the Fitter with a MACC (Multiply Accumulate) configuration, selected by the 7
bits OPMODE. As shown in Fig. 4.8, hits and coefficients are in input in the A
e B buses. The C input bus is used for the constant. The input buses width is
fixed to 18 bits, that is the maximum width for the B input and is sufficient for
both hits and constants. At every clock cycle the inputs are fed with a hit and
the corresponding coefficient. Also, the constant is registered at every clock cycle
but is added once. The utilization of the internal pipeline, by registering A and B
inputs and the using M register, increases the speed of the operations allowing a
higher clock frequency. With this configuration it takes 3 clock cycles for the first
couple of inputs to be multiplied and accumulated, leading to a total latency for
a scalar product of 9 clock cycles. Controlling the OPMODE, the Z multiplexer
is, at first, set to add the constant to the first product and then set to accumulate
the subsequent products. When all the products are accumulated a DV (Data
Valid) signal is generated. All this operations are controlled by the Control state
machine.
Control state machine A state machine called Control presides over the oper-
ations inside the DSP48E slices. It is composed by three states and takes in input
the EMPTY and EV signals in addition to the global RESET. The machine sends
the RE signal to the input FIFO, the MODE signal to the DSP and the DV and
NCVERROR signals to FItter output. The RESET signal sets the machine to
state WAIT (1 in Fig. 4.9) where the machine stays if EMPTY = 1 and where RE
53
CHAPTER 4. Gigafitter
Figure 4.9: Control state machine. The output signals on each state are: WAIT: RE=1,
MODE=1, NCVERROR=0; GO: RE=1, MODE=0, NCVERROR=0; ER-
ROR: RE=1, MODE=0, NCVERROR=1.
= 1, MODE = 1, NCVERROR = 0. The MODE signal enables the DSP to add
the c0 constant, thus the constant is the first to be summed in the accumulator.
When EMPTY becomes 0 the machine switches to the GO state where it stays
during normal operations. In this state RE = 1, MODE = 0, NCVERROR = 0.
If the input FIFO gets empty (EMPTY = 1) without an EV signal the machine
switches to state ERROR were the NCVERROR is generated. This is necessary
because it means that the FIFO got empty when it hadn’t sent all the seven hits
of the combination. In this case, data in DSP pipeline, where some cihi products
have already been accumulated, are corrupted and so an ERROR flag must be gen-
erated. The machine exits the ERROR state only when it receives the EV or the
RESET, switching to WAIT. When EV becomes 1 the machine switches to WAIT
to add the c0 constant of the subsequent hit combination. Then, if the FIFO is not
EMPTY, the machines returns to the GO state. With this machine, the Fitter is
always ready to receive data, without losing any clock cycle. Moreover, no clock
cycle is lost to reset the accumulator, because the first product is directly summed
to the constant c0. The DV output follows EV with the exception of the ERROR
state when it is always 0. This signal, generated here and opportunely delayed to
take into account the pipeline latency, indicates that the scalar product is ready.
The resulting product is a 48 bits two’s complements bus, which is directly
converted in the signed integers. Only a part of bits are significant because the bus
exceeds the desired resolutions on parameters. Moreover the constraint functions
must be squared and summed, but, as we saw before, the maximum width of DPS
54
4.6. Track selection: the Comparator
B input is 18 bits,which are more than enough for the required resolution. The bits
that are beyond the desired resolution are then ignored. Every parameter has a
different number of significant bits and also has the first significant bit in a different
position, reflecting the desired precision on the specific measure. The relevant bit
fields can be changed in the firmware. For the Gigafitter implementation in SVT,
I kept the old number of output bits in order to fit the current output protocol,
but it is important to point out that the Fitter is ready to handle increases in
parameters resolution. The table shows the number of selected bits for all the
calculated parameters.
d c ﬃ 1;2;3
# of bits 10+sign 7+sign 13 14
The simulation of the Fitter module is shown in Fig. 4.10.
Figure 4.10: Simulation of the Fitter module. Only one parameter is shown. 3 clock cycle
after the EV the scalar product is ready. If the EMPTY signal becomes 1
without an EV the error flag is raised.
4.6 Track selection: the Comparator
After the fitting algorithm has been carried out, the next step is to choose which
hit combinations represent real tracks. The Comparator is the GF part in charge
of this selection. Its structure is shown in Fig. 4.11.
The Comparator receives the reconstructed parameters and constraint func-
tions, storing the parameters and calculating the 2 from the three i. Then, the
2 is compared with a threshold value stored in a register. If the 2 value is under
the threshold it is temporary registered. The 2 is calculated inside a DSP48E
slice, where each i is squared and added to the others.
55
CHAPTER 4. Gigafitter
Figure 4.11: The Comparator internal structure
56
4.6. Track selection: the Comparator
Track goodness As said before, the Fitter receives from the Combiner all the
combinations based on 4/5 hits on the SVXII layers. Therefore, when in a road
there is an hit on each of the five SVXII layers, the Fitter receives five candidate
tracks (4/5 combinations). Only the best track must be sent to the GB. This is one
important feature introduced in the Gigafitter in order to increase the efficiency,
trying to eliminate the tracks containing noisy hits. This feature has also the
purpose of increase the parameters resolution by selecting the track that grant,
in accordance to a good 2, the desired resolution on a selected parameter. It is
clear that the layout of the used layers would affects parameters resolution and
that the same layout would not produce the best resolution on each parameters.
So, the selection of the track is driven by the need of having a better resolution
on a parameter rather than on the others. Another important track attribute we
must take account of is the quality of the hits (Long Cluster bit). The Comparator
receives this information directly from the Combiner.
The Long Cluster bit is sent from the HB to the Gigafitter (it is the second MSB
of SVXII hits) and represents a simple information about the hit quality. When
more than one contiguous layer bins have been fired, only one hit is recorded
but a LC (Long Cluster) flag is set to denote a bad resolution on the measure.
The combiner receives the LC information in a 5-bits bus in addition to 5 bits
for the used layers. A different layer layout leads to a different resolution in the
track parameters, e.g. the absence of the hit in the innermost layer reduces very
much the resolution on the impact parameter. I introduced a method to merge
this information with the 2 in order to form a simple parameter for the tracks
comparison. There are 5 possible layer layouts and 12 LC layouts because some
are rejected before the Gigafitter (there can’t be more than two LC in one track
and only one if it is on the innermost layer), leading to 60 possible combinations.
Moreover many combinations are similar or the same. If in a 5/5 combination
there is a LC on a layer it would affect the quality of the 4/5 combinations where
the corresponding layer is used. On the other hand the LC information is useless
if the corresponding layer is not used. Thus, in this case two combination, the one
with the LC on the missing layer and the one without it are exactly the same. In
order to classify the tracks and group similar cases I applied the mask in Tab. 4.3.
To compile Tab. 4.3 I used the following criteria:
 the innermost layer is the most important; the absence of the hit or a LC
seriously weakens the resolution on the impact parameter;
57
CHAPTER 4. Gigafitter
 the optimal case is when there are no LC;
 one LC is better than two;
 the quality is worst if the LC or the unused layer are on the innermost or
on the outermost layers. In a linear fit it is better to have a larger distance
between the measured points to reduce the error on fitted parameters;
Table 4.3: The table show the mask applied to layer and LC layouts with a quality eval-
uation. The innermost layer is the first from the left.
This mask is only a preliminary attempt to group similar cases, in order to
reduce the number of bits needed to describe the information on the hits quality.
I called this bus q (quality). q addresses, with the calculated 2, a ROM that
looks up the value of the function g = g(q; 2) (g as goodness). With the only
purpose of testing the hardware, I have implemented the ROM with the temporary
function g = q + 2 only to have a response from the ROM. Further studies on
the CDF data are necessary to tune the goodness of the track. As soon as this
studies are available, the only thing to do will be to change the ROM content.
The Comparator stores the g function of the first track, and then substitutes it
with the eventually incoming better g from other tracks. This happens until the
last track is coming when, finally, the best track can be stored in a FIFO waiting
58
4.7. Hit spy
to be packed from the Formatter. Every time the g function is better than the
previous one, the Comparator generates a BETTER signal that is used internally
to store the 2 and the parameters, and in the Hits spy module to store the
corresponding hits. When the last track has been compared, i.e. the Comparator
has received the EC signal from the Combiner, a BEST signal is generated to
push the track information (2, parameters and hits) in the small FIFO before the
Formatter. Either the chi2 cut and the g-function comparison can be turned off
with the signal NOCUT and NOCOMP. The signals can be set via VME interface
by writing the corresponding configuration register as I will show in section 4.9.1.
This feature can be useful to debug, to check the selection with an off-line analysis
or to compare the Gigafitter results with the TF++ output that doesn’t have the
g-function comparison. Fig. 4.12 shows the simulation of the Comparator with
and without the cut and the g comparison.
AM road ID selection Every reconstructed track belongs to a specific road
identified by the road ID, which is a 21 bits word. As it happens with parameters,
the road ID sent by the Combiner is stored for the Comparator decision. Even
if many tracks belong to the same road, the road ID is registered for every track
and only the IDs of the tracks that pass the Comparator selection are sent to a
small FIFO. The FIFO will be read, as the parameters and chi2 FIFOs, from the
Formatter.
4.7 Hit spy
A sketch of the Hits Spy module is shown in Fig. 4.13. The five hits from SVX II
and the XFT ﬃ, that enters the DSP slice must not be lost because they must be
sent to the Gigafitter output with the fitted parameters. The 9 ﬃ MSBs are used
for XFT track identification, an information that is needed by the Ghostbuster,
working after the Gigafitter. We don’t need to output all 15 bits of SVXII hits
because a part of the information is included inside AM road ID word. Therefore,
we have to store the 9 MSBs of XFT ﬃ and the 10 MSBs of SVXII layers.
A simple module, that I called Hit Spy, at every clock cycle stores the data
coming from the hit bus. It packs the hits in pairs (x0x1, x2x3 and x4; ﬃ) to
push them into a small FIFO. The hits are first registered, then written into
two parallel registers having the output merged into the FIFO input. I decided
59
CHAPTER 4. Gigafitter
(a) No 2 cut or comparison between tracks is performed. All tracks are output
(b) Only 2 cut is applied. All the track that pass the cut are output
(c) The comparison between tracks is performed. Only one track within a set
of combination is output, if has passed the 2 cut.
Figure 4.12: Simulation of the Comparator module. The BEST signal is set to 1 when a
track is ready for output and can be pushed into the output FIFO. There
is a transition in the 2 output bus only when the FIFO is read with the
FIFO_RE signal. The figures show how the module responds to the NOCUT
and NOCOMP signals.
60
4.8. Formatter
Figure 4.13: Hit Spy module. Hits are first coupled and pushed in the first FIFO.
to store the hits in pairs to prepare them to the output protocol that, as I will
show later, needs them packed in this way. The module is managed by a control
module which understands the EV and EMPTY signals, as the Control state
machine does, and generates the signals to enable the writing operations into the
registers and into the FIFO. The hits are then sent to output if the track passes
the 2 cut and the comparison based on g function. The method used to store the
information respecting the Comparator decision is similar to that one described
for the parameters in previous section. The only difference is that now data must
be read and written with 3 clock cycles, necessary to read all the six hits.
4.8 Formatter
The Formatter manages the conversion of GF data into the SVT output protocol.
It is fed in input with: the track parameters and the 2 from the Comparator,
hits from the Hit Spy and road ID and other informations that are not used in
the Gigafitter. Every information is contained into small FIFOs. Therefore, the
Formatter has the logic to control them all.
I used the TF++ output protocol[24] to make the GF perfectly compatible
with the SVT system. However, the old protocol didn’t provide all the AM road
61
CHAPTER 4. Gigafitter
ID bits. So, I have placed the two missing bits (Road ID(20-19)) in the following
words, dedicated to hit transmission, exploiting the two spare bits. The protocol
is shown in Tab. 4.4.
Table 4.4: Gigafitter output protocol
2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
EE EP data
1 z ﬃ
Road(18-17) sign c d
Sector AM Road ID (16-0)
Road(19) x1 x0
Road(20) x3 x2
2 x4
1 Gigafitter Status Track Number
:
1 1 L2B error flags PA Event Tag
The information concerning each track is contained in seven words. While in
the input protocol the EP bit sets the end of a roads, in the output protocol the
EP bit is set with the last word of each track. The EE word is the copy of the
input EE word. The z in the first word is composed by the zs of the innermost
and the outermost layers used in the track reconstruction (3 + 3 bits). The track
number is composed by the nine MSBs of the XFT ﬃ. Since there can’t be two
tracks with the same XFT information it is used for track identification.
Formatter structure The internal structure is shown in Fig. 4.14. All oper-
ations are controlled by a finite state machine (Formatter Control). If the input
FIFOs are not empty, at every clock cycle it pushes a word in the output FIFO. To
follow the right word sequence, each word corresponds to a state in the machine.
When data are ready in input, in each state the machine pushes the correspond-
ing word into the FIFO. When the machine outputs the last word of the track, it
also reads the output FIFO status. If the FIFO is almost full it waits until the
ALMOST_FULL flag is lowered. The ALMOST_FULL threshold is programmed
to allow at least 7 more words to be stored. In this way data are never lost. Data
could be lost only if the output FIFO doesn’t lower the ALMOST_FULL flag for
a long time making the input FIFOs get full. Therefore, the output FIFO depth
62
4.9. Firmware on the Pulsar Board
Figure 4.14: Formatter structure
must be sufficient to absorb the fluctuations in the data latency and error flags
must be risen in the case of filling of the FIFOs that are in input to the Formatter.
Fig. 4.15 shows a simulation of Formatter module.
4.9 Firmware on the Pulsar Board
The firmware described till now is all implemented on the mezzanine card, inside
the Virtex FPGA. Data from the mezzanine must be read by the pulsar to commu-
nicate with the next SVT board. In the final Gigafitter configuration, the Pulsar
board will be in charge of receiving data from three mezzanine cards. Then it
must refine the parameter values in order to take into account small nonlinearities
of the problem. The pre-calculated terms needed for this operation will be stored
in a large RAM mounted on the fourth mezzanine card. Since, at present time,
only one fitter mezzanine card is available, this function hasn’t been implemented.
Moreover only the fit in one wedge has been implemented, so, no merging is needed
for first tests.
I have written a simple firmware which is able to read the mezzanine output
FIFO described before, and to send it to next board with the SVT transfer proto-
63
CHAPTER 4. Gigafitter
Figure 4.15: Formatter simulation. The end of a track (EP) and the end of and event
(EE) are highlighted. When track FIFO is EMPTY the write operation in
the output FIFO is not enable.
col. A sketch of the functions implemented in the Pulsar FPGAs is shown in Fig.
4.16.
Figure 4.16: A sketch of some of the functions implemented in the Pulsar FPGAs
The signals needed to read the FIFO (RE, EMPTY) are managed by the Con-
trol FPGA and repeated in the DATA I/O. Also the data from the FIFO are first
registered inside the DATA I/O and then sent to Control where they are stored
inside a FIFO. All operations inside the Pulsar are synchronized with the 40 MHz
S_Link clock. This frequency cannot be used to generate the output DS, since the
SVT protocol is implemented with a 30 MHz maximum clock frequency. There-
fore, the S_Link clock frequency is divided in order to generate the right DS signal
64
4.9. Firmware on the Pulsar Board
and the FIFO RE signal.
The most important function of the Pulsar board is to manage the communi-
cations via VME protocol. As I have shown, in order to run the fitting algorithm,
the Combiner must retrieve data from two RAMs. The content of these RAMs
must be loaded from a PC at power-on and must be accessible, in order to check
its integrity. Secondarily, it is planned the realization of input and output SPY
BUFFERs (see Sec. 4.10 for a short description), which must be readable at every
time. Finally, other register must be written and read from the PC, e.g the register
containing the 2th value. Therefore, a VME interface is necessary to communicate
with the PC.
4.9.1 VME interface
The VME protocol[37], as it is used in the Pulsar board, can be briefly described as
follow. A dedicated chip is mounted on the pulsar to transform the VME standard
into the TTL used in the Pulsar. Thanks to this chip we can send, from a PC to
the desired FPGA (either DATA I/O or Control), an address bus, VMEAddress, a
data bus, VMEData which is an input/output bus, and two signals which enable
the reading and the writing operations, VMERead and VMEWrite. We can read
or write the resource (RAM or Register) at the address VMEAddress, by receiving
or sending data in the VMEData bus.
Most of the resources that must be accessed via VME are on the mezzanine
cards. Thus, a part of the VME interface is implemented in the Virtex-5, but the
core of the interface is implemented in the DATA I/Os. Each DATA I/O is in
charge of reading and writing operations on two mezzanine cards. The basic idea
for the interface implementation comes from the TF++ VME interface. We send
VMEaddress, VMEData, VMERead and VMEWrite to all the devices we want to
access. Each device is labelled with an address and, when it receives the VMERead
or the VMEWrite signals, it compares the VMEAddress with its own code (Decode
Address). In case of match, it performs the reading or writing operations. While
this approach is rather simple for the devices on the Pulsar board, it is not the
case of those devices implemented in the mezzanine cards. This is because of the
width of the mezzanine connectors. From the 72 lines available in the two PMC
IEE 1386 connectors we used 25 lines for GF data flow. Other lines would be
probably needed in order to communicate the fitter status to the Pulsar. From a
first approximation, it follows that 42 lines are available for the VME interface.
65
CHAPTER 4. Gigafitter
This lines are used as follows:
 5 lines are used to send a code, in order to identify the resource to be accessed
(Opcode).
 34 lines form a bidirectional bus called VMEBus. It is used for data and ad-
dresses. This address is used to address the resource (RAMs or Spy Buffers)
selected by the Opcode.
 2 lines for read and write signals (ReadPulse and WritePulse).
 one line is used for a signal called BusBusy. It is used to switch the VMEBus
direction. When BusBusy = 1 the Pulsar can write the VMEBus, while
some tri-state buffers in the mezzanine disable the write operation on the
bus. When Busbusy = 0 the roles of the Pulsar and of the mezzanine card
are inverted.
It is clear that in the write operation BusBusy is always set to 1 while the
read operation is a little more complicated. A finite state machine is in charge of
switching the VMEBus in order to: send the address, send the ReadPulse, switch
the bus to read data.
On the mezzanine, each RAM and Register compares its own address (De-
code Address) with the Opcode. If there is a match, tri-strate buffers are set in
order to connect the resource with the VMEBus. This is all is needed for the
writing and reading operations on configuration registers, Spy-Buffers and the
MAKE_ADDRESS RAM. The operation on the Constants RAM, because of the
width of its words (672 bits) cannot be performed in the same way. Words are
sent in 48 packets of 14 bits each and an additional code (PACKET) is used to
identify the packet. The packets are stored in the mezzanine and written together
into the RAM. The code used to address the RAMs is showed in Tab. 4.9.1.
The described interface allows to fill the necessary RAMs with constants and
to set all the configuration registers. During the tests, data have been successfully
written in RAMs and registers.
4.10 Next steps
During my thesis work I have developed the firmware that lets the Gigafitter fit
the tracks coming from one SVT wedge. A complete GF must fit tracks from all 12
66
4.10. Next steps
23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Z Z M Opcode Address x x
Table 4.5: VMEAddress to access resources on Gigafitter. The two MSBs (ZZ) are used
to choose the FPGA while M is used to choose the mezzanine. The two MSBs
of the Opcode are used to select a wedge and the other three the desired device.
In the case of the Constants RAM, the six MSBs of the Address are used for
the PACKET code.
SVT wedges in order to provide a complete ﬃ covering. Therefore, some more work
must be done. The fitting line must be replicated for the 12 wedges, implementing
4 lines in each mezzanine card. The only difference between two fitting lines lies
in the Constant RAM content, which is different from one to another. Once all
the fitting line are available data coming from each one must be merged into one
bus. A simple finite state machine, placed in the Pulsar board, could take care
of this task. The firmware on the pulsar will also handle the RAM placed in the
fourth mezzanine, in order to apply small corrections the track parameters, taking
into account nonlinearities in the fitting problem.
Input and output spy buffers must be implemented. They are RAMs addressed
with an incremental address, where incoming and outgoing data are temporary
duplicated for monitoring and debugging purposes. When needed, the spy buffers
are readable from VME in order to check results from GF with an oﬄine analysis.
A test mode should also be implemented. It would allow the access to internal
partial results in the fitting pipeline, e.g. DSP outputs, g functions and other
registers. This mode would be very useful to debug the firmware and to understand
the output results.
Finally, it will be necessary a complete error handling system in order to man-
age error flags coming from all the parts of the Gigafitter. Some of this error flags
must be sent to downstream boards, as shown in section 4.8, and other must be
accessible from VME interface.
All these GF missing parts do not present any difficulties in principle and,
moreover, do not affect the system feasibility. This first implementation of the
Gigafitter would be parasitically tested in the real environment of CDF.
67
CHAPTER 4. Gigafitter
4.11 The Gigafitter contribute to SVT
As said before, the Gigafitter computation power will grant a reduction of the SVT
processing time, where the track fitter is one of the major bottle-necks, leading
also to a better efficiency with stable performances as a function of Instantaneous
Luminosity. Scalar products in the old low densities FPGAs occupied a large chip
area with timing problems. For this reason 88multiplier were implemented, even
if words to be multiplied were wider (18 bits). Most significant bits information
was then included in pre-calculated terms, stored in very large memories (one word
for each pattern). This feature is the actual limit to the AM size we can use and
will be removed by using 25  18 bit hardware dedicated multipliers inside new
FPGAs. The product constant set size will be reduced drastically. We will be able
to use a large number of different sets to allow reconstruction of tracks historically
discarded because of hardware limitations. This means a better SVT efficiency. It
is also possible to fit many times the same track deleting one particular layer in
each different fit. The layer configuration producing the best track quality will be
chosen. The different fits will be performed in parallel, without latency increase
. As the Tevatron collider Instantaneous Luminosity increases, it is important to
have the capability to evaluate track parameters under the assumption that there
could be a noisy hit in the fitted combination. This discrimination capability
allows reducing the SVT efficiency and impact parameter resolution degradation
due to the high detector occupancy. A larger AM pattern bank will translate in
three important improvements:
 Lepton coverage improvement in the forward region; the increase of the
muon-electron-tau coverage at CDF is provided be L2 SVT high quality
tracking in the forward region by using the SVX only where the COT is
missing. H physics acceptance can be improved coupling L2 high quality
tracking with high quality L2 calorimetric measurements provided by the
last calorimetric trigger upgrade.
 Extension of the SVT acceptance in track Pt that will significantly im-
prove online b-tagging capability. With this upgrade SVT will be able to
reconstruct tracks down to 1:5 GeV=c in Pt, while now only tracks with
Pt > 2 GeV=c are used.
 Extension of the SVT acceptance in track Impact Parameter that will sig-
nificantly improve the lifetime measurements. It is planned to achieve a
68
4.12. Switching from SVT to FTK
sensibility to Impact Parameter up to 2-3 mm, while now only tracks with
Impact Parameters smaller than 1.5 mm are available.
4.12 Switching from SVT to FTK
The Gigafitter has been developed in the SVT environment where it will sub-
stitute the existing track fitter. However, it will be introduced in the new FTK
processor which is foreseen to improve the ATLAS trigger performances. Although
the general structure of the device would remain the same, its utilization in the
new environment will need the introduction of some changes in the firmware. In
the next chapter I will show how the FTK parameters are tuned on the ATLAS
structure.
First of all, as presented in chapter 3, FTK reconstructs complete 3D tracks by
fitting the trajectory helix with 5 parameters, while in SVT only the projection on
the r ﬃ plane is reconstructed. The additional parameters (cot ; z0) require their
own constants in order to evaluate equation 3.1. Therefore, two additional parallel
fitting lines, with their own DSP processor, are needed in the fitter module.
The different geometry of the ATLAS experiment with respect to CDF, produce
a very different definition of sectors. In order to apply the linearized fit,  160k
sectors are defined. Each sector needs its own set of constants. This subdivision
is necessary to grant a good linear approximation inside each sector. In FTK,
7 logical layers are defined, with a 2-dimentional coordinate each. Therefore,
the Gigafitter must execute 14  14 scalar products. With 14 coordinates and 5
parameters to be fitted, 9 constraint functions must be calculated, for a total of
5 + 9 = 14 scalar products to be computed in parallel. The increased number of
products leads to a larger resource utilization inside the chip. While the increased
numer of DSP can be easily faced, the size of constants RAM will be a critical
parameter. The RAM would contain at least one word per sector, each containing
14 coefficients and one constant. Moreover, in order to repeatedly fit the same
track by deleting one layer in each fit, a different set of constants for each layer
layout is needed. This would result in more than 1 million sets of constants, a
very large number to be handled, despite the steady growth of resource density
inside modern FPGA chips. This problem is still under investigation at the level
of FTK architecture and solutions, such as that of FPGA-array boards, have been
proposed. In the next chapter I present a study on FTK performances, with a
69
CHAPTER 4. Gigafitter
particular attention to the selection of B0s ! + .
70
Chapter 5
Application to rare b decays
5.1 FTK simulation
A simulator program was created for the FTK (FTKSim), which is able to process
complete ATLAS events and perform the FTK algorithm to produce exactly the
same list of tracks that would be produced by the actual device. The purpose of
this program is manifold:
 evaluation of tracking performance parameters that can then be used in fast,
parametric detector simulations for high-statistic studies of physics perfor-
mance;
 detailed and reliable evaluation of the physics performance of the FTK, by
feeding it complete events produced by the full ATLAS detector simulation;
 evaluation of the crucial parameters needed for hardware design.
 evaluation of the large set of numeric constants needed for programming the
FTK device.
These goals are attained by an intermediate-level simulation, that describes the
algorithm and the FTK internal data accurately but at a high level, and avoids
detailed hardware simulation, thus attaining a sufficient speed for simulation of
moderately-sized samples of complete events. The core code is based on a similar
simulator previously created for the SVT processor (QUICKSVT [38]), inheriting
its overall organization and basic algorithms, while the data structures have been
recreated in accordance with the ATLAS/ATHENA data structures [39][40].
71
CHAPTER 5. Application to rare b decays
FTKSim is a standalone program, written in C/C++ language which inter-
faces with ROOT [41]. It contains two main connected modules simulating the
Associative Memory (AM) and the Track Fitter (TF) respectively, and two sep-
arate modules that produce the two main data banks needed for programming:
the pattern bank (PB) for the AM, and the fit constants (FC) for the TF. For
this study, the 8 SCT layers were grouped in 4 pairs (r   ﬃ + stereo), in order
to effectively work with a total of 7 layers ("logical layers"), each providing a
2-dimensional measurement (space point). This is not expected to affect signifi-
cantly the evaluation of performance, and allows savings in computing time. In
addition, it allows a more meaningful comparison with the off-line reconstruction
program (iPatRec[42]) that uses the same procedure.
5.1.1 Generation of internal FTK data banks
The internal FTK data banks encode information about the geometry and resolu-
tion of the various parts of the detector, in addition to an appropriate subdivision
of the detector in regions. In order to simulate real operating conditions with
unknown detector misalignments, the needed information on detector geometry
and resolution was extracted from samples of simulated data rather than from a
nominal geometric description of the detector. This is attained for both pattern
bank and fit constants by processing a training sample of 20 million of single-muon
events for fit constants and 100 million of single-muon events for the pattern banks.
The events were processed by the full ATLAS detector simulation (code versions
10.0.6 and 12.0). Realistic resolution and multiple scattering effects were simu-
lated. For the sole purpose of training, effects producing large deviations of the
track from its average trajectory or multiple tracks were turned off (e.g. hard
interactions with detector material, delta rays, etc.) in order to optimize the per-
formance on clean tracks. This is to avoid special cases of low probability that
generate no real advantage to system efficiency. Most tracks cross each of the
7 defined "logical layers" ( because of detector overlaps, a track can sometimes
cross two modules of the same layer). FTK Sim was configured to reconstruct any
tracks leaving at least 6 hits on any combination of these 7 layers. A combination
of 7 modules ( one per layer ) that can be crossed by the same track is called a
"sector".
72
5.1. FTK simulation
Chapter 6. A look to the future: FTK Processor
Region #1
Region #8
O
v
e
r
l
a
p
S
e
c
t
o
r
Sector
Figure 6.5: The barrel is subdivided in geometrical regions, each covering about
90◦. In the figure is shown the logical subdivision of each layer, the module size is
arbitrary; two regions are highlighted: the not-overlapping part is in green while
the overlap is blue. The two sequences of red modules show two sectors.
interactions with detector material, delta rays, etc.) in order to optimize the per-
formance on clean tracks. This is to avoid special cases of low probability that
generating no real advantage to system efficiency.
Most tracks cross each of the 7 “logical layers” we defined (because of detector
overlaps, a track can sometimes cross two modules of the same layer). We config-
ured FTKSim to reconstruct any tracks leaving at least 6 hits on any combination
of these 7 layers. A combination of 7 modules (one per layer) that can be crossed
by the same track is called a “sector”.
AM organization: defining sectors and regions
In the first step, it is necessary to identify all valid sectors. They were determined
from the input data, set by selecting the possible sectors that are hit by tracks
with a sufficient frequency. An important parameter for both sectors and patterns
is the coverage. This is a purely geometric quantity, defined as the probability
that a track (with parameters within a fiducial range) intersects the detector in
a set of points that are within a sector/pattern contained in the bank. In short,
it is the fraction of reconstructible tracks, when only purely geometric effects are
considered. The fiducial range of track parameters is defined by their generated
distribution (almost flat in the chosen range), as shown in Fig. 6.6.
164
Figure 5.1: The barrel is subdivided in geometrical regions, each covering about 90. In
the figure is shown the logical subdivision of each layer, the module size is
arbitrary; two regions are highlighted: the not-overlapping part is in green
while the overlap is blue. The two sequences of red modules show two sectors.
AM organiz tion: defining s ctors and r gions I the first step, i is neces-
sary to identify all valid sectors. They were determined from the input data set,
by selecting the possible sectors that are hit by tracks with a sufficient frequency.
An important parameter for both sectors and patterns is the coverage. This is a
purely geometric quantity, defined as the probability that a track (with param-
eters within a fiducial range) intersects the detector in a set of points that are
within a sector/pattern contained in the bank. In short, it is the fraction of recon-
structible tracks, when only purely geometric effects are considered. The fiducial
range of track parameters is defined by heir generated distribution (almost flat
in the chosen range), as shown in Fig. 5.2. In our work we found a list of 157,901
sectors, providing a geometrical coverage of 98.6%. All sectors are then grouped to
form regions (see Fig. 5.1). Each region is implemented in hardware as a separate
Associative Memory bank, and therefore patterns crossing a region boundary are
not allowed by the system. This has no impact on the efficiency, because regions
have been defined with a generous overlap: 8 regions are used, each defined by a
range of the ﬃ coordinate and covering approximately 1/4 of the detector, so that
each pair of contiguous regions overlaps by about 50%. Patterns located in the
overlap region are not duplicated because this will generate a fake track, for this
reason each of this patterns is arbitrarily assigned to one of the banks. All regions
73
CHAPTER 5. Application to rare b decays
6.5. Simulation
-0.5 0 0.5
]-1Curvature [(GeV/c)
-0.2 -0.1 0 0.1 0.2
Impact Parameter [cm]
-1 0 1
)!cot(
-2 0 2
 [rad]!
-200 -100 0 100 200
 [mm]
0
z
Figure 6.6: Distribution of the track parameters: curvature, impact parameter,
cot(θ), φ, z0 for the training sample.
In our work we found a list of 157,901 sectors, providing a geometrical coverage
of 98.6%.
All sectors are then grouped to form regions (see Fig. 6.5). Each region is
implemented in hardware as a separate Associative Memory bank, and therefore
patterns crossing a region boundary are not allowed by the system. This has
no impact on the efficiency, because regions have been defined with a generous
overlap: we use 8 regions, each defined by a range of the φ coordinate and covering
approximately 1/4 of the detector, so that each pair of contiguous regions overlaps
165
Figure 5.2: Distribution of the track parameters: curvature, impact parameter, cot (),
ﬃ, z for the training sample.
74
5.1. FTK simulation
extend along z for the full detector length.
Fit constants and linearity After the sectors are defined, for each sector the
fit constants are determined. The constants will be used to solve the 3.1. They
are evaluated by comparing hit positions with the originally generated true track
parameters. Only tracks hitting at least 7 different layers were used for constants
generation.
It has been explicitly verified that each sector covers a region of space small
enough to make the linear fitting approximation accurate within the whole sector[43].
The linear approximation, as is shown in the Fig. 5.3, is tested by verifying that
the difference between track parameters extracted using 3.1 and the real parameter
have a distribution centered in 0 for each value of the real parameter. If the linear
approximation is not satisfactory will be possible to see deviations from zero for
particular values of the parameter. However in this case the linear fit was found a
good approximation for all the track parameters.
Generation of pattern banks In the second step, valid patterns are determined
inside each sector, to be stored in the Associative Memory. Each module is sub-
divided into a number of bins of equal size (super-bin), each of them a rectangle
in the ﬃ  z space. A pattern is a combination of 7 such bins, one for each of the
7 modules of the sector being considered. They are generated by the same algo-
rithm used to find valid sectors: the training set of tracks is scanned and the valid
patterns are identified as those that have a sizeable probability of being ’hit’ by
a valid track. The size of the bins has been varied to find a compromise between
the size of the pattern bank (see Fig. 5.4) and the hit occupancy within each
pattern, which determines the number of track candidates that must be fit. Fig.
5.4 shows the efficiency of the pattern bank as a function of the number of training
tracks used to produce the bank. The different curves in the figure correspond to
different road sizes. For this work, we have chosen a size in the r ﬃ plane of 5 mm
for Pixels and 10 mm for SCT detectors; both extend in z for the length of a full
module. This choice yields a bank size of  106 patterns for each of the 8 regions
(see Tab. 5.1). In this case only the barrel detector (central region) is included in
the calculation. This is compatible with implementation in 2 AM boards of the
type currently in use in the CDF SVT.
75
CHAPTER 5. Application to rare b decays
6.5. Simulation
a
.u
.
0
500
1000
1500
2000
2500
3000
3500
]-1) [(GeV/c)
T
1/(2 p
-0.5 0 0.5
)
T
 1
/(
2
 p
!
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Curvature linearity
(a) 1/pT
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
!
-2 0 2
! 
"
-0.008
-0.006
-0.004
-0.002
0
0.002
0.004
0.006
0.008
 linearity!
(b) φ
a
.u
.
0
20
40
60
80
100
120
140
160
180
I.P.
-0.2 -0.1 0 0.1 0.2
 I
.P
.
!
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
I.P. linearity
(c) Impact Parameter
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
)!Cot(
-1 -0.5 0 0.5 1
)
!
 C
o
t(
"
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
) linearity!Cot(
(d) cot(θ)
a
.u
.
0
50
100
150
200
250
300
350
0z
-40 -20 0 20 40
0
 z
!
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
 linearity0z
(e) z0
Figure 6.7: The plots show the differ-
ence between the track parameters re-
constructed by FTK and the truth pa-
rameter as function of the truth param-
eter itself. As expected if the linear ap-
proximation is good the difference are
centered in 0 for all truth values.
167
(a) 1=pT
6.5. Simulation
a
.u
.
0
500
1000
1500
2000
2500
3000
3500
]-1) [(GeV/c)
T
1/(2 p
-0.5 0 0.5
)
T
 1
/(
2
 p
!
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Curvature linearity
(a) 1/pT
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
!
-2 0 2
! 
"
-0.008
-0.006
-0.004
-0.002
0
0.002
0.004
0.006
0.008
 linearity!
(b) φ
a
.u
.
0
20
40
60
80
100
120
140
160
180
I.P.
-0.2 -0.1 0 0.1 0.2
 I
.P
.
!
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
I.P. linearity
(c) Impact Parameter
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
)!Cot(
-1 -0.5 0 0.5 1
)
!
 C
o
t(
"
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
) linearity!Cot(
(d) cot(θ)
a
.u
.
0
50
100
150
200
250
300
350
0z
-40 -20 0 20 40
0
 z
!
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
 linearity0z
(e) z0
Figure 6.7: The plots show the differ-
ence between the track par meters re-
constructed by FTK and the truth pa-
rameter as function of the truth par m-
eter itself. As expected if the linear p-
proximation is good the difference are
centered in 0 for all truth values.
167
(b) ﬃ
6.5. Simulation
a
.u
.
0
500
1 0
5
20
5
30
5
]-1) [(GeV/c)
T
1/(2 p
-0.5 0 0.5
)
T
 1
/(
2
 p
!
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Curvature linearity
(a) 1/pT
a
.u
.
0
200
400
600
8
1000
1200
1400
1600
!
-2 0 2
! 
"
-0.008
. 6
. 4
. 2
0
.002
. 4
. 6
. 8
 linearity!
(b) φ
a
.u
.
0
20
40
60
80
100
120
140
160
180
I.P.
-0.2 -0.1 0 0.1 0.2
 I
.P
.
!
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
I.P. linearity
(c) Impact Parameter
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
)!Cot(
-1 -0.5 0 0.5 1
)
!
 C
o
t(
"
-0.02
-0.015
- .01
-0.0 5
0
.005
. 1
0.015
0.02
) linearity!Cot(
(d) cot(θ)
a
.u
.
0
50
1 0
5
20
5
30
5
0z
-40 -20 0 20 40
0
 z
!
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
 linearity0z
(e) z0
Figure 6.7: The plots show he differ
ence b tween the track parameters re
constructed by FTK and he truth pa
rame er as function of the truth p am
eter tself. As expected if the linear p-
proximat on is good the difference are
centered in 0 for all truth values.
167
(c) Impact Parameter
6.5. Simulation
a
.u
.
0
500
1000
1500
2000
2500
3000
3500
]-1) [(GeV/c)1/(2 p
-0.5 0 0.5
)
T
 1
/(
2
 p
!
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Curvature linearity
(a) 1/pT
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
!
-2 0 2
! 
"
-0.0 8
-0. 6
-0. 4
-0. 2
0
0.0 2
0. 4
0. 6
0. 8
 linearity!
(b) φ
a
.u
.
0
20
40
60
80
100
120
140
160
180
I.P
-0.2 -0.1 0 0.1 0.2
 I
.P
.
!
-0.05
-0. 4
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
I.P. linearity
(c) Impact Par met r
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
)!Cot(
-1 -0.5 0 0.5 1
)
!
 C
o
t(
"
-0.02
15
-0. 1
. 5
0
0.0 5
. 1
. 5
0. 2
) linearity!Cot(
(d) cot(θ)
a
.u
.
0
50
100
150
200
250
300
350
0z
-40 -20 0 20 40
0
 z
!
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
 linearity0z
(e) z0
Figure 6.7: T e plots how the diff r-
ence be we n the track par me rs re-
cons ructed by FTK and the truth p -
ramet r as function of the truth param-
et itself. As expected if th li ar ap-
proximation is good the differ nce are
center d in 0 for all truth values.
167
(d) cot ()
6.5. Simulation
a
.u
.
0
500
1000
1500
2000
2500
3000
3500
]-1) [(GeV/c)
T
1/(2 p
-0.5 0 0.5
)
T
 1
/(
2
 p
!
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Curvature linearity
(a) 1/pT
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
!
-2 0 2
! 
"
-0.008
-0.006
-0.004
-0.002
0
0.002
0.004
0.006
0.008
 linearity!
(b) φ
a
.u
.
0
20
40
60
80
100
120
140
160
180
I.P.
-0.2 -0.1 0 0.1 0.2
 I
.P
.
!
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
I.P. linearity
(c) Impact Parameter
a
.u
.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
)!Cot(
-1 -0.5 0 0.5 1
)
!
 C
o
t(
"
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
) linearity!Cot(
(d) cot(θ)
a
.u
.
0
50
100
150
200
250
300
350
0z
-40 -20 0 20 40
0
 z
!
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
 linearity0z
(e) z0
Figure 6.7: The plots show the differ-
ence between the track parameters re-
constructed by FTK and the truth pa-
rameter as function of the truth param-
eter itself. As expected if the linear ap-
proximation is good the difference are
centered in 0 for all truth values.
167
(e) z0
Figure 5.3: The plots show the difference between the track parameters reconstructed by
FTK and the truth parameter as function of the truth parameter itself. As
expected if the linear approximation is good the difference are centered in 0
for all truth values
76
5.1. FTK simulation
6.5. Simulation
Num. of tracks
3
10 410
5
10
6
10
E
ff
.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1mm Pixel 3mm SCT
2mm Pixel 5mm SCT
5mm Pixel 10mm SCT
Bank efficiency vs bank size (one region)
Figure 6.8: Growth of geometric coverage with increasing size of the pattern bank
for each region.
169
Figure 5.4: Growth of geometric coverage with increasing size of the pattern bank for
each region.
Chapter 6. A look to the future: FTK Processor
Bank Patterns Muon eff. Pions eff.
0 1053348 95.8± 0.4 95.9± 0.2
1 983741 95.7± 0.3 95.8± 0.2
2 1017193 96.0± 0.3 95.5± 0.2
3 1026228 96.1± 0.4 95.3± 0.2
4 1019529 95.8± 0.4 95.2± 0.2
5 1029316 96.1± 0.3 95.7± 0.2
6 1044599 96.4± 0.3 95.6± 0.2
8 1008758 96.7± 0.3 95.2± 0.2
Table 6.1: Size and coverage of FTK pattern banks.
yields a bank size of ∼ 106 patterns for each of the 8 regions (see Tab. 6.1). In this
case only the barrel detector (central region) is included in the calculation. This
is compatible with implementation in 2 AM boards of the type currently in use in
the CDF SVT.
6.6 FTK tracking performance
After the generation of the pattern banks and the fit constants it is now possible to
evaluate the performance of FTK and compare it with the full off-line reconstruc-
tion program (iPatRec v10.0.6). Results are shown in Fig. 6.9, and demonstrate
that the FTK is able to provide at the trigger level a resolution comparable with
the full–fledged off-line tracking program. The Fig. 6.10 shows the impact param-
eter resolution as function of the transverse track momentum. The FTK measured
resolution has a shape similar to the off-line, with an additional 30µm added in
quadrature. The reason of this distance is not completely understood, but the
precision is enough for the trigger purposes.
The reconstruction efficiency on the single muon sample is also good (Fig. 6.11),
with performance very close to the off-line. The small differences can be ascribed
to the fact that the off-line reconstruction is based on the complete tracker, while
FTK only uses the silicon-based part. The FTK algorithm is actually expected to
yield a performance very close to the theoretical maximum when non–linearities
are small.
In this document we analyze in detail a specific application: reconstruction of
rare B0s decays in the mode B
0
s → µ+µ−.
170
Table 5.1: Size and coverage of FTK pattern banks.
77
CHAPTER 5. Application to rare b decays
5.2 FTK tracking performance
After the generation of the pattern banks and the fit constants it is now possible to
evaluate the performance of FTK and compare it with the full off-line reconstruc-
tion program (iPatRec v10.0.6). Results are shown in Fig. 5.5, and demonstrate
that the FTK is able to provide at the trigger level a resolution comparable with
the full-fledged off-line tracking program. The Fig. 5.6 shows the impact parame-
ter resolution as function of the transverse track momentum. The FTK measured
resolution has a shape similar to the off-line, with an additional 30 m added in
quadrature. The reason of this distance is not yet completely understood, but the
precision is enough for the trigger purposes. The reconstruction efficiency on the
single muon sample is also good, as shown in the plot of Fig. 5.7, with perfor-
mance very close to the off-line. The small differences can be ascribed to the fact
that the off-line reconstruction is based on the complete tracker, while FTK only
uses the silicon-based part. The FTK algorithm is actually expected to yield a
performance very close to the theoretical maximum when non-linearities are small.
In this document a specific application is analyzed in detail: reconstruction of
rare B0s decays in the mode B0s ! + .
5.3 Application to B0s ! + 
The small yield makes this mode only accessible in hadron collisions. The current
best limits are from the Tevatron (CDF 1:9 fb 1 : < 5:810 8, [44]). The power of
this channel in searching for super-symmetry is comparable with direct searches:
this is a possible candidate for the first detection of physics beyond the SM. The
Tevatron searches proved that this mode can be separated from background and
measured in spite of the complexity of the hadronic collisions, provided a good
tracking and a strong trigger are available. In this section the B0 ! +  case will
not be treated because with the expected mass resolution of ATLAS, 60 MeV=c2
[45], will not be possible to separate the two signals in the search.
The presence of muons allows selection at Level-1 of the trigger. The current
ATLAS strategy [46] is mainly based on requiring both muons to pass the threshold
pT > 6 GeV=c and be within jj < 2:5 The possibility of triggering on a single
muon has only been considered for low-luminosity periods [47] due to its high rate
[48]. As can be seen in Tab. 5.2, the main background comes from in-flight decay of
78
5.3. Application to B0s ! + 
6.6. FTK tracking performance
 I.P. [cm]!
-0.02 0 0.02
0
10000
20000
30000
40000
50000
Impact Parameter resolution
(a) Impact Parameter
]-1 Curv [(GeV/c)!
-0.02 -0.01 0 0.01 0.02
0
10000
20000
30000
40000
50000
60000
70000
80000
iPatRec
FTK
Curvature resolution
(b) Curvature
 [rad]!"
-0.004 -0.002 0 0.002 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
 resolution!
(c) φ
 [mm]0 z!
-2 -1 0 1 2
0
20
40
60
80
100
120
3
10"
 resolution0z
(d) z0
)! Cot("
-0.005 0 0.005
0
5000
10000
15000
20000
25000
30000
35000
) resolution!Cot(
(e) cot(θ)
Figure 6.9: Comparison of resolution obtained by the off-line reconstruction code
iPatRec (blue) and FTKSim (red) for the track parameters.
171
(a) Impact parameter
6.6. FTK tracking performance
 I.P. [cm]!
-0.02 0 0.02
0
10000
20000
30000
40000
50000
Impact Parameter resolution
(a) Impact Parameter
]-1 Curv [(GeV/c)!
-0.02 -0.01 0 0.01 0.02
0
10000
20000
30000
40000
50000
60000
70000
80000
iPatRec
FTK
Curvature resolution
(b) Curvature
 [rad]!"
-0.004 -0.002 0 0.002 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
 resolution!
(c) φ
 [mm]0 z!
-2 -1 0 1 2
0
20
40
60
80
100
120
3
10"
 resolution0z
(d) z0
)! Cot("
-0.005 0 0.005
0
5000
10000
15000
20000
25000
30000
35000
) resolution!Cot(
(e) cot(θ)
Figure 6.9: Comparison of resolution obtained by the off-line reconstruction code
iPatRec (blue) and FTKSim (red) for the track parameters.
17
(b) Curvature
6.6. FTK tracking performance
 I.P. [cm]!
-0.02 0 0.02
0
10000
20000
30000
40000
50000
Impact Parameter resolution
(a) Impact Para eter
]-1 Curv [(GeV/c)!
-0.02 -0.01 0 0.01 0.02
0
10000
20000
30000
40000
50000
60000
70000
80000
iPatRec
FTK
Curvature resolution
(b) Curvature
 [rad]!"
-0.004 -0.002 0 0.002 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
 resolution!
(c) φ
 [mm]0 z!
-2 -1 0 1 2
0
20
40
60
80
100
120
3
10"
 resolution0z
(d) z0
)! Cot("
-0.005 0 0.005
0
5000
10000
15000
20000
25000
30000
35000
) resolution!Cot(
(e) cot(θ)
Figure 6.9: Comparison of resolution obtained by the off-line reconstruction code
iPatRec (blue) and FTKSim (red) for the track parameters.
171
(c) ﬃ
6.6. FT tracking perfor ance
 I.P. [cm]!
-0.02 0 0.02
0
10000
20000
30000
40000
50000
Impact Parameter resolution
(a) Impact Parameter
]-1 Curv [(GeV/c)!
-0.02 -0.01 0 0.01 0.02
0
10 00
2 00
3 00
4 00
5 00
6 00
7 00
8 00
iPatRec
FTK
Curvature resolution
(b) Curvature
 [rad]!"
-0.004 -0.002 0 0.002 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
 resolution!
φ
 [mm]0 z!
-2 -1 0 1 2
0
20
40
60
80
100
120
3
10"
 resolution0z
(d) z0
)! Cot("
-0.005 0 0.005
0
5000
10000
15000
20000
25000
30000
35000
) resolution!Cot(
(e) cot(θ)
Figure 6.9: Comparison of resolution obtained by the off-line reconstruction code
iPatRec (blue) and FTKSim (red) for the track parameters.
171
(d) z0
6.6. FTK tracking performance
 I.P. [cm]!
-0.02 0 0.02
0
10000
20000
30000
40000
50000
Impact Parameter resolution
(a) Impact Parameter
]-1 Curv [(GeV/c)!
-0.02 -0.01 0 0.01 0.02
0
10000
20000
30000
40000
50000
60000
70000
80000
iPatRec
FTK
Curvature resolution
(b) Curvature
 [rad]!"
-0.004 -0.002 0 0.002 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
 resolution!
(c) φ
 [mm]0 z!
-2 -1 0 1 2
0
2
4
6
8
10
2
3
10"
 resolution0z
(d) z0
)! Cot("
-0.005 0 0.005
0
500
10000
5
20
5
30
5
) resolution!Cot(
(e) cot(θ)
Figure 6.9: Comparison of resolution obtained by the off-line reconstruction code
iPatRec (blue) and FTKSim (red) for the track parameters.
171
(e) cot ()
Figur 5.5: Comparison of r l tion obtained by the off-lin reconstruction code PatRec
(blue) and FTKSim (red) for the track parameters.
79
CHAPTER 5. Application to rare b decays
Chapter 6. A look to the future: FTK Processor
 [GeV/c]
t
p
5 10 15 20
 [
c
m
]
0
d
!
0
0.001
0.002
0.003
0.004
0.005
0.006
Impact Parameter resolution
Figure 6.10: Resolution on the impact parameter as function of the track inverse
momentum.
 [rad]!
-3 -2 -1 0 1 2 3
fr
a
c
ti
o
n
0
0.2
0.4
0.6
0.8
1
 !Efficiency vs 
FTK
iPatRec
Figure 6.11: Efficiency vs azimuthal angle φ.
172
Figure 5.6: Resolution on the impact parameter as function of the track inverse momen-
tum.
Chapter 6. A look to the future: FTK Processor
 [GeV/c]
t
p
5 10 15 20
 [
c
m
]
0
d
!
0
0.001
0.002
0.003
0.004
0.005
0.006
Impact Parameter resolution
Figure 6.10: Resolution on the impact parameter as function of the track inverse
momentum.
 [rad]!
-3 -2 -1 0 1 2 3
fr
a
c
ti
o
n
0
0.2
0.4
0.6
0.8
1
 !Efficiency vs 
FTK
iPatRec
Figure 6.11: Efficiency vs azimuthal angle φ.
172
Figure 5.7: Efficiency vs azimuthal angle ﬃ.
80
5.3. Application to B0s ! + 
light mesons ( eK); this is strongly suppressed at Level-2 by matching with Inner
Detector tracks and improved pT measurement [45]. The residual background is
mainly from semi-leptonic decay of bottom and charmed hadrons. However, this
high rate is not a problem if the next level of selection is fast: the presence of FTK
allows performing oﬄine-type selections in a small fraction of the time needed by a
CPU farm, thus allowing to handle much larger Level-1 rates, provided the Level-2
output rate from the selection is low enough.Chapter 6. A look to the futur : FTK Processor
Source Barrel EndCap Barrel+EndCap
pi/K 7.0 9.8 16.8
b 1.9 2.1 4
c 1.1 1.3 2.4
W 0.004 0.005 0.009
Total 10.0 13.2 23.2
Table 6.2: Level–1 rates (kHz) for the MU6 trigger (one muon with pT > 6GeV/c
and |η| < 2.5), estimated for instantaneous luminosity 1× 1033 cm−2s−1 [107].
6.7 Application to B0s → µ+µ−
The small yields makes this mode only accessible in hadron collisions. The current
best limits are from the Tevatron (CDF 1.9 fb−1: < 5.8 × 10−8, [102]). The
power of this channel in searching for super-symmetry is comparable with direct
searches: this is a possible candidate for the first detection of physics beyond
the SM. The Tevatron searches proved that this mode can be separated from
background and measured in spite of the complexity of the hadronic collisions,
provided a good tracking and a strong trigger are available. In this section will
not treated the B0 → µ+µ− case because with the expected mass resolution of
ATLAS, 60 MeV/c2[103, 104], will not be possible to separate the two signals in
the search.
The presence of muons allows selection at Level–1 of the trigger. The cur-
rent ATLAS strategy [105] is mainly based on requiring both muons to pass the
threshold pt > 6GeV/c and be within |η| < 2.5 The possibility of triggering on
a single muon has only been considered for low–luminosity periods [106] due to
its high rate [107]. As can be seen in table 6.2, the main background comes from
in–flight decay of light mesons (pi e K); this is strongly suppressed at Level–2 by
matching with Inner Detector tracks and improved pT measurement [105]. The
residual background is mainly from semi-leptonic decay of bottom and charmed
hadrons. However, this high rate is not a problem if the next level of selection
is fast: the presence of FTK allows performing oﬄine–type selections in a small
fraction of the time needed by a CPU farm, thus allowing to handle much larger
Level–1 rates, provided the Level-2 output rate from the selection is low enough.
We studied the performance of the following Level–2 selection based on FTK,
assuming a Level–1 trigger asking for just one muon with pT > 6GeV/c
1. Find a second muon with a pT > pTmin
2. Both muons having impact parameter1 from 100µm to 2mm
1This upper limit derives from the training sample distribution for the I.P., see Fig. 6.6
174
Table 5.2: Level-1 rates (kHz) for the MU6 trigger (one muon with pT > 6 GeV=c and
jj < 2:5), estimated for instantaneous luminosity 1 1033cm 2s 1[48].
The performance of the following Level-2 selection based on FTK was studied,
assuming a Level-1 trigger asking for just one uon with pT > 6 GeV=c
1. Find a second muon with a pT > pTmin
2. Both muons having impact parameter from 100 m to 2 mm
3. T e reconstructed B0s candidate is required to point to the beam spot in the
transverse plane, by applying a cut to its impact parameter: dB0s < 100 m
4. 4:8 GeV=c2 < M() < 6 GeV=c2
These cuts had been performed on signal and background samples generated
using Pythi . The background sampl s was form d by bb QCD production that
is assumed to be the main background, at least at trigger level. The pile-up and
minimum bias even s wer not considered, processed by he ful y detai ed ATLAS
simulation, of the tracking (code version 10.0.6), followed by our FTK simulator;
the noise in the d tecto was added in the si lation using th standard values.
In this versio of the code only an older, 6-layer implementation of FTKSim was
available, which has a lower performance of the current version, so the results
are conservative in this contest. Since the ATLAS muon reconstruction capability
has been shown to be very good, detailed muon reconstruction was not deemed
81
CHAPTER 5. Application to rare b decays
necessary, and it is simply assumed that the muons are always correctly identified;
this is not expected to influence the results significantly.
The Fig. 5.8 shows the distributions for the quantities chosen to apply cuts.
In the plots are compared the distributions for the background (green) and the
signal (blue) of reconstructed B0s ! +  mass and impact parameters. These
are the most effective cuts. The final result of this study is shown in Fig. 5.9 and6.7. Application to B0s → µ+µ−
Figure 6.13: The plots show the distribution of mass and impact parameters,
evaluated using the FTK variables, for the signal and the background. The applied
cuts are shown by the red vertical bars.
3. The reconstructed B0s candidate is required to point to the beam spot in the
transverse plane, by applying a cut to its impact parameter: dB0s < 100µm
4. 4.8GeV/c2 < M(µµ) < 6GeV/c2
These cuts had been performed on signal and background samples generated
using Pythia. The background samples was formed by bb¯ QCD production that
is assumed to be the main background, at least at trigger level. The pile-up and
minimum bias events were not considered, processed by the fully detailed ATLAS
simulation, of the tracking (code version 10.0.6), followed by our FTK simulator;
the noise in the detector was added in the simulation using the standard values.
In this version of the code only an older, 6–layer implementation of FTKSim was
available, which has a lower performance of the current version, so the results
are conservative in this contest. Since the ATLAS muon reconstruction capability
has been shown to be very good, detailed muon reconstruction was not deemed
necessary, and it is simply assumed that the muons are always correctly identified;
this is not expected to influence the results significantly.
The Fig. 6.13 shows the distributions for the quantities chosen to apply cuts.
In the plots are compared the distributions for the background (green) and the
signal (blue) of reconstructed B0s → µµ mass and impact parameters. These are
the most effective cuts.
The final result of this study is shown in fig 6.14 and detailed in table 6.3
as a function of the cut pTmin. The plots show that the efficiency for collecting
B0s → µ+µ− events increase by a factor of 3 when lowering the second muon
threshold from 6 to 3 GeV/c.
Even considering the lower efficiency of the muon system below 6 GeV/c (80%
at 3 GeV/c), these results imply an improvement of the size of the collected samples
175
Figure 5.8: The plots show the distribution of mass and impact parameters, evaluated
using the FTK variables, for the signal and the background. The applied cuts
are shown by the red vertical bars.
detailed in Tab. 5.3 as a function of the cut pTmin .
2 GeV=c 3 GeV=c 4 GeV=c 5 GeV=c 6 GeV=c
2  FTK in jj < 1 0.385
ﬀ() > 100 m 0.295
ﬀ(Bs) < 100 m 0.360
4:8 GeV=c2 < M() < 6 GeV=c2 0.353
pT cut on second  0.344 0.228 0.170 0.116 0.061
Efficiency relative to L1 0.234 0.150 0.107 0.074 0.053
Total efficiency 0.041 0.029 0.020 0.013 0.009
Table 5.3: Signal efficiencies
The plots show that the efficiency for collecting B0s ! +  events increase by
a factor of 3 when lowering the second muon threshold from 6 to 3 GeV/c.
Even considering the lower efficiency of the muon system below 6 GeV/c (80%
at 3 GeV/c), these results imply an improvement of the size of the collected samples
with the respect the original di-muon trigger scenario, this raise sensitivity to low
branching fractions even with a limited integrated luminosities.
82
5.3. Application to B0s ! + 
6.7. Application to B0s → µ+µ−
 Pt cut [GeV]µSecond 
1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
e
ff
ic
ie
n
c
y
0
0.05
0.1
0.15
0.2
0.25
0.3
FTK
iPatRec
Truth
µ6 GeV Pt cut on first 
Figure 6.14: Efficiency of the proposed Level–2 selection as a function of the second
muon threshold pTmin. The efficiencies are relative to the Level–1 selection.
this doesn’t affect the conclusion that the rate is very reasonable. It can be further
reduced if necessary by bringing to Level–2 other cuts that have been proposed
for off-line analysis, like isolation or 3D vertexing. They can all be implemented
at earlier selection stages thanks to the prompt availability of the FTK track list.
pT (µ2) cut 2 GeV/c 3 GeV/c 4 GeV/c 5 GeV/c 6 GeV/c
|η(µ2)| < 1 0.508
σ(µ) > 100 µm 0.375
σ(Bs) < 100 µm 0.266
M ∈ [4.8, 6.0]GeV/c2 0.014
pT (µ2) 0.491 0.312 0.174 0.100 0.045
LVL1 Eff. 5.25× 10−3 2.82× 10−3 8.08× 10−4 0 0
Total Eff. 6.80× 10−4 3.66× 10−4 1.04× 10−4 0 0
Table 6.5: Efficiency of selection cuts on background events.
177
Figure 5.9: Efficiency of the proposed Level-2 selection as a function of the second muon
threshold pTmin . The efficiencies are relative to the Level-1 selection.
Given the similarity between FTK and off-line tracking performances, it can
be expected th t FTK selection efficiencies will be similar also for regions where
they have not been explicitly simulated in this study, as in the forward region,
but could be included in FTK as well by simply adding further regions. With this
assumptions we can evaluate the total number of B0s ! +  events collected and
make a comparison with previous studies (Tab. 5.4).
A crucial point is obviously the amount of background passing this selection.
This was evaluated by simulating generic bb events with exactly the same proce-
dures adopted for the B0s ! +  signal. The sample contained about 20k events,
which is sufficient for a first-order evaluation.
The efficiencies observed on background are reported in Tab. 5.5.
These numbers allow estimating the output Level-2 rate, knowing that the
Level-1 rate is  3:5 kHz at L = 1033 cm 2s 1 4. With the pT > 3 GeV=c cut, the
efficiency is of order 10 3 , yielding rates of the order of 10 Hz. This has a large
statistical uncertainty due to the small number of events passing the cuts (6), but
this doesn’t affect the conclusion that the rate is very reasonable. It can be further
reduced if necessary by bringing to Level-2 other cuts that have been proposed for
83
CHAPTER 5. Application to rare b decays
CDF 780 pb 1 ATLAS 30 fb 1
FTK 30fb 1 FTK 30fb 1
pt1 > 6 GeV pt1 > 6 GeV
pt2 > 6 GeV pt2 > 3 GeV
off-line Level-2
Signal 0 27 178 546
Background 1 93
L2 Rate Few Hz 10 Hz
Limit
< 10 7 < 6:6  10 9
@95% CL @95% CL
Table 5.4: Expected event yields from the FTK trigger, under Standard Model assump-
tions for the branching fraction of B0s ! + .
pT (2) cut 2 GeV=c 3 GeV=c 4 GeV=c 5 GeV=c 6 GeV=c
j(2)j < 1 0.508
ﬀ() > 100 m 0.375
ﬀ(Bs) < 100 m 0.266
4:8 GeV=c2 < M() < 6 GeV=c2 0.014
pT (2) 0.491 0.312 0.174 0.100 0.045
LVL1 Eff. 5:25  10 3 2:82  10 3 8:08  10 4 0 0
Total efficiency 6:80  10 4 3:66  10 4 1:04  10 4 0 0
Table 5.5: Efficiency of selection cuts on background events.
84
5.3. Application to B0s ! + 
off-line analysis, like isolation or 3D vertexing. They can all be implemented at
earlier selection stages thanks to the prompt availability of the FTK track list.
85
CHAPTER 5. Application to rare b decays
86
Conclusions
In this thesis I have presented a new powerful processor that has been developed to
provide fast track fitting inside the trigger system of hadron collider experiments.
It has been shown that the attributes of experiments operating at modern
hadron colliders, such as the Tevatron and the LHC, grant the necessary condi-
tions for the search of rare events. Some particular events, such as the rare decay
B0s ! + , if detected, would reveal the presence of physics beyond the current
formulation of the Standard Model. But, hadron collider experiments, that pro-
vide the sufficient statistic for the search of such rare events, also produce a very
large amount of background. Therefore, sophisticated trigger systems have been
developed, where the use of the reconstructed particle trajectories, since the early
trigger decisions, is a powerful and effective tool.
It has been shown how the Fast Tracker processor, which is the evolution of
SVT, would provide the reconstruction of track parameters for the Level-2 ATLAS
trigger decisions with off-line quality. It applies a two-stage algorithm, based on
pattern recognition of interesting events and fitting of track parameters with a
linearized algorithm.
Then, I have shown my personal contribution to the development and testing
of the core of the Gigafitter firmware: a processor, based on FPGA technology,
where the linearized fit algorithm is implemented by exploiting DSP arrays that
can run up to a clock frequency of 550MHz. The Gigafitter has been developed
to overcome SVT Track Fitter limitations but, thanks to its great computational
power, it will be also used as the FTK fitter.
I have implemented a complete Gigafitter fitting pipeline, which processes data
coming from an SVT wedge. For each part of the firmware I have presented the
technical solutions I have introduced in order to reduce the computing time and
the resource utilization. This work is a first step in the necessary development of
the device that will need further efforts to reach the desired goal rate of one fit
per nanosecond. However, the firmware I have written has demonstrated that the
87
CHAPTER 5. Application to rare b decays
resources available in the chip are sufficient for the complete Gigafitter implemen-
tation at FTK. A complete simulation of each part of the firmware ensures a correct
response of the system, that can be now tested on real data and environment at
CDF.
In the last chapter I have presented some studies on FTK performances, based
on FTK-simulation software. It has been shown that the FTK processor can
reconstruct all track parameters with resolution and efficiency close to the off-
line ATLAS algorithm. The use on fully-simulated events containing B0s ! + 
signals and bb background has shown that FTK will be able to significantly improve
the signal acceptance, while keeping the event rate below the acceptable threshold
for the ATLAS trigger system.
88
Bibliography
[1] http://www.aps.org/programs/honors/prizes/panofsky.cfm.
[2] F. Mandl and G. Shaw. Quantum Field Theory. John Wiley and Son, 1993.
[3] D. H. Perkins. Introduction to High-Energy Physics. Cambridge University
Press, 2000.
[4] Chris Quigg. Gauge Theories of the Strong, Weak, and Electromagnetic
Interactions. The Benjamin/cummings Publishing Company, Inc., 1983.
[5] A. Abulencia et al. Measurement of the B0s   B0s oscillation frequency. Phys.
rev. Lett., vol. 97, 2006.
[6] Fermilab tevatron main page. http://www-bdnew.fnal.gov/tevatron/.
[7] Fermilab national accelerator laboratory. http://www.fnal.gov/.
[8] The collider detector at fermilab. http://www-cdf.fnal.gov/.
[9] Alice (a large ion collider experiment). http://aliceinfo.cern.ch/.
[10] Lhcb (the large hadron collider beauty experiment).
http://lhcb.web.cern.ch/lhcb/.
[11] Atlas(a toroidal lhc apparatus). http://atlas.web.cern.ch/.
[12] Cms (compact muon solenoid). http://cms.cern.ch/.
[13] The atlas experiment at the cern large hadron collider.
https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasTechnicalPaper.
[14] A. Bardi R. Carosi M. Dell’Orso M. D’Onofrio A. Annovi, M.G. Bagliesi and
et al. The fast tracker processor for hadron collider triggers. IEEE Trans
Nucl. Sci., vol. 48(no. 3), June 2001.
89
BIBLIOGRAPHY
[15] L. Ristori M. Dell’Orso. A highly parallel algorithm for track finding. Nucl.
Instr. Meth., vol. 278:pp. 436–438, 1990.
[16] J. Berryhill A. Cerri A. G. Clark R. Culbertson et al. A. Bardi, S. Belforte.
SVT: an online silicon vertex tracker for the CDF upgrade. Nucl. Instr.
Meth., vol. A409:pp. 658–661, May 1998.
[17] M. Dell’Orso and L. Ristori. VLSI structures for track finding. Nucl. Instr.
Meth., vol. A278(no. 2):pp. 436–440, June 1989.
[18] Bitossi M. Chiozzi S. Damiani C. Dell’Orso M. Giannetti P. Giovacchini P.
Marchiori G. Pedron I. Piendibene M. Sartori L. Schifano F. Spinella F. Torre
S. Tripiccione R. Annovi A., Bardi A. A VLSI processor for fast track finding
based on content addressable memories. IEEE Trans Nucl. Sci., vol. 53(no.
4), August 2006.
[19] A. Bardi R. Carosi M. Dell’Orso P. Giannetti et al. A. Annovi, M. G. Bagliesi.
The data organizer: a high traffic node for tracking detector data. IEEE
Trans. Nucl. Sci., Vedere se è stato pubblicato..
[20] S. Donati G. Gagliardi S. Galeotti P. Giannetti et al. S. Belforte, M. Dell’Orso.
The SVT hit buffer. IEEE Trans. Nucl. Sci., vol. 43(no. 3):pp. 1810–1813,
June 1996.
[21] Bardi A. Carosi R. Dell’Orso M. Giannetti P. Iannaccone G. Morsani F. Pietri
M. Varotto G. Annovi A., Bagliesi M.G. A pipeline of associative memory
boards for track finding. IEEE Trans Nucl. Sci., vol. 48(No. 3):595, 2001.
[22] H. Wind. Principal component analysis and its application to track find-
ing, volume III of Formulae and methods in experimental data evaluation.
European Physical Society, 1984.
[23] G. Gagliardi. Un progetto di trigger per il processo B ! J= +X al collider
adronico di Fermilab. PhD thesis, Università degli studi di Pisa - Facoltà di
Scienze M.F.N., 1993.
[24] Un-Ki Yang Jahred Adelman, Mel Shochet. SVT track fitter upgrade.
CDF/DOC/TRIGGER/CDFR/7872, July 2006.
[25] Virtex-5 data sheets webpage.
http://www.xilinx.com/support/documentation/virtex-5_data_sheets.htm.
90
BIBLIOGRAPHY
[26] Pulsar board project. http://hep.uchicago.edu/˜thliu/projects/Pulsar/.
[27] Altera apex20k overview.
http://www.altera.com/products/devices/apex/overview/apx-
overview.html#2.5.
[28] Leonardo spectrum. http://www.mentor.com/synthesis.
[29] Quartus ii. http://www.altera.com/products/software/sfw-index.html.
[30] A. Canepa M. Casarsa M. Covery G. Cortiana M. Dell’Orso G. Flanagan H.
Frisch P. Gianetti O. Gonzalez T. Liu D. Lucchesi R. Northrop D. Pantano
M. Piendibene L. Ristori L. Rogondino V. Rusu S. Torre Y. Tu Y. Veszpremi
M. Vidal S.M. Wang L. Sartori, A. Bhatti. Mezzanine card specifications for
level-2 calorimeter trigger upgrade. CDF/DOC/TRIGGER/CDFR/8533.
[31] http://catalog.tycoelectronics.com/amp/bin/amp.connect?c=1&m=bypn&i=13&pn=1
-120527-1.
[32] Virtex-5 family overview.
http://www.xilinx.com/support/documentation/data_sheets/ds100.pdf.
[33] Xilinx ise.
http://www.xilinx.com/products/design_resources/design_tool/index.htm.
[34] Mentor graphics modelsim. http://www.model.com/.
[35] T. Nakaya et al. Hardware design and specification of the svt track fitter.
CDF/DOC/TRIGGER/5026, May 2001.
[36] Virtex-5 FPGA XtremeDSP design considerations user guide.
http://www.xilinx.com/support/documentation/user_guides/ug193.pdf.
[37] Vita standard organization. http://www.vita.com/.
[38] The CDF Coll. SVTSIM web page. http://www-cdf.fnal.gov/upgrades/daq
trig/trigger/svt/svtsim.html.
[39] The ATLAS Coll. ATLAS computing technical design report. Technical
report, July 2004. ATLAS TDR-017, CERN-LHCC-2005-022.
[40] ATHENA web page.
http://atlas.web.cern.ch/nAtlas/GROUPS/SOFTWARE/OO/architecture/.
91
BIBLIOGRAPHY
[41] R. Brun et al. ROOT object oriented data analysis framework. Available:
http://root.cern.ch/.
[42] R. Clift and A. Poppleton. IPATREC: inner detector pattern-recognition and
track-fitting, 1994.
[43] S. Belfonte el al. Silicon vertex trigger technical design report.
CDF/DOC/TRIGGER/PUBLIC/3108, 1995.
[44] a. Aaltonen et al. Search for bs ! +  and bd ! + 
decays with 2 fb 1 of pp collisions, 2007. Available: http:
//www.citebase.org/abstract?id=oai:arXiv.org:0712.1708.
[45] The ATLAS Coll. ATLAS detector and physics performance technical design
report. Technical report, CERN, 1999. CERN-LHCC-99-014, ATLAS-TDR-
14, volume 1-2.
[46] ATLAS high-level trigger, data acquisition and control technical design re-
port. Technical report, 2003. ATLAS-TDR-016.
[47] S. George. The ATLAS b-physics trigger. Technical report, 2004. ATL-DAQ-
2004-004.
[48] The ATLAS Coll. Level-1 trigger technical design reort. Technical report,
1998. COM-PHYS-2004-053.
[49] Virtex-5 user guide.
http://www.xilinx.com/support/documentation/user_guides/ug190.pdf.
[50] http://www.national.com/ds/ds/ds90lv031a.pdf.
[51] http://www.national.com/ds/ds/ds90lv032a.pdf.
[52] R. Culbertson M. Dell’Orso H. Frisch P. giannetti E. Meschi T. Nakaya G.
Punzi L. Ristori M. Shochet F. Spinella P. Wilson S. Belforte, J. Berryhill
and A. Zanetti. Specification of the XTRP, SVT, and level 2 interfaces.
CDF/DOC/TRIGGER/CDFR/4578, February 2000.
[53] Roger J.N. Phillips Vernon D. Barger. Collider Physics - Updated Edition.
Addison-Wesley, 1997.
92
BIBLIOGRAPHY
[54] G. Volpi. Rare decays of B mesons and baryons at the Tevatron and
the LHC. PhD thesis, Facoltà di Scienze Matematiche Fisiche e Naturali -
Università degli Studi di Siena, 2008.
[55] Sullivan G. Toback D. Wahl J. Wilson P. Campbell M. Frish H., Shochet M.
Conceptual design of a deadtimeless trigger for the cdf trigger upgrade.
CDF/DOC/TRIGGER/CDFR/2038, December 1994.
[56] ATLAS collaboration. Atlas technical paper.
https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasTechnicalPaper.
93
