Development of readout electronics for the ATLAS tile calorimeter at the HL-LHC by Carrió Argos, Fernando
Development of Readout Electronics for
the ATLAS Tile Calorimeter
at the HL-LHC
Departamento de Ingenier´ıa Electro´nica
Programa de doctorado en Ingenier´ıa Electro´nica
TESIS DOCTORAL
Fernando Carrio´ Argos
DIRECTOR
Dr. Alberto Valero Biot
Valencia, mayo 2017
A dissertation submitted to the University of Vale`ncia
for the degree of Doctor of Philosophy.
ii
Declaration
This dissertation is the result of my own work, except where explicit
reference is make to the work of others. It has not been submitted for
another qualification to this or any other university.
Fernando Carrio´ Argos
iii
iv
Acknowledgements
Many people contributed to the success of this thesis and I would like to thank
all of them.
First of all, I would like to express my gratitude to my advisor Dr. Alberto
Valero for the guidance and suggestions offered during all these years at IFIC
and CERN. He was always accessible and willing to help me even during one of
the most important periods of his life (congratulations on your baby).
Thanks to my friends and colleagues from the IFIC TileCal group: Luca,
Juan, Damia´n, Leonor, Sergi, Yesenia, Pedro, and Paco. You are an incredible
teamwork that makes this group feels like family. Also my acknowledgment to
the professors Antonio Ferrer, Victoria Castillo and Emilio Higo´n for all the
support and advice, especially during my first months at CERN. I would like to
thank all of you for giving me the opportunity of collaborating in this fascinating
project.
Most probably, this period was not as hard as it should thanks to my friends
from the University of Valencia and IFIC: David, Javier Navarro, Ramo´n, Gess-
samı´, Xavier and Javier Collado. Thank you so much for the coffees and con-
versations which saved me in the worst moments.
Of course, all this work could not have been possible without the hard work
of a great community: the TileCal collaboration. I want to thank Christian
Bohm, Steffen Muschter and Eduardo Valde´s from the Stockholm University for
the time discussing about firmware, clock domains and remote programming.
Mark Oreglia, Kelby Anderson and Fukun Tang from the University of Chicago,
thank you for sharing your invaluable experience with readout electronics and
hardware. Special thanks to Alexander Paramonov from Argonne National
Laboratory for the fruitful discussions during the coffees at the testbeam and
v
ACKNOWLEDGEMENTS
for all the support during the writing of this document. I am also grateful to
Giulio Usai from University of Texas for being an excellent leader in the projects
in which we worked together. I hope we can continue discussing for a long time
in the increasingly longer upgrade meetings.
I would also like to thank Carlos Solans, Ju´lio Viera and Pablo Moreno with
whom I shared most of my time working in the MobiDICK project at CERN.
I am grateful to Manoel Barros and Diego Barrientos for the endless discus-
sions about technical details about digital electronics and printed circuit board
design. I learn a lot from you!
Finalmente, me gustar´ıa agradecer a mi familia y a Anna por el apoyo in-
condicional que me han dado durante estos an˜os y, en especial, durante los
u´ltimos meses de esta tesis. Sin ellos esta tesis no ser´ıa posible.
A todos vosotros: ¡MUCHAS GRACIAS!
vi
Preface
The Large Hadron Collider (LHC) is one of largest particle accelerators in the
world. It has been used to explore energy frontier physics since 2010, with a
collaboration composed of more than 7,000 scientists from 60 different countries.
After a major upgrade that will occur in the 2020s, the LHC will become the
High Luminosity LHC (HL-LHC). The HL-LHC will increase the instantaneous
luminosity by a factor 5 compared to the LHC. The integrated luminosity of
the HL-LHC program will be 10 times the integrated luminosity of LHC.
The R&D HL-LHC efforts involve a large community in Europe, but also in
the US and Japan. The design of the HL-LHC and the consequent upgrade of the
experiments at the HL-LHC represents an exceptional technological challenge.
New accelerator technologies are under development such as superconducting
magnets and cavities and high-throughput electronics to receive and process the
extraordinary amount of data generated by the experiments. In addition, the
new readout and trigger architecture planned for the ATLAS in the HL-LHC
requires a complete redesign of the front-end and back-end electronics systems
to cope with the new requirements in radiation levels, data bandwidth and
clocking distribution.
This thesis is focused on the development of readout electronics for the
ATLAS experiment at the HL-LHC, particularly in the design of the Tile Pre-
processor (TilePPr) prototype envisaged for the readout of the Tile Calorimeter
and communication with the ATLAS trigger system.
Chapters 1 and 2 present an introduction to the LHC and HL-LHC exper-
iments, followed by an extensive review of the Tile Calorimeter and the plans
for the ATLAS Phase II Upgrade for the HL-LHC.
vii
PREFACE
The TilePPr prototype hardware design is fully described in Chapter 3,
followed by the result of signal integrity simulations that confirmed the correct
design of the PCB. At the end of the chapter some experimental results obtained
during the initial tests with the first prototypes are presented.
Chapter 4 describes all the firmware developments implemented for the op-
eration of the Demonstrator module in the TilePPr prototype and in the Daugh-
terBoard. This chapter includes a detailed description of all the firmware blocks
designed for the front-end and back-end electronics, focusing in the development
of high-speed data links with fixed and deterministic latency.
Chapter 5 presents the development of FPGA-based circuits for the precise
measurement of phase differences between clocks. A phase measurement circuit,
called OSUS, based on oversampling techniques is discussed. The experimental
results with the OSUS circuit obtained from its implementation in the TilePPr
prototype are presented here. The OSUS circuit permits the synchronization of
the Demonstrator module and the LHC clock, as well as the monitoring of the
phase stability of clocks with a precision of about 30 psRMS.
Chapter 6 includes a description of the testbeam setup and some experi-
mental physics results obtained. During these testbeam campaigns the TilePPr
prototype was the main readout system in the back-end electronics operating
the Demonstrator module.
Finally, the conclusions and future plans for this work are given at the end
of this document.
viii
Contents
1 The Large Hadron Collider 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 LHC experiments . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The ATLAS experiment . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Trigger and Data AcQuisition system . . . . . . . . . . . 5
1.3 The Tile Calorimeter . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Front-end electronics . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 Back-end electronics . . . . . . . . . . . . . . . . . . . . . 13
1.4 Data flow of the TileCal readout chain . . . . . . . . . . . . . . . 15
2 ATLAS Upgrades for HL-LHC 17
2.1 TDAQ architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Tile Calorimeter Upgrade . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Front-end electronics . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 Power supplies . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 Back-end electronics . . . . . . . . . . . . . . . . . . . . . 33
3 Design of the TilePPr prototype 37
3.1 Specifications of the TilePPr prototype . . . . . . . . . . . . . . . 37
3.2 Components and functionality . . . . . . . . . . . . . . . . . . . . 38
3.2.1 Field Programmable Gate Arrays . . . . . . . . . . . . . . 39
3.2.2 TTC receiver block . . . . . . . . . . . . . . . . . . . . . . 45
3.2.3 Clocking unit . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.4 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . 47
ix
CONTENTS
3.2.5 Power distribution . . . . . . . . . . . . . . . . . . . . . . 48
3.2.6 Other components . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Physical design and PCB layout . . . . . . . . . . . . . . . . . . 52
3.3.1 Stack-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.2 Signal integrity studies . . . . . . . . . . . . . . . . . . . . 54
3.3.3 IR drops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Post-layout simulations . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.1 Insertion and return losses . . . . . . . . . . . . . . . . . . 67
3.5 Characterization tests . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5.1 Introduction to jitter . . . . . . . . . . . . . . . . . . . . . 72
3.5.2 Eye diagrams . . . . . . . . . . . . . . . . . . . . . . . . . 74
4 Integration of the TilePPr in the Demonstrator 79
4.1 GBT protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Tile GBT-FPGA IP core . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.1 Front-end Tile GBT links . . . . . . . . . . . . . . . . . . 84
4.2.2 Back-end Tile GBT links . . . . . . . . . . . . . . . . . . 86
4.3 Data format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.1 Downlink data format . . . . . . . . . . . . . . . . . . . . 92
4.3.2 Uplink data format . . . . . . . . . . . . . . . . . . . . . . 93
4.4 Front-end electronics firmware . . . . . . . . . . . . . . . . . . . . 94
4.4.1 Data Packer . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.4.2 MainBoard Interface module . . . . . . . . . . . . . . . . 96
4.4.3 Charge Injection System block . . . . . . . . . . . . . . . 98
4.4.4 Integrator block . . . . . . . . . . . . . . . . . . . . . . . 99
4.4.5 GBTx configuration module . . . . . . . . . . . . . . . . . 99
4.4.6 Monitoring block . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.7 DCS module . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.8 ADC block . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.5 Back-end electronics firmware . . . . . . . . . . . . . . . . . . . . 104
4.5.1 IPbus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.5.2 Link Controller . . . . . . . . . . . . . . . . . . . . . . . . 105
4.5.3 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.5.4 DCS FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
x
CONTENTS
4.5.5 TTC module . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.5.6 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5.7 Integrator Readout block . . . . . . . . . . . . . . . . . . 109
4.5.8 Pipeline module . . . . . . . . . . . . . . . . . . . . . . . 109
4.5.9 Readout interfaces . . . . . . . . . . . . . . . . . . . . . . 111
4.5.10 Latency block . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.6 Data path delays and latency measurements . . . . . . . . . . . . 114
4.6.1 Digital path latency measurements . . . . . . . . . . . . . 114
4.6.2 ADC interface path latency . . . . . . . . . . . . . . . . . 117
5 Clock distribution in the Tile Calorimeter 121
5.1 Current clock distribution architecture . . . . . . . . . . . . . . . 122
5.2 Clock distribution architecture in the HL-LHC . . . . . . . . . . 123
5.2.1 Synchronization of the Demonstrator module . . . . . . . 123
5.3 DMTD method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3.1 Digital approximation of the DMTD method . . . . . . . 127
5.4 OverSampling to UnderSampling method . . . . . . . . . . . . . 131
5.4.1 Performance of the OSUS circuit . . . . . . . . . . . . . . 133
5.5 Implementation of the OSUS circuit . . . . . . . . . . . . . . . . 138
5.5.1 Synchronization of the Demonstrator with the TTC system139
5.5.2 Studies on clock stability . . . . . . . . . . . . . . . . . . 143
6 Testbeam setup and results 147
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2 Testbeam setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.2.1 Beam elements . . . . . . . . . . . . . . . . . . . . . . . . 150
6.3 Clock distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.4 Data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.5 Calibration systems . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.1 Pedestal and linearity runs . . . . . . . . . . . . . . . . . 156
6.5.2 Charge Injection System . . . . . . . . . . . . . . . . . . . 158
6.5.3 Cesium scans . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.6 Demonstrator physics program . . . . . . . . . . . . . . . . . . . 161
6.6.1 Data quality . . . . . . . . . . . . . . . . . . . . . . . . . 161
xi
CONTENTS
6.6.2 Oﬄine data analysis . . . . . . . . . . . . . . . . . . . . . 164
7 Conclusions 169
8 Resumen 173
8.1 Introduccio´n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.1.1 Experimento ATLAS . . . . . . . . . . . . . . . . . . . . . 174
8.1.2 Calor´ımetro Hadro´nico TileCal . . . . . . . . . . . . . . . 176
8.1.3 Mejoras del experimento ATLAS y el High Luminosity LHC 177
8.1.4 Proyecto Demonstrator . . . . . . . . . . . . . . . . . . . 178
8.2 Prototipo TilePPr . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.3 Objetivos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.4 Metodolog´ıa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.5 Conclusiones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
List of Acronyms 191
Bibliography 195
List of Figures 202
List of Tables 209
xii
Chapter 1
The Large Hadron Collider
1.1 Introduction
The Large Hadron Collider (LHC) [1] is the world’s largest and most powerful
particle accelerator. It is installed and operated at the European Organization of
Nuclear Research (CERN) in a circular tunnel of 27 km and 100 m underground,
crossing the border of France and Switzerland, close to Geneva. The LHC is
composed of super-conducting magnets designed to collide proton beams at a
center-of-mass energy of
√
s = 14 TeV, delivering an instantaneous luminosity
of L =1× 1034 cm−2s−1.
The LHC is the last stage of a series of accelerators increasing the energy
of the proton beams. Figure 1.1 shows a diagram of the CERN’s accelerator
complex and how they are interconnected. Linac2 and PS Booster compose
the first stages of the accelerator complex where the proton beam reaches an
energy of 1.4 GeV. Then, the beam is injected into the Proton Synchrotron
(PS) which accelerates the beam up to 25 GeV. The Super Proton Synchrotron
(SPS) receives the protons from the PS and accelerates them to 450 GeV before
injecting them into the LHC. Finally, in the LHC proton beams are accelerated
to their maximum energy in two separated beam pipes where beams travel in
opposite directions before colliding.
1
CHAPTER 1. THE LARGE HADRON COLLIDER
Figure 1.1: The CERN’s accelerator complex.
1.1.1 LHC experiments
Seven experiments installed around the LHC analyze the particles produced in
the collisions. The LHC collides the proton beams at four interaction points
around the accelerator ring corresponding to the location of the main exper-
iments: ATLAS (A Toroidal LHC ApparatuS) [2], CMS (Compact Muon So-
lenoid) [3], ALICE (A Large Ion Collider Experiment) [4] and LHCb (Large
Hadron Collider beauty) [5].
The two largest experiments, ATLAS and CMS, are multi-purpose experi-
ments located in opposite sides of the LHC ring. ATLAS and CMS were de-
signed and optimized to measure the properties of the strong and electroweak
forces with high precision at the TeV scale, and studying new physics beyond
the Standard Model. In 2012, both experiments announced the discovery of
the Higgs boson [6] with a mass around 125 GeV. The LHCb studies B-physics
and the CP violation whereas ALICE experiment investigates the quark-gluon
plasma through heavy ion collisions.
The other three experiments in the LHC are much smaller in size. The
TOTEM (TOTal Elastic and diffractive cross-section Measurement) [7] exper-
iment is near the CMS detector and performs high-precision measurements of
the proton size and the LHC luminosity. The LHCf (LHC forward) [8] is located
near the ATLAS detector and studies the particle generation in the forward re-
2
1.2. THE ATLAS EXPERIMENT
gion of collision as a simulation of cosmic rays in laboratory conditions. The
last experiment approved by the LHC is the MoEDAL (Monopole and Exotics
Detector at the LHC) [9]. The MoEDAL is installed in the walls of the LHCb
cavern and searches directly for the magnet monopole.
1.2 The ATLAS experiment
The ATLAS experiment [2] is a general-purpose detector designed to study
the products of p-p collisions at the LHC. The ATLAS detector is the largest
detector in the LHC and is about 45 meters long, more than 25 meters high,
and has an overall weight of approximately 7,000 tons.
Figure 1.2: The ATLAS experiment.
The two proton beams collide at the center of the ATLAS detector producing
particles in all directions. The ATLAS experiment is composed of different sub-
detectors to measure the different type of particles emerged from the collisions:
the Inner Detector (ID), electromagnetic and hadronic calorimeter systems and
the Muon Spectrometer (MS).
Built around the beam pipe, the ID is located at the inner part of ATLAS.
It was designed for tracking and vertexing by the measurement of the trajec-
3
CHAPTER 1. THE LARGE HADRON COLLIDER
tories of the charged particles generated in the collisions. A solenoidal magnet
surrounds the ID generating a magnetic field of 2 Tesla which bends the trajec-
tories of the charged particles. The curvature of the trajectories is used for the
calculation of the particle momentum. Surrounding the ID, the Liquid Argon
(LAr) and the Tile (TileCal) calorimeters measure the deposited energy and
reconstruct the direction of different types of particles. The outermost layer of
the ATLAS is composed of the MS and a toroid magnet, where a muon tracking
system measures the trajectories of the muons beyond the calorimeters. Three
superconducting air-core toroid magnets surrounding the ATLAS detector gen-
erate a field of 0.5 Tesla in average to bend the trajectory of charged particles.
The ATLAS detector provides high precision measurements of different types
of particles and processes.
The design requirements of the ATLAS detector include the following as-
pects:
• Very good electromagnetic calorimetry for electron and photon separation
and measurement, complemented by a full coverage hadronic calorimetry
for accurate jet and missing transverse energy (EmissT ) measurements.
• High-precision muon measurements, with the capability to guarantee ac-
curate measurements at the highest luminosity using the external muon
spectrometer alone.
• Efficient tracking at high luminosity for high-pT lepton-momentum mea-
surements, electron and photon identification, τ -lepton and heavy-flavour
identification, and full event reconstruction capability at low luminosity.
• Triggering and measurement of particles at low-pT , providing high effi-
ciencies for most physics processes of interest at LHC.
• Large acceptance in pseudorapidity (η) with almost full azimuthal angle
(φ) coverage. The azimuthal angle is measured around the beam axis,
while the pseudorapidity is measured with respect to the plane perpendic-
ular to the beam line and derived from the polar angle (θ):
η = − ln
(
tan
(
θ
2
))
(1.1)
4
1.2. THE ATLAS EXPERIMENT
1.2.1 Trigger and Data AcQuisition system
The current ATLAS trigger system [10] is composed of 3 levels of event se-
lection. While the Level 1 (L1) trigger system is completely based on custom
hardware designed for the ATLAS detector, the Level 2 (L2) and the Event
Filter (EF) levels are largely based on Commercial Off-The-Shelf (COTS) com-
ponents. Each trigger level refines the event selection that the previous level
provided, thus reducing the trigger rate. A schema of the ATLAS Trigger and
Data AcQuisition (TDAQ) system is shown in Figure 1.3.
Figure 1.3: ATLAS trigger and data acquisition system.
The L1 decision is based on reduced-granularity information provided by
the Resistive Plate Chambers (RPC) and Thin-Gap Chambers (TGC) for high
pT muons, and by the calorimeters for electromagnetic clusters, jets, τ -leptons,
EmissT and large ET . The L1 trigger reduces the event rate from 40 MHz to a
maximum of 100 kHz on average. The L1 decision must arrive to the readout
electronics in less than 2.5 µs, meanwhile the front-end electronics keeps the
events in pipeline memories.
The L2 trigger decision is based on Regions of Interest (RoI). The RoIs
are regions of the detector where the L1 has identified possible trigger objects
within the event with full-granularity and full-precision. The RoI are stored in
5
CHAPTER 1. THE LARGE HADRON COLLIDER
the Read Out Buffers (ROBs) until the L2 trigger system process them. The
L2 trigger system uses the ROIs information on coordinates, energy and type
of signature to reduce the amount of data and reduces the event rate below
3.5 kHz, with an average event processing time of approximately 40 ms.
In the final trigger decision level the Event Filter (EF) process the com-
plete events built in the Event Builder (EB) system and the events selected are
permanent stored in the CERN computer center for further physics analysis.
Although the EF was initially designed to reduce the output to about 200 Hz,
during the Run 1 (2010-2012) the trigger event output was 800 Hz.
In addition, the Data AcQuisition (DAQ) system provides infrastructure for
the configuration, control and monitoring of the ATLAS detector, while the
Detector Control System (DCS) supervises the detector services, such as power
supplies or gas systems.
1.3 The Tile Calorimeter
The Tile Calorimeter detector [2] [11] is a sampling calorimeter which uses steel
as absorber and scintillator tiles as active medium. It covers the region, |η|<1.7,
behind the liquid argon electromagnetic calorimeter. TileCal is divided into a
central Long Barrel, 5.6 meters in length, and two Extended Barrels, 2.6 me-
ters in length. The radial depth of TileCal is approximately 7.4 λ (interaction
lengths). Each barrel is azimuthally divided into 64 wedges of size ∆φ ∼ 0.1,
made of steel plates and scintillator tiles, with a total weight of 2,600 metric
tons for the complete detector.
The combination of the orientation of the scintillator tiles radially and nor-
mal to the beam axis with wavelength-shifting (WLS) fiber readout on the tile
edges, allows for almost seamless azimuthal calorimeter coverage. The WLS
fibers are grouped into bundles defining 5,182 calorimetric cells. The fiber bun-
dles are read out by 9,852 PhotoMultiplier Tubes (PMTs) providing an approx-
imate projective geometry in pseudorapidity. There is a gap region between the
long and the extended barrel which is instrumented with special cells. The front-
end electronics and readout optics are highly integrated within the mechanical
structure of TileCal. The PMTs and all the readout electronics are housed on
6
1.3. THE TILE CALORIMETER
aluminum units, called super-drawers, located at the outermost part of TileCal.
The front-end electronics also provide analogue sums of channels from cells with
the same η coordinate, forming trigger towers which are the basis for the L1
trigger processing. The low-voltage power supplies of the front-end electronics
are mounted in an external steel box at one of the sides of the super-drawer
which contains the connections for power and other services. For the calibra-
tion systems, the calorimeter is equipped with a laser system, a Charge Injection
System (CIS) and a 137Cs radioactive source which are employed to calibrate
the detector response to the electromagnetic scale with a high precision. The
structure of the TileCal modules is depicted in Figure 1.4.
Figure 1.4: Structure of a TileCal module and main components.
1.3.1 Optics
The TileCal active medium is composed of scintillating tiles of eleven different
sizes [12] of 3 mm thickness and with radial lengths ranging from 97 mm to 187
mm and azimuthal lengths ranging from 200 mm to 400 mm, where different size
of the tiles corresponds to different depth in radius. Ionizing particles crossing
the tiles induce the production of ultraviolet scintillation light in the polystyrene
base material of the tiles and this light is subsequently converted to visible light
by wavelength shifting.
7
CHAPTER 1. THE LARGE HADRON COLLIDER
The tiles are surrounded by a plastic sleeve to protect the tile and improve
the scintillation light yield due to its high reflectivity of 95%. In addition, the
plastic sleeve contains a mask pattern to reduce the optical non-uniformity of
the tiles to a level below 5% for the sum of signals of both sides of the tile. The
WLS fibers are attached to the tile edges to collect the light produced in the
scintillators and shift its wavelength to a longer one. Each WLS fiber collects
light from tiles and routes it to the PMTs inserted into the super-drawers.
The WLS fibers are grouped together in bundles and coupled to the PMTs.
The fiber grouping defines a three-dimensional cell structure to form three radial
sampling depths, approximately 1.5, 4.1 and 1.8 λ thick at η = 0. These cells
have dimensions ∆η ×∆φ = 0.1 × 0.1 in the first two layers and 0.2 × 0.1 in
the last layer. The depth and η-segmentation of the barrel and extended barrel
modules are shown in Figure 1.5. Each tile is read out by two different PMTs
providing redundancy and sufficient information to partially equalize signals
produced by particles crossing the calorimeter at different positions.
500 1000 1500 mm0
A3 A4 A5 A6 A7 A8 A9 A10A1 A2
BC1 BC2 BC3 BC5 BC6 BC7 BC8BC4
D0 D1 D2 D3
A13 A14 A15 A16
B9
B12 B14 B15
D5 D6
D4
C10
0,7 1,0 1,1
1,3
1,4
1,5
1,6
B11 B13
A12
E4
E3
E2
E1
beam axis
0,1 0,2 0,3 0,4 0,5 0,6 0,8 0,9 1,2
2280 mm
3865 mm =0,0
η
~
Figure 1.5: Segmentation in depth and η of the TileCal modules for half of a
long barrel (left) and for an extended (right) barrel. TileCal cell distribution is
symmetric respect to the interaction point at the origin.
1.3.2 Front-end electronics
The Long Barrel and Extended Barrels are subdivided in four partitions (EBA,
LBA, LBC and EBC) as depicted in Figure 1.6. Each partition is contains 64
super-drawers for a total of 256 super-drawers. The front-end electronics [13]
and readout components are housed inside the super-drawers, while the rest
8
1.3. THE TILE CALORIMETER
of the trigger and readout electronics are located off detector in the ATLAS
counting rooms (USA15). Figure 1.7 depicts a block diagram of the TileCal
electronics.
Figure 1.6: Tile Calorimeter partitions. EBA and EBC partitions correspond to
the Extended Barrels and LBA and LBC partitions to the central Long Barrel.
Photomultiplier block
The PMT block is the key element in the readout chain as they measure the
light produced by the scintillating tiles. It is composed of a mechanical structure
made of steel cylinder and mu-metal shield for magnetic shielding and contains a
light mixer, a PMT, a high voltage divider and the 3-in-1 card (Figure 1.8). The
light mixer mixes the light from the readout fibers to ensure uniform illumination
of the photo-cathode. The PMT blocks are inserted into the aluminum structure
of the super-drawers, ensuring an accurate placement of the light mixer and
WLS fiber bundle for each tile. The main components of a PMT block are the
following:
• Photomultipliers: the PMT converts the light signal from the fiber bundles
into an electric charge. The Hamamatsu R5900 PMT was selected to read
out the tiles. This PMT has a compact size with of 28 × 28 × 28 mm3
and has a dynode structure with 8 amplification stages.
9
CHAPTER 1. THE LARGE HADRON COLLIDER
Fibres
Light mixer
Photomultiplier
Divider
Tiles
Calorimeter
3−in−1 Card
Laser
Particle energy
PMT block
Adder
ROD
Pipeline
ADC
Digitiser
Interface
Charge or energy
system
Mezzanine
Mother−board
TTC
Integrator
Current calibration
Bulk HVPS
Cesium
Probes
CANbus
Charge injection
Muon
L1 trigger
CANbus
HV
micro−processor
Bulk LVPS
Finger
LVPS
opto−couplerHV
Calorimeter
L1 trigger
Figure 1.7: Block diagram of the TileCal electronics.
• Light mixers: since the PMT response depends over the photocathode sur-
face position illuminated, a light mixer is used for mixing the light coming
from all the fibers in the bundle, so that there is no correlation between
the position of the fiber and the area of the photocathode receiving the
light.
• Magnetic shielding: The mu-metal and iron magnetic shielding in the
PMT must prevent residual fields from the ATLAS solenoid and toroids
that could cause gain variations. It should provide a protection up to
500 Gauss magnetic fields in any direction.
• HV dividers: the primary purpose of the divider is to partition the high
voltage between the dynodes of the PMT. The High Voltage (HV) divider
also serves as a socket to allow the connection of the PMT to the front-end
electronics without any interconnecting wires. This design minimizes the
capacitance between the PMT and the electronics, which is important to
reduce noise and unreliable connections.
10
1.3. THE TILE CALORIMETER
Figure 1.8: Scheme of the PMT block.
• 3-in-1 cards: these boards provide a high and a low gain shaped pulse
for the digitizer boards, the charge injection calibration system and slow
integration of the PMT signals for monitoring and calibration.
Digitizer system
The 3-in-1 cards amplify and shape the PMT signals generating two output
signals, high and low gain with a gain ratio of 64. High and low gain signals are
transmitted to the Digitizer boards, where they are digitized every 25 ns by 10-
bit Analog-to-Digital Converters (ADC). The Digitizer board is equipped with
two depth-configurable pipeline memories in the TileDMU ASIC [14] which store
the digitized data until the reception of a L1 accept signal (L1A). Each TileDMU
ASIC receives the digitized data from three channels, this is six channels per
Digitizer board. Upon the reception of a L1A signal, a data frame containing
up to 16 samples is copied into the derandomizer buffers in the Interface board
for its transmission to the ROD system in the back-end electronics. In order to
reduce the total data bandwidth during the normal operation only one of the two
gains, and only 7 samples are read out. The sampling clock is provided by the
TTCrx ASIC [15] and can be adjusted collectively for all ADC in a Digitizer
board in steps of 106 ps. The phase adjustment of the clock is necessary to
11
CHAPTER 1. THE LARGE HADRON COLLIDER
ensure that the central sample is near the pulse peak. The motherboard also
contains an analog part to provide a voltage reference to the ADCs and two
8-bit Digital-to-Analog Converters (DAC) that provide a pedestal for the AC-
coupled inputs. Each super-drawer of the LB contains up to 8 digitizer boards
whereas the EB modules contains 6 digitizer boards.
Interface board
The Interface board [16] is the digital link with the back-end electronics system.
Each module hosts one Interface board which receives and distributes the TTC
signals to the electronics, collects and formats data from the digitizer boards,
and transmits the digitized data via an optical link to the ROD system. The
Interface board implements a redundant readout system with two output fibers
to reduce possible errors due to single event upset, though, at a given time, only
one of the two fibers is connected to the ROD system. Figure 1.9 shows a block
diagram of the Interface board.
Figure 1.9: Block diagram of the functional blocks and data flow of the Interface
board.
Adder board
The adder boards receive the analog signals from up to six 3-in-1 cards com-
posing a trigger tower and perform an analog sum. Two analog sum results are
sent to the L1 trigger system. The first analog sum comprises the sum of all
the cells, while the second one only contains the last layer cell.
12
1.3. THE TILE CALORIMETER
1.3.3 Back-end electronics
The back-end electronics system is installed in the counting rooms of the cavern
(USA15), located 70 meters away from the detector and contains two differ-
ent sub-systems: the Read Out driver (ROD) system and the Trigger, Timing
and Control (TTC) system [17]. The back-end electronics is organized in four
partitions, each one dedicated to the readout of the Long and Extended Bar-
rels. These units are physically split into different crates: a 6U Versa Module
Eurocard (VME) TTC crate and a 9U VME ROD crate.
Read Out Driver
The Read Out Driver [18] is the core element of the back-end electronics re-
ceiving data from 8 super-drawers through optical links. A total of 32 RODs,
divided in four VME crates, one per barrel, are needed to read out the Tile
Calorimeter, this is eight RODs per rack. The ROD module is composed of a
ROD motherboard and four Processing Units (PU), each populated with two
commercial Digital Signal Processor (DSP) chips that process the data before
its transmission to the ATLAS DAQ system.
In addition, one Trigger and Busy Module (TBM) is installed per ROD crate.
This 9U VME module receives and distributes the TTC signals from the local
TTC system to the RODs, and also gathers the busy signals from eight RODs
to provide a combined busy signal to the ROD Busy module.
Trigger, Timing and Control
The back-end TTC system is installed in four VME crates. Each TTC partition
contains a series of VME boards to handle the TTC information in the different
subsystems.
• Local Trigger Processor (LTP): the LTP receives the TTC signals from
the Central Trigger Processor (CTP) and distributes them to the TTCvi
module.
• LTP Interface (LTPI): the LTPI communicates multiple LTP modules
with the CTP.
13
CHAPTER 1. THE LARGE HADRON COLLIDER
• TTC VME Bus Interface (TTCvi): the TTCvi provides the A and B
channel signals to the TTCex for its encoding and distribution to the
front-end electronics.
• TTC Emitter (TTCex): the TTCex converts the commands received from
the TTCvi to optical signals.
• TTC Optical Coupler (TTCoc): the TTCoc fans out the optical signals
up top 320 different destinations.
• ROD Busy module: this module monitors the busy signal, and produces
the OR operation of the 16 busy input lines.
Other TileCal-specific modules are also present in the TTC crate, which are
primary used for calibration purposes but also receive or handle TTC informa-
tion.
• Shaft module: it controls the calibration trigger requests. It is primary
used to share the calibration request during physics runs.
• TTC Receiver in PCIe Mezzanine Card (TTCpr) module: the TTCpr
provides TTC information to the TDAQ software for the calibration runs.
This board is attached into the Single Board Computer (SBC) of the TTC
crate.
• Laser Read Out Driver: this module provides information from the Laser
calibration system to the TDAQ software and distributes TTC signals to
the Laser system.
The TTC rack location was chosen to minimize the length of the TTC fibers
to the front-end crates and the associated contribution to the trigger latency, un-
derstood as the time difference between the bunch crossing identification (BCID)
and the arrival time of the L1A signal to the front-end electronics system. In
addition, the programmable delay lines of the calibration boards are configured
to reproduce the timing of signals generated by particles originated from the
interaction point.
14
1.4. DATA FLOW OF THE TILECAL READOUT CHAIN
1.4 Data flow of the TileCal readout chain
The complete readout process is shown in Figure 1.10. It starts collecting the
light generated by particles crossing the TileCal scintillating tiles, and then
routing it to the PMTs through WLS fibers. The PMTs convert the light into
an electrical analog pulse which is shaped and amplified by the 3-in-1 cards,
distributing two copies of the analog signal with a ratio of 1:64. The analog
PMT signals are transmitted to the digitizers boards where the signals are
digitized at the LHC frequency and stored in the configurable-depth pipeline
memories of the TileDMU. In parallel to this operation, the low gain analog
signals are summed in groups of five and the result is sent to the L1 trigger
system.
64
1
PMT
Detector signals Digitizer3-in-1
ADC
ADC
PIPELINE
    Σ
Analog 
trigger sums
Interface
OTx
GLINK
to RODFORMAT
S
E
L
M
E
M
Figure 1.10: Block diagram of TileCal readout chain.
In the L1 trigger system, the CTP transmits the L1A signal through the TTC
network via the LTP to the front-end electronics requesting the selected events at
a mean rate of 100 kHz. When the front-end electronics receives the L1A signal,
the TileDMU transmits the samples corresponding to the requested BCID to the
Interface board. Then, the Interface board builds up a data fragment containing
the samples from all channels in the super-drawer and transmit it to ROD system
in the back-end electronics system. The data flow rate is controlled using a busy
feedback signal from the back-end electronics to the CTP. The busy signal is
generated by the RODs modules when the input buffers are full. This signal
is transmitted to the TBM and ROD busy module which distributes it to the
CTP, informing that it is not possible to accept new events.
15
CHAPTER 1. THE LARGE HADRON COLLIDER
16
Chapter 2
ATLAS Upgrades for
HL-LHC
The High Luminosity upgrade of the Large Hadron Collider (HL-LHC) [19] is
planned for the Long Shutdown 3 (LS3) period from 2024 to 2026. The HL-
LHC will provide a nominal instantaneous luminosity of L =7.5× 1034 cm−2s−1,
7.5 times the initial design luminosity, with an average of 200 inelastic colli-
sions per bunch crossing. The HL-LHC will deliver an integrated luminosity
of 300−350 fb−1 per year with the goal of 4000 fb−1 by 2035, about 10 times
the integrated luminosity reached with the LHC. The central activity at the
HL-LHC will be the measurement of the properties of the recently discovered
Higgs boson and, in particular, the studies of the Higgs coupling to the different
fermions and bosons, as well as the precise measurement of the trilinear Higgs
self-coupling through the observation of di-Higgs production. The HL-LHC will
produce high statistics data that will permit the study in detail of the Standard
Model and physics Beyond the Standard Model. A temporal overview of the
plans for the LHC evolution towards the HL-LHC is given in Figure 2.1.
The complete upgrade of the ATLAS detector [20] is planned in three dif-
ferent phases corresponding to the three long shutdown periods. After LS3, the
ATLAS Phase II Upgrade will prepare the different sub-detectors for the HL-
LHC luminosity conditions. The pile-up of events per beam crossing in ATLAS
will increase from 20 to 200, requiring a finer granularity for the detectors and
17
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
Figure 2.1: LHC plan for the next ten years, with a series of shutdowns with
dedicated upgrades and increase of energy and luminosity.
a new TDAQ architecture able to handle the trigger rates and the amount of
generated data. Moreover, the increment of the luminosity will require a new
front-end electronics system more tolerant to radiation.
Some detectors, such as the Inner Detector, the LAr Forward Calorimeter
and the Forward Muon Wheels will be more affected by the radiation and will
require the complete replacement of the current detector and electronics. This is
not the case for the calorimeters and muon chambers, where there is no necessity
of replacing their structures and active materials. Only an upgrade of the front-
end and back-end electronics systems is required in order to cope with the new
radiation levels and data bandwidth.
2.1 TDAQ architecture
The TDAQ system will be completely redesigned for the HL-LHC to address
the performance requirements in combination with the increased trigger rates
and data volumes. The proposed TDAQ for the HL-LHC [21] consists of a
single-level hardware trigger stage, called Level-0 (L0) trigger system, and a
software system called Event Filter (EF). Figure 2.2 presents a block diagram
of the single-level TDAQ architecture for the HL-LHC.
The L0 trigger system receives the information from the LAr and TileCal
detectors, and the muon system reducing the trigger rate from 40 MHz to 1 MHz
by the application of hardware-based algorithms.
The calorimeters provide coarse granularity data to the L0Calo system to
identify electron, tau and jet candidates, and to calculate EmissT . In parallel,
18
2.1. TDAQ ARCHITECTURE
Figure 2.2: Block diagram of the single-level architecture envisaged for the
TDAQ at the HL-LHC [21].
the L0Muon system receives data from all muon subsystems and from the most
external cells of TileCal to identify muons. It also receives information from
the Monitoring Drift Tubes (MDT) and RPCs to improve the muon trigger
coverage.
The L0Calo and L0Muon systems provide trigger objects with reconstructed
energies and spatial locations to the Global Event which combines them into
higher level signatures. These signatures are then passed to the Level-0 Central
Trigger Processor (L0CTP) which makes the Level-0 decision (LA0) based on
several parameters.
Related to the readout path, all detectors transmit the data to a common
readout system called Front-End LInk eXchange (FELIX) [22]. This system
is used to transmit selected detector data to Level-0 trigger system using low
latency point-to-point connections and also to interface the detector electronics
with the DCS and TTC systems. The FELIX system sends the event data
to the Data Handlers where data is reformatted and then buffered into the
19
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
Storage Handler before being transmitted to the EF for the application of the
last trigger algorithms. The Storage Handler buffers the event data while the EF
is processing the events between fills. In the last stage of the trigger architecture,
the EF makes the final trigger decision using software algorithms at a rate of
10 kHz.
The TDAQ collaboration is also considering a second option: an architecture
with two hardware levels (Level-0 and Level-1), where the L0 trigger rate reaches
4 MHz and the Level-1 (L1) trigger system implements more complex algorithms
reducing the event rate to 800 kHz.
In order to fulfill the latency requirements imposed by the new TDAQ system
and adding a margin for future developments, the detectors will implement large
pipeline memories to store 10µs of data for Level-0 and 35µs for Level-1 (in
the case of a L0/L1 trigger architecture). The latency, trigger and data rates
between the detector readout and the trigger systems for both architecture
options are summarized in Table 2.1.
L0 schema L0/L1 schema
L0/L1 trigger rate 4 MHz 4 MHz / 800 kHz
L0/L1 latency 10µs 10µs / 35µs
Data rate to L0Calo and L0Muon 40.08 MHz 40.08 MHz
Latency data to L0Calo and L0Muon 1.7µs 1.7µs
Data rate to FELIX 1 MHz 800 kHz
Latency data to FELIX 10µs 35µs
Table 2.1: Trigger parameters and readout data rates for the two proposed
TDAQ architectures.
2.2 Tile Calorimeter Upgrade
The motivation for the upgrade of the Tile Calorimeter is to fulfill the new
requirements set by the HL-LHC. The complete replacement of the readout
electronics is foreseen for the Phase II Upgrade [20] in order to meet the in-
creased radiation tolerance requirements and to be compatible with the TDAQ
architecture, providing more precise and higher granularity information to the
trigger systems.
In the proposed TDAQ architecture the back-end electronics system will
readout the detector and transmit pre-processed data from cells to the trigger
20
2.2. TILE CALORIMETER UPGRADE
system at the LHC frequency. In parallel to the data processing for the trigger
system, the data samples will be stored in pipeline memories until the reception
of a Level-0/Level-1 acceptance signal. A block diagram of the upgraded readout
architecture is shown in Figure 2.3.
Daughter board
PMT
Detector signals
ADC
QSFP
Σ
Digital Trigger Sums
Signal 
Reco to FELIX
TilePPrSignal conditioning and digitizer
PIPELINETTC
DCSFormat
GBT
DATA
QSFP
FPGA
GBTx
40 MHz
Level0
Trigger
40 MHz
~1-4 MHz
Figure 2.3: Block diagram of the TileCal readout architecture for the Phase II
Upgrade.
The new TDAQ architecture requires a considerable increase in bandwidth
and number of links between the front-end and back-end electronics systems,
in addition to the use of high-speed links between the back-end electronics and
trigger systems. While with the current system the detector is read out through
256 optical links with a bandwidth of 800 Mbps for each link, the new readout
architecture requires 4096 optical links (including redundancy) transmitting at
9.6 Gbps per link.
Table 2.2 summarizes the main features of the current and the Phase II
Upgrade readout systems.
Downlinks Current Phase II
Nbr. of links 256 2048
Link bandwidth 80 Mbps (TTC) 4.8 Gbps
Uplinks Current Phase II
Nbr. of links 256 4096
Link bandwidth 800 Mbps 9.6 Gbps
Nbr. of readout boards 32 (ROD) 32 (TilePPr)
Nbr. of crates 4 (VME) 4 (ATCA)
BW to DAQ per module 3.2 Gbps (ROS) 40 Gbps (FELIX)
BW to Trigger per module Analog ∼500 Gbps
Table 2.2: Comparison between the current and Phase II readout systems.
21
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
The Demonstrator module
In parallel with the developments of the readout electronics for the HL-LHC,
the Demonstrator project aims to evaluate and qualify the upgraded trigger and
readout electronics before the complete replacement of the electronics after the
Phase II Upgrade.
A Demonstrator module containing the upgraded front-end electronics was
developed in the framework of the project. It combines the Phase II readout
electronics with the legacy analog interface for the L1 trigger system since the
full digital trigger system will not be installed before the Phase II Upgrade. In
the back-end electronics, a Tile Preprocessor (TilePPr) prototype will read out
and operate the Demonstrator module. The TilePPr will store the samples in
pipeline memories and transmit L1 selected data to the ROD modules keep-
ing backward compatibility with the current DAQ system. The Demonstrator
module will replace one of the super-drawers into the ATLAS experiment. The
installation of the Demonstrator module into the ATLAS experiment is foreseen
during one of the short LHC shutdowns planned for the Run 2.
2.2.1 Front-end electronics
The front-end electronics comprise the set of electronic equipment, boards and
devices dedicated to the data readout and operation of the PMTs which are
hosted in the TileCal modules. The electronics will be housed inside four alu-
minum structures per module, called minidrawers, placed in the outermost part
of the detector. The mechanical design of the super-drawers has been modified
to organize the front-end electronics in four independent modules improving the
access and serviceability, and reducing the impact of the single point failures.
Figure 2.4 shows the mechanical structure of the TileCal detector and how the
super-drawers have been organized in minidrawers.
As is shown in Figure 2.5, each minidrawer houses up to 12 PMTs together
with its corresponding Front-End Boards (FEB), one High Voltage board to
distribute power to the PMTs, one MainBoard (MB) which receives the PMT
signals and one DaughterBoard (DB) which interfaces with the back-end elec-
tronics.
22
2.2. TILE CALORIMETER UPGRADE
Figure 2.4: Picture of the TileCal modules and the minidrawers.
DaughterBoard
The DaughterBoard [23] provides a high-speed communication path between the
front-end electronics and the TilePPr modules in the counting rooms (USA15).
In the upgrade electronics, the DBs will be responsible for the reception and
execution of configuration commands for the front-end electronics, as well as for
collecting and transmitting the digitized signal from the PMTs to the TilePPr.
In addition, the DBs will distribute the recovered LHC clock to the FEBs for
the digitization of the PMT signals. Figure 2.6 depicts a block diagram with
the main components of the DaughterBoard.
The DaughterBoard version 4 (Figure 2.7) was used during the 2015-2017
testbeam periods and will be integrated into the Demonstrator module. The
DB version 4 fulfills all the requirements of the HL-LHC and is compatible with
the three FEB options that will be presented in next sections. The DB was
designed in two independent halves, corresponding to A and B sides, where
each half serves 6 PMTs on one side of the minidrawer. The DB is populated
with two Xilinx Kintex 7 Field Programmable Gate Arrays (FPGA) connected
to two Quad Small Form-factor Pluggable (QSFP) modules. These connections
provide a redundant high-speed communication with the back-end electronics.
23
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
Figure 2.5: Detailed drawing of the minidrawer indicating the position of the
different parts of the front-end electronics.
Each side of the DB includes one GBTx [24] chip per side which recovers the
LHC clock and distributes it to the FPGA for the operation of the front-end
electronics and communication with the back-end electronics. In addition, the
GBTx also provides remote reset and configuration capabilities. Both FPGAs
are connected to the GBTx through LVDS buffers forming two independent
JTAG chains. The JTAG chain is closed transmitting the TDO signal to the
back-end electronics using the FPGA not being programmed, and therefore
only one FPGA can be programmed at a time. Although during the normal
operation of the module the FPGAs will be configured from the on-board flash
memories, the remote programming enables the possibility of programming the
FPGAs if the flash memory is damaged or updating the firmware version in the
flash memories.
A DB is connected to a MainBoard through a 400-pin FMC connector. This
connector provides a high-speed path to receive the PMT digitized data and to
configure and operate the FEBs. Related to the power distribution, each side
of the DB is individually powered with 10 V from the MainBoard.
Different techniques will be implemented in the FPGA to prevent errors
due to Single Event Upsets (SEU) or Multi Event Upsets (MEU). One of these
techniques is the memory scrubbing where the FPGA configuration memory
24
2.2. TILE CALORIMETER UPGRADE
Figure 2.6: Block diagram of the DaughterBoard.
Figure 2.7: Picture of the DaughterBoard version 4.
is continuously checked and corrected if possible. For those cases where the
memory integrity cannot be recovered the DB will be remotely reset or repro-
grammed. Moreover, the Triple Modular Redundancy (TMR) will be imple-
mented in firmware to reduce the likelihood of data errors due to SEU or MEU.
MainBoard
The MainBoard [25] controls the FEBs and also provides a high-speed path to
transmit digitized PMT signals to the DaughterBoard. Depending on the FEB
option the MainBoard could also include circuitry to digitize the signals coming
from the FEBs. In this thesis, the MainBoard for the 3-in-1 option is covered
in detail since it is used in the Demonstrator module.
25
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
3-in-1 front-end boards
The upgraded 3-in-1 FEB [26] is a revised version of the current FEB installed in
ATLAS. This new FEB is composed of COTS components. It features dynamic
range of 17 bits, better linearity and lower noise than the previous version.
Functionality of the FEB includes: the fast signal processing chain, the slow
signal processing chain and the calibration circuitry. Figure 2.8 shows a picture
of the upgraded 3-in-1 card.
Figure 2.8: A 3-in-1 card designed for the HL-LHC.
Figure 2.9 presents a block diagram of the functional blocks of the 3-in-1
card. The fast signal processing chain includes a 7-pole passive LC shaper to
transform a PMT pulse into a wider pulse. The wide pulse is amplified with
two clamping amplifiers with a gain ratios of 1 and 32, providing the so-called
high and low gain signals. The amplified signals are routed to MainBoard
where they are digitized with on-board dual-channel 12-bit ADCs. The slow
signal processing chain includes a 6-gain programmable slow integrator. It is
used to monitor the average PMT currents for Cesium detector calibration, and
monitoring of instantaneous luminosity. The precise charge injection circuit
is connected to the shaper. Charge injection procedure is used to calibrate
each readout channel and response of the amplifiers and ADCs, as well as the
integrator circuit.
The 3-in-1 card was selected for the Demonstrator module since it is backward-
compatible with the present analog trigger. A 3-in-1 card can provide analog
signal to the L1 calorimeter trigger system (L1Calo). If inserted into the AT-
LAS detector before the HL-LHC upgrade, the Demonstrator will provide analog
trigger sums to the L1Calo system.
26
2.2. TILE CALORIMETER UPGRADE
Figure 2.9: Functionality of 3-in-1 card for the upgrade of the TileCal detector.
The 3-in-1 MainBoard circuit (Figure 2.10) is divided in four sections each
controlled by an Altera Cyclone IV FPGA. A section contains the required
circuitry to control and read out three PMTs for a total of twelve PMTs per
MainBoard. Each FPGA controls three dual-channel ADCs for digitizing the
PMT signals at 40 Msps, six DACs for control the bias voltage levels applied
to the ADC inputs and three ADCs for sampling the integrators at 50 kSps.
All the control and data lines are routed to the DaughterBoard via the FMC
connector. The Cyclone FPGAs are accessed from the DaughterBoard via an
SPI interface, while digitized PMT signals are sent directly from the ADCs to
the DaughterBoard using LVDS lines at 560 Mbps. Also two I2C buses (one
per side) are dedicated for the read out of the integrator ADCs.
Figure 2.10: Picture of both sides of the MainBoard for the 3-in-1 FEB option.
In the same way as the DaughterBoard, this MainBoard is divided in two
halves called A and B sides. Each side has its own power distribution, where
the Schottky diodes connecting both sides prevent power failures in case of
27
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
malfunctioning of one of the fLVPS bricks. Even if one side of a minidrawer
fails completely, all PMTs on the opposite side can still be read out. The steel
structure of the TileCal modules acts as a good radiation shield, where the end
of the module close to the patch panel is exposed to a higher rate of radiation.
A simulated map of the radiation dose for the ATLAS is shown in Figure 2.11.
Based on this map, the voltage regulators and FPGAs were placed where the
radiation is lower.
Figure 2.11: Simulated radiation dose in ATLAS after 100 fb−1, being 1/40th of
the total integrated luminosity expected for the HL-LHC [27].
QIE FEB
The core of the second FEB design is a Charge (Q) Integrator and Encoder
(QIE) chip [28]. This custom ASIC digitizes the analog PMT signals with
a constant resolution and no dead-time, covering a dynamic range of about
18 bits. In opposition to the other two FEB designs, the QIE does not perform
any pulse shaping, but integrates the PMT current for every tick of the LHC
clock so pileup-related noise is reduced.
Figure 2.12 depicts a block diagram of the main blocks of the QIE. The first
stage of the QIE consists of a current splitter dividing the PMT current into
fixed fractions. Then the output of the splitter is time-multiplexed between
four gated integrators at 40 MHz and digitized with a flash ADC. The total
acquisition time of the QIE corresponds to a latency of four clock cycles due to
its pipelined operation. In addition, the QIE includes a fixed-threshold Time to
Digital Converter (TDC) which is used to measure the time position of rising
28
2.2. TILE CALORIMETER UPGRADE
edge of the PMT pulses. The QIE FEB includes a calibration circuitry to
calibrate the system with a Cesium source and current injection. The QIE
outputs data through 8 LVDS outputs at 80 Mbps which constitutes of a 9-
bit floating point word for the digitized charge, 2 bits to identify the gated
integrator, and 5 bits for the TDC data.
Current 
Splitter 
   Gated 
Integrator 
  
 Phase A 
Range 
Select 
 Ph A 
Input 
from 
PMT 
Range Bits 
Analog Out 
   Gated 
Integrator 
 Phase B 
Range 
Select 
 Ph B 
Range Bits 
Analog Out 
   Gated 
Integrator 
 Phase C 
Range 
Select 
 Ph C 
Range Bits 
Analog Out 
   Gated 
Integrator 
 Phase D 
Range 
Select 
 Ph D 
Range Bits 
Analog Out 
Clock and Phase Control 
I0 
I1 
I2 
I3 
Integrate Reset Select 
CAPID  
   Bits 
Range 
  Bits 
Analog  
  Out 
40 MHz Clock 
FADC 
ADC 
 Bits 
CAPID 
& 
Range 
 Bits 
Register 
Figure 2.12: Block diagram describing the operation of the main modules of the
QIE chip [27].
The QIE MainBoard is the least complex of the three types of MainBoards.
It handles point-to-point LVDS signals for controlling purposes and to read out
the QIE FEBs. Due to the limited number of pins in the FMC connector of the
DaughterBoard v4, the MainBoard includes CPLDs to multiplex the SPI bus
between the DB and the FEBs required for the control of the FEBs.
FATALIC FEB
The third FEB design relies on an ASIC called the Front end for ATLAS TileCal
Integrated Circuit (FATALIC) [29]. FATALIC includes a multi-gain current
29
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
conveyor which splits the input signal into three ranges with different gains (1,
8, 64) covering the full dynamic range of the PMT signal. Each current conveyor
output is followed by a shaper and the readout chain is completed with a 12-
bit ADC operating at 40 Msps. The FATALIC chips transmit two gains: the
internal logic performs the auto-selection between high and low gain while the
medium gain is always transmitted. In addition, FATALIC also includes a slow
integrator circuit to measure the minimum bias current of the PMT during the
operation and to calibrate the detector with a Cesium source. The circuitry for
the CIS calibration is located in the FEBs.
The FATALIC MainBoard is based on the 3-in-1 MainBoard including four
Altera Cyclone IV. Each FPGA is associated to three channels and serializes
the parallel data provided by the FATALIC FEBs, transmitting the result to
the DaughterBoard through the FMC connector. In addition, the MainBoard
also provides the clock to the FEBs, and interfaces with the DaughterBoard to
receive the configuration commands.
2.2.2 Power supplies
Low Voltage Power Distribution
The low voltage power is distributed to the front-end electronics using a three-
stage power distribution system. The upgraded low voltage schema is largely
based on the current version, due to the reliable operations of the current power
supplies.
The 200 V power supplies installed in USA15 racks distribute power to the
finger Low Voltage Power Supplies (fLVPS) [30] [31] located in the extreme end
of each module. Each fLVPS includes 8 buck converters designed with COTS
components, called bricks, to deliver 10 V to the MainBoards and Daughter-
Boards. Each minidrawer is powered with two bricks for double-redundancy,
using the OR-diode circuit in the MainBoard. Extra effort has been made to
make the fLVPS bricks radiation-tolerant with special focus on SEU. The fLVPS
bricks are controlled and monitor through the DCS system.
The Point Of Load (POL) regulators of the front-end electronics comprise
the third stage of the LV power distribution system. The POL regulators convert
10 V to the required voltage for the different electronic components. The POL
30
2.2. TILE CALORIMETER UPGRADE
regulators were selected to provide low-noise and were qualified for the operation
with the TileCal radiation levels.
High Voltage Power Distribution System
The High Voltage Power distribution System (HVPS) controls and monitors
the voltages applied to the almost 10,000 PMTs via the DCS system. The
TileCal community is developing two different approaches for the high voltage
distribution: the HV remote [32] and the HV internal [27]. Both approaches
have in common that the high voltage power supplies are placed away from the
detector in USA15. The difference is the location of the high voltage regulation
and control circuits.
HV remote
In this approach, the control and monitoring electronics will be located away
from the detector, in USA15, and each PMT will receive high voltage indepen-
dently via two long wires. Individual wires will be combined into cables. This
implementation is shown in Figure 2.13.
Figure 2.13: Block diagram of the remote HV power distribution system [27].
The high voltage control circuit of the remote HVPS is an improved version
of the currently used HVPS during Runs 1 and 2. The first version is capable of
31
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
delivering HV to groups of 12 channels corresponding to an entire minidrawer
where each channel could be controlled individually.
Continuing with the developments of the HV remote system, the local elec-
tronics boards used to control and monitor voltages will be replaced with com-
mercial computers that interface with the HV remote system through Ethernet.
The second version of the HV remote system will be able to provide high voltage
to 24 channels per high HV board, for a total of 512 boards for the complete
detector.
One of the advantages of the HV remote system is that no radiation tolerant
electronics is needed and the access to the HV boards for maintenance will be
easy since the electronics will be located in USA15. The disadvantage is that
the system needs almost 10,000 long wires.
HV internal
The HV internal system implements a different schema where the high voltage
is sent directly to each module through a common high voltage cable. In the
front-end electronics, each DaughterBoard is connected to one HVOpto board
that regulates and monitors the voltage applied to 12 PMTs. Figure 2.14 shows
a block diagram of the HV internal system.
Figure 2.14: Block diagram of the internal HV power distribution system [27].
32
2.2. TILE CALORIMETER UPGRADE
The HV bulk power supply in the counting rooms provides high voltage to
four minidrawers in parallel through a single coaxial cable. The core of the HV
Opto board is a Maxim Integrated MAX1329 chip which features 16 multiplexed
analog inputs connected to an ADC, 2 DACs and GPIOs. The DCS system sets
the high voltage individually for each PMT by transmitting commands to the
DaughterBoards through the TilePPrs. The DaughterBoards receive and store
the applied voltages digitized by the ADC of the MAX1329 chip. The MAX1329
chips are read out through an SPI interface. The DaughterBoards collect the
monitoring data and transfers it to the DCS system through the TilePPr.
The advantage of the HV internal option is the reduction in the number
of high voltage cables (where only one per four minidrawers is needed), at the
expense of a more difficult maintenance of the HV system due to the limited
access to the detector.
Active dividers
The passive HV dividers employed to distribute the high voltage to the PMT
dynodes were redesigned to fulfill the HL-LHC requirements. The legacy HV
dividers are implemented with a resistive network that provides non-linearities
below the 1% with the small anode currents generated with the current lu-
minosity. The new HV dividers [33] were designed with active components
(transistors) to provide a constant gain independently of the anode current at
the HL-LHC where the anode currents will be larger for some cells due to the
higher pulse rates.
2.2.3 Back-end electronics
The upgraded back-end electronics system will be, as the current version, located
in USA15 racks, 70 meters away from the detector. The Phase II Tile back-end
electronics consist of two different systems: the Tile PreProcessor modules and
the Trigger and DAQ interface (TDAQi) boards. The Tile back-end electronics
system is based on the Advanced Telecommunications Computing Architecture
(ATCA) specifications [34]. The ATCA framework provides a commercially
and standardized platform for high-speed serial interconnects on the backplane
supporting different I/O interfaces.
33
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
Tile PreProcessor module
The Tile PreProcessors [35] will be first and main element in the back-end
electronics. It will be responsible for the reception and processing of the digital
data coming from the TileCal modules. They will also distribute the DCS
and TTC information for synchronization with the LHC clock to the front-end
electronics.
The TilePPr modules will transmit selected data, integrator data and DCS
information to the FELIX system. It will also provide reconstructed energy and
time per cell to the TDAQi boards. On the other hand, the FELIX system will
transmit the LHC clock, TTC commands and DCS configuration to the TilePPr
modules to distribute it to the on-detector electronics.
The TilePPr module will be designed as a full size ATCA blade following the
ATCA specifications, where the interfaces with the other ATCA boards in the
same shelves will be achieved through three backplane connectors in locations
called Zone 1 to 3.
• Zone 1 connector provides slow control paths for the shelf management
and the power connection.
• Zone 2 connector routes the different data paths between the TilePPr
module and the different ATCA blades connected in the same shelf.
• Zone 3 connector provides high-speed point-to-point connections to QSFP
connectors in the TDAQi to communicate with the FELIX system and to
the PreProcessing Trigger (PPT) FPGA.
The TilePPr module will be composed of an ATCA carrier board with four
Advanced Mezzanine Card (AMC) [36] slots to host the Compact Processing
Modules (CPM). The ATCA carrier will provide power to the CPMs, as well
as, the basic interfaces for the communication with other systems through the
backplane.
The CPMs will be designed with a single AMC form factor and will be pop-
ulated with one FPGA and high-speed optical modules to implement all the
required functionalities. The FPGA and optical modules will be selected to
provide the CPM with the capability to read out and operate up to 8 minidraw-
34
2.2. TILE CALORIMETER UPGRADE
ers (two of the current TileCal modules). Therefore 32 TilePPr modules will be
required for the complete readout of the TileCal detector.
During the operation, the TilePPr modules will distribute LHC clock, TTC
commands and DCS configuration to the front-end electronics. Furthermore,
the CPM FPGAs in the TilePPr module will compute the reconstructed energy
and time per cell and bunch crossing, transmitting them to the TDAQi board.
At the same time, the CPM FPGAs will store the received samples in pipelined
memories until the reception of a L0/L1 acceptance signal. When this happens,
the selected data will be extracted from the memories, formatted and transmit-
ted to the FELIX system. Figure 2.15 presents a block diagram of the TilePPr
and TDAQi boards for the Phase II Upgrade.
ATCA
Switch
Interface
ZONE 1
ZONE 2
ZONE 3
Power
Supply
TilePPr 
POD
L0/L1 
Calo/
Muon PP
FPGA
POD
POD
L0/L1 
Calo
QSFP FELIX
POD L0 MuonIPMC
AMC#3
AMC#2
AMC#4
TileTDAQI 
POD
Pa
tch
 P
an
el
Main 
FPGA
Pa
tch
 P
an
el
Main 
FPGA
Pa
tch
 P
an
el
Main 
FPGA
Pa
tch
 P
an
el
Main 
FPGA
AMC#1
SuperDrawer
SuperDrawer
SuperDrawer
SuperDrawer
SuperDrawer
SuperDrawer
SuperDrawer
SuperDrawer
MPO12 MPO6
Figure 2.15: Block diagram of the final TilePPr and TDAQi designs.
Tile PreProcessor prototype
The TilePPr prototype is the first iteration of the final TilePPr module. It
is the main element of the back-end electronics system in the Demonstrator
project, providing compatibility with the upgraded front-end electronics system
and the current DAQ system. It is capable of operating up to four minidrawers
(one complete TileCal module), and therefore represents one eighth of the final
design.
The first prototype includes all the components required to receive and pro-
cess data from the Demonstrator module, as well as to decode and distribute
35
CHAPTER 2. ATLAS UPGRADES FOR HL-LHC
TTC signals to the front-end electronics for configuration and synchronization
with the LHC clock. It also interfaces with the DCS system to control and
monitor the high voltage distribution to the PMTs.
Furthermore, this prototype serves as a testbench for the development of
trigger pre-processing algorithms that will be implemented in the TDAQi board
after its installation in the HL-LHC.
The core processing of the TilePPr prototype relies on two high-performance
FPGAs connected to four QSFP optical modules. The TilePPr prototype was
designed as a double mid-size AMC board that can be operated in an ATCA
carrier or in a µTCA shelf. Chapter 3 is devoted to the TilePPr prototype
design, describing it in detail, and Chapters 4 and 5 cover the integration and
operation of the TilePPr within the Demonstrator module.
Trigger and DAQ interface
The Trigger and DAQ interface [37] board will be the interface between the
TilePPr module and the ATLAS TDAQ system. The TDAQi will be designed
as an ATCA Rear Transition Module (RTM) [38] and will be powered and
operated from the TilePPr module through the Zone 3 connectors.
The PPT FPGA in the TDAQi will compute trigger objects with differ-
ent granularity and energy resolution with the data provided by the TilePPr
modules. The trigger data will be transmitted to the trigger systems through
the optical links using a protocol with low, fixed and deterministic latency. In
addition, the TDAQi will provide a point-to-point path between the TilePPr
module and the FELIX system through a QSFP module placed in the TDAQi
board.
36
Chapter 3
Design of the TilePPr
prototype
The TilePPr will be the first element and the main readout component of the
back-end electronics in the Tile Calorimeter after the Phase II Upgrade. It
will also provide the sampling clock and the configuration to the front-end elec-
tronics. This chapter gives a detailed description of the design of the TilePPr
prototype for the Demonstrator project. The system requirements, component
selection, PCB layout design and hardware test verifications are covered in this
chapter.
3.1 Specifications of the TilePPr prototype
The core of this thesis is the design of the first prototype of the TilePPr for
the HL-LHC. The designed prototype is a key element in the Demonstrator
project that aims to validate the proposed readout architecture for the Phase
II Upgrade, as has already been discussed in Chapter 2.
The TilePPr prototype represents a slice of the final TilePPr system, read-
ing out one complete Demonstrator module. The TilePPr prototype was used
for the readout and operation of the three different front-end board options (3-
in-1 cards, QIE and FATALIC) during three testbeam campaigns. More details
about the performance and functionalities of the TilePPr prototype during the
37
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
testbeam campaigns will be presented in Chapter 6. In addition, this prototype
is envisaged to be included in the current DAQ architecture to read out and op-
erate the Demonstrator module that will be inserted into the ATLAS experiment
before the Phase II Upgrade. When integrated within the DAQ architecture,
the TilePPr prototype will store the PMT digital samples in pipelines at the
LHC frequency. After reception of a L1A signal, selected data will be trans-
ferred to the RODs with the appropriate format. Therefore, the Demonstrator
module will be transparent to the ATLAS DAQ system emulating the current
front-end electronics functionalities.
The tasks in charge and specifications that the TilePPr prototype has to
fulfill as part of the Demonstrator project are listed below.
• Readout and operation of one TileCal module with the Phase II front-end
electronics.
• Storage of the digital samples in pipeline buffers.
• Data formatting and transmission to the current RODs.
• Synchronization of the front-end and back-end electronics with the LHC
clock provided by the legacy TTC system through data links with fixed
and deterministic latency.
• Communication through Ethernet for slow control functionalities.
• Implementation of signal processing algorithms for energy and time recon-
struction.
• Communication with the upgraded trigger systems sending pre-processed
data for the trigger decision.
• Communication with the FELIX system.
3.2 Components and functionality
The TilePPr prototype was designed to fulfill all the functional requirements
described above. Its components were selected according to the hardware spec-
ifications for the front-end electronics and the overall ATLAS Trigger and DAQ
systems for the HL-LHC.
38
3.2. COMPONENTS AND FUNCTIONALITY
Figure 3.1 shows a block diagram the main components and communications
paths of the TilePPr prototype.
Flash
LMK
03806B
Xilinx
Virtex 7
Xilinx
Kintex 7
DDR3
DDR3
Xilinx
Spartan 6
MMC
SFP
MiniPOD
TX
FMC connector
Power
monitor
CDCE
62005
PIN
TTC
recovery
AMC
connector
QSFP
QSFP
QSFP
QSFP
UART
Si570
CDCE
62005
mem.
Flash
mem.
Si570
MiniPOD
RX
ETH
PHY
Temp.
sensors
UART
EEPROM
MGTData
Clocks MGT clk
GbEI2C
PCIe
2
4
2
4
4
4
4
6
6
46 34
4
2
12
IPMI
4
P8-9
P17-20
P0-1
P4
Upd
P10
Slow
Figure 3.1: General block diagram of the TilePPr prototype.
The following subsections provide a detailed description of the selected com-
ponents and functionalities.
3.2.1 Field Programmable Gate Arrays
The first components selected were the FPGA devices. At least two FPGAs were
needed in the TilePPr prototype: one for the readout of the Demonstrator mod-
ule and one for the communication with the trigger systems. The requirement
of having a high-speed readout data path with deterministic and low latency
made the use of high-performance FPGAs with embedded transceivers manda-
tory. Xilinx series 7 FPGAs [39] complied with all the requirements and were
chosen for this reason.
39
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
The selection of the FPGAs was based on the number of transceivers and
logic capabilities, resulting in the selection of a Xilinx Virtex 7 for the operation
with the front-end electronics and a Xilinx Kintex 7 for the communication
with the trigger systems. Since many of the auxiliary components such as clock
distribution circuitry and power modules are shared between the two FPGAs,
a Xilinx Spartan 6 FPGA was included in the design for configuration and
monitoring.
Readout FPGA
The Readout FPGA is the core processing element of the TilePPr prototype.
The Readout FPGA is in charge of the readout and operation of the Demonstra-
tor module, interface with the FELIX system, synchronization of the front-end
electronics with the LHC clock and interface with the Trigger FPGA. The se-
lected FPGA device for the implementation of the Readout FPGA task is a
Xilinx Virtex 7 XC7VX485T. It contains a large number of logic and DSP re-
sources and 48 high-speed transceivers capable of operating at 10.3125 Gbps.
The TilePPr prototype was also designed to be pin-to-pin compatible with
other FPGA model: the Xilinx Virtex 7 XC7VX415T. Both FPGA models con-
tain similar resources, with the main difference being in the type of high-speed
transceivers. Table 3.1 summarizes the resources of the pin-to-pin compatible
Virtex 7 FPGAs for the TilePPr prototype.
XC7VX415T XC7VX485T
Logic Cells 412,160 485,760
CLBs
Slices 64,400 75,900
Distributed
RAM (Kb)
6,525 8,175
DSP Slices 2,160 2,800
Max RAM (Kb) 31,680 37,080
CMTs 12 14
PCIe Gen2 blocks 0 4
PCIe Gen3 blocks 2 0
GTX transceivers 0 48
GTH transceivers 48 0
Available User I/O 350 350
Table 3.1: Summary of resources of the selected Virtex 7 FPGAs.
40
3.2. COMPONENTS AND FUNCTIONALITY
The XC7VX415T includes GTH transceivers supporting rates of 9.6 Gbps
while the XC7VX485T contains GTX transceivers which are not designed to
operate at 9.6 Gbps in a gap from 8 Gbps to 9.8 Gbps. This limitation in
the data rate of the GTX transceivers is driven by the frequency range of the
dedicated Phase Locked Loops (PLL) in the transceivers. However, as will
be discussed in Chapter 4, the GTX transceivers can be operated out of the
manufacturer specifications achieving a stable and reliable communication at
9.6 Gbps.
As already mentioned in Chapter 2, the DaughterBoard includes two QSFP
modules for the transmission of digital data to the TilePPr prototype. Only one
QSFP module is connected to the TilePPr prototype, while the second redun-
dant QSFP is reserved in case of malfunctioning of the first QSFP. Therefore,
16 transceivers of the Readout FPGA are routed to four QSFP modules (Figure
3.2 (a)) providing a maximum bidirectional bandwidth of 160 Gbps with the
front-end electronics. Moreover, another set of 12 transceivers are routed to
an Avago MiniPOD receiver (Figure 3.2 (b)) [40] for the evaluation of other
technologies and testing purposes.
In order to implement the interface with the FELIX system, four transceivers
were routed directly to the AMC backplane connector providing point-to-point
communication with the TDAQi board through the Zone 3 connector of the
ATCA carrier. The FELIX interface path provides an aggregate bandwidth of
40 Gbps between the Readout FPGA and the FELIX system.
Another four-transceiver set is used for the data path between the Readout
and Trigger FPGAs. The Readout FPGA will use this interface path to send
the reconstructed cell energies and time to the Trigger FPGA. The Readout
FPGA is also connected to 6 transmitters of an Avago MiniPOD transmitter to
evaluate a different approach where the Readout FPGA can transmit directly
the pre-processed data to the trigger system reducing the overall latency.
Finally, two GbE ports and one PCIe port are routed to the AMC backplane
connector for the remote operation of the Readout FPGA. These ports will be
used to control and configure the Readout FPGA through the IPBus ports and
will provide remote programming capabilities.
41
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
(a) Luxtera QSFP module. (b) Avago MiniPOD RX.
Figure 3.2: Optical modules employed for the high-speed communication paths
in the TilePPr prototype.
Trigger FPGA
The main purpose of the Trigger FPGA is the evaluation of trigger algorithms
and study of the latency for data transmissions between the Readout and the
Trigger FPGAs. As introduced in the Chapter 2, in the final version of the
readout electronics for the HL-LHC, the Trigger FPGA will be located in the
TDAQi that will also provide communication path with the FELIX and trigger
systems.
In the TilePPr prototype the selected Trigger FPGA is a Xilinx Kintex 7
XC7K420T. This FPGA model contains high-density logic and DSP resources
to implement the algorithms, and 28 transceivers for the high-speed commu-
nication with the FELIX and trigger systems. Table 3.2 summarizes the logic
resources and transceivers contained in the selected Kintex 7 FPGA.
Half of the channels of the Avago MiniPOD transmitter are connected to six
transceivers of the Trigger FPGA for the evaluation of the communication with
the trigger systems. The Trigger FPGA can transmit at a maximum aggregated
bandwidth of 60 Gbps through the Avago MiniPOD module (10 Gbps per lane).
In addition, four transceivers are used for the transmission of the cell energy
and time from the Readout FPGA to the Trigger FPGAs. This communication
path will also be employed to evaluate different data protocols which minimize
the latency between the two FPGAs in the final design.
42
3.2. COMPONENTS AND FUNCTIONALITY
XC7K420T
Logic Cells 416,960
CLBs
Slices 65,150
Distributed
RAM (Kb)
5,938
DSP Slices 1,680
Max RAM (Kb) 30,060
CMTs 8
PCIe Gen2 blocks 1
GTX transceivers 28
Available User I/O 380
Table 3.2: Summary of resources of the selected Kintex 7 FPGA.
Finally, two additional transceivers provide remote operation and program-
ming capabilities through the AMC backplane connector using the GbE pro-
tocol. A RJ45 connector placed in the front-panel provides Ethernet connec-
tion for local operation. The Ethernet interface is implemented using a Mar-
vel 88E1111E PHY controller that allows the selection between 10BASE-T,
100BASE-TX or 1000BASE-T protocols.
Slow Control FPGA
The Slow Control (SC) FPGA is used for the monitoring and configuration of
all the components in the TilePPr prototype. The selected FPGA is a Xilinx
Spartan 6 XC6SLX16 [41] which interfaces with the Trigger and Readout FP-
GAs, clocking circuitry, power monitoring circuit, all optical modules, on-board
sensors and with the ATCA system through the Module Management Controller
(MMC) [42]. Table 3.3 summaries the logic and block resources of this FPGA.
XC6SLX16
Logic Cells 14,579
CLBs
Slices 2,278
Distributed
RAM (Kb)
136
DSP Slices 32
Max RAM (Kb) 576
CMTs 2
Available User I/O 160
Table 3.3: Summary of resources of the selected Spartan 6 FPGA.
43
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
The protocol employed for communication with the majority of the on-board
components is I2C. Exceptions are the power monitoring circuit which uses
PMBus protocol, the jitter cleaners which use Serial to Parallel Interface (SPI)
protocol and the communication between the SC FPGA and the Trigger and
Readout which is implemented with a custom serial protocol.
In the TilePPr prototype there are two separated I2C chains as shown in Fig-
ure 3.3. The first I2C chain contains the MMC, temperature sensors (MCP9808),
an I2C GPIO expander chip (PCA6416A), an EEPROM memory (24AA32A)
and the SC FPGA. In this I2C chain, the MMC works as I2C master reading the
data from the temperature sensors and monitoring data contained in registers
in the SC FPGA. Upon a user request, the MMC can also extract the TilePPr
ID number from the EEPROM memory or remote reset the FPGAs through
the I2C GPIO expander chip.
PCA9546A
PCA9546A
PCA9546A
PCA9546APCA9546ASpartan 6
MMC
QSFP1
QSFP2
QSFP3
QSFP4
MiniPOD TX
SFP
ADN2814
MiniPOD RX
Si570 V7
Si570 K7
Si570 LMK
Si570 RTM
FMC
RTM
RT controller
IPMI (AMC)
24AA32A
PCA6416A
SN74LVC06
+
PROG B V7
PROG B K7
PROG B SP6
MCP9808
4 x
Figure 3.3: Block diagram of the I2C chains in the TilePPr prototype.
The second I2C chain connects the SC FPGA with the rest of the components
in the board. In this I2C chain the SC FPGA is the master device reading the
status and configuring the following components:
44
3.2. COMPONENTS AND FUNCTIONALITY
• Clock distribution and generation circuitry: clock generators (LMK03806B),
and programmable oscillators (Si570).
• Optical modules: QSFPs, SFP and Avago MiniPOD modules.
• TTC recovery circuitry (ADN2814).
• External boards: Rear Transition Modules and FMC boards.
• Real time controller chip.
I2C switches (PCA9548A) were included to reduce the complexity of the
routing between the SC FPGA and the I2C slaves. In addition, I2C switches
handle any communication conflict when two or more I2C slaves have the same
address.
The SC FPGA permits complete monitoring and basic operation of the
TilePPr prototype from the ATCA framework through the MMC board. The
MMC is a small mezzanine card included in the TilePPr prototype which im-
plements the Intelligent Platform Management Interface (IPMI) to manage the
power connection to the ATCA system and remote operation of the AMCs. Fig-
ure 3.4 shows a picture of the MMC card v3.5 used in the TilePPr prototype.
Figure 3.4: Picture of the MMC card.
3.2.2 TTC receiver block
The TTC receiver block is needed for synchronization of the TilePPr prototype
and front-end electronics with the legacy TTC system. This block is composed
of two Analog Devices ADN2814 chips which extract the clock and data from
the TTC stream and distribute them to the Readout and Trigger FPGAs.
The TTC signal can be received through two different paths: a standard
SFP connector or an ST-connectorized photo-diode. Both connectors should
45
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
not be operated at the same time, and the selection between the SFP and the
photo-diode is done with on-board resistors and capacitors. The output of the
selected optical connector is then routed to an ON Semiconductor NB6B14S
chip. The latter is a differential buffer which repeats the data to two different
ADN2814 chips for the clock and data extraction.
In addition, the NB6B14S chip also buffers the TTC signals to two transceivers
(one in the Trigger FPGA and one in the Readout FPGA). This latter option
permits the implementation of clock and data extraction in the FPGAs using
the transceiver resources.
Figure 3.5 shows a block diagram of the circuitry used for the extraction of
the TTC clock and data.
SFP
PINNB6N14S
ADN
2814
Xilinx
Kintex 7
Xilinx
Virtex 7
ADN
2814
Xilinx
Spartan 6
Data
Clk
Data
Clk
Control
I2C
Control
I2C
Figure 3.5: Block diagram of the TTC receiver block and its connections with
the Readout, Trigger and SC FPGAs.
3.2.3 Clocking unit
The selection of the clock distribution components is one of the most important
design choices because the reference clock for the transceivers requires high
quality and very low jitter. The selected clocking unit provides all the clocks
needed for the operation of the TilePPr prototype.
The use of high-speed transceivers synchronized with the LHC clock requires
high-performance jitter cleaners, such as the CDCE62005 chip from Texas In-
struments, to meet the transceiver specifications in terms of jitter. The jitter
cleaners are used to clean the recovered LHC clock from the TTC receiver block
46
3.2. COMPONENTS AND FUNCTIONALITY
and to route it to the transceivers.
The TilePPr is equipped with two CDCE62005 chips connected to the Trig-
ger and Readout FPGAs. Each CDCE62005 chip has two selectable clock inputs
and 5 clock outputs. One clock input is connected to a programmable local os-
cillator (SI570) and the second one is connected to the recovered LHC clock.
All the clock outputs were routed to the FPGA banks in such a way that all
the interfaces can be configured synchronous with the LHC clock.
The CDCE62005 chips can be programmed and controlled via the SC FPGA
or through the Trigger and Readout FPGAs using level shifter chips to adapt
the voltage levels between the FPGAs and the CDCE62005. The operation of
the CDCE62005 chips to synchronize the TilePPr with the LHC clock will be
discussed in detail in Chapter 5.
Also, a clock generator, a Texas Instruments LMK03806B chip was included
in the design to provide low noise clocks for the PCIE and GbE communication
channels with the ATCA carrier. The LMK03806B chip has 14 programmable
clock outputs, half of them connected to the Trigger FPGA and the other half
to the Readout FPGA.
3.2.4 Configuration
All FPGAs in the TilePPr prototype can be configured using either with a JTAG
chain or by loading a configuration file directly from non-volatile memories.
The Trigger and Readout FPGAs are connected to Byte Peripheral Interface
(BPI) flash memories (Micron PC28F00AG18F). BPI memories provide initial
configuration of the FPGAs during the power-up sequence. The selected BPI
memories have enough space to be segmented into pages to contain different
configuration files which can be updated remotely [43]. In addition, one of
the advantages of the BPI memories is the reduced access time, thus reducing
the time to re-write the memory and load the configuration into the FPGAs.
However, the SC FPGA is connected to a 128 Mb SPI flash memory (Micron
N25Q128) since the size of the configuration files is much smaller and a parallel
interface is not required.
An on-board Digilent JTAG SMT2 programming module configures all the
FPGAs through the JTAG chain. This programmer can connect via the front-
47
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
panel through a microUSB connector. A second standard through-hole connec-
tor is provided for an external JTAG programmer. The configuration mode of
all the FPGAs can be selected using on-board switches so that the configuration
files are loaded from the local memories or through the JTAG chain.
Figure 3.6 shows all the programmable devices connected to the JTAG chain.
The three FPGAs and the FMC connector are connected to the same JTAG
chain through level shifter chips to adapt the voltage levels between the FPGA
and the Digilent programmer. In case an FPGA is not present or cannot be ac-
cessed through the JTAG port, it has to be bypassed to allow the communication
with the rest of the devices. This is achieved by connecting the corresponding
TDI and TDO ports with the 0 Ω resistors for the FPGAs, or with the switch
for the FMC module.
3.3V 1.8V
0Ω 0Ω
0ΩSwitch
TDI
Voltage
translator
Voltage
translator
3.3V 1.8V
TCK
TMS
TDO
TDI
TCK
TMS
TDO
TDI
TCK
TMS
TDO
FMC Spartan 6
Kintex 7 Virtex 7
MMC
Digilent USB
JTAG conn.
Figure 3.6: Block diagram of the TilePPr JTAG chain with the FMC connector
and FPGAs.
3.2.5 Power distribution
One of the critical points during the design of the TilePPr prototype was the
power distribution. The reduced area available on the board, the number of
power consuming components, and the required voltages voltages made the de-
sign of the power distribution circuit challenging. In addition, voltage regulators
powering the FPGA transceivers are required to have low noise to avoid any jit-
ter contribution affecting to the quality of the high-speed signals.
48
3.2. COMPONENTS AND FUNCTIONALITY
Regulator
Model
Voltage
(V)
Imax
(A)
Power
rail
Device
2 x LTM4627 1.0 30 VCCINT Virtex7, Kintex7
LTM4627 1.0 15 MGTAVCC Virtex7, Kintex7
LTM4601A 1.2 12 MGTAVTT Virtex7, Kintex7
LTM4618 1.8 6 MGTVCCAUX Virtex7, Kintex7
LTM4628
3.3 8 P3V3
Clocking, sensors,
I2C
2.5 8 P2V5 Ethernet, Kintex7
LTM4628
1.8 8 P1V8
Virtex7, Kintex7,
flash memories
1.5 8 P1V5
Virtex7, Kintex7,
DDR3
LTM8029 5.0 0.6 P5V0 SP6 MIC2230
MIC2230
3.3 0.8 P3V3 SP6 Spartan6 IO Banks
1.2 0.8 P1V2 SP6 Spartan6 core
ADP7102 5.0 0.3 P5V0 FICER photo-diode
Table 3.4: Summary of the power modules used in the TilePPr prototype indi-
cating the operating voltage and the maximum current.
The input voltage of 12 V received directly via the AMC backplane connector
when inserted into the ATCA shelf, or from an ATX power connector using a
commercial ATX power supply. The power stage of the TilePPr is composed
of a total of 10 voltage regulators: 8 switching regulators power the different
power rails for the Trigger and Readout FPGAs, 1 switching regulator feeds
the SC FPGA IO banks and core, and 1 linear regulator provides 5 V to the
photo-diode of the TTC receiver block. Table 3.4 shows a complete list of the
regulators constituting the power stage.
Following the manufacturer recommendations, ferrite beads were introduced
in the transceivers power rails to keep the voltage ripple below 10 mVpp. The
output voltage ripple for all the regulators was simulated using the Linear Tech-
nology software LTSpice IV [44] confirming that the voltage ripple is kept within
10 mVpp.
The power stage includes a protection circuit for monitoring the 12 V power
input to prevent damage when powering the TilePPr prototype through the
ATX power connector. Figure 3.7 shows a block diagram of the protection
circuit, where a Linear Technologies LT4363 chip senses the input voltage and
current consumption through the RSense resistor. Resistors R1, R2 and R3 are
used to configure the undervoltage and overvoltage levels. In case of any power
49
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
deviations are detected, the LTM4363 chip opens the switch Q1 to protect the
board.
R1
R2
R3
RG
CG
RSENSE
Q1
VIN VOUT
LT4363
OV
UV
GATE SNS OUT
Figure 3.7: Block diagram of the protection circuit designed for the 12 V input.
Power monitoring circuit
A power monitoring circuit monitors all the voltages and currents of the power
stage to identify hardware problems. Moreover, this circuit provides a better
understanding of the power consumption requirements of the FPGAs in different
scenarios, helping in the design of the final TilePPr.
The power monitoring circuit is implemented using two Linear Technologies
LTC2974 chips, where each of them can measure the currents and voltages of up
to four power rails. The first six power modules listed in Table 3.4 are connected
to the power monitoring circuit.
The current delivered by the power regulators is measured through 500 µΩ
sensing resistors connected in series between the power regulators and the power
rails feeding the FPGAs. For an accurate measurement of the voltage applied to
the FPGAs, sense lines connect the LTC2974 directly to power vias underneath
the FPGAs. For both measurements, the input signals are conditioned with
anti-aliasing filters to remove frequencies above 31.25 kHz. The SC FPGA
reads out the measured values via the PMBus protocol.
50
3.2. COMPONENTS AND FUNCTIONALITY
3.2.6 Other components
FMC connector
The FMC connector interfaces the TilePPr prototype with other FMC cards
expanding its functionalities or providing access to the FPGA IO banks for
testing and debugging purposes. The FMC connector selected for the TilePPr
prototype is a High Pin Count (HPC) connector from Samtec with 400 pins.
Only the Trigger and Readout FPGAs are connected to the FMC connector,
where the Readout FPGA was routed to 34 differential pairs and the Trigger
FPGA to 46 differential pairs. In addition, two high-speed links were routed to
transceivers in the Trigger and Readout FPGAs and one high-speed link was
directly connected to the AMC backplane connector.
The TilePPr prototype will also be employed as the core of the PROMETEO
[45] system designed for the certification of the front-end electronics during
maintenance campaigns. For this reason, the FMC pinout was designed to be
pin-to-pin compatible with the PROMETEO ADC FMC board (Figure 3.8).
Figure 3.8: Picture of the ADC FMC board version 2 for the PROMETEO
project.
DDR3 memories
DDR3 memories from Micron Technologies are connected to the Trigger and
Readout FPGAs, where each FPGA will access up to 512 Mb. The DDR3
memories were added for evaluation purposes and to study the performance
of the TilePPr using external memories to absorb the large quantity of data
generated under high trigger rate situations.
51
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
USB-UART interfaces
The USB-to-UART bridges (Silicon Labs CP2103GM) are connected to the
Trigger and Readout FPGAs for the implementation of local debug ports to
check the status of the TilePPr in situ. Both debug ports can be accessed from
the front-end panel through microUSB connectors and provide a maximum data
rate of 1 Mbps using the Universal Asynchronous Receiver/Transmitter (UART)
protocol.
3.3 Physical design and PCB layout
The TilePPr Printed Circuit Board (PCB) was designed as a double AMC form
factor that can be operated either in a µTCA shelf or in an ATCA carrier.
The PCB was designed following the AMC specifications [36] which defines
the physical and mechanical dimensions, maximum allowable component height
and the pinout of the connector. The total size of the PCB is 149 mm ×
183.5 mm and it is divided in separate sections for the FPGAs, clocking circuitry,
communication modules and power regulators. Figure 3.9 shows a picture of the
TilePPr prototype.
Figure 3.9: Picture of the TilePPr prototype indicating the main components.
52
3.3. PHYSICAL DESIGN AND PCB LAYOUT
3.3.1 Stack-up
The stack-up materials and layer thicknesses were selected to fulfill the high-
speed design requirements. The selected stack-up counts a total of 16 layers
used to implement the power rails and signal routing. Figure 3.10 depicts a
sketch of the stack-up indicating the layer thickness and materials used.
Ground
Nelco N4000-13 SI
Nelco N4000-13 SI
Signal 1
Ground
Power 1
Signal 2
Ground
Power 2
Power 3
Ground
Signal 3
Signal 4
Bottom
Ground
Ground
Top
Power 4
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
Nelco N4000-13 SI
65µm
100µm
120µm
100µm
120µm
100µm
65µm
100µm
65µm
100µm
120µm
100µm
120µm
100µm
65µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
17µm
Layer1
Layer2
Layer3
Layer4
Layer5
Layer6
Layer7
Layer8
Layer9
Layer10
Layer11
Layer12
Layer13
Layer14
Layer15
Layer16
Figure 3.10: Sketch of the TilePPr stack-up.
Nelco N4000-13 SI was chosen as the dielectric material due to its low dielec-
tric constant (r) of 3.2 and a dissipation factor (tan(δ)) of 0.008, both measured
at 10 GHz for striplines. This material is widely used for high-speed designs
since it offers very low dielectric losses at high frequencies.
The placement of the components required several iterations due to the
large number of components and the limited number of layers. In addition, the
signal routing congestion was minimized by an appropriate selection of the pins
interfacing the FPGAs with the rest of the components.
53
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
The final layer distribution employed for the PCB design of the TilePPr is
described below.
• Layers 2, 4, 7, 10, 13 and 15: continuous ground planes.
• Layers 5 and 12: continuous power planes.
• Layers 8 and 9: split power planes.
• Layers 3 and 14: high-speed lines for the QSFPs, Avago MiniPODs, FPGA
high-speed connections and AMC backplane connections.
• Layers 1, 6, 11 and 16: clock lines, DDR3 lines and rest of slow control
signals.
3.3.2 Signal integrity studies
Signal integrity simulations constitute a fundamental step during the design of
high-speed communication devices. Simulations in a wide range of frequencies
help to prevent interconnection problems prior to manufacturing.
Since the FPGA transceivers transmit at high data rates, the length of the
high-speed interconnects are comparable to the wavelength of the traveling sig-
nals which behaves as transmission lines. The geometry and properties of the
dielectric materials of the interconnects define the parameters of the transmis-
sion lines such as the characteristic impedance, propagation delay and dielectric
and conductor losses.
High-speed interconnects have to be carefully designed since the noise budget
can be compromised by impedance mismatches along the interconnect, interfer-
ence from neighbor interconnects (crosstalk) or high losses degrading the signal
amplitude.
Another important parameter in the design of high-speed interconnects is
the bandwidth of the traveling signal. A high-speed interconnect has to pro-
vide a low attenuation path for all the frequencies within the bandwidth of the
transmitted signal. Since digital signals have a large number of harmonics, the
effective bandwidth of a digital signal is defined as the highest significant fre-
quency component [46]. The effective bandwidth is related to the rise time of
the digital signal. The mathematical expression, considering a rise time based
on 20%-80% thresholds, is shown in Equation 3.1.
54
3.3. PHYSICAL DESIGN AND PCB LAYOUT
BW =
0.23
tr
(3.1)
where tr corresponds to the rise time (20%-80%) of the signal and BW is the
effective bandwidth.
In the case of the TilePPr prototype, the high-speed transceivers contained
in the Readout and Trigger FPGAs have a typical rise time of 40 ps, so according
to Equation 3.1 the effective bandwidth is close to 6 GHz.
Pre-layout simulations
Prior to the routing of the PCB, pre-layout simulations were performed to define
the geometry of the single-ended and differential traces. Single-ended traces
were designed to provide a characteristic impedance (Zo) of 50 Ω and differential
lines to provide a differential impedance (Zdiff ) of 100 Ω. A 2D field solver
software, called ANSYS Q3D Extractor [47], was used to simulate multiple
geometries with different trace width and separations between the differential
traces in order to find the most suitable values. Figure 3.11 shows the results
of the characteristic and differential impedance of the simulated microstrips for
different combinations of trace widths and trace separation.
(a) Simulation of the Zo for a single mi-
crostrip varying the trace width from 75 µm
to 125 µm.
(b) Simulation of the Zdiff for a differen-
tial microstrip varying the trace width from
75 µm to 100 µm and the trace separation
between the pairs from 75 µm to 100 µm.
Figure 3.11: Simulation of the impedance of microstrip structures.
55
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
A similar simulation was done to determine the geometry of the striplines.
Figure 3.12 presents the characteristic and differential impedance of the simu-
lated striplines.
(a) Simulation of the Zo for a single stripline
varying the trace width from 75 µm to
125 µm.
(b) Simulation of the Zdiff for a differential
stripline varying the trace width from 75 µm
to 100 µm and the trace separation between
the pairs from 75 µm to 100 µm.
Figure 3.12: Simulation of the impedance of stripline structures.
Table 3.5 shows the selected values of width and trace separation for the
microstrips and striplines. The geometry parameters of the microstrips and
striplines were selected based on the simulation results, where the selected ge-
ometries provide impedance values within a 10% of the design constraint value.
Microstrip Stripline
Single Differential Single Differential
Width 125 µm 100 µm 100 µm 75 µm
Separation - 100 µm - 75 µm
Table 3.5: Summary of the selected geometry values for the high-speed inter-
connects.
Impedance discontinuities
One of the most common source of problems which compromises the signal in-
tegrity are the impedance discontinuities along the high-speed interconnects [48].
Geometry variations along the traces produce reflections that degrade the sig-
nal quality at the receiver. Differential signal vias traversing layers, or the pads
of the DC-coupling capacitors, are common sources of impedance mismatch in
high-speed designs.
56
3.3. PHYSICAL DESIGN AND PCB LAYOUT
The proportion of the signal reflected back to the transmitter is proportional
to the impedance mismatch and can be estimated by Equation 3.2.
ρd =
Zd − Zo
Zd + Zo
(3.2)
where Zd is the impedance of the discontinuity, Zo the characteristic impedance
of the line and ρd is the reflection coefficient.
The S-parameters
In this thesis, Scattering parameters (S-parameters) [49] were used to character-
ize and validate the designed high-speed interconnects. S-parameters describe,
in the frequency domain, how the interconnections interact with incident sine
waves. When an incident sine wave interacts with an interconnect, some part
of the energy can scatter back from the interconnect and the other continues
its propagation through the interconnect. Then, S-parameters offers a versatile
way to extract the characteristics of the channel, as insertion loss, return loss
or the amount of crosstalk between lines.
Figure 3.13 shows a representation of a 2-port network with the normalized
wave definitions for S-parameters, where a represents the incident waves and b
the scattered waves.
Two-port
a1
b1 b2
a2
Network
1 2
Figure 3.13: Recommended port labeling for an interconnection.
The value of the S-parameters represent the ratio of the amplitude between
the incident and scattered waves. Equation 3.3 shows the definition of the S-
parameters for the particular case of a 2-port network.
S11 =
b1
a1
∣∣∣∣
a2=0
S12 =
b1
a2
∣∣∣∣
a1=0
S21 =
b2
a1
∣∣∣∣
a2=0
S22 =
b2
a2
∣∣∣∣
a1=0
(3.3)
For example, S11 quantifies the return loss (called before ρd in Equation 3.2)
which corresponds to the portion of energy that is reflected back to the source,
57
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
and S21 quantifies the insertion loss or, in other words, the portion of energy
that is transferred to the receiver. The S-parameters are represented in decibels:
Sij(dB) = 20 · log10
(
Ai
Aj
)
(3.4)
where Ai and Aj correspond to the amplitude of the wave at ports i and j
respectively, and Sij is the value of the S-parameter between ports i and j.
The S-parameters can be extended to a n-port network to represent more
complicated models with different high-speed lines. Equation 3.5 shows the
matrix form of the S-parameters for a n-port network.

b1
b2
...
bn
 =

S11 S12 . . . S1n
S21 S22 . . . S2n
...
...
. . .
...
Sn1 Sn2 . . . Snn
 ·

a1
a2
...
an
 (3.5)
where bi, Sij and aj represent a generalization of the notation followed in
Figure 3.13 and Equations 3.3 and 3.4.
The mixed mode S-parameters
The mixed-mode S-parameters are obtained from the S-parameters and de-
scribe the same concept as the S-parameters but considering differential- and
common-mode signals. This form is more convenient for signal integrity analy-
sis. Figure 3.14 shows a sketch of the port definitions for a 4-port network and
its equivalent 2-port differential network.
Network
a1
b1 b2
a2
b4
a4a3
b3
Four-port
Network
ad1
bd1 bd2
ad2
bc2
ac2ac1
bc1
Two-port
Mixed-mode
1
2
3
4
1 2
Figure 3.14: Representation of the S-parameters of a 4-port network (left) and
the equivalent S-parameters of 2-port mixed-mode network (right).
As well as the S-parameters, the mixed-mode S-parameters can be repre-
sented in matrix form. Equation 3.6 shows the mixed-mode S-parameters cor-
responding to a 2-port differential network.
58
3.3. PHYSICAL DESIGN AND PCB LAYOUT

bd1
bd2
bc1
bc2
 =

SDD11 SDD12
SDD21 SDD22
 SDC11 SDC12
SDC21 SDC22

SCD11 SCD12
SCD21 SCD22
 SCC11 SCC12
SCC21 SCC22


·

ad1
ad2
ac1
ac2
 (3.6)
where adi, bdi are the differential-mode signal and aci, bci are the common-mode
signals.
Each of the submatrixes composing the mixed-mode matrix presented in
Equation 3.6 describes a different energy response of the characterized line:
• SDD submatrix: differential- to differential-mode signal response.
• SDC submatrix: mode conversion of common- to differential-mode signals.
• SCD submatrix: mode conversion of differential- to common-mode signals.
• SCC submatrix: common- to common-mode signal response.
The most useful mixed-mode S parameters to characterize the high-speed inter-
connect are the SDD11 or differential return loss parameter and the SDD21 or
differential insertion loss parameter.
However, the mixed-mode S-parameters are also used for quantifying the
amount of near- and far-end crosstalk induced in a victim line, i.e. with SDD31,
or the quantity of differential-mode signal that is converted to common-mode
with SCD parameters.
DC-coupling capacitors
One source of impedance mismatch are the DC-coupling capacitors used to
interconnect high-speed components with different standard logic [50]. The
area increase due to the capacitor pad reduces the impedance of the trace and
generates reflections. The use of small capacitor form factors as 0201 reduces the
impedance mismatch since the pads are much closer in size to the trace width.
In the TilePPr module, DC-coupling capacitors are used to interconnect the
Readout and the Trigger FPGAs, for the communication with the RTM module
through the backplane and for the Avago MiniPOD connectors. No DC-coupling
59
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
capacitors are needed for the QSFP lines, since the 0201 capacitors are included
in the optical module.
Figure 3.15 shows the 0201 model employed to compare the mismatch im-
pedance produced by 0201 and 0402 packages using a 3D field solver software
called ANSYS HFSS [51]. The model of the interconnect is composed of a pair
of DC-coupling capacitors and the corresponding differential traces.
Figure 3.15: Model of a differential line with 0201 case DC-coupling capacitors
(ANSYS HFSS software).
Figure 3.16 (a) shows the Time Domain Reflectometry (TDR) response of
the 0201 and 0402 packages to a step signal with a rise time of 40 ps. The
TDR technique is another common way to characterize the quality of the high-
speed interconnects. A step signal with a fast rise time is transmitted through
the interconnect and the reflected amplitude is measured in the time domain.
The impedance discontinuity increases or decreases the returned step amplitude
depending on its impedance.
60
3.3. PHYSICAL DESIGN AND PCB LAYOUT
(a) Comparison of the TDR response between
differential interconnects with 0201 (blue)
and 0402 (red) case DC-coupling capacitors
using a step signal with a rise time of 40 ps.
(b) Comparison of the insertion loss (SDD21
parameter) between differential interconnects
with 0201 (blue) and 0402 (red) case DC-
coupling capacitors.
Figure 3.16: Comparison of the TDR and insertion loss simulation results corre-
sponding to two differential interconnects with 0201 and 0402 case DC-coupling
capacitors.
As can be observed, the 0201 package generates a smaller impedance mis-
match than the 0402 package and consequently the insertion loss (SDD21) is
also lower. Based on the results, 0201 package capacitors were used for the
DC-coupling capacitors.
In addition, another technique to mitigate the effect of the 0201 DC-coupling
capacitors is to cut a rectangular area of the reference plane under the pad to
decrease the capacitance of the line. The TDR results and SDD21 parameters
obtained from the simulation of different sizes of area cuts in the underneath
power plane are shown in Figure 3.17.
As can be observed, the reduction of the capacitance formed by the capacitor
pads results in a smaller impedance mismatch. However, the area cuts have not
been implemented in the TilePPr prototype since the impedance discontinuity
produced by the 0201 capacitors is already small and the implementation of this
technique would split the reference plane due to the high number of capacitors
and their proximity.
61
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
(a) TDR simulation results of a differential
interconnect with 0201 case DC-coupling ca-
pacitors for different sizes of area cuts. The
rise time of the step signal is 40 ps.
(b) Insertion loss (SDD21 parameter) of a
differential interconnect with 0201 case DC-
coupling capacitors for different sizes of area
cuts.
Figure 3.17: Results of the TDR and insertion loss simulations corresponding to
a differential interconnect with 0201 case DC-coupling capacitors with different
sizes of area cuts.
Differential vias
When using a via to interconnect traces between different layers, there are
some layout elements that compromise the signal integrity: Non-Functional
Pads (NFP), via stub, and ground vias.
The NFPs are those pads of the via which are not connected to any layer
along the PCB stack-up. Manufacturers include them in the stack-up to improve
the mechanical stability of the via in the PCB laminate. One of the drawbacks of
NFPs in high-speed designs is the impedance discontinuities introduced by the
parasitic capacitances created between the NFPs and the neighboring reference
layers. One way to avoid this problem is to remove the NFPs to decrease the
via capacitance. Another approach is to enlarge the antipad size (area clearance
between the via and the conductors in the same layer).
The second factor which creates a reflection is the via stubs. Via stubs are
the remaining parts of the vias when interconnecting layers, i.e. if a via con-
nects a trace from the TOP layer to layer 8, the rest of the via from layer 8
to the BOTTOM layer will create a reflection because an impedance discon-
tinuity. This via stub creates a resonance that cancels the signal components
at a frequency dominated by the length of the via stub (λ/4, where λ is the
wavelength). The impedance mismatch due to via stubs can be eliminated by
62
3.3. PHYSICAL DESIGN AND PCB LAYOUT
using the back-drilling technique or buried vias in the design at the expense of
higher PCB manufacturing costs. However, a cost-effective technique is to route
the high-speed lines in the more external layers of the PCB, thus reducing the
stub length.
The third factor related to differential vias which compromises the signal
integrity is the ground vias which provide the return path of the traveling signal.
If no ground vias are placed at the proper near position, reflections will be
produced.
The via structure used for the design of the TilePPr prototype was simulated
to define the optimum size of the via antipad, the position of the ground vias
and also to evaluate if back-drilling or buried vias were needed for the PCB
layout. Figure 3.18 shows the 3D model employed to perform the simulations
with a differential via connecting the top layer and layer 15.
Figure 3.18: Model of the differential via used in the simulations with ANSYS
HFSS.
63
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
Figure 3.19 shows the TDR and insertion loss comparison for a simulation
where the radius of the via antipad is increased from 250 µm to 550 µm in steps
of 100 µm. Radius of the via antipad between 450 µm and 550 µm produce
an impedance mismatch below the 5% of the differential impedance and the
minimum insertion loss factor.
(a) TDR simulation results of a differential
via for different antipad radii. The rise time
of the step signal is 40 ps.
(b) Insertion loss (SDD21 parameter) of a dif-
ferential via for different antipad radii.
Figure 3.19: Results of the TDR and insertion loss simulations of a differential
via layout for different antipad radii.
Finally, a different simulation is performed to determine the frequency of
the via resonance in a worse case scenario where the via connects the TOP
layer with layer 2 creating a long stub from layer 3 to 16. As can be observed
in Figure 3.20 the via resonance appears above 20 GHz, out of the effective
bandwidth and thus it does not degrade the quality of the signal.
64
3.3. PHYSICAL DESIGN AND PCB LAYOUT
Figure 3.20: Insertion loss of a differential via with a stub from layer 3 to 16.
The self-resonance frequency of the via is placed around 25 GHz.
3.3.3 IR drops
One critical point when designing boards with components that demands high
currents are the voltage drops produced along the power planes due to the
resistance of the conductors, called IR drops (I × R). The effective area of the
power planes is reduced by the via holes, forming a swiss cheese effect that
increases the DC resistance. If the power interconnects do not provide a low
resistive path, voltages could drop enough to cause malfunctioning of some
components.
In order to evaluate the impact of the swiss cheese effect in the power distri-
bution, the IR drops were simulated using ANSYS SIwave software [52]. First,
the maximum current consumption values of the FPGAs were calculated using
the Xilinx Power Estimator (XPE) tool [53] considering a worst case defined by
the next scenario of resource usage [54].
• 80% of Look-Up Tables and registers clocked at 245 MHz.
• 80% of block RAM and DSP at 491 MHz.
• 50% of the MMCM circuits and 25% of the PLLs at 500 MHz.
• 100% of the GPIO using SSTL 1.2 clocked at 1,200 MHz.
• All the routed GTX transceivers running at the specified data rate.
65
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
Table 3.6 shows a summary of the estimated maximum current carried by
the different power rails of the TilePPr.
Power Rail Voltage (V)
Virtex 7 Kintex 7 Total
Current (A) Current (A) Current (A)
VCCINT+VCCBRAM 1.0 15.02 12.415 27.435
VCCAUX 1.8 0.558 0.611 1.169
MGTVCCAUX 1.8 0.062 0.062 0.124
MGTAVCC 1.0 4.204 2.656 6.86
MGTAVTT 1.2 2.531 1.24 3.771
Table 3.6: Summary of the maximum current consumption of the TilePPr power
rails. Current consumptions were estimated using the Xilinx XPE tool.
According to the simulations, all power rails have a maximum voltage drop
close or lower to the 3% of the operating voltage as recommended by the manu-
facturer. From all the IR drop simulations, the more restring value corresponds
to the VCCINT power rail. This power rail feeds the core of both Kintex and
Virtex 7 FPGAs with a maximum estimated current that almost reaches 27.5 A.
Figure 3.21 shows the result of the IR drop simulation in layer 12 for the
VCCINT power rail. As can be observed, the IR drops for VCCINT power rail
reaches a minimum of 0.969 mV at the Virtex 7 location. This value is slightly
lower than the operating voltage recommended by the manufacturer. The IR
drop for this power rail could be reduced by using wider power vias or smaller
via antipad in the area underneath the FPGA. Nevertheless, the TilePPr power
monitoring circuitry includes capabilities for trimming the output voltage of the
power modules in case the voltage in the power rails is under the recommended
values.
66
3.4. POST-LAYOUT SIMULATIONS
+9.998E-01V
+9.976E-01V
+9.954E-01V
+9.932E-01V
+9.910E-01V
+9.888E-01V
+9.867E-01V
+9.845E-01V
+9.823E-01V
+9.802E-01V
+9.780E-01V
+9.758E-01V
+9.737E-01V
+9.715E-01V
+9.694E-01V
Figure 3.21: IR drops on layer 12 for the VCCINT power rail. The Kintex 7
FPGA (top) is draining 12.415 A and the Virtex 7 FPGA (bottom) is consuming
15.02 A.
3.4 Post-layout simulations
Post-layout simulations were performed to validate the routing before manufac-
turing the PCB. The mixed-mode S-parameters were extracted using ANSYS
tools and studied to verify that no signal integrity problems will cause errors in
the communications.
The signal integrity studies presented here correspond to the QSFP1 RX0
and RX1 differential lines. The QSFP RX lines were selected for the simulations
since will be used to receive the data from the front-end electronics at rates
close to 10 Gbps. Figure 3.22 shows the physical model of lines RX0 and RX1
(highlighted in yellow) employed for the signal integrity simulations.
3.4.1 Insertion and return losses
Insertion and return losses of the high-speed lines were studied to quantify the
energy not transmitted to the load and to determine if it affects the signal
quality. Insertion and return losses are related according to Equation 3.7.
SDD21 =
√
1− SDD112 (3.7)
67
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
Figure 3.22: Snapshot of the physical model used for the signal integrity sim-
ulations with ANSYS Siwave software. RX0 and RX1 lines are highlighted in
yellow.
Figure 3.23 shows the simulated SDD11 value for differential signals and
SCC11 for common signal value from the QSFP1 RX0 channel. The reflected
energy corresponds to the impedance mismatch produced by the pads of the
FPGA and QSFP connector plus the impedance mismatch due to the separation
of the differential pair and vias when the trace arrives to the QSFP connector
and FPGA pads. The energy reflected back to the source will be translated as
an addition to the insertion losses.
(a) Simulation of the SDD11 parameter. (b) Simulation of the SCC11 parameter.
Figure 3.23: Simulation of the SDD11 and SCC11 parameters for RX0 line.
68
3.4. POST-LAYOUT SIMULATIONS
Figure 3.24 shows the result of the simulations of the SDD21 values for the
RX0 line. Results show a low insertion loss of -2 dB at the Nyquist frequency
5 GHz, which will not degrade the signal quality [49]. The maximum Nyquist
frequency referred to a serial link is half the data rate [46].
Figure 3.24: Simulation of the SDD21 parameter for RX0 line.
Line asymmetry
The asymmetry in length between the two traces of a differential pair pro-
duces impedance mismatches converting part of the signal from differential-
to common-mode. Although the maximum length difference between the two
traces was limited during the routing stage to 100 µm, simulations were per-
formed to ensure that no significant quantities of energy are converted from the
differential- to the common-mode.
Figure 3.25 shows the simulated SCD21 parameter corresponding to the RX0
line. As can be observed, there is a negligible part of the differential-mode signal
being converted to common-mode.
Crosstalk studies
Crosstalk to neighboring traces can also be evaluated with mixed-mode S-
parameters. Crosstalk in high-speed designs refers to part of the energy of
the signal induced on the neighboring lines due to capacitance and inductance
69
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
Figure 3.25: Simulation of the SCD21 parameter for RX0 line.
couplings between them. Crosstalk can be basically categorized in two types:
Near-End crossTalk (NEXT) and Far-End crossTalk (FEXT). NEXT is the in-
duced noise at the end of a victim line close to the transmitter of the aggressor,
while FEXT refers to the induced noise at the opposite side of the victim line.
Although the high-speed lines in the TilePPr prototype are separated as far as
possible from possible victim or aggressor lines, a quantification of the NEXT
and FEXT was performed.
Figure 3.26 shows the NEXT from differential to differential signal (SDD31)
considering a 4-port network composed of two differential channels RX0 and
RX1, and also the NEXT from differential to common signal (SCD31) is shown.
The simulated NEXT between RX0 and RX1 differential lines is negligible for
both cases in the frequency range of interest.
Figure 3.27 shows the results of the FEXT simulation for differential to
differential signals (SDD41) and differential to common signals (SCD41). As the
NEXT, FEXT is negligible showing a induced noise of -75 dB at the Nyquist
frequency for the differential to differential signals and below -70 dB for the
differential to common signals. Following the results of the crosstalk studies,
the induced noise in the high-speed lines for the communication between the
Readout FPGA and the QSFP modules is negligible and does not contribute to
degradation of signal quality.
70
3.5. CHARACTERIZATION TESTS
(a) Simulation of the SDD31 parameter. (b) Simulation of the SCD31 parameter.
Figure 3.26: Simulation of the SDD31 and SCD31 parameters for the study of
the NEXT between the RX0 and RX1 lines.
(a) Simulation of the SDD41 parameter. (b) Simulation of the SCD41 parameter.
Figure 3.27: Simulation of the SDD41 and SCD41 parameters for the study of
the FEXT between the RX0 and RX1 lines.
3.5 Characterization tests
The goal of the characterization test of the TilePPr prototype was to study the
jitter in the high-speed lines used to communicate the front-end and back-end
electronics systems. As mentioned in Chapter 2, the TilePPr will send the LHC
clock to the front-end electronics embedded with the data. For this reason, the
lines associated with the communication with the front-end electronics have to
provide sufficiently low jitter. The quality of these data signals was evaluated
in terms of jitter.
71
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
3.5.1 Introduction to jitter
Jitter is defined as the time deviation of a signal with respect to its expected
occurrence in time. This timing shift in the signal has a large number of causes
including crosstalk, Inter Symbol Interference (ISI), impedance discontinuities,
and thermal noise [55]. The total jitter (TJ) of a high-speed line can be de-
composed in two main types: random jitter (RJ) and deterministic jitter (DJ),
where the last one is categorized in different subtypes of jitter as depicted in
Figure 3.28.
Total jitter
Random (RJ) Deterministic (DJ)
Data Dependent (DDJ)Periodic (PJ)
Figure 3.28: Classification of jitter.
Random jitter
RJ represents a jitter which is uncorrelated to any other signal in the design,
thus producing unpredictable time variations. RJ is considered to follow a
Gaussian distribution, therefore its peak-to-peak values are not mathematically
bounded. Instead, RJ is usually expressed in Root-Mean-Square (RMS) sec-
onds. The primary contributors to RJ are the thermal noise, shot noise and
pink noise(1/f) generated in electronic devices.
Deterministic jitter
The second contributor to the TJ is the (DJ). DJ is defined as a time deviation
of a signal that is repeatable and therefore can be predicted. Since the jitter
is repeated in time, the peak-to-peak value of this jitter is bounded. For this
reason, DJ measurements are defined in units of peak-to-peak seconds. This
class of jitter is again divided in two types of jitter based on its source.
• Periodic jitter (PJ) is referred to timing shifts following a periodic pattern.
This jitter is typically caused by external deterministic noise sources, such
72
3.5. CHARACTERIZATION TESTS
as power supply noise or unstable PLLs oscillating, but it is not correlated
to data.
• Data-Dependent Jitter (DDJ) is defined as the jitter correlated to the
bit sequence in a data stream. DDJ is produced by a combination of
impedance mismatches, frequency response of the transmission lines and
asymmetries in the duty cycle of the transmitted signal.The main sources
of DDJ are the Inter Symbol Interference (ISI) and the Duty-Cycle Dis-
tortion (DCD).
Total jitter
The total jitter probability density function (PDF) is the result of the convolu-
tion between RJ and DJ .
TJ = RJ ∗DJ (3.8)
TJ is usually defined as the total jitter at a specific Bit Error Ratio (BER),
since TJ is unbounded due to the contribution of RJ . It provides a jitter esti-
mation related to the total jitter contribution to a specific BER as defined in
Equation 3.9 [56].
TJ(BER) = 2 ·QBER ·RJ(RMS) +DJ(δ-δ) (3.9)
where DJ(δ-δ) is the DJ obtained through the dual-Dirac method and QBER
is a factor derived from the complementary error function. QBER estimates the
amount of eye closure produced by the RJ for a specific BER. Some values of
QBER are shown in Table 3.7.
BER QBER BER QBER
10−12 14.069 10−16 16.444
10−13 14.698 10−17 16.987
10−14 15.301 10−18 17.514
10−15 15.882
Table 3.7: QBER factor as a function of BER.
73
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
3.5.2 Eye diagrams
Eye diagram is a common technique to detect signal integrity problems in the
time domain. Eye diagrams are generated by overlaying a high number of logic
transitions in the time domain scale. The opening in time and amplitude of the
eye diagram gives an estimation of the quality of the signal in terms of jitter
and also adding a valuable information about the path attenuation.
Simulations of high-speed lines
In this thesis, eye diagrams were simulated for the receiving and transmitting
lines used for the communication with the front-end electronics. The simulation
was carried out using the ANSYS software tools, where the S-parameters of
lines were extracted and co-simulated using IBIS-AMI models for the drivers
and receivers. Two different sets of simulations were performed corresponding
to the two data rates of 4.8 Gbps and 9.6 Gbps used in the TilePPr prototype.
Figure 3.29 shows the simulated eye diagrams. Both diagrams show wide-
open eyes in amplitude and time, indicating a good layout design of the high-
speed lines and low attenuation. The estimated jitter from the simulations
is 1.083 psRMS for the transmitter lines at 4.8 Gbps (Figure 3.29 (a)), and
2.043 psRMS for the receiver lines at 9.6 Gbps (Figure 3.29 (b)). Both values
can be considered very low for the data rates of operation.
(a) Eye diagram for the transmitter lines at
the QSFP connector pins. The simulation is
performed with a data rate of 4.8 Gbps and a
PRBS31 pattern.
(b) Eye diagram for the receiver lines at the
Readout FPGA pins. The simulation is per-
formed with a data rate of 9.6 Gbps and a
PRBS31 pattern.
Figure 3.29: Simulated eye diagrams using the IBIS-AMI models provided by
Xilinx.
74
3.5. CHARACTERIZATION TESTS
The high-frequency losses degrade the rising and falling times thus reducing
the eye opening in amplitude. This effect is more evident for the receiver lines
operating at 9.6 Gbps than for the transmitter lines running at 4.8 Gbps.
Jitter and BER measurements
A set of measurements was done to evaluate the quality of the signals trans-
mitted to the front-end electronics. The different types of jitter were measured
in the optical signals transmitted through an Avago AFBR-79Q4Z QSFP mod-
ule [57] using a Keysight DCA-X 86100D sampling oscilloscope [58] equipped
with the optical module 86105C.
These measurements were employed to validate the correct design and pro-
duction of the TilePPr prototype since these tests include not only the high-
speed lines but also the rest of the components involved in the data communi-
cation such as clocks, jitter cleaners, and power supplies. An IBERT IP core
from Xilinx [59] was implemented in the Readout FPGA to transmit a PRBS31
pattern. Table 3.8 shows the jitter measurement results done at 4.8 Gbps and
9.6 Gbps data rates with the corresponding standard deviations, and Figure 3.30
shows the eye diagrams generated measuring the optical signal of one QSFP
transmitter.
4.8 Gbps 9.6 Gbps
µ (ps) σ (ps) µ (ps) σ (ps)
RJ(RMS) 2.74 0.28 2.97 0.25
DJ(δ-δ) 2.44 1.31 5.94 1.29
TJ(10
−12) 39.85 3.46 46.5 3.15
TJ(10
−13) 41.61 3.31 48.42 3.02
TJ(10
−14) 43.31 3.16 50.25 2.88
TJ(10
−15) 44.94 2.99 52.02 2.74
TJ(10
−16) 46.51 2.83 53.71 2.59
TJ(10
−17) 48.03 2.65 55.36 2.43
TJ(10
−18) 49.5 2.48 56.95 2.27
Table 3.8: Jitter measurement results for one transmitter of the TilePPr proto-
type at 4.8 Gbps and 9.6 Gbps rates.
75
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
The jitter measurements show a good quality of the signal transmitted to-
wards the front-end electronics with low jitter values. The estimated TJ ensures
that the design is robust enough to operate at the required data rates showing a
TJ(10
−18) of 0.23 Unit Interval (UI) at 4.8 Gbps and 0.55 UI at 9.6 Gbps. While
the RJ takes similar values for 4.8 Gbps and 9.6 Gbps cases, the deterministic
component presents a higher value at 9.6 Gbps. The Keysight DCA-X 86100D
provided a resolution of 5 fs and 2 fs in the RJ(RMS) measurement for 4.8 Gbps
and 9.6 Gbps respectively. The resolution in the rest of measurements was 50 fs
for 4.8 Gbps and 20 fs for 9.6 Gbps.
Finally, as part of the validation test for the TilePPr prototype, a BER
test was performed during a period of 115 hours [60]. A Xilinx IBERT IP
core was implemented to control the 16 transceivers connected to the QSFP
modules. The QSFP modules were connected in loopback mode using fibers
and the transceivers were configured to transmit a PRBS31 data pattern at
9.6 Gbps. Since no error were detected during all the test period the obtained
BER was better than 5 · 10−17 with a confidence level (CL) of 95% [61].
76
3.5. CHARACTERIZATION TESTS
(a) Optical eye diagram corresponding to one transmitter of the QSFP
module running at 4.8 Gbps.
(b) Optical eye diagram corresponding to one transmitter of the QSFP
module running at 9.6 Gbps.
Figure 3.30: Optical eye diagrams generated with the Keysight DCA-X 86100D
oscilloscope.
77
CHAPTER 3. DESIGN OF THE TILEPPR PROTOTYPE
78
Chapter 4
Integration of the TilePPr
in the Demonstrator
As already covered in Chapter 2, the Demonstrator project aims to evaluate the
proposed DAQ architecture and detector readout electronics for the HL-LHC
prior to its installation into the ATLAS detector.
A Demonstrator module was populated with prototypes of all the new com-
ponents and tested during the testbeam campaigns. This module is composed
of four minidrawers containing the required electronics to operate and read out
up to 48 PMTs. Each minidrawer includes twelve 3-in-1 cards, one MainBoard,
one DaughterBoard and, depending on the minidrawer, an HVOpto board or
an HV distribution board. In the back-end electronics, the Tile PreProcessor
prototype distributes the configuration commands and sampling clock to the
front-end electronics and receives the digitized data for every period of the LHC
clock.
The DAQ architecture employed for the readout and operation of the Demon-
strator electronics is similar to the one proposed for the HL-LHC, with the dif-
ference that the Demonstrator module is also be able to provide analog trigger
signals to the L1Calo system permitting its installation into the current ATLAS
detector.
This chapter gives a detailed description of the different firmware modules
included in the DaughterBoard and TilePPr prototype for the Demonstrator.
79
CHAPTER 4. INTEGRATION OF THE TILEPPR
4.1 GBT protocol
The GigaBit Transceiver (GBT) protocol [62] was developed at CERN for the
data transmission between the front-end and the back-end electronics in the HL-
LHC era as part of the GBT project [63]. This protocol establishes a common
transmission path for the TTC, DAQ and Slow Control (SC) information.
The GBT project includes the development of different radiation tolerant
ASICs implementing the GBT protocol, such as the GBTx or the GBT-SCA [64],
as part of the front-end electronics where COTS components cannot operate
due to the radiation levels. However, in the counting rooms where the radiation
levels are not a concern, the GBT protocol can be implemented in FPGAs.
The VHDL-based firmware version of the GBT protocol, called GBT-FPGA
IP core [65], is supported by CERN and can be implemented in various FPGA
models with embedded transceivers.
Figure 4.1 shows a block diagram of the GBT-FPGA IP core indicating the
frequencies at which each block operates and the data flow. The GBT protocol
transmits a 120-bit frame at the LHC clock frequency of 40 MHz, resulting in
a line rate of 4.8 Gbps.
The implementation of the GBT-FPGA IP core in FPGA employs four clocks
to handle the data across the different blocks.
• tx frame clk : this clock corresponds to the LHC clock frequency.
• tx word clk : the FPGA transceiver generates this clock internally to trans-
mit 40-bit words and its frequency is 3 times the tx frame clk frequency,
this is 120 MHz.
• rx frame clk : a Mixed-Mode Clock Manager (MMCM) is used to generate
it from the rx word clk and its frequency corresponds to the LHC clock
frequency.
• rx word clk : the FPGA transceiver recovers this clock from the incoming
data and has a frequency of 120 MHz. A 40-bit word is received for every
rx clock cycle.
The first block in the transmitter side is the Scrambler. It is implemented
as a Linear Feedback Shift-Register (LFSR) which produces pseudorandom bit
80
4.1. GBT PROTOCOL
sequences reducing the occurrence of long streams of ones or zeros, maintaining
the DC-balance in the serial transmitter output signals.
Figure 4.1: Block diagram of the transmitter and receiver blocks of the GBT-
FPGA IP core.
Then, the scrambled data is driven into a double interleaved Reed Solomon
(RS) encoder capable of correcting up to 16 consecutive errors per 120-bit GBT
word. The use of the RS error correction reduces the data bandwidth by about
30%, so the available data bandwidth is 3.2 Gbps. However, the RS block can
be bypassed to use the Forward Error Correction (FEC) field (32-bits) for data.
This operation mode is called GBT Wide-Bus, and it provides an increased user
data bandwidth of up to 4.48 Gbps.
The output of the RS block is sent to the TX Gearbox block which man-
ages the data transmission between the GBT-FPGA IP core and the FPGA
transceivers, performs the Clock Domain Crossing (CDC) between the two do-
mains, and adjusts the data path widths.
The inverse process is performed in the GBT receiver side, where the received
data passed through Descrambler, Gearbox, and Decoder blocks. Additionally,
the GBT receiver includes the Frame Aligner block which aligns the 40-bit words
from the transceiver using the received patterns and builds the complete 120-bit
GBT word.
Since the front-end electronics is supposed to use the recovered clock to
transmit data to TilePPr, both rx frame clk and tx frame clk in the TilePPr
are in the same clock domain. However, the phase relationship between them is
unknown, and thus safe data transferences between these clock domains cannot
81
CHAPTER 4. INTEGRATION OF THE TILEPPR
be guaranteed due to metastability issues. This unknown phase difference results
in a challenge for those readout systems which have to combine data from both
clock domains. For example, in the TilePPr the received data are stamped with
its corresponding BCID, where the BCID is in the tx frame clk clock domain
and the received data is in the recovered rx frame clk clock domain. A solution
to this problem is proposed in the BE Tile GBT-FPGA IP core section.
GBT data format
The data format of the GBT protocol is described for the GBT Frame mode
in Table 4.1 and for the GBT Wide-Bus mode in Table 4.2, where 4 bits are
reserved for the Header (H), 4 bits for Slow Control (SC) and 80 bit for user
data.
119 116 115 112 111 32 31 0
H SC Data FEC
Table 4.1: Standard GBT data format with FEC.
119 116 115 112 111 0
H SC Data
Table 4.2: Wide-Bus GBT data format without FEC.
Another important feature of the GBT protocol is the header encoding. The
GBT header is encoded by the transmitter into two symbols via the Data sel
signal (Figure 4.1) adding some extra information to the contents of the frame.
• ”0110”: when the GBT word is an IDLE word.
• ”0101”: when the GBT word includes a data word.
In the GBT receiver side, the Data flag signal permits the identification of
the received word.
Standard and Latency Optimized GBT version
The GBT-FPGA IP core can be configured in two different modes depending
on the latency requirements: the Standard and the Latency Optimized (LO)
version.
82
4.2. TILE GBT-FPGA IP CORE
The Standard version is intended for non-timing critical applications since
it does not provide a deterministic latency, while the LO version guarantees a
low, fixed and deterministic latency of the clock and data in both directions at
the cost of a more complex implementation in the FPGA.
The implementation of the LO GBT-FPGA version requires to bypass those
blocks of the FPGA transceiver that cannot guarantee a fixed and deterministic
latency, such as the elastic buffers used to resolve the phase differences between
the transceiver and logic clock domains. In addition, the LO GBT-FPGA ver-
sion includes a phase alignment circuit to guarantee the deterministic phase of
the recovered rx frame clk with respect to the tx frame clk used in the source.
4.2 Tile GBT-FPGA IP core
The Tile GBT-FPGA IP core was designed based on the original LO GBT-
FPGA code to fulfill the requirements for the Tile Calorimeter readout system
at the HL-LHC.
One of the main differences between the LO GBT-FPGA version and the
Tile GBT-FPGA version is the data bandwidth. While the original code imple-
ments a communication path operating at 4.8 Gbps in both directions, the Tile
GBT-FPGA block implements an asymmetric communication path where the
downlink (back-end to front-end) operates at 4.8 Gbps and the uplink (front-end
to back-end) at 9.6 Gbps.
As discussed in Chapter 2, each redundant link of the DaughterBoard trans-
mits two 12-bit samples corresponding to 2 gains for 6 channels per LHC clock
cycle. Then, the total data bandwidth required in the uplink to transmit the
digitized data reaches 5.7 Gbps. Doubling the rate of the GBT protocol would
provide a total data bandwidth of 6.72 Gbps (using the SC bits as data), and the
remaining data bandwidth would be insufficient to transmit both the integrator
and slow control data to the TilePPr modules.
For this reason, the uplink Tile GBT-FPGA blocks were configured to op-
erate in Wide-Bus mode, where the upper 16 bits of the 32-bit FEC field were
utilized to transmit data and the lower 16 bits were reserved for a Cyclic Re-
dundancy Check (CRC) word to provide error detection, achieving a total data
83
CHAPTER 4. INTEGRATION OF THE TILEPPR
bandwidth in the uplink of 8 Gbps. It is important to remark here that the
transceivers are operated outside of the manufacturer specifications, since the
Quad PLL (QPLL) used to generate the internal clocks for the transceiver has a
frequency gap between 8 GHz and 9.8 GHz1. This issue made critical the study
and correct selection of the clocking structure and transceiver configuration.
Related to the downlink, the Tile GBT-FPGA block was configured to oper-
ate in Frame mode (using the error correction capabilities) at the nominal rate
of the GBT protocol. Minor changes on the original code were made to adapt
it to the TileCal needs. The Tile GBT-FPGA IP core and its implementation
in the DaughterBoards and TilePPr boards is described in detail below.
4.2.1 Front-end Tile GBT links
A total of four FE Tile GBT links were implemented in the DaughterBoard,
corresponding to two links per FPGA to communicate with the back-end elec-
tronics through a QSFP module. Figure 4.2 depicts a sketch of the Daughter-
Board showing the link connections between the QSFPs and the FPGAs for the
downlink (a) and uplink (b).
FPGA
A
FPGA
B
QSFP A
QSFP B
A0
A1
A0
A1
B0
B1
B1
GBTx
GBTx
B0
(a) Connections between the QSFPs
and the FPGAs for the downlink com-
munication.
FPGA
A
FPGA
B
QSFP A
QSFP B
DB
A0
A1
A0
A1
B0
B1
B0
B1
GBTx
GBTx
(b) Connections between the QSFPs
and the FPGAs for the uplink commu-
nication.
Figure 4.2: DaughterBoard link connections.
Depending on which QSFP module is used to operate the minidrawer up to
two downlinks are routed to the FPGA transceivers, although in the current
firmware version only A1 and B1 downlinks from QSFP A are implemented for
simplicity. When QSFP A is used, the GBTx A recovers a multiple of the LHC
clock (160 MHz) from the A0 channel which is routed to both FPGAs to drive
1This gap only occurs in the GTX transceivers and not to the GTH transceivers.
84
4.2. TILE GBT-FPGA IP CORE
the clocking circuitry of transceivers. In the same way, if QSFP B is used, the
GBTx B extracts the recovered clock from the B0 channel.
The recovered 160 MHz clock is connected to a Channel PLL (CPLL) which
generates all the necessary clocks to operate the FE GBT-FPGA block receiver.
In the case of the uplink, the CPLL cannot drive the transmitter at 9.6 Gbps,
and the high-performance QPLL has to be used instead. However, although the
uplink rate is in the QPLL range of operation (despite the gap already com-
mented), the values of its multipliers and dividers do not permit the generation
of the required clocks to drive the high-speed serializer in the transceiver.
A solution to overcome this limitation is to connect the 120 MHz clock
extracted from the Clock and Data Recovery (CDR) of the receiver directly to
the QPLL reference clock input using a Horizontal clock buffer (BUFH). This
technique, which is not recommended by the manufacturer [66], carries the risk
of adding jitter into the transceiver circuitry causing errors and instabilities
in the communication links due to the fact that the reference clock is routed
through the fabric clocking network.
In the first version of the GBT-FPGA IP core, the recovered clock from the
transceiver (rx word clk) was used as clock source for the QPLL with satisfac-
tory results. However, the stability of the transmitter was compromised by the
quality of the recovered clock. The proposed solution to enhance the robust-
ness of the links was to configure the CPLL of the unused receiver (A0, B0) to
generate 120 MHz from the 160 MHz provided by the GBTx. To implement
this, the auto-adapting algorithm of the CDR unit of the receiver was disabled
to interrupt the transceiver to lock to incoming data.
The clocking structure implemented for the Tile GBT-FPGA blocks is pre-
sented in Figure 4.3. All the internal clocks needed for the operation of the
DaughterBoard are generated from the rx word clk keeping the uplink and
downlink synchronized in the same clock domain. In addition, the rx frame clk
(40MHz) is derived from the rx word clk keeping a fixed and deterministic phase
difference with the tx frame clk clock from the TilePPr. It is crucial that both
clocks keep a fixed and deterministic phase difference to provide the correct time
stamp to the data in the TilePPr, since the rx frame clk is used as sampling
clock for the digitization of the PMT pulses in the MainBoard.
85
CHAPTER 4. INTEGRATION OF THE TILEPPR
rx word clkrx frame clk
tx frame clk
tx word clk
PLL
Ref clk
Control
and
Acquisition
MainBoard
GBTx
GBT
TX
GTX
TX
QPLL
GBT
RX
GTX
RX
CPLL
CPLL
GTX
RX
BUFH
GBT
TX
GTX
TX
QPLL
A
1
/
B1
A
0
/
B0
(40 MHz) (120 MHz)
(240 MHz)
(80 MHz)
(160 MHz)
QSFP
A
1
/
B1
A
0
/
B0
block
Figure 4.3: Block diagram of the clocking distribution for the Tile GBT links
in the DaughterBoard.
4.2.2 Back-end Tile GBT links
Sixteen BE Tile GBT-FPGA links were implemented in the Readout FPGA
for the communication with the Demonstrator module. The transmitter part
of the Tile GBT-FPGA was configured in Frame mode and operated at the
nominal GBT data rate of 4.8 Gbps. Only minor modifications were done
to accommodate the original code to the downlink requirements. A proper
clock distribution was planned in order to share the maximum number of clock
resources, reducing the number of clock domains.
However, the receiver part required major modifications in order to fulfill
the requirements of the data rate and to fit the high number of links in the
Readout FPGA. The BE Tile GBT-FPGA receiver was designed to operate at
the double nominal GBT data rate (9.6 Gbps) and in Wide-Bus mode, making
the Tile GBT receiver compatible with the Tile GBT transmitter implemented
in the DaughterBoard. A major problem was found when implementing the 16
links due to the limited number of clocking resources available in the Readout
FPGA and to the complexity of managing such a high number of clock domains.
As already shown in Figure 4.1, each GBT-FPGA receiver requires a MMCM
86
4.2. TILE GBT-FPGA IP CORE
block to generate the rx frame clk from the rx word clk, while the Readout
FPGA only contains 14 MMCMs.
A proposed solution to overcome this limitation and to implement the 16
Tile GBT links in the Readout FPGA was to replace the functionality of the
MMCM by Blind Oversampling CDR (BO-CDR) circuits [67] [68] [69].
The first modification was performed in the Descrambler block of the GBT-
FPGA IP core. The Descrambler block was modified to be clocked with the
rx word clk (Figure 4.4), where a multiplexer located between the RX Gearbox
and the Descrambler handles the data between both blocks. Since the Header
signal is received every 3 rx word clk clock cycles, a new GBT word is built for
every LHC half clock cycle (the uplink operates at 9.6 Gbps).
After modifying the Descrambler block, the rx frame clk is not required
anymore for the reception of the GBT words, and thus the MMCM can be
removed from the GBT receiver. Nevertheless, the data is still in the rx word clk
clock domain and has to be retimed to the TilePPr clock domain (tx frame clk),
where the phase relationship between both clocks unknown.
Figure 4.4: Block diagram of the modified Descrambler of the BE Tile GBT-
FPGA module.
The latter issue is solved with the implementation of the BO-CDR circuit
shown in Figure 4.5. This circuit is composed of a set of samplers and synchro-
nizers with 3 shift registers and one multiplexer (MUX) which are implemented
for each GBT word bit, and a unique MUX decision block for the complete GBT
link.
87
CHAPTER 4. INTEGRATION OF THE TILEPPR
The BO-CDR circuit retimes individually each received GBT word bit from
the recovered rx word clk clock domain to the local rx frame clk clock domain
(indicated as φ0 in Figure 4.5), with the exception of the 4 bits of the header
field which are not used by other firmware blocks in the TilePPr. This solution
requires the implementation of 116 BO-CDR circuits per GBT link for a total
of 1,856 circuits to retime the 4 GBT links.
Sampler
φ 4pi
3
φ0
φ0
φpi
MUX
decision
φ0
φ0
φ0
data
Data from GBT
Sync0
ˆSync1
ˆSync2
Address
φ 2pi
3
φ0
D Q
D
D D
D
D Q
QQ
Q Q
Synchronizer + shift register
PhaseSel
MUX
x 116
Synchronized
Figure 4.5: Block diagram of the BO-CDR circuit.
The principle of operation of the BO-CDR consists of oversampling the re-
ceived data using 3 copies of a local generated rx frame clk shifted 120◦ in
phase from each other. This technique, called 3X oversampling, is represented
in Figure 4.6.
Data 0 Data 1
φ 4pi
3
φ 2pi
3
φ0 φ0 φ0φ 4pi3φ
2pi
3
Figure 4.6: Concept of the 3X oversampling technique.
Therefore, the received data is registered using the 3 samplers shown in
Figure 4.5 and retimed into the local rx frame clk clock domain using the syn-
chronizers and shift registers. The retimed data is stored in the 3 shift registers
and the MUX decision block selects which of the 3 copies will be sent out of the
GBT-FPGA IP core versions through the data multiplexer (MUX).
88
4.2. TILE GBT-FPGA IP CORE
Figure 4.7 shows a block diagram of the decision MUX block. It implements
a phase picking algorithm which performs the selection for the entire GBT word
based on the value of the syncx signals at the rx frame clk frequency for every
received GBT word.
Sync0
Sync1
Sync2
Phase
selector
XORa
XORb
XORc
1
N
∑
φ 4pi
3
Address
control
1
N
∑
φ 2pi
3
1
N
∑
φ0
D Q
Address
φ0
Phase
block
Figure 4.7: Block diagram of the MUX decision block.
The syncx corresponds to the resulting signals of 3X oversampling the HG/LG
bit with the circuit shown in Figure 4.8. As will be covered in the uplink data
format section, the HG/LG bit is used to identify the sample gain of the incom-
ing data. This bit was found to be the more adequate since it toggles each clock
cycle, providing more statistics to the decision block than any other bit.
φ 4pi
3
φ0
φ0
φpi
HG/LG
φ 2pi
3
φ0
D Q
D
D D
D
D Q
QQ
Q Q
φ0
φ0
φ0
D
D
D Q
Q
Q
Sync0
Sync1
Sync2
Figure 4.8: Synchronization circuit implemented for the acquisition and retiming
of the HG/LG bit.
After the reception of the syncx, the transition edges of the HG/LG are
easily found by applying the XOR operation between two adjacent samples. The
results of the XOR operation are passed to the Phase Selector which provides the
result to the moving average blocks. Then, the Address Control block compares
the output of the moving average blocks and the position of the MUX changing
the shift register addresses if needed.
89
CHAPTER 4. INTEGRATION OF THE TILEPPR
The phasor diagram corresponding to the implemented phase picking algo-
rithm is shown in Figure 4.9, where the black arrow represents the phase of the
transition edge with respect to the local rx frame clk, the red arrow is the phase
of the selected data and ϕd is the phase between the received data and the local
rx frame clk.
2pi
3
4pi
3
0
ϕdφ 4pi
3
2pi
3
4pi
3
0
φ0
ϕd
2pi
3
4pi
3
0
ϕd φ 2pi
3
Figure 4.9: Phasor diagram describing the operation of the phase picking algo-
rithm implemented in the MUX decision block.
Therefore, the MUX decision block can select the output data depending on
three different situations:
• If 0 < ϕd < 2pi/3 then the selected clock phase is φ4pi/3.
• If 2pi/3 < ϕd < 4pi/3 then the selected clock phase is φ0.
• If 4pi/3 < ϕd < 2pi then the selected clock phase is φ2pi/3.
This circuit is also capable of compensating the phase drifts produced be-
tween the clocks due to temperature and/or voltage variations in the electronics,
since it is continuously selecting the most appropriate clock phase to register
the incoming data. However, a special situation occurs when the phase between
both varies and the MUX decision block changes the selected data from φ0 to
φ 4pi
3
, or vice versa. Since the same address of the three shift registers points
to data acquired at different LHC clock cycles, this situation produces the dis-
placement of one local rx frame clk clock cycle in the received data stream. In
order to prevent this mis-synchronization the MUX decision block compensates
the missing or added clock cycle adjusting the address of the shift registers in
real time.
Finally, there is another important operation that the MUX decision per-
forms to time stamp the received data properly. As already commented, the
90
4.3. DATA FORMAT
front-end electronics transmit the high and low gain samples at the LHC fre-
quency corresponding to one LHC clock accompanied with the HG/LG bit. In
order to time stamp the samples with the correct BCID, during the initialization
of the BE GBT-FPGA links the MUX decision block checks the value of the
HG/LG with respect to the local tx frame clk. If both signals are not properly
aligned the MUX decision block adds a delay of half tx frame clk cycle to the
output data with the shift registers.
The final architecture of the Tile GBT-FPGA IP core implemented in the
TilePPr is shown in Figure 4.10, where the MMCM was replaced by a BO-CDR
circuit and the rx frame clk clocks are generated with the same MMCM (not
shown in this figure) used to generate the local tx frame clk clock.
Figure 4.10: Block diagram of BE Tile GBT link.
4.3 Data format
The data format for the communication with the upgraded readout electronics
was defined to fulfill the requirements. All configuration commands needed for
the operation of the module, the JTAG data for the remote configuration and the
LHC clock are distributed via the downlink. The uplink path transports all the
readout data from the front-end electronics, monitoring data and configuration
status of the module.
91
CHAPTER 4. INTEGRATION OF THE TILEPPR
4.3.1 Downlink data format
The downlinks communicate the TilePPr prototype with the GBTx chips and
FPGAs in the DaughterBoard using a GBT link configured in Frame mode.
The data format for the downlink is different depending on its destination.
FPGA downlink
The FPGA downlink is used to transmit the configuration commands, read
back requests and the Bunch Counter Reset (BCR) signal. The 84-bit words
are distributed as indicated in Table 4.3. Commands are addressed to different
DB registers by encoding the DB Address field.
83 50 49 48 47 32 31 0
Not used BCR N DB Address Data
Table 4.3: Data format fields in the downlink. Bit N indicates that the received
command is new and therefore the corresponding register has to be updated
with the data contained in the Data field.
GBTx downlink
The GBTx downlink is employed to distribute the LHC clock to the FPGA
transceivers, to transmit the JTAG signals for remote configuration and reset of
the DaughterBoard FPGAs. In the back-end electronics, the TilePPr prototype
encodes the JTAG and reset signals into the downlink GBT word and the GBTx
propagates the decoded signals to the FPGAs. The TMS, TCK and TDI signals
are transmitted through the GBTx while the corresponding TDO is transmitted
by the FPGA not being programmed. The data format for controlling the
remote programming and resets is shown in Table 4.4. Only the lower 24 bits
of the GBT word are used.
A sketch of the connections between the GBTx in side A and the Daugh-
terBoard FPGAs is represented in Figure 4.11. The second GBTx in side B is
23 22 21 20 19 18 17 16 15 8 7 6 5 4 3 2 1 0
TMS B TDI B RST B TCK B N.U. TMS A TDI A RST A TCK A
Table 4.4: GBTx downlink data format.
92
4.3. DATA FORMAT
connected to the FPGAs in the same way but receiving the signal through the
QSFP in side B.
GBTxFPGA
A
FPGA
B GBTx
QSFP
QSFP
DB
TMSA
TCKA
TDIA
RSTA
RSTB
TMSB
TCKB
TDIB
TDOB
TDOB
T
D
O
B
T
D
O
A
Figure 4.11: Connections between the GBTx chip and the DaughterBoard FP-
GAs for QSFP A.
4.3.2 Uplink data format
The DaughterBoard transmits a data word including high and low gain samples,
integrator data and monitoring data at the LHC clock frequency. As already
described in the previous section, the uplink implements the Wide-Bus GBT
mode in order to increase the data bandwidth. Thereby, since the Wide-Bus
mode does not implement the FEC algorithms, the last two bytes of the uplink
GBT word are reserved for the CRC that uses bits from 115 to 16 bits as
input. Then, the CRC permits the detection of errors occurred in the uplink
communication path. The CRC algorithm was implemented as a LSFR circuit
and XOR gates. The polynomial G(x) employed for the communication is stated
in Equation 4.1.
G(x) = 1 + x1 + x2 + x4 + x7 + x12 + x16 (4.1)
The readout and integrator data are transmitted at the LHC clock frequency
using 80 bits of the GBT word. Table 4.5 describes data format used for the
uplink. For every period of the LHC clock, two GBT words are transmitted
to the TilePPr prototype. In the first half period the low gain samples from
6 PMTs are transmitted, and in the second half period the high gain samples
of the 6 PMTs. Five bits are reserved for the integrator data. The integrator
data words are divided in 5 bit words following the format described in the
Integrator block section. Additionally, the BCR signal received in the front-end
93
CHAPTER 4. INTEGRATION OF THE TILEPPR
electronics is transmitted back to the TilePPr prototype. As will be discussed
in next chapter, the BCR signal is used to calculate the latency of the links in
units of BC.
95 91 90 89 88 87 76 75 64 63 52 51 40 39 28 27 16
Integrator NU HG/LG BCR PMT6 PMT5 PMT4 PMT3 PMT2 PMT1
Table 4.5: Data format fields in the uplink for the readout and integrator data.
The HG/LG bit represents the gain, being 1 the high gain and 0 low gain.
The upper bits of the GBT word are used to read back the front-end elec-
tronics configuration and to monitor the DCS values. As for the DB commands,
it consists of a 16-bit address and 32-bit argument. Since the DCS and TTC
field has 20 bits, each transmission is performed using three consecutive GBT
words. The two Most Significant Bits (MSB) are used to index the parts of
the total message. The format of a fragment to read back the configuration is
defined in Table 4.6.
Word 113 112 111 96
1 00 0x0000
2 01 DB Address
3 10 Parameter[31..16]
4 11 Parameter[15..0]
Table 4.6: Data format fields in the uplink for the TTC and DCS data.
4.4 Front-end electronics firmware
In this section the firmware blocks implemented in the DaughterBoards are
described. Figure 4.12 shows a block diagram of the main functional blocks of
the firmware.
4.4.1 Data Packer
The Data Packer handles the DB registers that are used for remote operation
and configuration of the front-end electronics and builds the uplink GBT word
with the readout and monitoring data. A set of nine 32-bit registers interface
the different blocks shown in Figure 4.12 with the TilePPr in the back-end
94
4.4. FRONT-END ELECTRONICS FIRMWARE
GBT
Back-end
electronics
ADC
block
CIS
MB
Integrator GBTx
config
Data
Packer
DCS
FSM
BCR
PLL
LH
C
C
LK
Config
Config
Commands
Readback
XADC
GBTx chip
Samples
High speed
ADCs
HV Opto Cyclone IV
Cyclone IV
Integrator
ADC
Sensors
Data
Figure 4.12: Block diagram of the DaughterBoard firmware.
electronics. A summary of the register list and the description is shown in
Table 4.7.
The Data Packer decodes the downlink GBT words to fill the corresponding
DB registers that are used for controlling the rest of the firmware blocks. More-
over, the Data Packer builds the uplink GBT word with the data received from
the different firmware blocks as the Integrator and the DCS module. The Data
Packer also extracts the data contained in the DB registers adding it to the up-
link GBT word. The DB registers are continuously read in a circular way only
interrupted when the TilePPr prototype requests the reading of a MainBoard
register.
The Data Packer uses the received BCR signal to synchronize the data trans-
mission sending a complete set of high and low gain samples for every LHC clock
cycle. Therefore, the data packing of the samples is deterministic in terms of
latency since samples corresponding to the same BCID are aligned with the
BCR.
95
CHAPTER 4. INTEGRATION OF THE TILEPPR
Description Type
MainBoard command W
Firmware version R
Read back commands FPGA 0 R
Read back commands FPGA 1 R
XADC reading R
Soft Error Mitigation R
Integrator R/W
Status/ operation of the HVOpto R/W
Read back / direct CIS commands R/W
Table 4.7: Registers for the operation of the DaughterBoard, where the W
indicates that the register can be only written, R that the register can be only
read and R/W that the register can be either read or written.
4.4.2 MainBoard Interface module
The MainBoard Interface module provides a communication path between the
DaughterBoard and the MainBoard through an SPI port. The MainBoard is
populated with four Altera Cyclone IV FPGAs each one controlling 3 channels.
The DaughterBoard receives the commands from the Data Packer and routes
them to the appropriate MainBoard FPGA. The MainBoard commands permit
the complete configuration of the FEBs, integrator ADCs, and fast readout
ADCs.
The MainBoard commands use 24 bits: 3 bits are reserved for the type of
command, 5 bits for the MainBoard FPGA and channel ID, and 16 bits for the
command and data. The data format is presented in Table 4.8.
23 22 21 20 18 17 16 15 0
T E B FPGA channel command and data
Table 4.8: Data format of the commands transmitted to the MainBoard FPGAs.
The encoding of the 3 MSB bits for the selection of the command type is
represented in Table 4.9. Bits from 20 to 16 are used to select the FPGA and
channel or to send the commands in broadcast. The encoding of the FPGA and
channel fields are indicated in Table 4.10 and 4.11.
96
4.4. FRONT-END ELECTRONICS FIRMWARE
T E B Command type
1 0 0 CIS timing and ADC commands (T commands)
1 0 1 Read back of timing and ADC register values (B commands)
0 1 0 TTC commands (E commands)
0 0 1 Read back of TTC commands (B commands)
Table 4.9: Header identification code for the 3-in-1 commands.
20 18 FPGA
000 A0
001 A1
010 B0
011 B1
1XX All FPGAs
Table 4.10: MainBoard FPGA identification code.
The distribution of the command and data fields depends on the type of
command. For the TTC commands (E), the 4 MSB bits are reserved to define
the type of command and the 12 LSB bits for data. In the case of the Timing
and ADC commands (T) the 3 MSB bits are reserved to define the type of
command and the 13 LSB bits are for data.
TTC commands
The TTC commands (E) are used for configuration of 3-in-1 boards and inte-
grator ADCs on the MainBoard. These commands allow the configuration of
3-in-1 cards for physics, CIS calibration runs or to disable the trigger outputs.
A summary of the functionalities controlled through the E commands is listed
below:
• Configuration the DACs driving the pedestal of the fast ADCs.
• Configuration of the internal switches of the 3-in-1 for physics data taking
or CIS.
• Control of DAC that charge the capacitors for CIS.
• Charge and discharge of the capacitors for CIS.
• Configuration of the Integrator gain.
• Enable / disable the trigger output.
97
CHAPTER 4. INTEGRATION OF THE TILEPPR
17 16 PMT
00 Channel 1
01 Channel 2
10 Channel 3
11 All channels
Table 4.11: PMT identification code.
Timing and ADC commands
The Timing and ADC commands (T) can be CIS phase timing commands or
ADC commands. The CIS timing commands configure the phase of the TPH
and TPL pulses that initiate the charge and discharge the capacitors. The T
commands permits the adjustment of the peak CIS pulse with the sampling
clock in steps of 16/25 ns. Also T commands are used to configure the ADCs.
Through the MainBoard FPGAs, the ADCs can be configured to send data
patterns for the alignment of the serial data in the DB, reset, or read internal
values of the ADC.
Read back commands
The execution of read back commands (R) generates a 18-bit word which in-
cludes the PMT ID, last executed command and the corresponding data field.
The use of R commands is very helpful during the operation of the module to
verify that the commands have reached the FPGAs in the MainBoard. After
the execution of a R command, the DB collects the requested data from the
MainBoard FPGAs and transmits it to the TilePPr through the uplink.
4.4.3 Charge Injection System block
The Charge Injection System can be controlled using the TPH and TPL lines
which connects the DB with the Cyclone IV FPGAs, or executing E commands
to configure the internal switches of the 3-in-1 cards.
The CIS block controls the charge and discharge of the small and large
capacitors by toggling the TPH and TPL lines. TPH controls the small capacitor
to calibrate the high gain circuitry while TPL controls the large capacitor for
low gain calibration. This block is configured from the back-end electronics with
98
4.4. FRONT-END ELECTRONICS FIRMWARE
the BCID positions when the capacitor has to be charged and discharged. The
value of the DAC is configured through a non-synchronous command from the
TilePPr.
4.4.4 Integrator block
The Integrator block reads out the integrator ADCs in the MainBoard through
the I2C bus. Each 16-bit ADC present in the MainBoard digitizes the integrated
current from the 3-in-1 card integrator. The integrator block requests a new
ADC sample every pre-programmed number of orbits, and then transmits the
samples to the Data Packer. The number of orbits is configured through the
Integrator DB register from Table 4.7.
Since 4 bits of the uplink word are reserved for the integrator data, the
transmission of an integrator sample requires 5 clock cycles of tx frame clk.
Thus, the integrator data is divided in 5 blocks as indicated in Table 4.12.
Word 95 94 91
1 V Channel ID
2 V 3..0 bits
3 V 7..4 bits
4 V 11..8 bits
5 V 16..12 bits
Table 4.12: Data format and sequence of the integrator words transmitted to
the TilePPr. The V bit indicates that the received data is valid.
4.4.5 GBTx configuration module
The GBTx configuration module is implemented with a 8-bit PicoBlaze Proces-
sor [70] which configures the GBTx chips via a I2C interface. The PicoBlaze is
connected to a 100 MHz local oscillator and loads the configuration of the GBTx
chip every time the DaughterBoard is powered on. This module configures the
GBTx enabling all the output clocks required to drive the GTX transceivers and
the e-links needed for the remote JTAG configuration and resets through the
GBTx chip. In addition, timeout and watchdog capabilities are included in the
present GBTx configuration to monitor the correct operation of the chip and
reset the internal FSMs if needed. Once the configuration of the GBTx satisfies
99
CHAPTER 4. INTEGRATION OF THE TILEPPR
the requirements for the Demonstrator, the GBTx chips will be permanently
programmed with the desired configuration, and the PicoBlaze processor will
be used to reconfigure the GBTx chip and for monitoring purposes.
4.4.6 Monitoring block
Xilinx series 7 FPGAs include a Xilinx Analog-to-Digital Converter (XADC)
block composed of a dual 12-bit ADC operating at 1 Msps and some on-chip
sensors to measure internal voltages and temperatures. Some of the Daughter-
Board FPGA pins can be used to connect analog signals and digitize with the
internal XADC block. In the DaughterBoard 10 analog inputs are connected to
the XADC block for remote monitoring of the module parameters. The moni-
tor block reads out four internal FPGA measurements from the XADC block,
including voltages and temperatures, and eight external measurements corre-
sponding to the MainBoard voltages and the module temperatures. All these
values are sequentially written in the XADC register and transmitted to the
TilePPr.
4.4.7 DCS module
The DCS module handles the communication with the HVOpto board for the
operation and monitoring of the PMT blocks through the 40-pin connector
in the DB. This module is only functional in minidrawers equipped with the
HVOpto board to provide the high voltage to the PMTs. The communication
protocol used between the DaughterBoard FPGA and the MAX chip in the
HVOpto board is SPI. The DCS module converts the received commands from
the TilePPr to SPI and communicates with the MAX chip in the HVOpto board.
The DCS module receives the commands from the TilePPr through the HVOpto
register. DCS commands are used to enable or disable the HV channels, set the
HV values or request a measurement reading from the MAX chip temperature
and voltage sensors.
100
4.4. FRONT-END ELECTRONICS FIRMWARE
4.4.8 ADC block
The ADC block module is one the most critical pieces in the DB firmware. This
block deserializes the data coming from 6 dual-channel 12-bit ADCs (LTC2264-
12) at a data rate of 560 Mbps per ADC channel. The MainBoard ADCs
generates a 14-bit word samples in serial mode per channel corresponding to the
digitization of the analog input signal, where the 12 MSB bits contain the sample
data and the two remaining bits are present to ensure software compatibility
with the 14-bit version of these ADCs. Synchronous with the serial data, the
ADC provides two auxiliary signals: a 280 MHz clock signal, called bit clk,
which is used to register the serial data in Double Data Rate (DDR) mode; and
a 40 MHz clock signal, called frame clk, which delimits the word boundaries
of deserialized word. Figure 4.13 shows the timing diagram of data and clock
signals provided by the ADC chip for the configuration used in the MainBoard.
Figure 4.13: Timing diagram of the ADC data and clock signals. FR signal
is the frame clk, DCO is the bit clk and OUT the output data. Extracted
from [71].
In the DaughterBoard, two ISERDESE2 blocks [72] are used per ADC chan-
nel to deserialize the serial data into a 14-bit parallel word. The selection of
a proper clocking architecture to provide the clocks to the ISERDESE2 is cru-
cial for the correct deserialization. Most common clocking architectures for
the implementation of ADC interfaces in FPGA includes an I/O clock buffer
(BUFIO) to provide the bit clk to the ISERDESE2 blocks through a dedicated
clocking network, and a second clock seven times slower than the bit clk, called
101
CHAPTER 4. INTEGRATION OF THE TILEPPR
frame clk local, which is generated from the bit clk with a regional clock buffer
(BUFR). This clocking architecture is recommended for source-synchronous
data capture since it ensures phase alignment between the bit clk and the
frame clk local.
However, the phase relationship between the generated frame clk local and
the incoming frame clk is unknown since the output phase of the BUFR after
its initialization can not be predicted. Figure 4.14 shows a timing diagram of
the deserialization process using the ISERDESE2 blocks with a factor 4, where
frame clk local and frame clk have a different phase.
A B C D
BCDA
frame clk
bit clk
frame clk local
data
A B C D A B C D
BCDA
Serial data
Output
Word
Figure 4.14: Timing diagram for the deserialization of a 4-bit word with the
ISERDESE2 block.
Most applications not requiring deterministic latency implement this clock-
ing architecture and uses a dedicated block of the ISERDESE2 to shift bits of
the parallel data up to it is aligned with frame clk. This shift-bit operation
in the ISERDESE2 could introduce a non-deterministic latency in the outputs
under some situations [73]. Furthermore, since the deserialized data is clocked
with frame clk local and has to be retimed to the rx frame clk domain a latency
uncertainty of one LHC clock cycle is introduced in the output of the ADC
block.
Therefore, an alternative method was developed to achieve word alignment
with deterministic and fixed latency using the ISERDESE2 blocks and the rec-
ommended clocking architecture.
As shown in Figure 4.15 the proposed method keeps the same clocking archi-
tecture, where the frame clk local is generated from the bit clk using a BUFR.
After the initialization of the DaughterBoard, the frame clk is deserialized us-
ing the bit clk and the frame clk local. The parallel word is passed to the Word
102
4.4. FRONT-END ELECTRONICS FIRMWARE
Alignment FSM (WA-FSM) which reinitializes the BUFR with a reset signal
until the deserialized frame clk is equal to 11111110000000, indicating that
frame clk and frame clk local are phase aligned. In order to obtain a different
output phase after resetting the BUFR, the WA-FSM delays the reset signal for
every iteration in steps of one bit clk period using a shift register.
ISERDES
CLK
CLK DIV
frame clk P
frame clk N
14
WA-FSM
bit clk P
bit clk N
ISERDES
CLK
CLK DIV
Data P
Data N
14
× 7
BUFIO
BUFR
IDELAY
IDELAY
IDELAY
RST
RST
Reset
LVDS
LVDS
LVDS
Data
BO-CDR
14
rx frame clk
rx frame clk120
rx frame clk240
frame clk local
Figure 4.15: Block diagram of the ADC block including the WA-FSM and the
BO-CDR.
Thereby, the ADC block monitors and aligns the frame clk local and frame clk
phases, where it is important to remark that the phase difference between
frame clk and rx frame clk (sampling clock distributed to the ADCs) is fixed
by design [71]. In addition, input data signals were delayed using the IDELAY
blocks in order to compensate the internal delays between the clock and the
data and provide sufficient timing margin for sampling the data.
Finally, there still exists a CDC issue since the data is synchronous with
frame clk and has to be retimed to the LHC clock in the DaughterBoard for
the transmission to the Data Packer. This CDC issue was resolved adding a
BO-CDR circuit at the output of the ISERDESE2.
103
CHAPTER 4. INTEGRATION OF THE TILEPPR
4.5 Back-end electronics firmware
Different blocks were implemented in the TilePPr for the operation of the
Demonstrator module. This section gives a detailed description of the design
of the TilePPr prototype firmware. Figure 4.16 depicts a block diagram of the
main functional blocks of the TilePPr firmware.
Encoder
Decoder
IPbus
Readout
GBT
FELIX
Packer
TTC
G-Link
Packer
FELIX
Front-end
electronics
TTC comm
L
1A
Link
Controller
High gain
Low Gain
C
om
m
an
d
s
R
es
et
J
T
A
G
Pipelines
TTC
system
Ethernet
Tile
RODs
L
1A
(IPbus)
DCS
FSM
Integr.
DCS comm
Latency
BCR Int data
D
C
S
d
at
a
BCR
(TTC)
Figure 4.16: Simplified block diagram of the TilePPr prototype firmware de-
signed to operate the Demonstrator module.
4.5.1 IPbus
The Readout FPGA is controlled via a Gigabit Ethernet port operated with the
IPbus protocol [74]. The IPbus protocol is a IP-based control protocol designed
for handling the data transaction between FPGA-based ATCA systems and
computers.
The software communicates with the FPGA-based devices through standard
UDP or TCP connections establishing a bidirectional client-server architecture.
The remote access to the clients is achieved by software through the mapping
of a virtual 32-bit address space to 32-bit registers in the FPGAs. The IPbus
protocol supports individual read and write operations on all the registers in
the FPGAs.
The data handling between the software applications and the IPbus module
in the TilePPr is managed by a software application called Control Hub. The
Control Hub manages simultaneous access of multiple software applications to
the TilePPr. The communication between the software applications and the
104
4.5. BACK-END ELECTRONICS FIRMWARE
Control Hub is implemented using TCP/IP while the communication between
the Control Hub and the TilePPr is UDP/IP.
The TilePPr IPbus registers are accessed through specific-designed software
applications written in C++ and Python which permits the control and con-
figuration of the TilePPr and the Demonstrator. For this purpose, almost one
thousand 32-bit registers were implemented in the TilePPr firmware.
4.5.2 Link Controller
The Link Controller manages the resets and remote JTAG chain through two
IPbus registers. It can individually reset the transmission and reception parts of
the GBT links by the assertion of specific bits of the IPbus register. In addition,
the Link Controller can also reset the DB FPGAs remotely.
The second register is used to decide which DaughterBoard and FPGA is
included in the remote JTAG chain. Therefore, through this register, the Link
Controller configures the internal multiplexers to access one FPGA at a time
externally.
4.5.3 Encoder
The Encoder builds the GBT word which will be transmitted to the front-end
electronics. It encodes the commands with the proper data format and BCR
signals. There is one Encoder per GBT link, where each can receive commands
from three different sources: IPbus registers, DCS FSM, and from the TTC
module.
From the IPbus registers the Encoder manages two types of commands:
asynchronous or synchronous commands. The asynchronous commands are for-
matted and transmitted to the front-end electronics as soon as they are received,
while the synchronous commands are transmitted every orbit at a specific BCID.
Commands received from the IPbus path are defined using 2 IPbus registers:
address and argument registers. The address register indicates to the Encoder
the destination of the command specifying minidrawer, DB side, and the DB
register. The argument register contains the data to be written in the DB regis-
ter. Table 4.13 presents the data format for the synchronous and asynchronous
commands.
105
CHAPTER 4. INTEGRATION OF THE TILEPPR
31 30 28 27 16 15 14 13 12 11 0
NU MD Index BCID NU Broadcast DB side DB register address
Table 4.13: Data format of the address register for the propagation of com-
mands. Where the MD index field indicates the minidrawer destination and the
BCID field is maintained empty for the asynchronous commands.
Synchronous and asynchronous commands can be transmitted to a single
DaughterBoard FPGA specified with the 16 upper bits of the address register,
or in broadcast reaching to all the DaughterBoard FPGAs asserting the corre-
sponding bit. An example of synchronous commands are the TTC commands.
They are used to configure the switches of the 3-in-1 cards during the CIS cal-
ibration runs. Other kind of commands as the configuration of the input bias
voltage of the ADCs or the high voltage settings are not required to be executed
at a certain BCID and are encoded as asynchronous commands.
The TTC commands received from the TTC module are translated into
commands that the front-end electronics can process. Since the legacy TTC
commands are directly formatted by the Encoder and transmitted to the front-
end electronics, the type of TTC command defines if they have to be executed
at a specific BCID or not.
Finally, the DCS commands are received from the DCS FSM. They are
only transmitted for those minidrawers hosting the HVOpto boards since these
commands are employed for the management of the high voltage system in the
front-end electronics.
4.5.4 DCS FSM
The DCS FSM in the TilePPr prototype interfaces the DCS software and the
TilePPr prototype to control and monitor the HVOpto boards. The DCS com-
puter sends commands and requests to the TilePPr through the IPbus interface
which are received and formatted in the DCS FSM before being transmitted
to the Encoder. In addition, the DCS FSM handles the DCS information of
the HVOpto boards. The FSM sends the voltage and temperature from the
individual HVOpto channel to the DCS computer.
106
4.5. BACK-END ELECTRONICS FIRMWARE
4.5.5 TTC module
One of the key modules implemented in the Readout FPGA is the TTC mod-
ule. This module, in combination with the ADN2814 chip presented in Chap-
ter 3, emulates some of the functionalities of the TTCrx chip. The ADN2814
chip recovers a version of the LHC clock with four times the original frequency
(160 MHz), in addition to the encoded TTC data at 160 Mbps. The recovered
clock and data are then passed to the TTC module in the Readout FPGA where
the TTC stream is decoded.
The TTC information is encoded in BiPhase Mark (BPM) to provide a
proper DC-balance, and includes two Time Division Multiplexed (TDM) chan-
nels (channel A and B). The channel A is dedicated for the propagation of the
L1A signals and channel B is used to transmit commands from the TTC crates
to the detector electronics. The TDM BPM encoding of the TTC channels is
shown in Figure 4.17.
CHANNEL A CHANNEL B
25 ns
0
0
1
1 1
0
1
0
Level-1
Reject
Level-1
Accept
Figure 4.17: TDM BPM encoding.
There are two types of B-channel commands: broadcast and addressed com-
mands. Broadcast commands are executed by all the receivers, while the ad-
dressed commands are only executed by those receivers which have the indicated
TTC address. The B-channel commands are also Hamming encoded providing
the capability to detect up to two bit errors and correct a single bit error per
frame. Some examples of broadcast commands are the BCR or ECR signals,
while the commands used for the configuration of the internal switches of the
3-in-1 cards are addressed.
Therefore, the TTC module implements all the functionalities required to
decode the legacy TTC stream. The decoded B-channels are converted to the
107
CHAPTER 4. INTEGRATION OF THE TILEPPR
new commands and transmitted to the Encoder for its transmission to the front-
end electronics and the L1A signals are propagated to the pipeline memories
and L1A counters.
In addition, the TTC module can be configured to operate in two different
modes:
• External mode: where the L1A and BCR received from the TTC system
are propagated to the front-end electronics.
• Internal mode: where the propagated TTC signals are generated locally
in the TilePPr. The BCR is generated internally with a 12-bit counter
with the possibility of limiting to 3563 BC (1 orbit) and the L1A signal is
controlled with an IPbus register.
For the physics operation mode, the TTC module is set to external mode
to receive the L1A and B-commands from the legacy TTC system, while the
internal mode is used for calibration runs and testing.
4.5.6 Decoder
The Decoder is the first module after the GBT receiver module. Each Decoder
(Figure 4.18) routes the data received from four GBT links (A0,A1,B0 and B1)
to other firmware blocks.
The link selection between the redundant A0 and A1 channels or B0 and B1
channels can be configured in auto or manual mode. When configured in auto
mode, the Decoder blocks selects dynamically one of the two redundant links
based on the result of the comparison between the computed and the original
CRC codes included in the incoming frame. If an error is detected in one of
the redundant links, then the other link is flagged as valid. However, when
configured in manual mode, the Decoder selects a specific channel regardless
the result of the CRC checking.
Finally, after the channel has been selected, the Decoder converts the frame
in subframes and routes them to the different firmware blocks for further pro-
cessing.
108
4.5. BACK-END ELECTRONICS FIRMWARE
CRC
checker
GBT A0
GBT A1
Selection
Samples
Integrator data
TTC/DCS data
72
5
21
BCR
Result
Control
Link
116
116
Figure 4.18: Block diagram of the Decoder. Only A0 and A1 channel are shown
for simplicity.
4.5.7 Integrator Readout block
There is one Integrator Readout block per GBT link receiving the integrator
subframes from its corresponding Decoder. Each Integrator Readout block in-
cludes a FSM to build the integrator data from the received subframes and to
store the results in twelve 512-position FIFO memories, each one dedicated to
a FEB channel.
The integrator data is read out from the FIFO memories using the IPbus
interface. The FIFO memories act as elastic buffers to prevent the data loss
when the FIFOs can not be read during short periods of time since the CPU
does not allocate time for the execution of the integrator readout application.
4.5.8 Pipeline module
The pipeline modules store the received data from high-speed ADCs in internal
memories. Upon the reception of a L1A signal, the selected samples are trans-
ferred to different readout paths. This module acts as a elastic buffer to absorb
the data generated by consecutive L1A signals, thus avoiding the data loss.
The pipelines were implemented using dedicated RAM-blocks in order to
save FPGA logic resources. There is one complete pipeline module per gain and
channel for a total of 96 pipeline modules (48 channels x 2 gains). Figure 4.19
shows the complete block diagram of a pipeline module.
A pipeline module consists of one Main Pipeline (MP), two Derandomizer
Memories (DM) connected to the ROD and FELIX interfaces and one IPbus
Memory Block (IMB) for the IPbus readout path. The MP blocks are imple-
109
CHAPTER 4. INTEGRATION OF THE TILEPPR
mented using a 18 Kb RAM-block segmented in 512-sample positions, being
capable of storing up to 12.8 µs of samples. The DM are implemented in the
same type of RAM-blocks but segmented in 16 pages of 16-sample positions
each, and IMB blocks in 16 pages with 32-sample positions.
Three Control blocks handle the storage of the samples into the 96 pipeline
modules and the data transfer to the Event Packers.
RdWr
L1A
96 × Pipelines
MP
Control
RdWr
DM
Control
FELIX
Trigger
Packer
Data
Ready
Start
ROD
Packer
× 96
× 96
RdWr
Control
DM Ready
Start
FELIX
ROD
IPbus
Derandomizer
Derandomizer
RdWr
IMB
Control
× 96
Derandomizer
Event
Event
IPbus
Registers
Config
FELIX
Interface
G-Link
Encoder
Figure 4.19: Block diagram of the pipeline modules, Event Packers and readout
interfaces.
The MP receives samples from the Decoder and stores them in a circular
buffer at the LHC clock frequency. The MP control block timestamps the in-
coming sample with the corresponding BCID and manages the write address
port by incrementing a 9-bit counter by 1 for each LHC clock cycle.
When the MP Control block receives a L1A signal from the TTC system,
the current write address is copied to a pointer table. Then, the Control block
transfers to the DM and IMB memories the 16 data samples previous to the
memory address stored in the pointer table. This handling of pointers avoids
the event loss when a second L1A is received during the transaction of data
selected by a previous L1A.
110
4.5. BACK-END ELECTRONICS FIRMWARE
The DM control block handles the data flow between the DM memories and
the Event Packers. The DM control block receives and stores the selected data
from the MP in the DM memories, and initiates a handshake (start, ready) with
the Event Packers to transfer the data corresponding to 96 channels at once.
Again, the DM blocks includes a table of pointers with memory addresses to
avoid data loss during the transaction of selected data to the Event Packers,
and the transmission of event fragments to the FELIX and ROD systems.
The IMB operates in a different way than the DM. It can be configured
in blocking or unblocking modes. When it is configured in unblocking mode,
the IMB operates similarly to the DM block but storing 32 samples instead of
16 samples. However, since the data bandwidth of the IPbus interface is not
fast enough to read out all the selected data the IMB memories are eventually
overwritten. The unblocking operation mode is only intended for testing and
to adjust the pipeline delay with respect to the time arrival of the L1A signal.
This is achieved by adjusting the memory addresses stored in the pointer table
of the MP block. On the other hand, when the IMB is configured in blocking
mode, the reception of new L1A signals is blocked until the selected data is read
from the memories.
4.5.9 Readout interfaces
The TilePPr prototype is read out through two different interfaces: the legacy
ROD and the FELIX system.
The Event Packers format the selected samples before they are transmitted
to the legacy ROD and FELIX system through dedicated interfaces. They also
handle the data flow between the pipelines and the interface paths through
the Control blocks. Both Event Packers operate in the same way, when the
MP receives a L1A signal, 16 high and low gain samples are copied from the
DM blocks to an intermediate register-based memory. After the handshaking
process between the Event Packers and the DM Control blocks is completed,
the Event Packers build an event fragment which is transferred through the
interface blocks to the next systems in the DAQ architecture.
111
CHAPTER 4. INTEGRATION OF THE TILEPPR
ROD interface
The ROD Event Packer builds the event fragment using the ROD format and
controls the G-Link encoder to transmit the data to the RODs. The TilePPr for-
mats the data using the same format as the legacy modules to be the backward-
compatible with the legacy system.
Table 4.14 shows the general ROD packet structure composed of a header
and trailer, 16 DMU blocks with samples from 3 PMTs each, DMU chip mask
word and the computed CRC word. The size of the all DMU block is the same
and depends on the configuration parameters of the Event Packer. The following
options can be tuned:
• Number of samples: 1 to 16 samples (7 samples is the default value).
• Truncate mode: selected sample bits are 10 MSB, 10 to 1 MSB or 10
LSB. This option is not configurable in the legacy system, being intended
to perform more accurate studies on the digital noise of the Demonstrator
module.
• Gain: high/low gain, auto-gain or bi-gain. In the auto-gain mode the
high gain samples are transmitted unless the samples are overflowed or
underflowed. If this happens, low gain samples are transmitted. In the
bi-gain mode both gains are transmitted at the same time.
Header - Start of Packet
DMU 1 Data block
...
DMU 16 Data block
DMU Chip Mask Word
Global CRC
Trailer - End of the Packet
Table 4.14: TileCal ROD data event format. The size of the DMU data blocks
depend on the operation mode.
Finally, the G-Link encoder encodes the formatted words in frames of 20
bits using the Conditional Inversion with Master Transition (CIMT) coding,
and drives a FPGA transceiver for data transmission to the ROD. The G-Link
encoder block was designed based on the functionalities implemented by the
112
4.5. BACK-END ELECTRONICS FIRMWARE
Agilent HDMP 1032 transmitter chip [75] used in the Interface boards of the
current system.
FELIX interface
The FELIX interface block includes four Event Packers, where each one collects
the data from the pipelines of one minidrawer and builds a FELIX event packet.
The FELIX packet structure, shown in Table 4.15 includes 16 high and low
gain consecutive samples for 12 channels from one minidrawer. As before, the
number of samples included in the event fragment can be configured from 1 to
16 samples, with 16 as the default value.
FELIX Header - Start of Packet
Run parameters
BCID MD ID
L1A ID
HG sample 2 Ch1 HG sample 1 Ch1
...
HG sample 16 Ch12 HG sample 15 Ch12
LG sample 2 Ch1 LG sample 1 Ch1
...
LG sample 16 Ch12 LG sample 15 Ch12
FELIX Trailer - End of Packet
Table 4.15: FELIX raw data event format. The Run parameters field defines
all the relevant parameters such as the number and type of run, or the DAC
value in the case of a CIS run. On the other hand, the MD ID field includes the
minidrawer number, BCID field includes the BCID corresponding to the first
sample and L1A ID field the L1A number in the run.
The selected data is transmitted through a standard GBT link to the FELIX
system at 4.8 Gbps. The GBT link used for communication with the FELIX
system is operated in Frame mode. The user data field is divided into 10 frames
of 8 bits, with each frame corresponding to one e-link [24].
4.5.10 Latency block
There are four Latency blocks in the Readout FPGA to measure the Round-
Trip Time (RTT) of the GBT links in units of LHC clock cycles. The Latency
block counts the number of clock cycles between the transmitted BCR to the
front-end electronics and the received BCR. The result is accessed through an
113
CHAPTER 4. INTEGRATION OF THE TILEPPR
IPbus register. This block permits a fast way to detect latency variations in the
digital path greater than 25 ns which would produce the incorrect time stamping
of the incoming data.
4.6 Data path delays and latency measurements
This section presents the measurements performed to obtain the latencies of
the data acquisition path of the Demonstrator module. The data acquisition
path can be separated in two different paths: the digital and the ADC interface
paths. The digital delay path only includes the communication latency between
the DaughterBoardd and the TilePPr through the GBT links, while the ADC
interface path delay comprehends the delays between the fast ADCs and the
deserialization and retiming modules in the DaughterBoard.
4.6.1 Digital path latency measurements
The digital path delay (δt) for the Demonstrator is not symmetric. The delays
associated to the downlink transmission (δd) and the uplink transmission (δu)
are not equal due to the different structure and configuration of the GBT links.
As indicated in Equation 4.2, the digital path latency is calculated as the sum
of δd and δu.
δt = δd + δu (4.2)
As mentioned above, the TilePPr firmware contains a block to calculate the
RTT in number of LHC clock cycles. Equation 4.3 defines the RTT of the digital
path as the sum of both delays rounded to the higher integer number.
RTT =
⌈
δd + δu
⌉
(4.3)
The RTT was measured in a testbench composed of a TilePPr prototype
and a DaughterBoard connected through a 3 meter fiber (Figure 4.20). The
delay between the transmitted and received BCR was measured using a Lecroy
WavePro 760Zi oscilloscope. The result of the measured RTT is 450 ns, which
agrees with the value of 18 clock cycles (18 × 25 ns) measured with the Latency
block. Figure 4.21 shows a screenshot of the Chipscope software [76] used to
114
4.6. DATA PATH DELAYS AND LATENCY MEASUREMENTS
perform this test.
Figure 4.20: Testbench for the digital path delay measurement.
The delays of the uplink and downlink were measured individually with the
oscilloscope to obtain the δu and δd values. For this measurement, BCR signals
in the DaughterBoard and TilePPr were connected to external pins. The δd
measured between the transmitted BCR signal in the TilePPr and the received
BCR signal in the DaughterBoard results in a delay of about 242 ns. However,
the measured latency in the uplink is lower with a value of about 208 ns since the
Tile GBT block and the transceivers for the uplink operates two times faster.
The measured latency values agree with the expected latencies for the uplink
and downlink. Table 4.16 and Table 4.17 summarize the latency introduced by
each transmitter and receiver block of the GTX transceiver, according with the
information provided by Xilinx [77]. The latencies added by the Tile GBT-
FPGA IP core and other firmware blocks involved in the uplink and downlink
communication are shown in Table 4.18.
An amount of latency is added when received data is retimed into the local
rx frame clk (80 MHz) domain. The added latency depends on the arrival time
of the data and the selected clock phase in the BO-CDR for sampling the data.
For this reason, the retiming process introduces a latency between 11 and 14
115
CHAPTER 4. INTEGRATION OF THE TILEPPR
Figure 4.21: Screenshot of the Chipscope software where BCR TX signal cor-
responds to the transmitted BCR, the BCR RX signals to the received BCR,
and LAT values corresponds to the RTT calculated by the Latency block.
rx word clk (240 MHz) cycles. In addition, the BO-CDR could add three more
rx word clk cycles in order to adjust the RTT latency to the higher integer
number of LHC clock cycles.
The rest of the delay is due to the optical modules and propagation delay over
the fibers (estimated as 4.9 ns/m for single mode fibers [78] [79]), propagation
delay over the PCB traces and latencies of the clock distribution inside the
FPGAs.
Downlink Uplink
Block
tx word clk cycles
(120 MHz)
Time
(ns)
tx word clk cycles
(240 MHz)
Time
(ns)
FPGA interface 2 16.63 2 8.32
Buffer (bypassed) 1 8.32 1 4.16
PMA interface 1 8.32 1 4.16
PMA 1.54 12.78 1.54 6.39
Total 5.54 46.05 5.54 23.02
Table 4.16: Latency introduced by the transmitter GTX transceiver blocks
according to the downlink and uplink configuration. PMA stands for Physical
Medium Attachment sublayer.
116
4.6. DATA PATH DELAYS AND LATENCY MEASUREMENTS
Downlink Uplink
Block
rx word clk cycles
(120 MHz)
Time
(ns)
rx word clk cycles
(240 MHz)
Time
(ns)
FPGA interface 2 16.63 2 8.32
Comma detect 1 8.32 1 4.16
PMA 5.21 43.35 5.21 21.67
Total 8.21 68.3 8.21 34.15
Table 4.17: Latency introduced by the receiver GTX transceiver blocks accord-
ing to the downlink and uplink configuration.
Downlink Uplink
Module
tx/rx word clk
cycles (240 MHz)
Time
(ns)
tx/rx word clk
cycles (240 MHz)
Time
(ns)
TX
Tile GBT 7 58.22 4 16.33
Data Packer - - 3 12.48
RX
Tile GBT 6 49.9 7 29.11
BO-CDR - - 15 62.38
Total 13 108.12 29 120.59
Table 4.18: Latency introduced by the firmware blocks for the downlink and
uplink communication. The Data Packer and BO-CDR blocks are only present
in the uplink.
4.6.2 ADC interface path latency
The time between the arrival time of the pulse signal by the ADCs and the
reception of the digitized pulse in TilePPr prototype was also measured. The
complete latency of the uplink (δfe), including the ADC interface path, is defined
in Equation 4.4 and corresponds to the sum of δu and the acquisition time of
the digitized signals (δADC).
δfe = δu + δADC (4.4)
Figure 4.22 shows a block diagram of the testbench used for the measurement
of the ADC interface path delay. The function generator Tektronix AFG3052C
was configured to generate a 10 µs wide square pulse triggered with the transmit-
ted BCR signal from the TilePPr, and then the output of the function generator
was connected to the input of one ADC.
In order to measure the delay from the ADC input to the output of the
deserialization block (δADC), the MSB of the deserialized ADC data in the
DaughterBoard was routed to an external pin. Then, the δADC was measured
as the time difference between the rising edge of the square pulse in the input
117
CHAPTER 4. INTEGRATION OF THE TILEPPR
Figure 4.22: Testbench for the ADC interface path delay measurement.
of the ADC and the rising edge of the MSB.
The measurement showed a δADC of about 297 ns. According to the man-
ufacturer specifications, the ADC pipelines introduce a latency of 6 LHC clock
cycles, plus about 4.67 ns associated with the propagation delay of the acquired
signal in the ADC [71]. Also, 5 LHC clock cycles more corresponds to the de-
lay introduced by the deserialization block and the BO-CDR implemented in
the DaughterBoard FPGA. The rest of the latency is due to the data retiming
process and the propagation delay over the PCB traces.
Therefore, according to Equation 4.4, δfe takes a value of about 505 ns using
a 3 meter long fiber. Again, this value was confirmed by measurement of the
time difference between the pulse injection into the ADC and the reception of
the digitized pulse in the TilePPr.
ADC deserializer tests
A similar setup was also used to validate the deterministic behavior of the
deserialization block in terms of latency. The delay between the arrival time of
the received BCR and the arrival time of the pulse signal (δp) was measured in
number of LHC clock cycles. For the realization of this measurement, a simple
pulse detector (Figure 4.23) was connected to the output of the GBT receiver.
118
4.6. DATA PATH DELAYS AND LATENCY MEASUREMENTS
Sn−2
Sn−1
X2
D Q D QSn
clk
Time
counter
Thres. −+ > δp
Sorting
network
BCR
Figure 4.23: Block diagram of the pulse detector block.
The pulse detector includes a sorting network [80], a comparator, a subtrac-
tor and a time counter. The implemented sorting network sorts the last three
received samples Sn−2, Sn−1 and Sn to X1, X2 and X3 where is always satisfied
that X1 ≤ X2 ≤ X3. The sorting network filters possible sample spikes that
would produce false triggers by the selection of the X2 sample.
Figure 4.24 shows a diagram of the implemented sorting network, where each
branch joining two paths represents a block containing a 2-input comparator and
a switch. The branches switch the paths when the bottom input is higher than
the top one, and keep unchanged the paths when the bottom input is lower or
equal to the top one.
Sn−2
Sn−1
Sn
X1
X2
X3
Figure 4.24: Representation of a sorting network for sorting 3 inputs.
The measured δp value kept constant during extensive tests where the test-
bench was power cycled and the transceivers were reset. Therefore, the fixed
and deterministic latency of the ADC interface path was confirmed.
119
CHAPTER 4. INTEGRATION OF THE TILEPPR
120
Chapter 5
Clock distribution in the
Tile Calorimeter
After the ATLAS Phase II Upgrade, the PreProcessor will be responsible for
the distribution of the clock to digitize the PMT analog signals in the front-end
electronics, and also for the synchronization of the readout electronics with the
overall ATLAS DAQ system.
Variations in the propagation delay of the transmitted clock could occur
during the operation due to voltage and temperature variations in the electron-
ics and fibers. Moreover, some elements of the high-speed communication links
could also have a non-deterministic latency producing time variations in the
distributed clock.These time variations will cause phase shifts of the clock dis-
tributed to the front-end electronics which have to be detected and monitored
for compensation prior to data taking.
In this thesis, a FPGA based circuit called OverSampling to UnderSampling
(OSUS) is proposed for the detection of latency variations and phase drifts
between clocks with a precision of about 30 psRMS. The performance of the
proposed OSUS circuit is also compared with the Digital Dual Mixer Time
Difference (DDMTD) circuit. This chapter also discusses the implementation of
the OSUS circuit in the TilePPr prototype to synchronize the TTC system and
the TilePPr, studies of the clock quality, phase drift monitoring and stability of
the link latency.
121
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
5.1 Current clock distribution architecture
As described in Chapter 1, in the current DAQ architecture the TTC system
distributes all the signals needed for the synchronization of the whole detector
electronics and the LHC clock. Figure 5.1 shows a block diagram of the TTC
distribution in the TileCal detector.
crate
LHC clock
Motherboard
FPGA
TTCrx
FEB
Interface board
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
CTPTTC
Orbit
ROD
×4
Front-end electronics
Back-end electronics
systemcrate
D
ig
it
iz
er
b
oa
rd
D
ig
it
iz
er
b
oa
rd
×4
×256
Figure 5.1: Sketch of the current clock distribution architecture for the TileCal
detector.
During physics data taking, the CTP receives the LHC clock and the orbit
signal from the LHC accelerator, and propagates them together with the Event
Counter Reset (ECR), L1A and configuration signals to the Tile TTC crates [81].
Then, the TTC modules distributes the LHC clock and control signals to each
one of the 256 TileCal modules through an individual optical fiber.
In the front-end electronics, the TTCrx ASIC recovers and distributes the
LHC clock for the digitization of the PMT signals. When the front-end elec-
tronics receives a L1A signal from the CTP, the Interface board transmits the
data and the corresponding BCID associated with the triggered event to the
RODs. This BCID is compared with the one registered in the ROD to confirm
that front-end and back-end systems are synchronized.
The TTCrx ASIC includes a dedicated circuitry to delay the recovered clock
in steps of 104 ps in order to compensate the particles’ time of flight and prop-
122
5.2. CLOCK DISTRIBUTION ARCHITECTURE IN THE HL-LHC
agation delays associated with the signal acquisition. This delay is measured
with timing calibration runs where the laser system [82] sends controlled light
pulses to the PMTs via dedicated fibers and the TTCrx ASICs are configured
to shift the phase of the sampling clock as close as possible to the peak of the
digitized pulse. Finally, the fine tunning of the sampling clock phase is done
with physics samples.
5.2 Clock distribution architecture in the HL-
LHC
The clock distribution schema of the ATLAS experiment will be revised for
the HL-LHC. In this new scenario, the TTC information will be distributed
from the CTP to a number of Local Trigger Interface (LTI) boards and then to
the FELIX systems. The LHC clock and TTC information will be transmitted
between the different elements of the back-end electronics system using a bi-
directional optical network running at 9.6 Gbps, called TTC Passive Optical
Network (TTC-PON) [83].
Finally, the TilePPr will receive the clock and TTC information from the
FELIX system through optical links using the GBT protocol and will propagate
the TTC signals to the front-end electronics, where the clock will be recovered
for the digitization of the analog signals and for the transmission of the digitized
data to the back-end electronics. Figure 5.2 depicts a complete block diagram
which describes the clock distribution schema in Tilecal for the HL-LHC.
Following the same clock distribution schema used in the current system,
the front-end electronics will be equipped with a GBTx chip to recover and
distribute the LHC clock to the FEBs for digitization.
5.2.1 Synchronization of the Demonstrator module
The architecture employed to distribute the clock to the Demonstrator module
is a combination of the legacy system and the architecture proposed for the HL-
LHC. In the Demonstrator, the legacy TTC system provides the LHC clock and
TTC information to the TilePPr prototype. Then, the TilePPr prototype dis-
tributes the LHC clock to the front-end electronics over 16 GBT links running
123
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
FELIX
LTI
×32
MainBoard
FPGA FPGA
GBTx GBTx
FEB
B
u
ff
er
B
u
ff
er
B
u
ff
er
B
u
ff
er
DaughterBoard
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FEB
FPGA FPGA FPGA FPGA
C
P
M
C
P
M
C
P
M
C
P
M
TilePPr
ϕa1
ϕa2
ϕb1
ϕb2 B A
×4
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
T
il
e
C
a
l
m
o
d
u
le
Front-end electronics
Back-end electronics
TTCSYSTEM
CTP
system
TTC-PON
GBT
LHC clock
Orbit
Figure 5.2: Sketch of the clock distribution schema in TileCal for the HL-LHC.
at 4.8 Gbps. The LHC clock is then recovered by the GBTx in the Daughter-
Boards, and distributed to the ADCs in the MainBoard for the digitization of
the PMT signals.
However, the clock distribution to the front-end electronics requires the mea-
surement of the phase difference between the LHC clock, recovered from the
TTC system, and the internal clocks generated in the TilePPr prototype to
ensure the synchronization with the legacy DAQ system.
The main source of mis-synchronization is produced by the clock recovery
stage in the TilePPr, where the recovered LHC clock takes an unknown phase
due to the frequency conversions performed with the jitter cleaner (CDCE62005)
to generate the frequency values required for the operation of the GBT links.
In addition, the LHC clock phase can suffer small variations due to voltage and
temperature drifts producing changes in the propagation delay of the electronics
and fibers.
124
5.3. DMTD METHOD
Therefore, the TilePPr prototype requires of a method to measure the phase
difference between the clock provided by the TTC system and the clock trans-
mitted to the front-end electronics. This method will provide synchronization
between the front-end and back-end electronics by detecting and compensating
for phase drifts during operations.
5.3 DMTD method
One popular circuit employed to measure the phase difference between periodic
signals is the Dual Mixer Time Difference (DMTD) circuit. The DMTD circuit
was present first by D.W. Allan in [84] to measure phase differences between
analog signals. The DMTD circuit permits to obtain the phase relationship
between two periodic signals with high resolution. Figure 5.3 presents a block
diagram of the DMTD circuit.
u1(t) = cos(2pifnt+ ϕ1)
u2(t) = cos(2pifnt+ ϕ2)
Time
counterus = cos(2pifst+ ϕs)
ϕ1 − ϕ2
Figure 5.3: Block diagram of the analog DMTD circuit.
The two clocks u1(t) and u2(t) have the same clock frequency but an un-
known phase relationship given by the difference between ϕ1 and ϕ2. Both
clocks are mixed with a third clock signal us(t) with slightly smaller frequency.
Figure 5.4 shows the waveform of the signal resulting of multiplying us and u1.
The mixing operation is mathematically described in Equation 5.1 as the
multiplication of cosine signals.
u1(t) · us(t) = cos(2pif1t+ ϕ1) · cos(2pifst+ ϕs)
=
1
2
· (cos(2pi(f1 + fs)t+ ϕ1 + ϕs) + cos(2pi(f1 − fs)t+ ϕ1 − ϕs))
(5.1)
The mixing operation between u1(t) and us(t) results in a low frequency
signal, called U1(t) and a high frequency signal which is filtered with a low pass
125
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
Figure 5.4: Waveform signal resulting from multiplying a 100 Hz signal (u1)
and a 95 Hz signal (us).
filter. U2(t) is obtained following the same procedure. The phase difference
(∆ϕ) between the two resulting U(t) signals is equal to the phase difference
between the original clock signals.
U1(t) = cos(2pifU t+ ϕ1 − ϕs) (5.2)
U2(t) = cos(2pifU t+ ϕ2 − ϕs) (5.3)
∆ϕ = ϕ1 − ϕ2 (5.4)
Therefore, the phase difference between u1(t) and u2(t) can be easily calcu-
lated using a time counter to measure the time difference between U(t) signals
(∆TU ).
∆ϕ = ∆TU2pifU (5.5)
The smaller difference between the frequency of the input signals and us
results in a decrease of fU as can be deduced from Equation 5.1, and then in
an increase of the resolution of the method.
126
5.3. DMTD METHOD
5.3.1 Digital approximation of the DMTD method
The digital approximation of the analog DMTD technique is based on sampling
the input signals with a sampling rate close to the input frequency signal. The
Nyquist theorem [85] establish that in order to sample all the changes of a signal,
the minimum sampling rate has to be at least two times the highest frequency
contained in the signal. Equation 5.6 refers to this condition in mathematical
terms.
fs ≥ 2fp (5.6)
where fs represents the sampling frequency and fp the highest frequency con-
tained in the signal.
Sampling the signals above fs results in lower frequency signals, called alias
signals. This technique is known as undersampling or subsampling. Figure 5.5
shows an example of an alias signal produced when a signal with a frequency
fp is undersampled at a sampling rate of 0.95 · fp.
Figure 5.5: Waveform signal (U1) resulting from undersampling a sinusoidal
signal of 100 Hz (u1) at 95 Hz.
The frequency of the alias (fa) signal can be obtained from Equation 5.7.
fa = |R · fs − fp| (5.7)
where R represents the closest integer multiple of fs to fp.
127
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
When a signal is undersampled, a fixed number of samples are generated for
each period of the input signal. The number of alias samples can be derived
from the aliasing theorems presented in [86].
L =
1
0.5−
∣∣∣0.5− ( fpfsmod1)∣∣∣ (5.8)
where mod is the modulus operation.
The Digital DMTD circuit (DDMTD) [87] [88] uses two Flip-Flops (FFs) to
sample two periodic digital signals with the same frequency and compares the
phase difference between the output product signals. Figure 5.6 shows a block
diagram of the digital implementation of the DMTD. The digital undersampling
circuit or DDMTD is composed of two FFs for sampling the input signals, a
PLL which generates the sampling clock, two debouncer blocks to filter glitches,
and time counters to measure and monitor the phase difference of the resulting
alias signals.
PLL
D Q
D Q
Debouncer
count
u1(t)
u2(t)
us(t)
block
N
N+1
U2(t)
U1(t) Debouncer
block
Time ∆ϕ
Figure 5.6: Block diagram of the DDMTD circuit.
Equation 5.7 shows that the closest frequency between the sampling clock
and the input signal will produce the lower frequency of the alias signal. Since
the alias signal phase is proportional to the input signal, the lower frequency
will provide more accurate measurements. For this reason, the sampling clock is
generated from one of the input clock signals using a PLL with a fractional factor
close to 1. The frequency fs of the sampling clock is presented in Equation 5.9.
fs =
N
N + 1
f1 (5.9)
where N corresponds to a positive number greater than 0.
128
5.3. DMTD METHOD
Figure 5.7 shows an example of how the digital undersampling technique
works. The U1 and U2 signals correspond to the alias signals resulted of under-
sampling u1 and u2 with us.
N cycles (8)
u1(t)
u2(t)
us(t)
U1(t)
U2(t)
Figure 5.7: Timing diagram of the DDMTD circuit with N = 8.
Therefore, the resolution of the DDMTD circuit corresponds to the maxi-
mum time difference between u1 and us, and it can be obtained by subtracting
the input period and the sampling period as shown in Equation 5.10.
∆tmax = Ts − T1 = N + 1
N
T1 − T1
=
1
N
T1 =
1
N + 1
Ts
(5.10)
The phase difference between u1 and u2 (φ12) corresponds to the number of
cycles of us from the rising edge of U1 and the rising edge of U2 multiplied by
the time resolution.
φ12 =
1
N + 1
Tsncycles (5.11)
where ncycles is the number of cycles of us between the rising edges of U1 and
U2.
Since fs/f1 is close to 1, the number of samples for each period of the input
signal is equal to N. This equality can be easily deduced from Equation 5.12 for
N > 1.
L =
1
f1
fs
− 1 = N (5.12)
129
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
A demonstration of the equivalence between the operation of the digital
DMTD circuit and the analog one is presented below. Equation 5.13 shows the
discretized form of Equation 5.1 where t has been replaced by nTs.
u1(n) · us(n) = 1
2
· (cos(2pi(f1 + fs)nTs + ϕ1 + ϕs)
+ cos(2pi(f1 − fs)nTs + ϕ1 − ϕs))
=
1
2
· (cos(2pif1nTs + n2pi + ϕ1 + ϕs)
+ cos(2pif1nTs − n2pi + ϕ1 − ϕs))
=
1
2
· (cos(2pif1nTs + ϕ1 + ϕs) + cos(2pif1nTs + ϕ1 − ϕs))
(5.13)
where n corresponds to the sample number and is a natural number.
Applying the Euler identity for the cosine, Equation 5.13 can be rewritten
as follows.
u1(n) · us(n) = 1
2
(
ej(α+ϕs) + e−j(α+ϕs)
2
+
ej(α−ϕs) + e−j(α−ϕs)
2
)
=
1
2
(
ejϕs · e
jα + e−jα
2
+ e−jϕs · e
jα + e−jα
2
)
=
1
2
(
ejϕs · cos(α) + e−jϕs · cos(α)) = cos(ϕs) · cos(α)
(5.14)
where α = 2pif1nTs + ϕ1.
Then, the discretized form of the multiplication of u1 and us can be expressed
as Equation 5.15.
u1(n) · us(n) = cos(ϕs) · cos(2pif1Tsn+ ϕ1)
= cos(ϕs) · cos(2piN + 1
N
n+ ϕ1)
= cos(ϕs) · cos(2pi 1
N
n+ ϕ1)
(5.15)
The frequency of the signal product of u1 and us is equal to the frequency
fU of the alias signal in the DDMTD circuit shown in Equation 5.16.
fU = f1 − fs = 1
N
fs (5.16)
130
5.4. OVERSAMPLING TO UNDERSAMPLING METHOD
Finally, u1 · us is redefined as U1 under the conditions shown in 5.17 to
represent it as a digital signal.
U1(n) =
1, if −
pi
2 < (2pi
1
N n+ ϕ1) <
pi
2 .
0, if others.
(5.17)
Note that, since the generated U signals are referenced to us, the argument
ϕs presented in Equation 5.15 is equal to 0.
5.4 OverSampling to UnderSampling method
Implementing multiple DDMTD circuits in a FPGA for the monitoring of dif-
ferent frequency signals is a complex task since each DDMTD circuit requires
a complete set of PLLs and clock buffers for the generation of the us signal.
The reduced number of clocking resources in FPGAs limits the possibility of
implementing a large number of DDMTD circuits in the same design.
A modification of the original architecture of the DDMTD circuit is proposed
in this thesis. The OverSampling to UnderSampling (OSUS) circuit makes
possible to measure the phase difference between multiple clock signals with
different frequencies using a unique PLL component.
The principle of operation of the OSUS circuit consists of oversampling the
input signals with a sampling clock that has a fractional frequency multiple of
u1 and bigger than one. Then, the oversampling signal is decomposed in alias
signals, called U, by selecting and grouping the output samples. The frequency
of the sampling clock for the OSUS circuit is defined in Equation 5.18.
fs = M
N
N + 1
f1 (5.18)
where M and N are natural numbers. M corresponds to the oversampling factor
and N is the PLL factor also used in the DDMTD circuit.
Figure 5.8 depicts a block diagram of the OSUS circuit describing its opera-
tion. The OSUS circuit produces M alias signals that are sent to M debouncer
blocks and M time counters. A Phase Control block passes the selected samples
to the debouncer blocks controlling a demultiplexer.
131
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
PLL
D Q
D Q
u1(t)
u2(t)
us(t)M NN+1
U2(t)
U1(t)
Debouncer
block
∆ϕ0
U0
2
(t)
U1
2
(t)
U
M−1
2
(t)
U0
1
(t)
U1
1
(t)
U
M−1
1
(t)
Debouncer
block
Phase
Control
Time
count
∆ϕ1
∆ϕM−1us(t)
Figure 5.8: Block diagram of the OSUS circuit.
The complete operation of an OSUS circuit with N = 8 and M = 4 is
presented in the timing diagram of Figure 5.9. Signal u1 corresponds to one
of the two input signals and is sampled with us generating the oversampled
U1 signal. Finally, the Phase Control block constructs the four U
k
1 signals
separating the samples in intervals of MTs.
0
pi/2
pi
3pi/2
us(t)
u1(t)
U1(t)
U11 (t)
U21 (t)
U31 (t)
U01 (t)
Figure 5.9: Timing diagram of the OSUS circuit with N = 8 and M = 4.
The waveform of the oversampled U signal in the digital domain is defined
in Equation 5.19.
U(n) =
1, if −
pi
2 < (2pi
1
NMn+ ϕ1) <
pi
2 .
0, if others.
(5.19)
132
5.4. OVERSAMPLING TO UNDERSAMPLING METHOD
In addition, as can be observed in Figure 5.9, each of the decomposed alias
signals Uk have a different phase offset corresponding to a multiple of 2piM k. The
Uk signals are constructed selecting the value of the oversampled U signal each
k+M samples. The waveform of the decomposed alias signals can be expressed
as follows.
Uk(n) =
1, if −
pi
2 < (2pi
1
N n+
2pi
M k + ϕ1) <
pi
2 .
0, if others.
(5.20)
where k represents the index of U and takes values from 0 to M − 1.
Therefore, as can be deduced from Equation 5.20 the advantage of the OSUS
method over the DDMTD circuit is the increase in statistics with the factor M ,
since the OSUS circuit produces M times more measurements than the DDMTD
circuit keeping the same time resolution.
5.4.1 Performance of the OSUS circuit
In this section the performance of the OSUS circuit and the DDMTD circuit
are compared considering two figures of merit: the measurement rate and the
acquisition time.
Measurement Rate
The measurement rate is referred here as the minimum time between two phase
measurements. While the phase measurements done with the DDMTD circuit
are separated in time a complete period of Uk, the OSUS circuit provides phase
measurements separated in time in steps of Uk/M . Thereby, the measurement
rate increments by a factor M with respect to the DDMTD. The improvement
in the measurement rate could be useful for applications using the DDMTD
circuit as part of a control loop.
Acquisition time
On the other hand, the acquisition time refers to the number of Ts periods
needed to collect a complete set of measurements. For the DDMTD method the
133
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
acquisition time is defined in Equation 5.21.
Ftuc = (K − 1)NTs + ∆ϕU (5.21)
where K corresponds to the number of samples to be acquired and ∆ϕU is the
phase difference between the U signals being compared.
The acquisition time for the OSUS circuit is lower than for the DDMTD
circuit. Equation 5.22 refers to the time needed to acquire K samples with an
OSUS circuit with M = K.
FtOSUS =
K − 1
K
NTs + ∆ϕUk (5.22)
where ∆ϕkU is the phase difference between the U
k signals being compared.
A comparison between the acquisition time of the DDMTD and the OSUS
circuit is presented in Figure 5.10 (a) and (b). Each line represents a different
∆ϕU with values between 0.1 · TU and 0.5 · TU .
(a) Acquisition time improvement of the
OSUS circuit over the DDMTD circuit. Each
line represents the acquisition time improve-
ment in percentage for M measurements vary-
ing the factor M. Five ∆ϕ cases are presented
taking values from 0.1 · TU to 0.5 · TU .
(b) Comparison between the OSUS circuit
and the DDMTD circuit acquisition times.
Each line corresponds to the acquisition time
for M measurements. Five ∆ϕ cases are pre-
sented taking values from 0.1 · TU to 0.5 · TU .
Figure 5.10: Comparison of the acquisition time of the OSUS and the DDMTD
circuit for different phase shifts.
134
5.4. OVERSAMPLING TO UNDERSAMPLING METHOD
Debouncing techniques
The theoretical resolution of the DDMTD and OSUS circuits is affected by the
jitter in the clock signals and metastability in the samplers. During the sampling
of the input signals, a high number of glitches are produced in the transition
edge of the Uk signals affecting to the precision of the measurements and the
transition edge of the Uk. Different debouncing methods to estimate position
of the positive edge of U signals were presented in [89] and [90] to mitigate
this problem in the DDMTD circuit: First Edge (FE), Positive Edge Median,
Zero Count(ZC) and Center-Of-Mass. The debouncing circuit provides a time
estimation of the rising edge of the Uk signal.
In this thesis, a new method based on the Average Position (AP) is used
to estimate the position of the positive edges. The AP method is compared
with other techniques with similar complexity as the FE, Last Edge (LE) or ZC
methods.
A brief description of the compared debouncing methods is given below.
• First Edge estimates the positive edge time position as the first positive
edge detected.
• Last Edge estimates the positive edge time position as the last positive
edge detected.
• Zero Count counts the number of ones and zeros produced during the
glitch period and estimates the positive edge position where the number
of zeros and ones are equal.
• Average Position calculates the positive edge time position as the average
value between the time position provided by the First Edge and Last Edge
methods.
Figure 5.11 presents a timing diagram of the debouncing operation with AP
method. The AP method estimates the positive edge using the values obtained
with the FE and LE methods. The FEx and LEx pulses are generated to detect
the first and the last positive glitches of the Ux signals. Then, the estimated
positive edge of the U1 (tstart) and U2 (tstop) is obtained by calculating the
average time between the arrival time of the corresponding FE and LE pulses.
135
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
t0 t1
t2 t3
U1
U2
FE1
LE1
FE2
LE2
tstop
tstart
Figure 5.11: Timing diagram of the AP debouncing method describing the
process to estimate the positive edge position of the U1 and U2 signals.
As expressed in Equation 5.23, the estimated phase difference between the
Ux corresponds to the time difference between the tstart and tstop signals.
∆t = tstop − tstart = (t3 + t2)
2
− (t1 + t0)
2
(5.23)
Finally, Equation 5.23 can be rewritten as presented in Equation 5.24 to
facilitate the implementation of the algorithm in the FPGA.
∆t =
(t2 − t0)
2
+
(t3 − t1)
2
(5.24)
Thereby, the AP method is implemented in the FPGA with two time coun-
ters, where the number of clock cycles between the FEx and LEx signals are
counted separately and then subtracted.
Debouncing techniques comparison
The four debouncing techniques were tested to compare the resolution of the
AP method. An OSUS circuit was implemented in the Readout FPGA con-
taining four different sets of debouncer blocks. The result of the comparison is
presented in Figure 5.12, where each histogram contains 500,000 measurements
corresponding to the phase difference between a pair of 240 MHz clocks. The
OSUS circuit was implemented with a factor M equal to 1 and the PLL was
configured a factor N equal to 16384.
136
5.4. OVERSAMPLING TO UNDERSAMPLING METHOD
Phase difference (ns)
1.8− 1.75− 1.7− 1.65− 1.6− 1.55− 1.5−
Co
un
ts
0
50
100
150
200
250
310×
Average Position
Zeroes
First Edge
Last Edge
Figure 5.12: Comparison of the four debouncing methods measuring two
240 MHz clocks. The OSUS circuit was configured with M = 1 and the PLL
with N = 16384.
Table 5.1 shows the time resolution corresponding to the four methods com-
pared in Figure 5.12.
Debouncing technique ResolutionRMS
Average Position 29.12 ps
Zero Count 29.5 ps
First Edge 44.66 ps
Last Edge 44.57 ps
Table 5.1: Time resolution obtained with the AP, ZC, FE and LE methods.
As can be concluded from Figure 5.12 and Table 5.1, the AP and ZC methods
provide a much better resolution than the FE and LE techniques, with an
improvement of about 15 psRMS. Although the number of logic resources needed
to implement the AP and ZC algorithms are very similar, the AP shows a slightly
better time resolution than the ZC method. For this reason, the AP method
was used in the TilePPr prototype to debounce the output signals coming from
the OSUS samplers.
137
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
5.5 Implementation of the OSUS circuit
The design of the OSUS circuit permits the synchronization of the front-end
electronics with the LHC clock and monitoring of the clock phase stability during
the operation of the Demonstrator module.
As will be described in the next subsections a total of 17 OSUS circuits were
implemented in the Readout FPGA. The us signal used for sampling is generated
from a 240 MHz clock with an MMCM and a PLL circuits connected in series.
Equation 5.25 shows the multipliers and dividers of the clocking resources which
were selected to obtain a factorN with a value of 16384. Moreover, both clocking
elements were connected without buffers through dedicated clock lines to reduce
the additive jitter [91].
fs = fin · MMMCM
DMMCM ·OMMCM ·
MPLL
DPLL ·OPLL = 240 MHz ·
64
1 · 3.625 ·
64
10 · 113
= 240 MHz · 16384
16384 + 1
= 239.985 MHz
(5.25)
where MMMCM and MPLL correspond to the value of the multipliers, DMMCM
and DPLL to the value of the input dividers, and OMMCM and OPLL to the
value of the output dividers.
Table 5.2 describes the features of the implemented OSUS circuit with re-
spect to the factor M for the selected us clock frequency and N .
fin (MHz) 240 MHz 120 MHz 80 MHz 40 MHz
factor M 1 2 3 6
Rt 254.313 fs 508.626 fs 762.939 fs 1,525.878 fs
fU 14.647 kHz 7.323 kHz 4.882 kHz 2.441 kHz
Counter Nbits 14 15 16 17
Table 5.2: Characteristics of the implemented OSUS circuit with respect to the
different clock frequencies that can be measured with the current configuration.
Rt represents the theoretical time resolution of the OSUS circuit.
Each pair of samplers corresponding to one OSUS circuit was implemented
in the same SLICEL block. Slices SLICEL and SLICEM are the logic elements
of the Xilinx Series 7 FPGAs. Each slice contains four Look-Up Tables, four
Shift Register Logic blocks, that can be configured as FFs or latches, arithmetic
138
5.5. IMPLEMENTATION OF THE OSUS CIRCUIT
carry logic and multiplexers to enlarge possible interconnections between the
different blocks.
Figure 5.13 shows how the input signals ux were connected to the FF A and
B through the input AX and BX of a SLICEL block. Then, the AQ and BQ
outputs transmit the sampled signals to the Phase Control unit where they will
be decomposed into the Ukx signals. The placement of rest of the blocks is not
critical and they are placed in the nearby slices.
It is important to remark here that the clock skew of the us signal between
the samplers is negligible since both samplers are placed in the same SLICEL.
However, the different propagation delays from the origin of the ux input signals
and the samplers have to be measured and calibrated to avoid systematic phase
errors in the measurements. The delay associated with the internal FPGA
routing of the ux signals was extracted using the Xilinx FPGA Editor software
[93] and compensated at the software level.
5.5.1 Synchronization of the Demonstrator with the TTC
system
The synchronization of the Demonstrator module with the TTC system is
achieved through the TilePPr. As introduced in Chapter 4, the ADN2814 chip
recovers a clock with a frequency 4 times the LHC clock frequency (160 MHz)
which is converted by the CDCE62005 chip in a low jitter clock with a fre-
quency 3 times the LHC clock (120 MHz). This low jitter clock will drive the
transceivers and will be used to generate all the required clocks for the different
firmware blocks in the Readout FPGA. Figure 5.14 depicts a block diagram
showing all the blocks involved in the synchronization process and its intercon-
nections.
However, the phase relationship between the LHC clock and the local LHC
clock (LHClocal
1) generated in the Readout FPGA is fixed but unknown. Fur-
thermore, the phase between the two clocks could change every time the sys-
tem is initialized due to the frequency conversion in the CDCE62005 chip
and other latency uncertainties as the non-deterministic output delay of the
CDCE62005 [94]. These latency variations could produce two different prob-
1This clock is referred as tx frame clk in Chapter 4.
139
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
Figure 5.13: Block diagram of a SLICEL. The two clock signals (red) are con-
nected to two registers in the same SLICEL block to minimize the skew delay
of the sampling clock (blue). Figure extracted from [92].
140
5.5. IMPLEMENTATION OF THE OSUS CIRCUIT
TTC ADN
Virtex 7 FPGA
2814
CDCE
62005
PLL
TTC
PLL
4× LHC
Phase
Unit OSUS
PLL
GBT
OSUS
LHClocal
us
TTC
decoder
LHCTTC
d
a
ta
ϕLHC
system
4× LHC 3× LHC
C
on
tr
ol
6× LHC
Figure 5.14: Block diagram of the synchronization block.
lems: the mis-synchronization of the front-end electronics with respect to the
LHC clock, and errors during the transmission of received TTC commands due
to CDC issues.
The method for the synchronization of the Demonstrator with the TTC
system presented here combines the dynamic reconfiguration capabilities of the
MMCMs with the phase measurements provided by the OSUS circuit.
First, the Slow Control FPGA configures the local oscillator (Si570) and the
CDCE62005 chip to provide a local clock for the initialization of the transceivers.
Once the TTC decoder block asserts the TTC locked signal, indicating that the
LHC clock provided by the TTC system is present, the Slow Control FPGA
configures the CDCE62005 chip to switch the input clock from the local to the
recovered clock. In addition, the TTC decoder block generates a clock signal
synchronized with the LHC clock (LHCTTC) using logic resources for monitoring
purposes.
The phase difference between the LHCTTC and the LHClocal signals is ob-
tained with the OSUS circuit. The phase measurements are transmitted to a
computer using the IPbus protocol where a dedicated software calculates the
number of steps of 11 ps needed to align both clocks. The computed number is
passed to the Phase Unit which configures dynamically the TTC PLL to shift
the phase of the recovered clock sent to the CDCE62005. This process is re-
peated until the averaged phase difference measured with the OSUS circuit is
141
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
below 15 ps.
Figure 5.15 shows a histogram of the phase difference between the LHClocal
and LHCTTC extracted from the synchronization process between the Demon-
strator and the TTC system after the calibration of the OSUS circuit. As can
be observed, after synchronization of the clocks the OSUS circuit measures a
phase difference between both clocks of ∼3.7 ps.
Entries  100000
Constant  37.9±  9817 
Mean      0.000096± 0.003697 
Sigma    
 0.00007± 0.03043 
Time (ns)
0.15− 0.1− 0.05− 0 0.05 0.1 0.15
Co
un
ts
0
2000
4000
6000
8000
10000
hissamphg_%i
Figure 5.15: Histogram corresponding to 100,000 measurements corresponding
the phase difference between the synchronized LHClocal and LHCTTC signals.
In addition, the phase difference between both clock signals was measured
with a Lecroy WavePro 760Zi oscilloscope to compare the results obtained with
the OSUS circuit. Figure 5.16 shows the histogram obtained with the oscillo-
scope corresponding to the phase difference between the LHClocal and LHCTTC
signals.
The measurement shows a phase difference of 2.005 ns, which corresponds to
the propagation delay difference from the LHClocal and LHCTTC sources to the
output pins connecting the TilePPr with the oscilloscope. The measured value
agrees with the delay difference of 2.004 ns estimated with the Xilinx FPGA
Editor software.
142
5.5. IMPLEMENTATION OF THE OSUS CIRCUIT
Figure 5.16: Histogram of the phase difference between the synchronized
LHClocal and LHCTTC obtained with a Lecroy WavePro 760Zi oscilloscope.
5.5.2 Studies on clock stability
The OSUS circuit was also used to study the stability of the LHC clock dis-
tributed through the GBT links in two different scenarios: after resetting the
GBT links and long term operations.
Each GBT link is equipped with an OSUS circuit measuring the phase dif-
ference between the recovered clock from the front-end electronics LHCFE and
the LHCTTC extracted from the legacy TTC system. The LHCFE signal corre-
sponds to the HG/LG bit of the GBT word (see Chapter 4) before being retimed
by the BO-CDR circuits.
During the first test, the DaughterBoards were remotely reset 100 times
from the TilePPr prototype. After every reset, 1,000 measurements of the phase
difference between the LHCFE and the LHCTTC were acquired for the four links
(A0, A1, B0 and B1) receiving data from the DaughterBoard. Figure 5.17 (a)
shows a histogram containing all the measurements taken during the 100 resets
for all the channels, and Figure 5.17 (b) presents the mean value of each set
of measurements acquired during the overall test. As can be observed, the
maximum delay variations between resets is below 100 ps. The channel-to-
channel skew is produced by internal delays in the distribution of the reference
clock to the transmitter part of the transceivers [39].
143
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
Phase difference (ns)
10.3− 10.2− 10.1− 10− 9.9− 9.8− 9.7− 9.6− 9.5−
Co
un
ts
0
2000
4000
6000
8000
10000
A1 channel
A0 channel
B0 channel
B1 channel
(a) Histogram of the phase difference between
the LHCFE and the LHCTTC for the 4 links
after resetting the DaughterBoards 100 times.
Reset number
0 10 20 30 40 50 60 70 80 90
Ph
as
e 
di
ffe
re
nc
e 
(ns
)
10.2−
10.15−
10.1−
10.05−
10−
9.95−
9.9−
9.85−
9.8−
9.75−
9.7−
A1 channel
A0 channel
B1 channel
B0 channel
(b) Phase difference between LHCFE and the
LHCTTC with respect to the reset count.
Figure 5.17: Phase difference variations between the LHCFE and the LHCTTC
signals after 100 resets.
In the second test, the phase difference between the LHCFE and the LHCTTC
clocks was measured 5,000 times every 5 minutes for a total period of 8 hours.
The delay variations with respect to the time are presented in Figure 5.18. As
expected, the phase difference between the clock signals was stable throughout
the test showing small variations below 10 ps.
Phase difference (ns)
10.3− 10.2− 10.1− 10− 9.9− 9.8− 9.7− 9.6− 9.5−
Co
un
ts
0
10000
20000
30000
40000
50000 A1 channel
A0 channel
B0 channel
B1 channel
(a) Histogram corresponding to the phase
measurement the LHCFE of the 4 links and
the LHCTTC during a period of 8 hours.
Time (s)
5000 10000 15000 20000 25000
Ph
as
e 
di
ffe
re
nc
e 
(ns
)
10.15−
10.1−
10.05−
10−
9.95−
9.9−
9.85−
9.8−
9.75−
9.7−
A1 channel
B0 channel
B1 channel
A0 channel
(b) Phase difference variations between the
LHCFE of the 4 links and the LHCTTC over
the time.
Figure 5.18: Phase difference variations between the LHCFE and the LHCTTC
during a period of 8 hours.
144
5.5. IMPLEMENTATION OF THE OSUS CIRCUIT
The results validate the capability of the TilePPr to provide a stable clock
for the digitization of analog pulses. Both studies reflect that the expected
phase variations between the transmitted and the received LHC clocks are below
100 ps. In any case, phase drifts are detected and corrected by the periodic
monitoring of clock phases during the operation of the Demonstrator module.
145
CHAPTER 5. CLOCK DISTRIBUTION IN THE TILE CALORIMETER
146
Chapter 6
Testbeam setup and results
Three testbeam campaigns were conducted at the H8 beam line of the CERN
SPS accelerator in the Prevessin area during 2015 and 2016. During 2017, two
more testbeam periods will be conducted to continue with the evaluation of the
upgraded readout electronics presented in Chapter 2. Usually, each testbeam
period consists of 14 days of beam time to complete a set of measurements,
where modules are tested using different energy beams and particles along η
and in perpendicular positions with respect to the beam. This chapter includes
a detailed description of the testbeam setup, facilities, trigger and readout archi-
tecture, as well as an introduction to the physics and calibration results obtained
during the testbeam campaigns.
6.1 Introduction
The main motivations of the testbeam campaigns are to study the stability of
the readout electronics and to validate the trigger and readout architectures
envisaged for the Phase II upgrade. The performance of the three FEB options
and back-end electronics for the HL-LHC was studied with data generated with
electrons, pions and muons, permitting its full characterization in conditions
close to real operation.
During the last two testbeam campaigns in 2016, the TilePPr prototype was
installed in the H8 control room and it was used to read out and operate the
Demonstrator module and to transfer selected data to the legacy RODs and FE-
147
CHAPTER 6. TESTBEAM SETUP AND RESULTS
LIX prototype. However, during the testbeam session in 2015 the TilePPr func-
tionalities were implemented in a commercial Xilinx VC707 evaluation board
since the TilePPr prototype was still under development.
6.2 Testbeam setup
The calorimeter setup was located in the H8 beam line of the CERN SPS North
Area. The setup was composed of three TileCal modules placed on a scan-
ning table capable of placing modules at any combination of angle and position
with respect to the incident beam. Two modules, one Extended Barrel and one
Long Barrel, were instrumented with the legacy readout electronics described
in Chapter 1. These modules included PMT blocks with 3-in-1 cards, Digitizer
boards and Interface boards, for the communication with the back-end electron-
ics. The legacy modules played a crucial role during the testbeams permitting
the comparison between the legacy and the new readout electronics.
Figure 6.1 presents the configuration and position of the modules during
the testbeam periods, and Figure 6.2 a picture of the testbeam modules on the
scanning table.
MD#1
3in1
HVRemote
MD#2
3in1
HVOpto
MD#3
3in1
HVOpto
MD#4
3in1
HVRemote
LB65 - C side
MD#4
QIE
HVOpto
MD#3
QIE
HVOpto
MD#2
FATALIC
HVRemote
MD#1
FATALIC
HVRemote
Superdrawer 
LEGACY
LB65 - A side
Module 0 - A side
Superdrawer 
LEGACY
Module 0 - C side
Superdrawer 
LEGACY
EB65 - C side
Figure 6.1: Configuration of modules and electronics during the October 2016
testbeam period.
One additional Long Barrel module was instrumented with the upgraded
electronics: the FATALIC/QIE module and the Demonstrator module. The
upgraded modules included four minidrawers connected to a cooling system
which controlled the internal temperature of the modules. Each minidrawer
148
6.2. TESTBEAM SETUP
Figure 6.2: Picture of the testbeam module on the table.
contained one DaughterBoard, one MainBoard and 12 PMT blocks with the
upgraded FEBs.
The QIE/FATALIC module was instrumented with both QIE and FATALIC
technologies. Two minidrawers were populated with PMT blocks with QIE
FEBs and two minidrawers with PMT blocks with FATALIC FEBs. Each FEB
option was connected to the corresponding MainBoard version which is specific
for each FEB option.
The Demonstrator module was equipped with 45 PMTs with 3-in-1 cards, a
3-in-1 MainBoard and a DaughterBoard.
Related to the services, both HVPS options were installed in the upgraded
modules to provide high voltage to the PMTs. The two FATALIC minidrawers
were powered with the remote HV system while the QIE minidrawers were
populated with the HV internal system. The Demonstrator module included a
combination of both options: the PMT blocks located in the outer minidrawers
were fed with the HV remote system and the PMTs in the middle minidrawers
were powered with the HV internal system. Each upgraded module received
the low voltage power from a fLVPS attached to the extreme of the module and
controlled from the legacy DCS software via CANBus.
149
CHAPTER 6. TESTBEAM SETUP AND RESULTS
The back-end electronics combined the legacy and the new readout electron-
ics. It included a reduced version of the legacy back-end electronics composed
of one ROD, one SBC and one TBM. The upgraded modules transmitted the
data to two TilePPrs integrated within the legacy TDAQ software through the
TTC and ROD interfaces.
Finally, a TTC system was also present in the testbeam setup composed of
a LTP, a TTCvi and a TTCex. The TTC system was used to configure the
upgraded and legacy electronics using the TDAQ software and also to provide
the clock for the readout electronics.
6.2.1 Beam elements
A variety of detectors were installed in the beam line to monitor the quality,
position and particle composition of the beam [95]. Figure 6.3 depicts a diagram
of the beam elements installed in the testbeam line.
Figure 6.3: Sketch of the beam elements in the testbeam setup.
Two beam chambers were used to measure the beam position with a reso-
lution of 0.2 mm. The beam chambers used at the testbeam are Delay Wire
Chambers (DWC) [96], developed at CERN. The DWC is formed by two sand-
wich parts, each composed of two cathode planes surrounding a central anode
wire-plane. The sandwich parts are placed orthogonally each other giving a
two dimensional position measurement. Figure 6.4 shows a picture of an instru-
mented DWC.
Since the beam is not composed of a single type of particle, a Muon Ho-
doscope (MH) and two Cherenkov Counters (CCs) were placed to help during
150
6.2. TESTBEAM SETUP
(a) Delay Wire Chamber.
Ximp [mm]
80− 60− 40− 20− 0 20 40 60 80
Yi
m
p 
[m
m]
80−
60−
40−
20−
0
20
40
60
80
0
1
2
3
4
5
6
7
8
9
ATLAS Tile Calorimeter
2016 testbeam data
(b) Beam position extracted from the beam
chambers data.
Figure 6.4: Picture of one of the beam chambers used for the testbeam setup
and the beam position data.
the oﬄine data analysis adding information for the particle identification. The
MH, also called Muon wall, is a movable detector made of 12 scintillators placed
behind the Tile modules. It is primarily used in oﬄine analysis to suppress the
low energy tail of the pions in high energy hadrons runs.
The CCs were placed in the beam line to improve the particle identification
separating pions and electrons at energies below 50 GeV. During the two first
testbeam campaigns in September 2015 and June 2016 two CCs were installed in
the beam line. One CC was configured to report interactions with kaons, pions
and electrons, but not protons, while the second one only reported interactions
with pions and electrons. During the testbeam of September 2016 a third CC
was installed and configured for identification of electrons.
Two scintillators S1 and S2 (Figure 6.3) were installed in the beam line
as part of the beam trigger system. The scintillator signals were transmitted
to the beam trigger system in the control room via coaxial cables. Since the
scintillator cable lengths were different, signals were time-equalized using delay
boxes before reaching the trigger system.
The beam trigger logic was implemented in a NIM crate using timer coun-
ters, discriminators and Fan-in/Fan-out modules. The beam trigger generates
a Master trigger signal when a beam particle produces a signal on both scintil-
lators. Then, the Master signal is transmitted to a second NIM crate initiating
151
CHAPTER 6. TESTBEAM SETUP AND RESULTS
the TDC measurement of the beam chamber signals and reading out the ADCs
used for the digitization of the scintillator signals.
In addition to the generation of the Master trigger, the trigger logic propa-
gated the L1A signal to the LTP in the TTC crate, unless the busy signal was
asserted indicating that the readout path was not available. Then, the LTP
transmitted the L1A signal to the TilePPr, RODs and front-end electronics via
the TTCvi and TTCex systems.
6.3 Clock distribution
The TTC system distributed the LHC clock to the legacy modules, RODs and
to the TilePPr prototypes through dedicated optical fibers. As described in
Chapter 1, the TTCrx ASIC located in the Motherboards of the legacy front-
end electronics recovers and fans out the LHC clock to the ADCs. In the
back-end electronics, the RODs also receive the LHC clock and TTC data for
event synchronization.
The clocking architecture employed for transmitting the LHC clock to the
upgraded modules is close to the proposed one for the Phase II Upgrade already
described in Chapter 5. The difference between the clocking distribution in the
testbeam setup and the proposed one for the Phase II is that the LHC clock
source is the legacy TTC system and not the FELIX system.
In the TilePPr, dedicated circuitry was employed to recover the LHC clock
and a jitter cleaner was used to reduce the jitter before the clock is passed to
the FPGA transceivers. The LHC clock was then distributed to the front-end
electronics embedded with the downlink GBT data.
In the DaughterBoard, the GBTx recovered and routed to the FPGA transceivers
a clock with four times the LHC clock frequency. Finally, the recovered LHC
clock was extracted from the transceiver CDR clock and transmitted to the
ADCs as sampling clock. In addition, the DaughterBoard FPGAs also imple-
mented the uplink communication at 9.6 Gbps using the recovered clock and
unifying the uplink and downlink clock domains.
Moreover, during the testbeam operation the TilePPr received commands to
configure the front-end electronics either through the IPbus registers or the TTC
152
6.4. DATA ACQUISITION
fiber. In the latter scenario, the TilePPr converted the legacy TTC commands
into Phase II commands.
6.4 Data acquisition
The TilePPr prototype was installed in the control room and was the core
of the back-end electronics during the testbeam campaigns of June 2016 and
September 2016. The TilePPr prototype was fully integrated into the TDAQ
architecture. It received the clock, commands and triggers from the legacy TTC
system and also transmited the triggered data to the RODs. A block diagram
describing the data flow and data acquisition architecture employed during the
testbeam is shown in Figure 6.5.
MD
PPr
MD MD MD
TTC
legacy 
ROD
IPbus
ROS
FELIX
DCS Event Dump
PPr 
config
Event Builder Athena
DQM ntuple
file
file
file
offline
LVPS
legacy
PPr
Interface
ntuple
Figure 6.5: Complete data acquisition system and data flow for the testbeam
setup.
The Demonstrator module transmitted to the TilePPr high and low gain
samples every 25 ns through 16 GBT links (4 Tile GBT links per minidrawer).
The TilePPr prototype extracted the samples from the GBT data and stamped
them with the corresponding BCID before its storage in the circular pipelines.
When a L1A was received, the TilePPr copied the data from the pipelines and
created an event packet that was transmitted to the next level in the DAQ
architecture.
153
CHAPTER 6. TESTBEAM SETUP AND RESULTS
The TilePPr prototype featured three different readout paths already intro-
duced in Chapter 4: FELIX system, legacy ROD and Ethernet readout through
IPbus. During the testbeam campaigns, the IPbus readout path was primarily
used for the configuration of the TilePPr and front-end electronics, daily cali-
bration runs, monitoring and readout of the integrator ADCs during the Cesium
calibration scans.
In the case of the FELIX readout path, 16 samples with the correspond-
ing BCID were formatted and sent to a FELIX prototype system through a
Standard GBT link. The FELIX prototype used during the testbeams was a
reduced custom version of the FELIX system capable of receiving GBT data
at 4.8 Gbps and storing the received data in internal buffers. This system was
implemented using a Xilinx KC705 evaluation board connected through PCIe
to a computer. After the reception of a complete event in the FELIX prototype
system, the binary data was extracted from the KC705 memories through the
PCIe interface and stored in the computer for later oﬄine analysis.
The legacy readout path is important because it allows the integration of the
Demonstrator module into the legacy TDAQ architecture after its installation
into the ATLAS detector. The TilePPr prototype emulated a legacy module
by sending event packets to the RODs through a G-Link interface running at
800 Mbps. After the reception of a L1A, the TilePPr retrieved the data from the
circular pipelines and packed the data into the legacy format for transmission
to the RODs.
The PUs in the RODs used the Optimal Filtering 2 algorithm (OF2) [97]
to reconstruct the amplitude and time of the shaped PMT signals using linear
combinations of the digital samples with a set of weights.
Equations 6.1 and 6.2 represent the magnitudes reconstructed by OF2 algo-
rithm.
A =
N∑
i=1
aiSi (6.1)
τ =
1
A
N∑
i=1
biSi (6.2)
where ai and bi are OF2 weights, Si are the digital samples, N is the number
of digital samples (7 is the default number), A is the amplitude of the shaped
154
6.4. DATA ACQUISITION
signal in ADC counts, τ is the phase of the pulse peak with respect to the
expected sampling time used in the calculation of the weights.
The OF2 algorithm also provides an estimation of the goodness of the re-
construction, called Quality Factor (QF) and expressed in Equation 6.3.
QF =
√√√√ N∑
i=1
(Si − (Agi +Aτg′i + p))2 (6.3)
where gi and g
′
i are the normalized amplitudes of the pulse shape function for the
ith sample and its derivative, and p is the pedestal which is usually estimated
as the average of the first and last samples or, just as the value of the first
sample [98].
Figure 6.6 shows a sketch of a reconstructed magnitudes superposed with
the samples.
Figure 6.6: Picture of a typical pulse shape showing the 7 samples and the
reconstructed pulse shape with the Optimal Filtering algorithm.
The Read Out System (ROS) received the reconstructed energy and time
and the digital samples from the RODs and provided the data fragments to the
Event Builder (EB). The EB reconstructed the energy and time per TileCal cell
using the Athena software [99] of the ATLAS software framework and stored
it in local disks together with the event trigger information provided by the
TilePPr.
During the data taking, the Data Quality Monitoring application (DQM)
accessed the stored events and displayed them in a Graphical User Interface
155
CHAPTER 6. TESTBEAM SETUP AND RESULTS
(GUI) to verify that the recorded data was useful. In addition, the DQM dis-
plays information regarding the beam line elements helping in the identification
of problems during the data taking. Figure 6.7 shows a screenshot of the DQM
used in the testbeam setup.
Figure 6.7: Screenshot of the DQM panel.
6.5 Calibration systems
The tests of the calibration systems are fundamental to verify the performance of
the readout electronics before the data taking and to obtain calibration data to
determine the electromagnetic energy scale and stability of the system. During
the testbeam periods, daily pedestal and CIS runs were taken prior to the physics
runs. Cesium scans to obtain the response of the detector were taken twice per
testbeam period.
6.5.1 Pedestal and linearity runs
The first test performed during the calibration runs are the linearity tests. The
linearity tests confirm that the front-end electronics receives correctly the con-
figuration commands and the ADCs data is correctly deserialized.
156
6.5. CALIBRATION SYSTEMS
In the linearity tests the DACs connected to the ADC inputs are configured
to shift the pedestal value from 0 to the maximum ADC range in programmable
steps. Then, pedestal samples are read out through the IPbus registers to ver-
ify the pedestal configuration. Any significant non-linearity of the results would
indicate that commands did not reached the front-end electronics, the analog
electronics is malfunctioning, or a problem with the deserialization in the Daugh-
terBoard FPGAs. Figure 6.8 shows the linearity test results corresponding to
the ADCs of one MainBoard section (3 channels).
DAC counts
0 200 400 600 800 1000
AD
C 
co
un
ts
0
500
1000
1500
2000
2500
3000
3500
4000
LG channel 0
DAC counts
0 200 400 600 800 1000
AD
C 
co
un
ts
0
500
1000
1500
2000
2500
3000
3500
4000
LG channel 1
DAC counts
0 200 400 600 800 1000
AD
C 
co
un
ts
0
500
1000
1500
2000
2500
3000
3500
4000
LG channel 2
Figure 6.8: Result of the linearity test. The red line represents the fit result of
a first-order polynomial function.
The pedestal runs were used to check that the digital path from the ADCs
of the MainBoard to the TilePPr was working correctly. This test validates
the digital path in both directions since the MainBoard is configured to set
the DACs that move the baseline at the ADCs input and the samples are read
out from the TilePPr through IPbus or the RODs. Normally, around 100,000
samples per ADC channel and gain were taken daily. When the readout path
was the ROD, each measurement includes 7 samples while the IPbus read out 32
samples per measurement. Figure 6.9 shows the obtained histogram for a single
channel with 100,000 samples, where a reduced noise with a value of 2.8 ADC
counts RMS can be observed. This value is similar to the RMS noise measured
in the current system [27].
In addition, this test permits detection of increased noise in the ADC chan-
nels or problems with the deserialization in DaughterBoard that would be seen
as non-Gaussian distribution of the pedestal values.
157
CHAPTER 6. TESTBEAM SETUP AND RESULTS
Entries  160000
Mean     1048
Std Dev     2.799
ADC counts
1020 1030 1040 1050 1060 1070 1080
Co
un
ts
1
10
210
310
410 ATLAS Tile Calorimeter
2016 testbeam  data
Figure 6.9: Result of a pedestal run taken with the TilePPr prototype through
the IPbus readout path.
6.5.2 Charge Injection System
The Charge Injection System included in the 3-in-1 cards is used to obtain the
gain factor of each individual ADC channel by injecting a known charge and
measuring the ADC response. CIS runs were taken in two different operation
modes: local and remote. In the local mode, a set of C++ and python scripts
were used to configure the front-end electronics through the TilePPr. The 3-in-1
card configuration for the switches, charge and BCID execution were transmit-
ted to the TilePPr through the IPbus interface, which encoded and transmitted
the commands to the front-end electronics through the optical links.
In the remote mode, the front-end configuration was controlled by the TDAQ
software which sent the configuration through the TTC system. The TilePPr
decoded the legacy TTC commands, converted them into Phase II format and
transmitted them to the front-end through the GBT links.
The two types of CIS tests were performed during the testbeam to obtain
the gain factor of the 3-in-1 cards to convert amplitudes in ADC counts to pC.
CIS linearity
During the CIS linearity test a series of CIS runs are generated increasing the
charge of the capacitors in discrete configurable steps. The CIS linearity test
is done for high and low gains covering the full ADC range. The reconstructed
158
6.5. CALIBRATION SYSTEMS
pulses include a bipolar component introduced by the internal capacitance of the
charge injector switches. This bipolar component can be measured configuring
the DAC to provide 0 V, and then subtracted from the reconstructed pulses.
CIS stability
The CIS stability tests consist of repetitive CIS runs with the same charge and
gain to track any variation of the reconstructed charge with time. During the
CIS stability tests the TilePPr configured the front-end electronics to charge
and discharge the selected CIS capacitor at a predefined BCID. It also stored
the CIS pulses in the pipelines before transmitting them to a computer through
the IPbus interface or through the legacy TDAQ infrastructure.
Figure 6.10 shows the result of a CIS linearity test for a single 3-in-1 card.
The measured charge is presented as a function of the injected charge for a low
gain channel showing good linearity. The bottom part of the figure indicates the
difference between the injected and the reconstructed charge. These residuals
are used to study the stability of the circuits with time.
Figure 6.10: Results of a CIS linearity test of a low gain channel and its residuals.
6.5.3 Cesium scans
The Cesium system was designed to measure the PMT gain and optical response
of the calorimeter cells. These measurements permit to obtain the correct high
159
CHAPTER 6. TESTBEAM SETUP AND RESULTS
voltage settings for each channel to equalize the response of the cells. An in-
sufficient strength of signal could also indicate broken fibers or a poor coupling
between the fiber bundle and the PMT block.
An hydraulic system moves a 137Cs γ source through a water-filled pipe that
crosses all the scintillator cells in the module. The 137Cs γ source is enclosed in
a small metal capsule. The current from PMTs is integrated and then digitized
by a 16-bit ADC. As explained earlier in Chapter 4, the integrator firmware
block in the DaughterBoard reads out these ADCs and transmits the integrator
samples to the TilePPr prototype.
The integrator samples can be read out at a configurable period in units of
orbits (89 µs). Normally the integrator samples are read out at a rate between
90 Hz and 150 Hz to follow the movement of the 137Cs γ source circulating
through the cells. Figure 6.11 shows the result of a Cesium scan for the cell
BC4. The maxima of the optical response correspond to the source crossing
the scintillating tiles while the minimums correspond to the source crossing the
absorber material.
Event number
33600 33700 33800 33900 34000 34100 34200
AD
C 
co
un
ts
0
5000
10000
15000
20000
25000
30000
35000
ATLAS Tile Calorimeter
2016 testbeam data
Figure 6.11: Response of the BC6 cells of the Demonstrator module during a Ce-
sium scan. Each peak corresponds to the response of each individual scintillator
tile when the Cesium source passes through the module.
160
6.6. DEMONSTRATOR PHYSICS PROGRAM
6.6 Demonstrator physics program
The physics program for the testbeam included the evaluation of the Demonstra-
tor and legacy modules with particle beams pointing at different cell positions
and angles. One of the goals of the testbeam program is to characterize the re-
sponse of the detector with the new readout electronics using different types of
particle beams with known energy. The performance of the upgraded electronics
is then compared with data obtained from the legacy modules also installed in
the testbeam.
The measurement of the cell response to electron beams determines the av-
erage charge to energy conversion factor called electromagnetic scale (EM) con-
stant. The measurement of electrons also evaluates the detector characteristics
such as linearity, uniformity and energy resolution. In addition, the measure-
ment of muon energy can be used for the calculation of the EM constant.
Other interesting measurements involve hadron beams where the hadron
shower energies are measured. These tests characterize the pion response as a
function of the energy and permits the evaluation of new energy reconstruction
methods.
The SPS beam is not pure, it is a mixture of pions, electrons and muons.
After the data is reconstructed, an oﬄine analysis is needed to separate and
identify single event-particles that will be the basis for the different studies and
calculation of the EM constant
6.6.1 Data quality
During the data taking and prior to the oﬄine data analysis for particle iden-
tification, the data collected from each PMT is analyzed for integrity. The
maximum sample is positioned in the fourth position of the seven samples trans-
mitted to the RODs in order to minimize the error deviation during the energy
reconstruction. However, as indicated in Figure 6.12, the maximum sample
could not always be contained in the fourth sample because the different time
of arrival of particles depending on the beam interaction point. Also the timing
precision of the testbeam trigger system when generating the L1A signals could
produce a misplacement of the maximum sample.
161
CHAPTER 6. TESTBEAM SETUP AND RESULTS
cMaxSample
Entries  33216
Mean    2.851
Std Dev    0.3269
Sample position
0 1 2 3 4 5 6
Co
un
ts
0
5000
10000
15000
20000
25000 ATLAS Tile Calorimeter
2016 testbeam data
Figure 6.12: Histogram of the maximum sample position corresponding to PMT
42 during a run with 50 GeV electron beam.
A second data quality study includes the comparison of the energy collected
by the two PMTs reading a cell. The correlation between the energy of the
PMTs reading out a cell gives an insight of the PMT gain relation where the
correlation coefficient is expected to be one or close to, if the beam is targeting
the cell center. An energy response with a correlation coefficient different from
one would indicate the malfunctioning of one PMTs or the application of non-
optimal HV values. Figure 6.13 shows the correlation between the energy of
the two PMTs reading the BC8 cell of the Demonstrator module in a run with
50 GeV electron beam.
Finally, the correlation between the reconstructed time of two pulse signals
also provides information about the data quality. The reconstructed time of a
pulse signal refers to the time difference between the reconstructed pulse peak
and the sampling clock edge. However, since the testbeam trigger system is not
synchronized with the beam, the reconstructed a peak varies uniformly in time
between ±25 ns. Figure 6.14 shows the correlation between the reconstructed
time of two PMTs connected to cell BC6 (a) and the reconstructed time for
only one PMT (b).
162
6.6. DEMONSTRATOR PHYSICS PROGRAM
PMT 41 (pC)
0 5 10 15 20 25 30
PM
T 
42
 (p
C)
0
5
10
15
20
25
30
Amp_
Entries  32736
Mean x   6.342
Mean y    7.09
Std Dev x   3.989
Std Dev y   4.564
1
10
210
ATLAS Tile Calorimeter
2016 testbeam data
Amp_
Figure 6.13: PMT energy correlation corresponding to cell BC8 during a run
with 50 GeV electron beam. As can be observed, the correlation coefficient
corresponds to a value close to one indicating a good equalization of the PMT
gains.
PMT 29 (ns)
0 10 20 30 40 50 60 70
PM
T 
30
 (n
s)
20−
10−
0
10
20
30
40
50
0
2
4
6
8
10
12
14
16
18
20
ATLAS Tile Calorimeter
2016 testbeam data
(a) Correlation between the reconstructed
time for PMTs 29 and 30 reading the BC6
cell. The timing correlation coefficient has a
value close to one. The displacement of the
maximum peak from the fourth sample pro-
duces an offset of 12.5 ns in the axis y and
40 ns in axis x.
Timing_0_610582__1
Entries  51084
Mean    13.41
Std Dev     7.508
Time (ns)
20− 10− 0 10 20 30 40 50
Co
un
ts
0
50
100
150
200
250
300
350 ATLAS Tile Calorimeter
2016 testbeam data
(b) Reconstructed time for PMT 30. The
phase difference between the peak pulse and
the expected arrival takes random values in a
50 ns range centered in 12.5 ns.
Figure 6.14: Timing plots for BC6 cell taken during a run with 180 GeV muon
beam.
163
CHAPTER 6. TESTBEAM SETUP AND RESULTS
6.6.2 Oﬄine data analysis
Different selections are applied to the reconstructed event data in order to se-
lect single-particle events and to perform the different studies for the detector
characterization. The first selection requirement is implemented with the infor-
mation collected with the beam elements during the run. The data from the
beam chambers are used to reject events containing particles with tracks out of
the beam axis or not parallel to it, since in this case the energy could not be
deposited at the expected location in the module. In addition, selections are
applied using the scintillator counters to remove events with particles generating
undesired showers in the modules.
After the first selections, higher level methods are applied to effectively
isolate specific types of particles for the different studies. For example, Fig-
ure 6.15 (a) shows the energy spectrum measured in the Demonstrator module
with a 100 GeV electron beam where the pions and muons are clearly sepa-
rated. Muons are mixed with pedestal and noise in the low energy side of the
plot, while hadrons and electrons stay at higher energies.
The muon contamination can be easily removed for energies below 10 GeV by
requiring the measured cell energy to be higher than 5 GeV. However, the pion-
electron separation requires more sophisticated selections since electron beams
are usually contaminated with a comparable population of hadrons and elec-
trons. Most methods for electron-pion separation exploit the difference between
electromagnetic and hadronic shower profiles. The electromagnetic showers pro-
duce a higher energy density in the impact region with respect to the hadronic
showers, since the energy is deposited in a more concentrated region.
A method to separate pions and electrons, called Hot Cells [100], uses the
measured energy of a single-particle event versus the number of cells that con-
tains signal above the cell noise is also used. Figure 6.15 (b) shows the recon-
structed energy of a run with 100 GeV electron beam versus the number of cells
(Ncells) over an energy threshold of 0.05 pC [101]. For this analysis, the cells
of the Demonstrator and the bottom legacy module are included for the calcu-
lation of the energy. The cluster on the top left side corresponds to electrons
with higher energy which are concentrated in a lower cell count, the pions are
at the center and the muons in the bottom part of the plot.
164
6.6. DEMONSTRATOR PHYSICS PROGRAM
Energy (pC)0 50 100 150 200
Co
un
ts
 
0
1000
2000
3000
4000
5000
6000
7000
8000
ATLAS Tile Calorimeter
2016 testbeam data
(a) Spectrum of the energy measured with
the Demonstrator module during a run with
100 GeV electron beam.
N cell
0 5 10 15 20 25
En
er
gy
 (p
C)
0
20
40
60
80
100
120
1
10
210
310ATLAS Tile Calorimeter
2016 testbeam data
(b) Event energy versus the number of cells
over the threshold for a run with 100 GeV
electron beam.
Figure 6.15: Energy spectrum of a run with 100 GeV electron beam and the
representation of the energy versus the number of Hot Cells.
For electron beams with energies below 20 GeV the pion-electron separation
can be improved by combining the data provided by the Cherenkov chambers
with the Average Density (AvD) [100] expressed in Equation 6.4. The AvD is
calculated as the sum of cell energy densities over the total number of cells with
signal above a threshold.
AvD =
1
Ncell
Ncell∑
i=1
=
Ei
Vi
(6.4)
where Ei represents the deposited energy in cell i, Vi is the corresponding cell
volume and Ncell is the number of cells with deposited cell energy greater than
0.06 pC.
165
CHAPTER 6. TESTBEAM SETUP AND RESULTS
Figure 6.16 shows a scatter plot of the second Cherenkov counter signal
versus the AvD in the module for a 20 GeV run at η = −0.25. Electrons
are separated in the right side where the AvD is higher in two main groups
corresponding to the Ncell factor from Equation 6.4.
31000*pC/cm
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
AD
C 
co
un
ts
0
100
200
300
400
500
600
700
800
900
1000
0
2
4
6
8
10
12
ATLAS Tile Calorimeter
2016 testbeam data
Figure 6.16: Scatter plot of the Cherenkov counter versus the AvD in the module
for a run with 20 GeV electron beam. Pions are confined in the lower left
quadrant and electrons in the right. The energy density for electrons is separated
into two main regions corresponding to two or three cells above threshold.
These selection requirements allow identification of the different types of
particles. As introduced before, the EM scale constant of the Tile Calorimeter
cells is determined by measuring the energy of the single-event electrons. Fig-
ure 6.17 shows the relation between the electrons energy and the beam energy
in a run with 100 GeV electron beam after applying a cut based on the AvD
method. The EM calibration constant obtained is 1.031 pC/GeV with a σ of
0.045 pC/GeV, being close to the expected 1.05 pC/GeV already calculated in
previous testbeams [101].
166
6.6. DEMONSTRATOR PHYSICS PROGRAM
Entries  1567
Constant  7.0± 195.4 
Mean      0.001± 1.031 
Sigma    
 0.00110± 0.04493 
ADC pC/GeV
0 0.2 0.4 0.6 0.8 1 1.2 1.4
Co
un
ts
0
20
40
60
80
100
120
140
160
180
200
220
ATLAS Tile Calorimeter
2016 testbeam data
Figure 6.17: Energy in pC of the cell A8 normalized to the beam energy of
100 GeV electrons at 20 degrees after isolating the electrons.
The study of the muon signals permits the evaluation of the electronics
performance since the TileCal response to muon are close to the pedestal values.
Muons are isolated from the electronic noise applying energy cuts on all the
cells that the beam crosses except in the cell under study. Figure 6.18 shows
an example of muon signal isolation for a run with 180 GeV muon beam at
θ = 0.15. The pedestal shown in the plot corresponds to the reconstructed
signal in cell D1 for a run with no beam on the Demonstrator module.
Figure 6.18: Isolated muon signal from the pedestal in cell D1 for a run with
180 GeV muon beam at 20 degrees.
167
CHAPTER 6. TESTBEAM SETUP AND RESULTS
168
Chapter 7
Conclusions
This PhD dissertation is focused on the design, production and integration tests
of the first prototype of the TilePPr module for the ATLAS Tile Calorimeter
in the HL-LHC.
Before the complete replacement of the readout electronics for the HL-LHC,
the Demonstrator project aims to evaluate and qualify the upgraded electronics
readout system with the installation of a Demonstrator module into the AT-
LAS experiment. The TilePPr prototype includes all the components required
to read out and operate the Demonstrator module, implementing the trigger
architecture envisaged for the HL-LHC, as well as, keeping backward compati-
bility with the current DAQ system.
A detailed description of the design and validation of the TilePPr proto-
type has been presented. Extensive signal integrity studies were performed to
validate the PCB layout, where special attention was paid to the impedance
discontinuities along the high-speed traces, that cause degradation of the signal
integrity, such as the DC-blocking capacitors or the differential vias.
BER tests and eye diagrams were measured with a sampling oscilloscope as
part of the validation tests of the prototype. The results presented in Chapter 4
show good performance of the prototype with a BER better than 5 · 10−17 for a
Confidence Level of 95%, as well as, low jitter values in the optical link signals
with a TJ(10
−18) of 49.5 ps with a σ of 2.5 ps at 4.8 Gbps and 57 ps with a σ
of 2.3 ps at 9.6 Gbps.
169
CHAPTER 7. CONCLUSIONS
The limited number of clocking resources in the Readout FPGA forced a
major modification of the GBT-FPGA IP core, where the MMCM of the GBT
receivers was replaced by a BO-CDR implemented in logic. This modification
permitted the implementation of a high number of GBT links in FPGAs. More-
over, the TileCal links required the modification of the IP core, where the data
rate was increased from 4.8 Gbps to 9.6 Gbps in the uplink by tuning the con-
figuration of the embedded transceivers and clocking resources. A special effort
was made during the design of the GBT links and all the firmware blocks im-
plemented in logic handling the readout samples, such as the data packers or
deserialization blocks, to provide a fixed and deterministic latency.
The integration of the TilePPr prototype with the Demonstrator module was
covered in detail, describing all the different firmware blocks implemented in the
front-end and back-end electronics required for the operation of Demonstrator
module.
Another important contribution is the implementation of a phase monitoring
tool, called OSUS, which synchronizes the Demonstrator module with the LHC
clock distributed by the TTC system and monitors the stability of the clock
transmitted to the front-end electronics with a precision of about 30 psRMS.
Moreover, the OSUS circuit was used to study the latency variations of the
GBT links produced after the reconfiguration of the front-end electronics. As
it has been covered in this thesis, the glitches generated during the sampling of
the clocks result in degradation of the system resolution. Different debouncing
techniques were studied to estimate the time position of the positive edge. The
studies allowed to improve the time resolution by about 15 psRMS when using
the Average Position method instead the First Edge method.
Finally, the TilePPr prototype was integrated with the rest of the readout
electronics during three testbeam campaigns and served as the main element
of the back-end electronics system. Following the TDAQ architecture for the
HL-LHC, the TilePPr prototype distributed the LHC clock and configuration
commands to the front-end electronics through optical fibers and read out the
digitized samples at the LHC frequency. Data was stored in pipeline memories
up to the reception of L1A when the selected data was formatted and transmit-
ted to the ROD and FELIX systems. Chapter 6 includes a detailed description
170
of the testbeam setup, facilities, trigger and data acquisition architecture, ac-
companied by the physics analysis performed with the data collected from the
testbeam campaigns. These important tests compared the performance of the
legacy and upgraded readout electronics systems.
The future work derived from the TilePPr prototype includes the design
of the TilePPr module for the HL-LHC, which will be based on the TilePPr
prototype presented here. The final design will be capable to operate up to 8
complete modules and will be composed of an ATCA carrier with four AMC
slots which will host the CPMs. The experience gained on high-speed data
transmission techniques applied to the TilePPr prototype will be used for the
future TilePPr module. In addition, many of the firmware pieces described in
this document, such as the OSUS circuit or the Tile GBT-FPGA IP core, will
be used in the final version of the TilePPr module.
171
CHAPTER 7. CONCLUSIONS
172
Cap´ıtulo 8
Resumen
Esta tesis se desarrolla dentro del marco del proyecto Tile Calorimeter (TileCal)
Demonstrator. Este proyecto tiene como objetivo la evaluacio´n y cualificacio´n de
la electro´nica de adquisicio´n del detector TileCal para el High Luminosity Large
Hadron Collider (HL-LHC). Los planes del proyecto Demonstrator incluyen la
instalacio´n de un mo´dulo prototipo con los nuevos desarrollos electro´nicos den-
tro del experimento A Toroidal LHC ApparatuS (ATLAS). Adema´s, el mo´dulo
Demonstrator sera´ testeado con haces de part´ıculas en diferentes periodos de
testbeam con el objetivo de estudiar el rendimiento de los prototipos.
En esta tesis se presenta el disen˜o, integracio´n e instalacio´n del primer pro-
totipo Tile PreProcessor (TilePPr). Este prototipo ha sido disen˜ado para la
operacio´n y lectura del mo´dulo Demonstrator, como primer y principal elemen-
to de la electro´nica de back-end. Tambie´n se presenta el desarrollo de firmware
que se ha realizado para los prototipos de la tarjeta DaughterBoard y TilePPr.
8.1 Introduccio´n
El Gran Colisionador de Hadrones (LHC) es el ma´s grande y potente acelerador
de part´ıculas del mundo. El LHC se encuentra en las instalaciones de la Organi-
zacio´n Europea para la Investigacio´n Nuclear (CERN) a 100 metros bajo tierra,
dentro de un tu´nel circular de 27 kilo´metros que cruza la frontera de Francia
y Suiza. El LHC es el u´ltimo acelerador de una serie de aceleradores utilizados
para aumentar la energ´ıa de los haces de protones. La Figura 8.1 muestra el
173
CAPI´TULO 8. RESUMEN
complejo de aceleradores del CERN. Los diferentes aceleradores aumentan la
energ´ıa de los haces de protones hasta alcanzar una energ´ıa de 450 GeV y son
inyectados en el acelerador LHC donde finalmente se aceleran hasta su ma´xima
energ´ıa.
Figura 8.1: Complejo de aceleradores del CERN.
Situados alrededor del anillo del LHC, los 4 principales experimentos ana-
lizan las part´ıculas producidas durante las colisiones de los haces de protones.
Estos experimentos son: ATLAS, CMS, ALICE y LHCb. Los dos experimentos
ma´s grandes, ATLAS y CMS, son detectores de propo´sito general que esta´n
situados en lados opuestos del LHC. Ambos detectores han sido disen˜ados para
medir con precisio´n las propiedades de las interacciones fuertes y electrode´bi-
les de las part´ıculas elementales , as´ı como nueva f´ısica ma´s alla del modelo
esta´ndar. Ambos experimentos anunciaron en 2012 el descubrimiento del boso´n
de Higgs con una masa alrededor de 125 GeV. Por otra parte, el LHCb estudia la
violacio´n de la simetr´ıa CP, y ALICE investiga el plasma quark-gluo´n mediante
la colisio´n de iones pesados.
8.1.1 Experimento ATLAS
El experimento ATLAS (Figura 8.2) es un detector de propo´sito general disen˜ado
para estudiar las part´ıculas resultantes de las colisiones de los haces de protones
en el LHC. ATLAS es el detector ma´s grande del LHC con 45 metros de largo,
174
8.1. INTRODUCCIO´N
ma´s de 25 metros de altura, y con un peso total de aproximadamente 7,000 tone-
ladas. El experimento ATLAS esta´ compuesto por diferentes subdetectores que
estudian las part´ıculas generadas en las colisiones. Entre estos subdetectores se
encuentran: el detector interno, los calor´ımetros electromagne´tico y hadro´nico,
y el espectro´metro de muones.
Figura 8.2: El experimento ATLAS.
El detector interno se situ´a en la parte ma´s interna de ATLAS, alrededor
del tubo donde circulan los haces. Este detector permite la reconstruccio´n de
los ve´rtices y de las trayectorias de las part´ıculas cargadas que lo atraviesan.
Rodeando a este detector se encuentran los calor´ımetros electromagne´tico y
hadro´nico que miden la energ´ıa depositada por las distintas part´ıculas. En la
capa ma´s externa del ATLAS se encuentra el espectro´metro de muones calculan-
do la trayectoria de las part´ıculas cargadas que los calor´ımetros no han frenado.
Finalmente, el detector ATLAS esta´ rodeado por tres toroides magne´ticos que
generan un campo de 0.5 Teslas curvando la trayectoria de las part´ıculas carga-
das.
175
CAPI´TULO 8. RESUMEN
8.1.2 Calor´ımetro Hadro´nico TileCal
El calor´ımetro hadronico de tejas es uno de los subdetectores que componen el
experimento ATLAS situa´ndose en la regio´n |η| < 1.7. TileCal es un calor´ımetro
de muestreo que utiliza acero como material absorbente y centelleador como
medio activo. Este subdetector esta´ dividido en un barril central (LBA, LBC)
de 5.8 metros de longitud y dos barriles extendidos (EBA, EBC) de 2.6 metros
de longitud cada uno. Cada barril esta´ formado por 64 mo´dulos (Figura 8.3), que
se dividen a su vez en celdas. Cada una de las celdas del Tilecal se lee utilizando
dos fotomultiplicadores (PMT) y fibras especiales que desplazan la longitud de
onda de la luz. Un total de 9852 PMTs son necesarios para la lectura completa
de TileCal.
Figura 8.3: Estructura de los mo´dulos de TileCal.
Durante las colisiones, las part´ıculas producidas en el centro de ATLAS y que
cruzan el TileCal, depositan su energ´ıa en las celdas del detector produciendo
cierta cantidad de luz que es guiada hasta los PMTs. Estos PMTs generan un
pulso ele´ctrico con una amplitud proporcional a la luz producida en la celda. La
tarjeta 3-in-1 recibe este pulso y lo acondiciona generando dos pulsos analo´gicos
con una relacio´n de amplitud de 1:64. Los dos pulsos generados por la tarjeta
3-in-1 son digitalizados con un ADC de 10 bits utilizando un reloj de 40 MHz
s´ıncrono con las colisiones de haces en el LHC. Adema´s, los pulsos son sumados
176
8.1. INTRODUCCIO´N
analo´gicamente en grupos de hasta 5 sen˜ales correspondientes a las celdas de
una misma η, y enviados al sistema de primer nivel de seleccio´n.
Las sen˜ales digitalizadas son almacenadas en las memorias de los chips TileD-
MU hasta la recepcio´n de una sen˜al de disparo (L1A), generada por el sistema de
primer nivel de seleccio´n. Esta sen˜al indica que los datos han sido seleccionados
y deben procesarse. Esta seleccio´n de primer nivel se produce con una frecuen-
cia ma´xima de 100 kHz en promedio. Una vez recibida dicha sen˜al los datos
son transmitidos a las tarjetas Read Out Drivers (ROD) donde se procesara´n,
transmitiendo al siguiente nivel de seleccio´n la energ´ıa y tiempo reconstruidos
correspondiente a los pulsos recibidos.
8.1.3 Mejoras del experimento ATLAS y el High Lumi-
nosity LHC
Durante el an˜o 2026 el acelerador LHC se actualizara´ dando paso al acelerador
HL-LHC. Este nuevo acelerador permitira´ aumentar la luminosidad instanta´nea
en un factor 5, en comparacio´n con el actual LHC, y hasta en un factor 10 la
luminosidad integrada. El disen˜o del HL-LHC y la consecuente actualizacio´n
de los experimentos instalados en e´l, representa un gran desaf´ıo tecnolo´gico. El
nuevo acelerador conlleva el desarrollo de nuevas tecnolog´ıas de aceleradores
como imanes superconductores y cavidades, y sistemas electro´nicos que permi-
tan adquirir y procesar la extraordinaria cantidad de datos generada por los
experimentos.
El proyecto de actualizacio´n del detector ATLAS, llamado Phase II Upgra-
de, esta´ dividido en tres fases que corresponden a los tres periodos largos de
mantenimento (Figura 8.4). Despue´s de la parada te´cnica LS3, los diferentes
subdetectores habra´n sido actualizados para operar con las nuevas condiciones
de luminosidad HL-LHC. El nu´mero de eventos aumentara´ de 20 a 200 por inter-
accio´n, requiriendo que los subdetectores proporcionen informacio´n con mayor
precisio´n, adema´s de un nuevo sistema de adquisicio´n de datos capaz de manejar
el volumen de datos generado.
Algunos de los subdetectores del ATLAS como el detector interno, el Forward
LAr Calorimeter y las Forward Muon Wheels sufrira´n ma´s los efectos de la
radiacio´n requiriendo la sustitucio´n tanto del detector como de la electro´nica.
177
CAPI´TULO 8. RESUMEN
Figura 8.4: Plan del LHC para los pro´ximos 10 an˜os, incluyendo las paradas
te´cnicas y actualizaciones para el aumento de la luminosidad.
Otros subdetectores como los calor´ımetros o el espectro´metro de muones, menos
afectados por la radiacio´n, tan solo necesitara´n sustituir la electro´nica de lectura
y adquisicio´n para hacer frente a los nuevos niveles de radiacio´n y ancho de
banda de datos.
8.1.4 Proyecto Demonstrator
El proyecto Demonstrator pretende la evaluacio´n de la nueva electro´nica de ad-
quisicio´n de datos antes de que e´sta sea sustituida durante la actualizacio´n del
experimento ATLAS para el HL-LHC. Dentro del marco de este proyecto se
ha constru´ıdo el mo´dulo Demonstrator, el cual incluye prototipos de los nue-
vos sistemas electro´nicos. El mo´dulo Demonstrator esta´ dividido en 4 partes
iguales, donde cada parte esta´ compuesta por una estructura meca´nica de alu-
minio, llamada (minidrawer), sobre la que se distribuyen los siguientes sistemas
electro´nicos de lectura:
• Hasta 12 PMT blocks: cada PMT block contiene un fotomultiplicador
(PMT, PhotoMultiplier Tube) y su correspondiente tarjeta 3-in-1 modifi-
cada para acondicionar las sen˜ales generadas. La tarjeta 3-in-1 modificada
esta´ disen˜ada con componentes discretos y proporciona dos pulsos analo´gi-
cos con un ratio de 1:32 y una anchura a media altura (FWHM, Full Width
at Half Maximum) de 50 ns. Esta tarjeta esta´ basada en la tarjeta 3-in-1
utilizada actualmente en el TileCal.
• Tarjeta 3-in-1 MainBoard : permite la operacio´n de hasta 12 PMT blocks,
adema´s de incluir los ADCs necesarios para la digitalizacio´n de las sen˜ales
178
8.1. INTRODUCCIO´N
de los PMTs. Esta tarjeta tambie´n es la encargada de transmitir las sen˜ales
digitalizadas desde los ADCs a la tarjeta DaughterBoard.
• Tarjeta Adder base: da soporte f´ısico y alimentacio´n a las tarjetas adder
que suman analo´gicamente las sen˜ales acondicionadas correspondientes
a las celdas de una misma η. Estas tarjetas proporcionan la informacio´n
analo´gica de los eventos al sistema de primer nivel de seleccio´n de ATLAS.
• Tarjeta DaughterBoard : esta tarjeta contiene dos dispositivos de lo´gica
programable (FPGA, Field Programmable Gate Array) de altas prestacio-
nes y conectores o´pticos. La DaughterBoard es la interfaz con la electro´ni-
ca de back-end transmitiendo, a trave´s de enlaces de alta velocidad, las
sen˜ales digitalizadas junto a informacio´n del detector y, adema´s, recibiendo
y distribuyendo las sen˜ales de sincronismo y los comandos de configura-
cio´n.
• Tarjeta HV board : Esta tarjeta regula el voltaje de alta tensio´n aplicado
a los PMTs. Existen dos opciones para la distribucio´n de alto voltaje. En
la primera opcio´n, la tarjeta HVOpto card es alimentada con una tensio´n
de alto voltaje y e´sta regula de forma independiente el voltaje de cada
canal, mientras en la segunda opcio´n el voltaje de cada uno de los PMTs
se proporciona de forma individual desde fuera del detector, y la HV board
tan so´lo distribuye las alimentaciones.
La Figura 8.5 muestra la estructura meca´nica de un minidrawer, donde se
indica las diferentes partes que componen la electro´nica de front-end.
Adema´s como parte de este proyecto tambie´n se evalu´an otras dos alternati-
vas ma´s para la adquisicio´n de los pulsos de los PMTs en el HL-LHC, (adema´s
de la tarjeta 3-in-1): el chip Front-end ATlAs tiLe Integrated Circuit (FATA-
LIC) y el chip Charge Integrator and Encoder (QIE). El chip FATALIC es un
Circuito Integrado de Aplicacio´n Espec´ıfica (ASIC) que incluye una etapa de
acondicionamiento de la sen˜al con tres ganancias (1, 8, 64) y un ADC de 12
bits integrado tambie´n dentro del mismo ASIC. Por otra parte, el ASIC QIE
esta´ formado un divisor de corriente con mu´ltiples rangos y un ADC interno.
A diferencia del resto de opciones, el QIE proporciona el valor integrado de la
sen˜al de corriente generada por el PMT.
179
CAPI´TULO 8. RESUMEN
Figura 8.5: Figura detallada de un minidrawer y las diferentes tarjetas electro´ni-
cas de front-end.
8.2 Prototipo TilePPr
El prototipo Tile PreProcessor es el primer y ma´s importante componente de
la electro´nica de back-end en el proyecto Demonstrator. Este prototipo tiene
capacidad para leer y operar un mo´dulo del TileCal, representando una octava
parte del sistema TilePPr final que sera´ disen˜ado para el HL-LHC.
El TilePPr recibe y procesa datos digitales del mo´dulo TileCal, adema´s de
ser el encargado de la sincronizacio´n del detector y transmitir los comandos
de configuracio´n para operar la electro´nica de front-end. El prototipo TilePPr,
tambie´n se comunica con el Sistema de Control del Detector (DCS) configurando
y supervisando el voltaje aplicado a los PMTs. Durante la operacio´n del mo´dulo
Demonstrator, el prototipo TilePPr almacena cada 25 ns los datos transmitidos
por la electro´nica de front-end en memorias de tipo pipeline. Cuando el prototipo
TilePPr recibe una sen˜al L1A, e´ste transmite los datos seleccionados al sistema
Front-End LInk eXchange (FELIX) y al ROD, integrando de esta forma el
mo´dulo Demonstrator dentro del sistema de adquisicio´n actual.
Adema´s, el prototipo TilePPr tambie´n incluye herramientas digitales para
la monitorizacio´n de desfases entre sen˜ales perio´dicas. Estas herramientas han
jugado un papel importante en la medida de la latencia de los enlaces o´pticos
con precisiones por debajo de 30 psRMS, as´ı como para sincronizar el mo´du-
180
8.2. PROTOTIPO TILEPPR
lo Demonstrator con el sistema de Timing, Trigger and Control (TTC), que
proporciona el reloj y sen˜ales de sincronismo del LHC.
Como parte del proyecto Demonstrator, el prototipo TilePPr se ha utilizado
para la lectura y operacio´n de las tres opciones para las tarjetas de front-end
(3-in-1, QIE y FATALIC) durante 3 periodos de pruebas donde se testeo´ la
electro´nica con haces de part´ıculas. Adema´s, tambie´n se preve´ la instalacio´n del
mo´dulo Demonstrator dentro del actual experimento ATLAS durante una de
las paradas cortas del LHC durante el Run 2, donde el prototipo TilePPr se
utilizara´ para su operacio´n y lectura como bloque principal en el back-end.
Respecto al disen˜o del prototipo TilePPr (Figura 8.6), e´ste ha sido disen˜ado
con un formato de tarjeta doble Advanced Mezzanine Card (AMC). Por tanto,
esta placa se puede operar en una tarjeta madre Advanced Telecommunications
Computing Architecture (ATCA) o directamente en un sistema Micro Telecom-
munications Computing Architecture (µTCA).
Figura 8.6: Imagen del prototipo TilePPr.
El nu´cleo principal de la tarjeta esta´ formado por dos FPGAs de la serie
7 de Xilinx: una FPGA Virtex 7 (FPGA Readout), y una FPGA Kintex 7,
(FPGA Trigger). Estas FPGAs contienen transceptores de alta velocidad, una
alta densidad de recursos lo´gicos y un gran nu´mero de bloques de procesadores
181
CAPI´TULO 8. RESUMEN
digitales de la sen˜al (DSP). La tarjeta TilePPr integra 4 mo´dulos o´pticos Quad
Small Form-factor Pluggable (QSFP) para la comunicacio´n con la electro´nica
de front-end, as´ı como otros mo´dulos Avago MiniPOD con fines de evaluacio´n.
Durante el proceso de disen˜o de la tarjeta PCB (Printed Circuit Board), se
realizaron multitud de simulaciones de integridad de la sen˜al, especialmente en
aquellas l´ıneas destinadas a operar a 10 Gbps. Estas simulaciones constituye-
ron una etapa crucial tanto en la definicio´n de la geometr´ıa de las pistas de
alta velocidad, como en la deteccio´n y optimizacio´n de las discontinuidades de
impedancia producidas a lo largo de las pistas de alta velocidad.
Como me´todo de ana´lisis de la integridad de la sen˜al se utilizaron los para´me-
tros Scattering (S) extra´ıdos a trave´s de las herramientas ANSYS SIwave y
HFSS, as´ı como simulaciones de la te´cnica de Reflectrometr´ıa en el Dominio del
Tiempo (TDR). La Figura 8.7 muestra un ejemplo de las simulaciones realizadas
para la reduccio´n de la discontinuidad de impedancia producida por condensa-
dores de taman˜o 0201 y 0402, donde se comprobo´ que los condensadores de
taman˜o 0201 producen una menor degradacio´n de la integridad de la sen˜al.
(a) Comparacio´n de la respuesta TDR entre
pistas diferenciales con condensadores DC-
blocking con encapsulado 0201 (azul) y 0402
(rojo) utilizando una sen˜al escalo´n con un
tiempo de subida de 40 ps.
(b) Comparacio´n de la pe´rdida por insercio´n
(para´metro SDD21) entre pistas diferenciales
con condensadores DC-blocking con encapsu-
lado 0201 (azul) y 0402 (rojo).
Figura 8.7: Comparacio´n de los resultados de la simulacio´n TDR y de pe´rdidas
por insercio´n entre dos pistas con condensadores DC-blocking con encapsulado
0201 y 0402.
Una vez construidos los primeros prototipos, se realizaron tests para compro-
bar que cumpl´ıan con las especificaciones de disen˜o. Como parte de los tests de
evaluacio´n del prototipo TilePPr se midieron diagramas de ojo correspondientes
182
8.3. OBJETIVOS
a los enlaces o´pticos operando a 4.8 Gbps y 9.6 Gbps. La Figura 8.8 muestra
un diagrama de ojo obtenido de una de las salidas o´pticas de un mo´dulo QSFP
conectado al prototipo TilePPr y operando a 9.6 Gbps.
Figura 8.8: Diagrama de ojo correspondiente a la salida o´ptica de un mo´dulo
QSFP operando a 9.6 Gbps.
8.3 Objetivos
Los objetivos de la presente tesis incluyen el desarrollo e instalacio´n de un primer
prototipo electro´nico para la adquisicio´n de datos del mo´dulo Demonstrator que
demuestre la viabilidad de la nueva estrategia de adquisicio´n de datos propuesta
para el detector TileCal en el HL-LHC. Del mismo modo, este prototipo ha de
servir como base para el futuro disen˜o de la versio´n final de la electro´nica de
back-end que operara´ en el HL-LHC. Los objetivos se han divido en diferentes
puntos coincidiendo, en gran parte, con la estructura de la tesis:
• Desarrollo de un prototipo de electro´nica de back-end que cumpla con los
requisitos del sistema de adquisicio´n de datos del TileCal para el HL-LHC.
Este prototipo debera´ ser capaz de operar y leer los datos del mo´dulo De-
monstrator, as´ı como de transmitir informacio´n a los diferentes elementos
183
CAPI´TULO 8. RESUMEN
que forman parte del sistema de adquisicio´n en el HL-LHC. Adema´s el
prototipo debe permitir la integracio´n del mo´dulo Demonstrator dentro
del sistema de adquisicio´n del experimento ATLAS.
• Desarrollo de herramientas digitales basadas en FPGA para la medida de
fase entre dos sen˜ales perio´dicas con una precisio´n inferior a 30 psRMS.
Estas herramientas han de permitir la correcta sincronizacio´n del mo´dulo
Demonstrator con el LHC a trave´s del TilePPr, as´ı como servir de me´to-
do para el estudio y deteccio´n de las variaciones de fase producidas en
los relojes transmitidos por cambios de temperatura o de la tensio´n de
alimentacio´n de la electro´nica y fibras.
• Integracio´n el prototipo TilePPr con el mo´dulo Demonstrator para el es-
tudio del rendimiento de la electro´nica disen˜ada para el HL-LHC. Este
estudio, que incluira´ pruebas con haces de part´ıculas, permitira´ comparar
el rendimiento de la nueva electro´nica con el de la electro´nica actualmente
utilizada en el detector TileCal. Estas pruebas se realizara´n en las ins-
talaciones del CERN, donde la electro´nica sera´ testeada en condiciones
similares a la de operacio´n.
8.4 Metodolog´ıa
La metodolog´ıa llevada a cabo para la realizacio´n de esta tesis doctoral consta
de las siguientes fases:
Estudio de los sistemas de adquisicio´n para experimentos
de F´ısica de Altas Energ´ıas
El trabajo se inicio´ con el estudio del actual sistema de adquisicio´n de datos
del experimento ATLAS y, en concreto, de la electro´nica del detector TileCal.
Asimismo, tambie´n se realizo´ un profundo estudio de las necesidades del nuevo
sistema de adquisicio´n de datos para el detector TileCal en el HL-LHC, donde
se identifico´ los requerimientos de la nueva electro´nica.
184
8.4. METODOLOGI´A
Disen˜o conceptual del sistema de adquisicio´n
Definidas las necesidades del sistema de adquisicio´n de datos a construir, se pro-
puso el disen˜o de un prototipo basado en FPGAs y conectores o´pticos de alta
velocidad. Para realizar dicha propuesta, primero se estudiaron diferentes arqui-
tecturas de transceptores de alta velocidad embebidos en FPGA que permitieran
un ancho de banda suficiente y que pudieran establecer comunicaciones de alta
velocidad con latencia fija y determinista. Durante este estudio tambie´n se tuvo
en cuenta que la cantidad de recursos lo´gicos y de memoria embebida de las
FPGAs fuera suficiente para albergar los algoritmos y arquitecturas necesarias.
Finalmente, se realizo´ un extenso estudio para definir que´ otros componentes
har´ıa falta incluir en el disen˜o para poder implementar la comunicacio´n a alta
velocidad y el resto de funcionalidades requeridas.
Disen˜o f´ısico y verificacio´n del primer prototipo
Partiendo de los estudios realizados para la propuesta del disen˜o conceptual, se
seleccionaron los componentes que formar´ıan parte del primer prototipo, como
son las FPGAs, memorias, dispositivos de reloj y conectores o´pticos.
Durante el disen˜o f´ısico de la tarjeta del TilePPr se realizaron amplios es-
tudios de integridad de la sen˜al con herramientas de simulacio´n de campos
electromagne´ticos en 3D. Dichos estudios permiten el correcto disen˜o de la geo-
metr´ıa de las pistas, as´ı como la deteccio´n de discontinuidades de impedancia
a lo largo de las l´ıneas de alta velocidad, producidas por v´ıas diferenciales o
condensadores DC-blocking.
Como parte final de esta fase de la tesis, se procedio´ a la verificacio´n de
los prototipos fabricados. Para ello, se realizo´ un estudio de la calidad de las
comunicaciones o´pticas implementadas con la tarjeta TilePPr utilizando un os-
ciloscopio de muestreo con entradas o´pticas y un ancho de banda de 9 GHz.
Implementacio´n de las diferentes funcionalidades en FPGA
Una vez validado el funcionamiento del prototipo TilePPr, se procedio´ a la
programacio´n de las FPGAs para implementar las distintas funcionalidades. La
programacio´n de las FPGAs se dividio´ en dos partes. La primera parte incluye
185
CAPI´TULO 8. RESUMEN
todos los bloques necesarios para la comunicacio´n con el detector TileCal y su
integracio´n en el sistema de adquisicio´n del experimento ATLAS. Estos bloques
incluyen las siguientes funcionalidades:
• Transmisio´n de comandos de configuracio´n y sen˜ales de sincronismo hacia
la electro´nica front-end.
• Recepcio´n de datos de la electro´nica de front-end y almacenamiento en
memorias de tipo pipeline.
• Transmisio´n de datos seleccionados a los sistemas ROD y FELIX.
• Recepcio´n de comandos y sen˜ales de sincronismo con el sistema TTC.
En relacio´n a los enlaces de alta velocidad, se realizo´ un amplio estudio
del nu´mero de recursos necesarios para la implementacio´n de mu´ltiples enlaces
o´pticos en FPGA utilizando el protocolo GigaBit Transceiver (GBT). Adema´s,
tambie´n se identificaron las modificaciones que debeber´ıan llevarse a cabo en
dicho protocolo para cumplir con los requerimientos de ancho de banda del
detector TileCal.
En una segunda parte de la programacio´n de las FPGAs, se realizo´ un estudio
de te´cnicas digitales en FPGA para la medida de fase entre sen˜ales perio´dicas
con precisiones por debajo de los 30 psRMS. La implementacio´n de este tipo de
te´cnicas en el prototipo TilePPr es necesaria por dos razones fundamentales:
conocer la diferencia de fase entre los relojes locales del TilePPr y el reloj del
acelerador para poder sincronizar la electro´nica de front-end con los cruces de
haces en el ATLAS; y la continua monitorizacio´n de la fase del reloj transmitido
a la electro´nica de front-end para estudios sobre su estabilidad en el tiempo.
Como resultado de este estudio, se propuso un nuevo circuito, llamado Over-
Sampling to UnderSampling (OSUS), basado en el circuito Digital Dual Mixer
Time Difference (DDMTD). El circuito OSUS mejora las prestaciones del cir-
cuito DDMTD sobremuestreando las sen˜ales de entrada y descomponiendo las
sen˜ales resultantes en sen˜ales submuestreadas. Adema´s se analizaron diferentes
me´todos de deglitching para mejorar la precisio´n de la te´cnica de submuestreo,
a trave´s de la estimacio´n temporal de los flancos de subida de las sen˜ales resul-
tantes. Tras el ana´lisis de los diferentes me´todos de deglitching, se propone un
nuevo me´todo llamado Average Position que mejora la precisio´n del circuito.
186
8.5. CONCLUSIONES
Obtencio´n y ana´lisis de resultados obtenidos con el proto-
tipo TilePPr
Por u´ltimo, el objetivo final de esta tesis era comprobar la viabilidad del siste-
ma de adquisicio´n propuesto para su operacio´n en el HL-LHC, siendo adema´s
capaz de integrar el mo´dulo Demonstrator dentro del sistema de adquisicio´n
actual. Para ello, se instalo´ el prototipo TilePPr en las instalaciones de la l´ınea
H8 del acelerador SPS, junto al mo´dulo Demonstrator que inclu´ıa la electro´ni-
ca de front-end disen˜ada para el HL-LHC. Durante las pruebas con haces de
part´ıculas, el prototipo TilePPr fue utilizado para leer los datos digitalizados
por el mo´dulo Demonstrator donde distribuyo´ las sen˜ales de sincronizacio´n a la
electro´nica de front-end y transmitio´ los datos seleccionados al actual sistema de
adquisicio´n para su posterior ana´lisis. Finalmente, los datos recogidos durante
las pruebas fueron analizados y validaron el correcto funcionamiento del sistema
de adquisicio´n de datos disen˜ado para la actualizacio´n del detector TileCal.
8.5 Conclusiones
Esta tesis doctoral se centra en el disen˜o, produccio´n e integracio´n del primer
prototipo TilePPr como parte del sistema de adquisicio´n del detector TileCal
en el HL-LHC. El proyecto Demonstrator tiene como objetivo la evaluacio´n de
la nueva electro´nica del sistema de adquisicio´n antes de su instalacio´n en el HL-
LHC. Como parte de este proyecto se ha construido el mo´dulo Demonstrator
equipado con la nueva electro´nica de front-end disen˜ada para el HL-LHC.
El prototipo TilePPr ha sido disen˜ado para leer y operar el mo´dulo Demons-
trator implementando la nueva arquitectura del sistema de adquisicio´n para el
HL-LHC y, a la vez, permitiendo la integracio´n del mo´dulo Demonstrator en
el sistema actual de adquisicio´n de datos del ATLAS. Los planes del proyec-
to Demonstrator incluyen la instalacio´n del mo´dulo Demonstrator en el actual
experimento ATLAS reemplazando uno de los actuales mo´dulos.
En este documento se presenta una descripcio´n detallada del disen˜o y valida-
cio´n del prototipo TilePPr. Durante su disen˜o se realizaron extensos estudios de
integridad de sen˜al, donde se ha prestado especial atencio´n a la identificacio´n y
optimizacio´n de las discontinuidades de impedancia producidas a lo largo de las
187
CAPI´TULO 8. RESUMEN
l´ıneas de alta velocidad. Como parte de las pruebas de validacio´n del prototipo,
se realizaron test de BER (Bit Error Ratio) y diagramas de ojo de los enlaces
o´pticos. Los resultados presentados en el Cap´ıtulo 4 muestran un BER mejor
que 5 · 10−17 con un nivel de confianza del 95 %, y un jitter total para un BER
de 10−18 de 49.5 ps con una σ de 2.5 ps a 4.8 Gbps y 57 ps con una σ de 2.3 ps
a 9.6 Gbps.
Debido al limitado nu´mero de recursos de reloj en la FPGA Readout se mo-
difico´ el mo´dulo GBT-FPGA IP core, donde el Phase-Locked Loop (PLL) del
receptor fue reemplazado por un circuito Blind Oversampling Clock and Data
Recovery (BO-CDR) implementado con recursos lo´gicos dentro de la FPGA.
Esta modificacio´n permitio´ la implementacio´n del nu´mero de enlaces GBT ne-
cesarios en la FPGA Readout. Por otra parte, a ra´ız de los requerimientos del
nuevo sistema de adquisicio´n, tambie´n fue necesario modificar dicho mo´dulo pa-
ra incrementar el ancho de banda de recepcio´n de 4.8 Gbps a 9.6 Gbps. Adema´s,
los diferentes bloques firmware tanto en la electro´nica de front-end como en la
de back-end que intervienen en la adquisicio´n y transmisio´n de los datos han
sido disen˜ados para proporcionar una latencia fija y determinista.
Otra contribucio´n importante de esta tesis es la implementacio´n de un circui-
to digital en FPGA, llamado OSUS, para la medida de la diferencia de fase entre
sen˜ales perio´dicas con una precisio´n de 30 psRMS. Esta herramienta permite la
sincronizacio´n del mo´dulo Demonstrator con el sistema TTC y la monitorizacio´n
del reloj transmitido a la electro´nica de front-end. Adema´s, a trave´s del circuito
OSUS se han realizado estudios de la estabilidad de latencia de la comunicacio´n
con el front-end en dos situaciones: despue´s del reinicio de la electro´nica de
front-end y durante largos periodos de tiempo.
Se han estudiado diferentes te´cnicas de deglitching para la mejora de la
resolucio´n del circuito OSUS a trave´s de la estimacio´n temporal de los flancos
de subida de las sen˜ales sobremuestreadas. Los estudios realizados concluyeron
que el me´todo de Average Position propuesto mejora la resolucio´n del circuito
en aproximadamente 15 psRMS respecto al me´todo de First Edge.
Por u´ltimo, durante los periodos de pruebas con haces de part´ıculas, el proto-
tipo TilePPr fue utilizado como elemento principal de la electro´nica de back-end,
donde proporciono´ el reloj del LHC y los comandos de configuracio´n al mo´dulo
188
8.5. CONCLUSIONES
Demonstrator, recibiendo las sen˜ales digitalizadas de los PMTs cada 25 ns y
transmitiendo los datos seleccionados a los sistemas ROD y FELIX. Los da-
tos recogidos durante estas pruebas han permitido evaluar el rendimiento de la
nueva electro´nica de front-end disen˜ada para el HL-LHC compara´ndola con la
electro´nica actual.
El disen˜o del prototipo TilePPr sera´ utilizado como base para el desarrollo
de la versio´n final del TilePPr para el HL-LHC. El futuro TilePPr sera´ capaz de
operar hasta 8 mo´dulos completos y estara´ compuesto por una tarjeta ATCA que
albergara´ 4 mo´dulos AMC. Gran parte de los bloques de firmware presentados
en este documento, tales como el circuito OSUS o el Tile GBT-FPGA IP core,
sera´n utilizados en la versio´n final del TilePPr.
189
CAPI´TULO 8. RESUMEN
190
List of Acronyms
ADC Analog-to-Digital Converter.
AMC Advanced Mezzanine Card.
ASIC Application-Specific Integrated Circuit.
ATCA Advanced Telecommunications Computing Architecture.
ATLAS A Toroidal LHC AparatuS.
BC Bunch Crossing.
BCID Bunch Crossing IDentifier.
BCR Bunch Counter Reset.
BER Bit Error Ratio.
BO-CDR Blind Oversampling Clock and Data Recovery.
CDC Clock Domain Crossing.
CDR Clock and Data Recovery.
CERN European Organization for Nuclear Research.
CIS Charge Injection System.
CLB Configurable Control Block.
CPU Central Processing Unit.
CRC Cyclic Redundancy Check.
CTP Central Trigger Processor.
DAC Digital-to-Analog Converter.
DAQ Data AcQuisition.
DB DaughterBoard.
DCS Detector Control System.
DDMTD Digital Dual Mixer Time Difference.
DMTD Dual Mixer Time Difference.
191
LIST OF ACRONYMS
DSP Digital Signal Processor.
ECR Event Counter Reset.
FEB Front-End Board.
FEC Forward Error Correction.
FELIX Front-End LInk eXchange.
FEXT Far-end crosstalk.
FF Flip-Flop.
FIFO First In, First Out.
FMC FPGA Mezzanine Connector.
FPGA Field Programmable Gate Array.
FSM Finite State Machine.
GBT GigaBit Transceiver.
HG High Gain.
HL-LHC High Luminosity LHC.
IPMI Intelligent Platform Management Interface.
JTAG Joint Test Action Group.
L0A Level-0 trigger Accept.
L1A Level-1 trigger Accept.
LFSR Linear-Feedback Shift Register.
LG Low Gain.
LHC Large Hadron Collider.
LSB Less Significant Bit.
LTP Local Trigger Processor.
MB MainBoard.
MMCM Mixed-Mode Clock Manager.
MSB Most Significant Bit.
MUX MUltipleXer.
NEXT Near-end crosstalk.
OSUS OverSamling to UnderSampling.
PCB Printed Circuit Board.
PLL Phase Locked Loop.
PMT PhotoMulTiplier.
PRBS Pseudo-Random Binary Sequence.
192
LIST OF ACRONYMS
PU Processing Unit.
QSFP Quad Small Form-factor Pluggable.
RAM Random Access Memory.
ROD Read Out Driver.
RTM Rear Transition Module.
RTT Round-Trip Time.
SFP Small Form-factor Pluggable.
TDAQ Trigger and Data AcQuisition.
TDAQi Trigger and Data AcQuisition interface.
TDC Time to Digital Converter.
TDR Time Domain Reflectometry.
TileCal Tile Calorimeter.
TilePPr Tile PreProcessor.
TTC Trigger, Timing and Control.
UI Unit Interval.
VCO Voltage-Controlled Oscillator.
193
LIST OF ACRONYMS
194
Bibliography
[1] Benedikt et al. LHC Design Report. CERN, Geneva, 2004.
[2] The ATLAS Collaboration. The ATLAS Experiment at the CERN Large
Hadron Collider. Journal of Instrumentation, 3(08):S08003, 2008.
[3] The CMS Collaboration. The CMS experiment at the CERN LHC. Jour-
nal of Instrumentation, 3(08):S08004, 2008.
[4] The ALICE Collaboration. The ALICE experiment at the CERN LHC.
Journal of Instrumentation, 3(08):S08002, 2008.
[5] The LHCb Collaboration. The LHCb Detector at the LHC. Journal of
Instrumentation, 3(08):S08005, 2008.
[6] The ATLAS collaboration. Observation of a new particle in the search for
the Standard Model Higgs boson with the ATLAS detector at the LHC.
Physics Letters B, 716(1):1 – 29, 2012.
[7] The TOTEM Collaboration. The TOTEM Experiment at the CERN
Large Hadron Collider. Journal of Instrumentation, 3(08):S08007, 2008.
[8] The LHCf Collaboration. The LHCf detector at the CERN Large Hadron
Collider. Journal of Instrumentation, 3(08):S08006, 2008.
[9] J.L. Pinfold. The MoEDAL Experiment at the LHC – a New Light on the
Terascale Frontier. Journal of Physics: Conference Series, 631(1):012014,
2015.
[10] P. Jenni et al. ATLAS high-level trigger, data-acquisition and con-
trols: Technical Design Report. Technical Design Report ATLAS. CERN,
Geneva, 2003.
195
BIBLIOGRAPHY
[11] ATLAS Collaboration. Readiness of the ATLAS Tile Calorimeter for LHC
collisions. The European Physical Journal C, 70(4), 2010.
[12] The Tile Calorimeter collaboration. The optical instrumentation of the
ATLAS Tile Calorimeter. Journal of Instrumentation, 8(01):P01005,
2013.
[13] K. Anderson et al. Design of the front-end analog electronics for the
ATLAS tile calorimeter. Nuclear Instruments and Methods in Physics
Research Section A: Accelerators, Spectrometers, Detectors and Associated
Equipment, 551(2–3):469 – 476, 2005.
[14] S. Berglund et al. The ATLAS Tile Calorimeter digitizer. Journal of
Instrumentation, 3(01):P01004, 2008.
[15] J. Christiansen et al. TTCrx Reference Manual 3.9. October 2004.
[16] K. Anderson et al. ATLAS Tile Calorimeter Interface Card. In Proceedings
of the 8th Workshop on Electronics for LHC Experiments, 2002.
[17] B. G. Taylor. TTC distribution for LHC detectors. IEEE Transactions
on Nuclear Science, 45(3):821–828, June 1998.
[18] A. Valero et al. ATLAS TileCal Read Out Driver production. Journal of
Instrumentation, 2(05):P05003, 2007.
[19] G. Arduini et al. High Luminosity LHC: challenges and plans. Journal of
Instrumentation, 11(12):C12081, 2016.
[20] ATLAS Collaboration. Letter of Intent for the Phase-II Upgrade of the
ATLAS Experiment. December 2012. Draft version for comments.
[21] ATLAS TDAQ group. ATLAS Trigger & DAQ - Interfaces with Detector
Front-End Systems Requirement Document for HL-LHC. March 2017.
Draft version for comments.
[22] J. Anderson et al. FELIX: a High-Throughput Network Approach for
Interfacing to Front End Electronics for ATLAS Upgrades. J. Phys. Conf.
Ser., 664(8):082050, 2015.
196
BIBLIOGRAPHY
[23] S. Muschter et al. Development of a readout link board for the demon-
strator of the ATLAS Tile calorimeter upgrade. JINST, 8:C03025, 2013.
[24] P. Moreira et al, CERN. GBTx Manual V0.15. October 2016.
[25] F. Tang et al. Design of Main Board for ATLAS TileCal Demonstrator.
In Proceedings, 19th Real Time Conference (RT2014): Nara, Japan, May
26-30, 2014, 2014.
[26] F. Tang et al. Design of the front-end readout electronics for ATLAS Tile
Calorimeter at the sLHC. IEEE Trans.Nucl.Sci., 60:1255–1259, 2013.
[27] The Tile Calorimeter Collaboration. Initial Design for the Phase-II Up-
grade of the 2 ATLAS Tile Calorimeter System. Number ATL-COM-
TILECAL-2016-053. Geneva, December 2016.
[28] T. Roy et al. QIE: Performance Studies of the Next Generation Charge
Integrator. JINST, 10(02):C02009, 2015.
[29] N. Pillet et al. FATALIC, a wide dynamic range integrated circuit for the
TileCal VFE ATLAS upgrade. Proceedings of TWEPP 2011, 2011.
[30] G Drake. Design of a new switching power supply for the atlas tilecal
front-end electronics. Journal of Instrumentation, 8(02):C02032, 2013.
[31] A. Senthilkumaran et al. Reliability Analysis of a Low Voltage Power Sup-
ply Design for the Front-End Electronics of the ATLAS Tile Calorimeter.
pages 1231–1239, October 2012.
[32] R. Bonnefoy et al. Performances of a Remote High Voltage Power Supply
for the Phase II Upgrade of the ATLAS Tile Calorimeter. (ATL-COM-
TILECAL-2014-082), November 2014.
[33] F. Vazeille et al. NIEL and TID certifications of the active dividers of
the Tile Calorimeter of the ATLAS detector for the Phase II upgrade.
(ATL-TILECAL-INT-2015-001), January 2015.
[34] AdvancedTCA Base Specification. PICMG 3.0 R3.0. PCI Industrial Com-
puter Manufacturers Group (PICMG), Wakefield, MA, 2008.
197
BIBLIOGRAPHY
[35] F. Carrio´ et al. The PreProcessors for the ATLAS tile calorimeter phase II
upgrade. In 2015 IEEE Nuclear Science Symposium and Medical Imaging
Conference (NSS/MIC), pages 1–3, October 2015.
[36] Advanced Mezzanine Card Base Specification. PICMG AMC.0 R2.0. PCI
Industrial Computer Manufacturers Group (PICMG), Wakefield, MA,
2006.
[37] F. Carrio´. Timing distribution and data flow for the ATLAS Tile
Calorimeter Phase II upgrade. In 2016 IEEE-NPSS Real Time Conference
(RT), pages 1–4, June 2016.
[38] AdvancedTCA Rear Transition Module Zone 3A. PICMG 3.8 R1.0. PCI
Industrial Computer Manufacturers Group (PICMG), Wakefield, MA,
2011.
[39] Xilinx Inc. 7 Series FPGAs Data Sheet: Overview (DS180). 2017.
[40] Avago technologies. MiniPOD AFBR-812VxyZ, AFBR-822VxyZ 12.5
Gbps/Channel Twelve Channel, Parallel Fiber Optics Modules, March
2013.
[41] Xilinx Inc. Spartan-6 Family Overview (DS160). 2011.
[42] J. Mendez, CERN. CERN MMC - User Guide V1.0. 2015.
[43] R. Kuramoto, Xilinx Inc. QuickBoot Method for FPGA Design Remote
Update (XAPP1081). 2014.
[44] Linear Technology. LTspice IV Getting Started Guide. 2011.
[45] D. Bullock et al. PROMETEO: A portable test-bench for the upgraded
front-end electronics of the ATLAS Tile calorimeter. In Proceedings, 3rd
International Conference on Technology and Instrumentation in Particle
Physics (TIPP 2014): Amsterdam, Netherlands, June 2-6, 2014, volume
TIPP2014, page 409, 2014.
[46] E. Bogatin. Signal and Power Integrity - Simplified. Prentice Hall PTR
Signal Integrity Library. Pearson Education, 2009.
198
BIBLIOGRAPHY
[47] ANSYS Q3D Extractor software. http://www.ansys.com.
[48] Altera Corporation. Via Optimization Techniques for High-Speed Channel
Designs (AN529). (AN-529-1.0), May 2008.
[49] M. Resso and E. Bogatin. Signal Integrity Characterization Techniques.
International Engineering Consortium, 2009.
[50] Altera Corporation. Optimizing Impedance Discontinuity Caused by Sur-
face Mount Pads for High-Speed Channel Designs (AN530). (AN-530-1.0),
May 2008.
[51] ANSYS HFSS software. http://www.ansys.com.
[52] ANSYS SIwave software. http://www.ansys.com.
[53] Xilinx Inc. Xilinx Power Estimator User Guide (UG440). 2015.
[54] Xilinx Inc. 7 Series FPGAs PCB Design Guide (UG483). 2017.
[55] Tektronix. Understanding and Characterizing Timing Jitter. September
2012.
[56] Agilent Technologies. Jitter Analysis: The dual-Dirac Model, RJ/DJ, and
Q-Scale. December 2004.
[57] Avago Technologies. AFBR-79Q4Z. InfiniBand 4x QDR QSFP Pluggable,
Parallel Fiber-Optics Module.
[58] Linear Technology. Infiniium DCA-X 86100D Wide-Bandwidth Oscillo-
scope Mainframe and Modules.
[59] Xilinx Inc. Integrated Bit Error Ratio Tester 7 Series GTX Transceivers
v3.0 (PG132). 2016.
[60] F. Carrio´ et al. Performance of the Tile PreProcessor Demonstrator for the
ATLAS Tile Calorimeter Phase II Upgrade. Journal of Instrumentation,
11(03):C03047, 2016.
[61] J. Redd. Calculating Statistical Confidence Levels for Error-Probability
Estimates. Lightwave Magazine, pages 110–114, 2000.
199
BIBLIOGRAPHY
[62] P. Moreira et al. The GBT: A proposed architecture for multi-Gb/s data
transmission in high energy physics. In Topical Workshop on Electronics
for Particle Physics, pages 332–336, 2007.
[63] P. Moreira et al. The GBT Project. In Topical Workshop on Electronics
for Particle Physics, pages 342–346, 2009.
[64] A. Caratelli et al. The GBT-SCA, a radiation tolerant ASIC for detec-
tor control and monitoring applications in HEP experiments. Journal of
Instrumentation, 10(03):C03034, 2015.
[65] M. Barros et al. The GBT-FPGA core: features and challenges. Journal
of Instrumentation, 10(03):C03021, 2015.
[66] Xilinx Inc. 7 Series FPGAs GTX/GTH Transceivers User Guide
(UG476). 2016.
[67] N. Sawyer, Xilinx Inc. Data to Clock Phase Alignment (XAPP225). 2009.
[68] M. Kub´ıcˇek et al. Blind Oversampling Data Recovery with Low Hardware
Complexity. Radioengineering, 19(01):74–78, 2010.
[69] C.K.K. Yang. PhD thesis: Design of High-speed Serial Links in CMOS.
PhD thesis, Stanford University, 1998.
[70] Xilinx Inc. PicoBlaze 8-bit Embedded Microcontroller User Guide
(UG129). 2011.
[71] Linear Technology. LTC2265-12/ LTC2264-12/LTC2263-12 - 12-Bit,
65Msps/40Msps/ 25Msps Low Power Dual ADCs. Rev. B.
[72] Xilinx Inc. 7 Series FPGAs SelectIO Resources User Guide (UG471).
2016.
[73] M. Defossez, Xilinx Inc. An Interface for Texas Instruments Analog-to-
Digital Converters with Serial LVDS Outputs (XAPP866). 2008.
[74] C. Ghabrous et al. IPbus: a flexible Ethernet-based control system for
xTCA hardware. Journal of Instrumentation, 10(02):C02019, 2015.
200
BIBLIOGRAPHY
[75] Agilent Technologies. HDMP-1032/1034 Transmitter/Receiver ChipSet,
2000.
[76] Xilinx Inc. ChipScope Pro Software and Cores User Guide (UG029). 2012.
[77] 7 Series GTX Transceivers - TX and RX Latency Values, AR#42662.
2016.
[78] Corning SMF-28 Optical Fiber - Product Information. 2002.
[79] J. Coffey. Latency in optical fiber systems. 2017.
[80] R. Mueller et al. Sorting Networks on FPGAs. The VLDB Journal,
21(1):1–23, 2012.
[81] P. Farthouat et al. Local Trigger Processor. (ATL-DA-ES-0033), January
2004.
[82] C. Clement et al. Time Calibration of the ATLAS Hadronic Tile Calorime-
ter using the Laser System. 2008.
[83] E.B.S. Mendes et al. The 10G TTC-PON: challenges, solutions and per-
formance. Journal of Instrumentation, 12(02):C02041, 2017.
[84] D.W. Allan and H. Daams. Picosecond Time Difference Measurement
System. In 29th Annual Symposium on Frequency Control, pages 404–
411, May 1975.
[85] H. Nyquist. Certain topics in telegraph transmission theory. Transactions
of the AIEE, 47:617–644, 1928.
[86] L. Green. The Alias Theorems: Practical Undersampling For Expert
Engineers. June 2001.
[87] R.J. Aliaga et al. PET system synchronization and timing resolution using
high-speed data links. In 2010 17th IEEE-NPSS Real Time Conference,
pages 1–7, May 2010.
[88] P. Moreira et al. Digital dual mixer time difference for sub-nanosecond
time synchronization in Ethernet. In 2010 IEEE International Frequency
Control Symposium, pages 449–453, June 2010.
201
BIBLIOGRAPHY
[89] P. Moreira. Timing Signals and Radio Frequency Distribution Using Eth-
ernet Networks for High Energy Physics Applications. PhD thesis, Uni-
versity College of London, 2015.
[90] T. W lostowski. Precise time and frequency transfer in a White Rabbit
network. Master’s thesis, Rijksuniversiteit Groningen, Warsaw University
of Technology, 2011.
[91] Xilinx Inc. 7 Series FPGAs Clocking Resources User Guide (UG472).
2016.
[92] Xilinx Inc. 7 Series FPGAs Configurable Logic Block User Guide
(UG474). 2016.
[93] FPGA Editor software. http://www.xilinx.com.
[94] EP-ESE-BE, CERN. GBT-FPGA Tutorial, Tips & Tricks, June 2016.
[95] B. di Girolamo et al. Beamline instrumentation in the 2004 combined
ATLAS testbeam. (ATL-TECH-PUB-2005-001), July 2005.
[96] J. Spanggaard. Delay Wire Chambers - A Users Guide. (SL-Note-98-023-
BI), March 1998.
[97] B. Salvachua et al. Algorithms for the ROD DSP of the ATLAS Hadronic
Tile Calorimeter. Journal of Instrumentation, 2(02):T02001, 2007.
[98] A. Valero. PhD thesis: The Back-End Electronics for the ATLAS
Hadronic Tile Calorimeter at the Large Hadron Collider. PhD thesis,
Universidad de Valencia, 2014.
[99] G. Duckeck et al. ATLAS computing: Technical Design Report. (CERN-
LHCC-2005-022, ATLAS-TRD-017), 2005.
[100] M. Simonyan. Electron-pion separation in the ATLAS Tile hadron
calorimeter. (ATL-TILECAL-PUB-2006-003), 2006.
[101] The Tile Calorimeter collaboration. Testbeam studies of production mod-
ules of the ATLAS Tile Calorimeter. Nuclear Instruments and Methods in
Physics Research Section A: Accelerators, Spectrometers, Detectors and
Associated Equipment, 606(3):362 – 394, 2009.
202
List of Figures
1.1 The CERN’s accelerator complex. . . . . . . . . . . . . . . . . . 2
1.2 The ATLAS experiment. . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 ATLAS trigger and data acquisition system. . . . . . . . . . . . . 5
1.4 Structure of a TileCal module and main components. . . . . . . . 7
1.5 Segmentation in depth and η of the TileCal modules for half of a
long barrel (left) and for an extended (right) barrel. TileCal cell
distribution is symmetric respect to the interaction point at the
origin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Tile Calorimeter partitions. EBA and EBC partitions correspond
to the Extended Barrels and LBA and LBC partitions to the
central Long Barrel. . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Block diagram of the TileCal electronics. . . . . . . . . . . . . . . 10
1.8 Scheme of the PMT block. . . . . . . . . . . . . . . . . . . . . . . 11
1.9 Block diagram of the functional blocks and data flow of the In-
terface board. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.10 Block diagram of TileCal readout chain. . . . . . . . . . . . . . . 15
2.1 LHC plan for the next ten years, with a series of shutdowns with
dedicated upgrades and increase of energy and luminosity. . . . . 18
2.2 Block diagram of the single-level architecture envisaged for the
TDAQ at the HL-LHC [21]. . . . . . . . . . . . . . . . . . . . . . 19
2.3 Block diagram of the TileCal readout architecture for the Phase
II Upgrade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Picture of the TileCal modules and the minidrawers. . . . . . . 23
203
LIST OF FIGURES
2.5 Detailed drawing of the minidrawer indicating the position of the
different parts of the front-end electronics. . . . . . . . . . . . . . 24
2.6 Block diagram of the DaughterBoard. . . . . . . . . . . . . . . . 25
2.7 Picture of the DaughterBoard version 4. . . . . . . . . . . . . . . 25
2.8 A 3-in-1 card designed for the HL-LHC. . . . . . . . . . . . . . . 26
2.9 Functionality of 3-in-1 card for the upgrade of the TileCal detector. 27
2.10 Picture of both sides of the MainBoard for the 3-in-1 FEB option. 27
2.11 Simulated radiation dose in ATLAS after 100 fb−1, being 1/40th
of the total integrated luminosity expected for the HL-LHC [27]. 28
2.12 Block diagram describing the operation of the main modules of
the QIE chip [27]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.13 Block diagram of the remote HV power distribution system [27]. 31
2.14 Block diagram of the internal HV power distribution system [27]. 32
2.15 Block diagram of the final TilePPr and TDAQi designs. . . . . . 35
3.1 General block diagram of the TilePPr prototype. . . . . . . . . . 39
3.2 Optical modules employed for the high-speed communication paths
in the TilePPr prototype. . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Block diagram of the I2C chains in the TilePPr prototype. . . . . 44
3.4 Picture of the MMC card. . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Block diagram of the TTC receiver block and its connections with
the Readout, Trigger and SC FPGAs. . . . . . . . . . . . . . . . 46
3.6 Block diagram of the TilePPr JTAG chain with the FMC con-
nector and FPGAs. . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.7 Block diagram of the protection circuit designed for the 12 V input. 50
3.8 Picture of the ADC FMC board version 2 for the PROMETEO
project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.9 Picture of the TilePPr prototype indicating the main components. 52
3.10 Sketch of the TilePPr stack-up. . . . . . . . . . . . . . . . . . . . 53
3.11 Simulation of the impedance of microstrip structures. . . . . . . . 55
3.12 Simulation of the impedance of stripline structures. . . . . . . . . 56
3.13 Recommended port labeling for an interconnection. . . . . . . . . 57
3.14 Representation of the S-parameters of a 4-port network (left) and
the equivalent S-parameters of 2-port mixed-mode network (right). 58
204
LIST OF FIGURES
3.15 Model of a differential line with 0201 case DC-coupling capacitors
(ANSYS HFSS software). . . . . . . . . . . . . . . . . . . . . . . 60
3.16 Comparison of the TDR and insertion loss simulation results cor-
responding to two differential interconnects with 0201 and 0402
case DC-coupling capacitors. . . . . . . . . . . . . . . . . . . . . 61
3.17 Results of the TDR and insertion loss simulations corresponding
to a differential interconnect with 0201 case DC-coupling capac-
itors with different sizes of area cuts. . . . . . . . . . . . . . . . . 62
3.18 Model of the differential via used in the simulations with ANSYS
HFSS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.19 Results of the TDR and insertion loss simulations of a differential
via layout for different antipad radii. . . . . . . . . . . . . . . . . 64
3.20 Insertion loss of a differential via with a stub from layer 3 to 16.
The self-resonance frequency of the via is placed around 25 GHz. 65
3.21 IR drops on layer 12 for the VCCINT power rail. The Kintex 7
FPGA (top) is draining 12.415 A and the Virtex 7 FPGA (bot-
tom) is consuming 15.02 A. . . . . . . . . . . . . . . . . . . . . . 67
3.22 Snapshot of the physical model used for the signal integrity sim-
ulations with ANSYS Siwave software. RX0 and RX1 lines are
highlighted in yellow. . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.23 Simulation of the SDD11 and SCC11 parameters for RX0 line. . . 68
3.24 Simulation of the SDD21 parameter for RX0 line. . . . . . . . . . 69
3.25 Simulation of the SCD21 parameter for RX0 line. . . . . . . . . . 70
3.26 Simulation of the SDD31 and SCD31 parameters for the study of
the NEXT between the RX0 and RX1 lines. . . . . . . . . . . . . 71
3.27 Simulation of the SDD41 and SCD41 parameters for the study of
the FEXT between the RX0 and RX1 lines. . . . . . . . . . . . . 71
3.28 Classification of jitter. . . . . . . . . . . . . . . . . . . . . . . . . 72
3.29 Simulated eye diagrams using the IBIS-AMI models provided by
Xilinx. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.30 Optical eye diagrams generated with the Keysight DCA-X 86100D
oscilloscope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
205
LIST OF FIGURES
4.1 Block diagram of the transmitter and receiver blocks of the GBT-
FPGA IP core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 DaughterBoard link connections. . . . . . . . . . . . . . . . . . . 84
4.3 Block diagram of the clocking distribution for the Tile GBT links
in the DaughterBoard. . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Block diagram of the modified Descrambler of the BE Tile GBT-
FPGA module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5 Block diagram of the BO-CDR circuit. . . . . . . . . . . . . . . . 88
4.6 Concept of the 3X oversampling technique. . . . . . . . . . . . . 88
4.7 Block diagram of the MUX decision block. . . . . . . . . . . . . . 89
4.8 Synchronization circuit implemented for the acquisition and re-
timing of the HG/LG bit. . . . . . . . . . . . . . . . . . . . . . . 89
4.9 Phasor diagram describing the operation of the phase picking
algorithm implemented in the MUX decision block. . . . . . . . . 90
4.10 Block diagram of BE Tile GBT link. . . . . . . . . . . . . . . . . 91
4.11 Connections between the GBTx chip and the DaughterBoard FP-
GAs for QSFP A. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.12 Block diagram of the DaughterBoard firmware. . . . . . . . . . . 95
4.13 Timing diagram of the ADC data and clock signals. FR signal
is the frame clk, DCO is the bit clk and OUT the output data.
Extracted from [71]. . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.14 Timing diagram for the deserialization of a 4-bit word with the
ISERDESE2 block. . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.15 Block diagram of the ADC block including the WA-FSM and the
BO-CDR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.16 Simplified block diagram of the TilePPr prototype firmware de-
signed to operate the Demonstrator module. . . . . . . . . . . . . 104
4.17 TDM BPM encoding. . . . . . . . . . . . . . . . . . . . . . . . . 107
4.18 Block diagram of the Decoder. Only A0 and A1 channel are
shown for simplicity. . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.19 Block diagram of the pipeline modules, Event Packers and read-
out interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.20 Testbench for the digital path delay measurement. . . . . . . . . 115
206
LIST OF FIGURES
4.21 Screenshot of the Chipscope software where BCR TX signal cor-
responds to the transmitted BCR, the BCR RX signals to the
received BCR, and LAT values corresponds to the RTT calcu-
lated by the Latency block. . . . . . . . . . . . . . . . . . . . . . 116
4.22 Testbench for the ADC interface path delay measurement. . . . . 118
4.23 Block diagram of the pulse detector block. . . . . . . . . . . . . . 119
4.24 Representation of a sorting network for sorting 3 inputs. . . . . . 119
5.1 Sketch of the current clock distribution architecture for the Tile-
Cal detector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.2 Sketch of the clock distribution schema in TileCal for the HL-LHC.124
5.3 Block diagram of the analog DMTD circuit. . . . . . . . . . . . . 125
5.4 Waveform signal resulting from multiplying a 100 Hz signal (u1)
and a 95 Hz signal (us). . . . . . . . . . . . . . . . . . . . . . . . 126
5.5 Waveform signal (U1) resulting from undersampling a sinusoidal
signal of 100 Hz (u1) at 95 Hz. . . . . . . . . . . . . . . . . . . . 127
5.6 Block diagram of the DDMTD circuit. . . . . . . . . . . . . . . . 128
5.7 Timing diagram of the DDMTD circuit with N = 8. . . . . . . . 129
5.8 Block diagram of the OSUS circuit. . . . . . . . . . . . . . . . . . 132
5.9 Timing diagram of the OSUS circuit with N = 8 and M = 4. . . 132
5.10 Comparison of the acquisition time of the OSUS and the DDMTD
circuit for different phase shifts. . . . . . . . . . . . . . . . . . . . 134
5.11 Timing diagram of the AP debouncing method describing the
process to estimate the positive edge position of the U1 and U2
signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.12 Comparison of the four debouncing methods measuring two 240 MHz
clocks. The OSUS circuit was configured with M = 1 and the
PLL with N = 16384. . . . . . . . . . . . . . . . . . . . . . . . . 137
5.13 Block diagram of a SLICEL. The two clock signals (red) are con-
nected to two registers in the same SLICEL block to minimize
the skew delay of the sampling clock (blue). Figure extracted
from [92]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.14 Block diagram of the synchronization block. . . . . . . . . . . . . 141
207
LIST OF FIGURES
5.15 Histogram corresponding to 100,000 measurements correspond-
ing the phase difference between the synchronized LHClocal and
LHCTTC signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.16 Histogram of the phase difference between the synchronized LHClocal
and LHCTTC obtained with a Lecroy WavePro 760Zi oscilloscope. 143
5.17 Phase difference variations between the LHCFE and the LHCTTC
signals after 100 resets. . . . . . . . . . . . . . . . . . . . . . . . . 144
5.18 Phase difference variations between the LHCFE and the LHCTTC
during a period of 8 hours. . . . . . . . . . . . . . . . . . . . . . 144
6.1 Configuration of modules and electronics during the October 2016
testbeam period. . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.2 Picture of the testbeam module on the table. . . . . . . . . . . . 149
6.3 Sketch of the beam elements in the testbeam setup. . . . . . . . 150
6.4 Picture of one of the beam chambers used for the testbeam setup
and the beam position data. . . . . . . . . . . . . . . . . . . . . . 151
6.5 Complete data acquisition system and data flow for the testbeam
setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.6 Picture of a typical pulse shape showing the 7 samples and the
reconstructed pulse shape with the Optimal Filtering algorithm. 155
6.7 Screenshot of the DQM panel. . . . . . . . . . . . . . . . . . . . . 156
6.8 Result of the linearity test. The red line represents the fit result
of a first-order polynomial function. . . . . . . . . . . . . . . . . 157
6.9 Result of a pedestal run taken with the TilePPr prototype through
the IPbus readout path. . . . . . . . . . . . . . . . . . . . . . . . 158
6.10 Results of a CIS linearity test of a low gain channel and its residuals.159
6.11 Response of the BC6 cells of the Demonstrator module during
a Cesium scan. Each peak corresponds to the response of each
individual scintillator tile when the Cesium source passes through
the module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.12 Histogram of the maximum sample position corresponding to
PMT 42 during a run with 50 GeV electron beam. . . . . . . . . 162
208
LIST OF FIGURES
6.13 PMT energy correlation corresponding to cell BC8 during a run
with 50 GeV electron beam. As can be observed, the correlation
coefficient corresponds to a value close to one indicating a good
equalization of the PMT gains. . . . . . . . . . . . . . . . . . . . 163
6.14 Timing plots for BC6 cell taken during a run with 180 GeV muon
beam. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.15 Energy spectrum of a run with 100 GeV electron beam and the
representation of the energy versus the number of Hot Cells. . . . 165
6.16 Scatter plot of the Cherenkov counter versus the AvD in the
module for a run with 20 GeV electron beam. Pions are con-
fined in the lower left quadrant and electrons in the right. The
energy density for electrons is separated into two main regions
corresponding to two or three cells above threshold. . . . . . . . 166
6.17 Energy in pC of the cell A8 normalized to the beam energy of
100 GeV electrons at 20 degrees after isolating the electrons. . . 167
6.18 Isolated muon signal from the pedestal in cell D1 for a run with
180 GeV muon beam at 20 degrees. . . . . . . . . . . . . . . . . . 167
8.1 Complejo de aceleradores del CERN. . . . . . . . . . . . . . . . . 174
8.2 El experimento ATLAS. . . . . . . . . . . . . . . . . . . . . . . . 175
8.3 Estructura de los mo´dulos de TileCal. . . . . . . . . . . . . . . . 176
8.4 Plan del LHC para los pro´ximos 10 an˜os, incluyendo las paradas
te´cnicas y actualizaciones para el aumento de la luminosidad. . . 178
8.5 Figura detallada de un minidrawer y las diferentes tarjetas electro´nicas
de front-end. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.6 Imagen del prototipo TilePPr. . . . . . . . . . . . . . . . . . . . 181
8.7 Comparacio´n de los resultados de la simulacio´n TDR y de pe´rdidas
por insercio´n entre dos pistas con condensadores DC-blocking con
encapsulado 0201 y 0402. . . . . . . . . . . . . . . . . . . . . . . 182
8.8 Diagrama de ojo correspondiente a la salida o´ptica de un mo´dulo
QSFP operando a 9.6 Gbps. . . . . . . . . . . . . . . . . . . . . . 183
209
LIST OF FIGURES
210
List of Tables
2.1 Trigger parameters and readout data rates for the two proposed
TDAQ architectures. . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Comparison between the current and Phase II readout systems. . 21
3.1 Summary of resources of the selected Virtex 7 FPGAs. . . . . . . 40
3.2 Summary of resources of the selected Kintex 7 FPGA. . . . . . . 43
3.3 Summary of resources of the selected Spartan 6 FPGA. . . . . . 43
3.4 Summary of the power modules used in the TilePPr prototype
indicating the operating voltage and the maximum current. . . . 49
3.5 Summary of the selected geometry values for the high-speed in-
terconnects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Summary of the maximum current consumption of the TilePPr
power rails. Current consumptions were estimated using the Xil-
inx XPE tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.7 QBER factor as a function of BER. . . . . . . . . . . . . . . . . . 73
3.8 Jitter measurement results for one transmitter of the TilePPr
prototype at 4.8 Gbps and 9.6 Gbps rates. . . . . . . . . . . . . . 75
4.1 Standard GBT data format with FEC. . . . . . . . . . . . . . . . 82
4.2 Wide-Bus GBT data format without FEC. . . . . . . . . . . . . 82
4.3 Data format fields in the downlink. Bit N indicates that the
received command is new and therefore the corresponding register
has to be updated with the data contained in the Data field. . . 92
4.4 GBTx downlink data format. . . . . . . . . . . . . . . . . . . . . 92
211
LIST OF TABLES
4.5 Data format fields in the uplink for the readout and integrator
data. The HG/LG bit represents the gain, being 1 the high gain
and 0 low gain. . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.6 Data format fields in the uplink for the TTC and DCS data. . . 94
4.7 Registers for the operation of the DaughterBoard, where the W
indicates that the register can be only written, R that the register
can be only read and R/W that the register can be either read
or written. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.8 Data format of the commands transmitted to the MainBoard
FPGAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.9 Header identification code for the 3-in-1 commands. . . . . . . . 97
4.10 MainBoard FPGA identification code. . . . . . . . . . . . . . . . 97
4.11 PMT identification code. . . . . . . . . . . . . . . . . . . . . . . . 98
4.12 Data format and sequence of the integrator words transmitted to
the TilePPr. The V bit indicates that the received data is valid. 99
4.13 Data format of the address register for the propagation of com-
mands. Where the MD index field indicates the minidrawer des-
tination and the BCID field is maintained empty for the asyn-
chronous commands. . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.14 TileCal ROD data event format. The size of the DMU data blocks
depend on the operation mode. . . . . . . . . . . . . . . . . . . . 112
4.15 FELIX raw data event format. The Run parameters field defines
all the relevant parameters such as the number and type of run,
or the DAC value in the case of a CIS run. On the other hand,
the MD ID field includes the minidrawer number, BCID field
includes the BCID corresponding to the first sample and L1A ID
field the L1A number in the run. . . . . . . . . . . . . . . . . . . 113
4.16 Latency introduced by the transmitter GTX transceiver blocks
according to the downlink and uplink configuration. PMA stands
for Physical Medium Attachment sublayer. . . . . . . . . . . . . 116
4.17 Latency introduced by the receiver GTX transceiver blocks ac-
cording to the downlink and uplink configuration. . . . . . . . . . 117
212
LIST OF TABLES
4.18 Latency introduced by the firmware blocks for the downlink and
uplink communication. The Data Packer and BO-CDR blocks
are only present in the uplink. . . . . . . . . . . . . . . . . . . . . 117
5.1 Time resolution obtained with the AP, ZC, FE and LE methods. 137
5.2 Characteristics of the implemented OSUS circuit with respect to
the different clock frequencies that can be measured with the cur-
rent configuration. Rt represents the theoretical time resolution
of the OSUS circuit. . . . . . . . . . . . . . . . . . . . . . . . . . 138
213
