An analytical model of the delay generator for the triggering of particle detectors at CERN LHC by Gauci, Jordan Lee et al.
An Analytical Model of the Delay Generator for the
Triggering of Particle Detectors at CERN LHC
Jordan Lee Gauci∗, Edward Gatt∗, Giacinto De Cataldo†, Owen Casha∗ and Ivan Grech∗
∗Department of Microelectronics and Nanoelectronics, University of Malta
†Istituto Nazionale di Fisica Nucleare, Sezione di Bari, Italy
∗E-mail: jordan-lee.gauci.10@um.edu.mt
Abstract—This paper presents an analytical model of a tapped
shift-register based delay generator, which is currently imple-
mented in the High Momentum Particle Identification Detector
(HMPID) at CERN and will be upgraded in the coming years.
This work aims to verify whether this delay generator can be
optimized to provide a delay range of 525 ns with a resolution of
1 ns. In particular, this paper studies how the clock jitter affects
the delay generated and its linearity and predict how the current
architecture will perform at a higher frequency of operation. The
conclusions drawn via the analytical model, are then verified
using both a simulation model and an FPGA implementation of
the delay generator.
Index Terms—Delay Generators, Modelling, Digital System
Design
I. INTRODUCTION
The generation of precise delays has been an area of interest
for a number of years, particularly in high-energy physics
time-of-flight experiments. The heart of the delay generator
is the delay line, which shifts the input signal by a finite
amount of time by means of a digital command or control
voltage. Three important parameters define a delay line: the
delay-step, which is the finest delay that can be achieved; the
delay range, which is the maximum achievable delay; and the
delay jitter which is the uncertainty in a delay [1].
Delay lines consist of delay elements and can be imple-
mented using various methods, each of which having their
own merits and shortcomings. Electrical delay lines can be
classified into two main architectures: tapped delay lines and
single output delay lines. The former consists of a number
of identical cascaded delay elements, where the output of
each stage is tapped and routed to the output by means of a
multiplexer. In this case, the delay resolution is limited by the
propagation delay of a single delay element. The delay range
is theoretically equal to the delay resolution multiplied by the
number of delay stages. On the other hand, a single output
delay line consists of a number of cascaded delay elements,
where the delay at the output is varied by means of a control
voltage [1].
This paper presents an analytical model of a tapped shift-
register based remotely programmable delay generator (shown
in Fig. 1), which is currently implemented in the High Momen-
tum Particle Identification Detector (HMPID) [2]. The HMPID
is one of nineteen detectors located in ALICE (A Large Ion
Collider Experiment) at the CERN Large Hadron Collider
(LHC). The analytical model is being proposed as a tool to
MUX
…
D D DD
>>> >
7-bit control 
wordL0 Signal 
Clock
Delayed 
L0
Fig. 1. Simplified schematic of the tapped shift-register based delay generator.
investigate how the clock jitter affects the delay generated
and its linearity and predict how the current architecture will
perform at a higher frequency of operation.
II. BACKGROUND
The HMPID is a Ring Imaging Cherenkov Detector, where
charged particles impinge on a gas and are converted into
electrons, which are then read by the charge sense amplifiers
(CSA). The signals from the CSA are digitised and processed
by the read-out electronics, upon receiving a trigger signal
from the ALICE Central Trigger Processor [2].
For Run3 (scheduled for the period 2020-2022), most of the
detectors at CERN will be upgraded [3], and a new triggering
scheme will be required. Currently, for Run1 and Run2, the
HMPID uses the Level 0 (L0) trigger, arriving approximately
1.2 µs after a collision. The L0 signal is fed into a delay
generator, where the delay on the L0 signal can be coarsely
adjusted, such that the charge on the CSA is captured at its
peak intensity. The currently implemented delay generator is
capable of providing a maximum delay of 2.5 µs with a
resolution of 25 ns. The architecture consists of a 100-bit
wide shift register, operated using a 40 MHz clock, where the
output of each stage is tapped and routed to the main output
by means of a multiplexer. A 7-bit control word is used to
select the desired delay. This is implemented on an ACTEL
PROASIC3E FPGA. The ACTEL has a phase locked loop
which is capable of generating a 40 MHz clock, with a jitter
of ±250 ps [4].
During Run3, HMPID will be using the LM trigger signal
instead of L0, that arrives 650 ns after an event. While the
same system can be used as before, more precise control on the
delay is necessary and thus a finer resolution is required. The
new specifications call for a remotely programmable delay line
with a maximum delay range of 525 ns and a delay resolution
of 1 ns. The normal operating range of this delay module
should be between 300 ns and 500 ns.
III. THE ANALYTICAL MODEL
An analytical model is required to study the behaviour of
the delay generator architecture shown in Fig. 1, when this is
operated by a jittery clock. Starting from an ideal scenario, this
architecture can be modelled by Eq. 1, where Td is the total
delay that can be obtained with N delay elements, Tprop is the
propagation delay of the flip-flops and TMUX is the propagation
delay in the multiplexer. Eq. 1 assumes an ideal clock with
a period of Tclk, ideal and no jitter. Even though TMUX can
be significantly large, it is being ignored for simplification
purposes, since it only contributes to the systematic error in
the delay generated.
Td = TMUX + Tprop + (N − 1)Tclk, ideal (N ≥ 1) (1)
In practice, clock jitter will have an effect on the output
delay as it will cause variations in the periodic time of the
clock. The real clock can be modelled by Eq. 2, where ∆F is
the frequency deviation at a given instance. One may express
Eq. 2 in the form of Eq. 3
Tclock,real =
1
Fclk,ideal + ∆F
(2)
Tclk,real =
1
Fclk,ideal
+ ε′clk (3)
where ε′clk is the instantaneous clock error, defined by Eq. 4.
ε′clk =
−∆F
F 2clk,ideal + ∆FFclk,ideal
(4)
Thus, for every clock cycle, the clock will induce an error
ε′i in each stage i of the delay line, such that Eq. 1 can be
expressed as:
Td = Tprop + (N − 1)Tclk,ideal + εN (5)
εN is the total error at stage N of the delay line, that is,
εN =
∑N
i=1 εi, where εi is the relative error between the ideal
clock and the real clock. This implies that even though ε′i
may follow a known probability distribution with a particular
mean and standard deviation, the resultant distribution of the
delay will be transformed due to clock error accumulation.
For instance, as illustrated in Fig. 2, if the distribution of ε′i
is a standard distribution with a zero mean, εN would have
a multi-modal distribution with a different mean, which will
be translated to an offset in the transfer characteristic of the
delay generator. As shown via the simulations presented in
Section IV, this offset is random in nature and thus difficult
to predict.
This analytical model also shows that the main advantage
of using the shift-register architecture is that as the number
of taps increases, the average delay between each stage will
approach the periodic time of the ideal clock. This means that
Fig. 2. Distribution of the clock error (top) and the effect of clock error
accumulation on the distribution of delay error (bottom).
given a sufficient number of stages, the average delay error will
tend to zero. This effect can be clearly predicted by Eq. 6.
µTd =
Tprop
N
+ Tclk,ideal − Tclk,ideal
N
+
εN
N
(6)
IV. C++ SIMULATOR MODEL AND RESULTS
A C++ programme that simulates the tapped output delay
generator architecture shown in Fig. 1 was developed, in order
to study how the system will perform when fed by a jittery
clock. The programme consists of a clock and data generator
and a model for the D flip-Flop. Simulations were performed
with a sampling resolution of 1 ps and a simulation time of
2 µs. The results from the model with an ideal clock are
presented in Fig. 3, where the delay value is plotted against
the tap number. As expected, the response of the system is
perfectly linear, with an average delay value of 1 ns. The
offset, −0.7 ns, stems from the design of the D flip-flop model
where the C2Q propagation time is equal to 0.3 ns (for N ≥ 1
in Eq. 5).
A clock with finite jitter was then included in the model and
simulations were performed for different cycle-cycle jitter val-
ues. A standard distribution based random number generator
was used having its mean set to zero and a standard deviation
of one third the maximum jitter required. Fig. 4 shows the
transfer characteristic of the delay generator for a clock jitter
equal to ±250 ps, where the time delay together with the best
line of fit are plotted. As can be seen, since the clock error
is small when compared to the the periodic time of the clock,
the response of the system is still linear. The method of least
squares was used to calculate the y-intercept of the line of
best-fit. The gradient of this line was forced to 1 such that the
contribution of the clock error will only affect the y-intercept,
as per Eq. 5. While an offset of −0.7 ns was predicted for
the ideal case, this has changed to −0.519 ns. This change of
0.181 ns is exactly equal to the mean of the cumulative clock
error. Fig. 5 shows the variation of the offset in the transfer
characteristic of the delay generator for different values of
Fig. 3. Simulated Ideal Case where the Average Delay is equal to 1 ns, with
an offset of -0.7 ns.
Fig. 4. Simulation of the delay generator with a clock jitter of ± 250 ps
resulting in an offset of −0.519 ns.
clock jitter. This delay offset is random in nature and thus
difficult to predict.
V. IMPLEMENTATION, TESTING AND RESULTS
The delay generator shown in Fig. 1 has a simple design,
produces linear delay steps and can easily be implemented on
an FPGA. Most importantly, the delay generator is reliable
and does not carry any stability issues since it is an open
loop architecture. The upgraded delay generator was thus
based on this architecture. The implementation consists of
three modules: the clock Manager, the delay module, and the
output multiplexer. The clock manager consists of a Xilinx
digital clock manager (DCM) followed by an all digital phase
locked loop.A 200 MHz differential clock signal is generated
via an IDT5V9885 programmable clock generator which has
a jitter of 150 ps. The clock manager multiplies the 200 MHz
signal to obtain a 1 GHz clock. In this case, the jitter is
Fig. 5. Variation of the offset in the transfer characteristic of the delay
generator for different values of clock jitter.
estimated to rise to 292 ps. This clock is then used to drive
the 525-bit shift register present in the delay module, thus
obtaining the required delay resolution. The last component
in the architecture is the output multiplexer that taps the shift
register such that the required delay can be selected, through
a 10-bit control word.
A. VHDL Test Bench
A VHDL test bench was developed in order to simulate the
delay generator can be simulated while considering a place
and route FPGA model. In this case, the offset introduced is
equal to −1.357 ns. Even though there are no errors related
to the clock jitter, another type of error has been introduced
between each stage. This is the systematic error coming from
the physical placement of the shift register on the FPGA,
buffers/drivers and the multiplexer delay. The architecture was
optimised for both speed and floor planning, where the area
occupied by the shift register was constrained to the least
possible area, while also constraining this area within the same
clock region. In total, the architecture is occupying 2% of the
FPGA’s slices, 1 DCM ADV, and 1 PLL ADV on a Xilinx
Virtex-5 FPGA.
B. Physical Test Bench
Two Xilinx FPGA boards were used to test the system. The
first board emulates the generation of the LM signal, which is
a 25 ns wide trigger signal with a frequency of 200 kHz and
sends the signal along with the required delay. This was sent
through the high-speed VHDCi port on the board. The trigger
signal LM and the delayed signal L0 were then read by a
Tektronix TDS820 Oscilloscope and the delay was calculated.
Since the mathematical model of Eq. 5 only describes the
shift register architecture, it was necessary to remove the delay
contribution added by any input and output buffers, look up
tables and the output multiplexer. While an offset of −1.357 ns
has been predicted by the VHDL test bench simulation, in this
(a) Error contribution due to the clock jitter only. (b) Average delay obtained against the tap number.
Fig. 6. Measurement results of the delay generator implemented on an FPGA.
case the offset has increased to 1.957 ns. This is due to the fact
that apart from the systematic offset that has been introduced
by the architecture, the clock error will now have an effect on
the delay error. To see the contribution of the clock signal only
on the output, the simulation results can be subtracted from
those obtained from the VHDL test bench, thereby excluding
any systematic errors. This will result in an offset equal to
0.134 ns (Fig. 6a). Fig. 6b presents the average delay of the
system including systematic and clock errors. It can be clearly
seen that as the number of taps increases, the average delay
between each stage will approach that of the ideal clock.
C. Comparison
Table I presents a comparison between the delay generator
currently used by HMPID [4], a DLL-based delay generator
used by the LHCb calorimeter [5] and this work. It can be
seen that this work is the most suitable for use by HMPID as
it satisfies both the delay range and resolution requirements.
In addition the simulated differential non-Linearity (DNL)
is lower than that of the delay generator used by LHCb.
The differential non-linearity is a measure of the difference
between the measured delay and the expected delay. DNL is
measured by the mean of the difference between two adjacent
delay stages [1, 5].
TABLE I
A COMPARISON WITH OTHER DELAY GENERATORS WITHIN THE LHC
Work Delay Range Resolution DNL
Current Module [4] 2.5 µs 25 ns N/A
DLL Based [5] 25 ns 1 ns 23 ps
This Work 525 ns 1 ns 5 ps
VI. CONCLUSIONS AND FUTURE WORK
This paper has presented an analytical model of a shift-
register based delay generator that will be upgraded for use
by HMPID in the coming years. The improvement led to a
delay range of 525 ns and a delay resolution of 1 ns. Through
this model it was noted that while the delay response is linear
a random offset is generated. This offset stems from the clock
error accumulation. If the instantaneous clock error follows
a normal distribution with mean equal to zero, due to clock
error accumulation, the delay error would have a multi-modal
distribution with a different mean. The design of the system
was implemented on a Xilinx FPGA and the results were
verified both through simulations and actual measurements.
This work successfully shows that a shift-register architec-
ture can successfully function as a delay generator that man-
ages to filter out any clock and systematic errors. However, it is
evident that the offset can pose a problem, particularly because
it is unpredictable. This issue needs further investigation and a
possible solution could be to integrate the command sent to the
multiplexer in a feedback mechanism, where the actual delay
between L0 and LM is counted, and the multiplexer command
is then tuned accordingly.
ACKNOWLEDGEMENTS
The research work disclosed in this publication is funded by the ENDEAV-
OUR Scholarship Scheme (Malta). The scholarship may be part-financed
by the European Union - European Social Fund (ESF) under Operational
Programme II - Cohesion Policy 2014-2020, “Investing in human capital to
create more opportunities and promote the well being of society.”
REFERENCES
[1] B. Abdulrazzaq, I. A. Halin, S. Kawahito, R. M. Sidek, S. Shafie,
and N. A. Yunus, “A Review on High-Resolution CMOS Delay Lines:
Towards sub-picosecond Jitter Performance,” SpringerPlus, vol. 5, no. 1,
pp. 1–32, 2016.
[2] F. Piuz, W. Klempt, L. Leistam, J. De Groot, and J. Schkraft, Detector
for High Momentum PID - ALICE Technical Design Report. Geneva:
CERN, 1998.
[3] P. Antonioli, A. Kluge, and W. Riegler, “Upgrade of the ALICE Readout
& Trigger System,” CERN, Tech. Rep. CERN-LHCC-2013-019. ALICE-
TDR-015, Sep 2013.
[4] M. Krivda, “Fanin\Fanout Unit - User’s Guide,” August 2007.
[5] J. Mauricio, D. Gasco´n, X. Vilası´s, E. Picatoste, F. Machefert, J. Lefran-
cois, O. Duarte, and C. Beigbeder, “Radiation hard programmable delay
line for LHCb calorimeter upgrade,” Journal of Instrumentation, vol. 9,
no. 01, p. C01016, 2014.
