Optimization of multi-gigabit transceivers for high speed data
  communication links in HEP Experiments by Khan, Shuaib Ahmad et al.
Optimization of multi-gigabit transceivers for high speed data communication
links in HEP Experiments
Shuaib Ahmad Khana,∗, Jubin Mitraa, Tushar Kanti Dasa, Tapan K. Nayaka,b
aVariable Energy Cyclotron Centre, Homi Bhabha National Institute, Kolkata, India
bCERN,CH-1211 Geneva 23, Switzerland
Abstract
The scheme of the data acquisition (DAQ) architecture in High Energy Physics (HEP) experiments consist of data
transport from the front-end electronics (FEE) of the online detectors to the readout units (RU), which perform online
processing of the data, and then to the data storage for offline analysis. With major upgrades of the Large Hadron
Collider (LHC) experiments at CERN, the data transmission rates in the DAQ systems are expected to reach a few
TB/sec within the next few years. These high rates are normally associated with the increase in the high-frequency
losses, which lead to distortion in the detected signal and degradation of signal integrity. To address this, we have
developed an optimization technique of the multi-gigabit transceiver (MGT) and implemented it on the state-of-the-art
20nm Arria-10 FPGA manufactured by Intel Inc. The setup has been validated for three available high-speed data
transmission protocols, namely, GBT, TTC-PON and 10 Gbps Ethernet. The improvement in the signal integrity is
gauged by two metrics, the Bit Error Rate (BER) and the Eye Diagram. It is observed that the technique improves the
signal integrity and reduces BER. The test results and the improvements in the metrics of signal integrity for different
link speeds are presented and discussed.
Keywords: HEP, DAQ, Transceiver, FPGA, Signal Integrity
1. Introduction
The major goals of HEP experiments are to probe
the fundamental constituents of the matter and under-
stand the nature of fundamental forces. Advanced re-
search in HEP demands a progressive increase in colli-
sion energies and beam luminosities of the particle ac-
celerators, which are essential for accessing rare probes
with extremely low cross sections [1]. The experiments
are continuously upgraded with sophisticated detectors,
electronics and DAQ systems [2, 3]. The DAQ architec-
tures have been evolving continuously to cope up with
the demands of the experiments [4, 5]. The LHC at
CERN will go through a major upgrade during the long
shutdown (LS2) period, following which the beam lu-
minosities will increase by about an order of magnitude
from their present values. At the same time, the exper-
iments at the LHC are upgrading the detector and DAQ
systems to allow for faster readout of the online data.
∗Corresponding author
Email address: shuaib.ahmad.khan@cern.ch (Shuaib
Ahmad Khan)
The DAQ architecture in HEP experiments consists
of the three general steps: (i) the data from the online
detectors are transferred to the FEE through the detector
backplane, (ii) the data from the FEE are transferred to
the RU [6, 7, 8], and (iii) the processed data are further
transferred to data storage. These steps require high-
speed data communication links from one step to the
other. Most of the DAQ systems are designed using
the present available technology in such a way that it
could be easily upgraded to match the requirements of
the system. Since one of the major concerns is to effi-
ciently acquire data for all the collisions, error resilient
and efficient data transmission with minimal signal at-
tenuation is required. Signal integrity is essential for the
proper Clock and Data Recovery (CDR) [6, 9]. Thus it
is a challenge to minimize the bit error ratio (BER) and
improve signal integrity for increased data rates [10].
In this manuscript we address the challenges of high-
frequency losses arising due to the high data rates for
the DAQ systems in HEP experiments. Using FPGA
we present a heuristic optimization technique to tune
the parameters of multi-gigabit transceivers for achiev-
ing the best performance at high-speeds for the trans-
Preprint submitted to Elsevier January 10, 2019
ar
X
iv
:1
90
1.
02
72
2v
1 
 [p
hy
sic
s.i
ns
-d
et]
  9
 Ja
n 2
01
9
mission of data, trigger, timing and slow control infor-
mation. The proposed technique helps to improve the
system performance in terms of signal integrity and is
implemented on a state-of-the-art 20nm Intel Arria-10
FPGA [11]. It uses the Intel-Altera on-die Instrumen-
tation tools [12] and does not require the probing of
FPGA pins or transceiver attributes. The full setup is
tested for the link rate of the high-speed communica-
tion protocols frequently used for data transmission in
these experiments. The technique is useful for on-field
system-level debugging, and the parameters can be re-
configured dynamically, allowing the user to configure
the transceivers for optimum performance. The robust-
ness of the optimization technique has been tested with
Pseudo Random Binary Sequence31 (PRBS31) pattern,
which represents the stressed and transitional data con-
ditions. For the statistical reliability of the performed
tests, a large number of data vectors are acquired. Dif-
ferent performance indicators, such as, BER and eye di-
agrams have been used to verify the improvement of the
quality of data signal posterior to the execution of pro-
posed optimization technique.
The manuscript is organized as follows. In section 2,
we present the data aggregation and processing in HEP
experiments. The important constituents of the high-
speed DAQ system are discussed in section 3. Details of
the transceiver optimization technique with its intricate
features are presented in section 4. Section 5 describes
the FPGA based test setup, and section 6 discusses the
methodology to implement the proposed technique and
its advantages. The test results are presented and dis-
cussed in section 7. The manuscript is summarised in
section 8.
2. Data aggregation and processing
A generalised architecture for the DAQ scheme of the
HEP experiments is presented in Figure 1. The FEE
boards are connected to the detectors and are located in
the radiation zone with proximity to the detector, requir-
ing custom-built radiation hard electronics. The FEE
boards process the analog detector signals and convert
those to digital signals. Design and specifications of
these boards are unique to the individual detector sys-
tem [13]. The particle detectors operate in the harsh ra-
diation zones and in some cases, in high magnetic fields.
The main data storage units, on the other hand, are kept
in low radiation zones. The RUs, which are interme-
diary between FEE and storage, can be placed either
in the radiation zone of the experiment’s cavern or in
a low radiation zone near the data storage units. In an
ideal case, the placing of the RUs near the detectors in
the cavern minimizes the transmission latencies. But it
requires custom-built radiation hard electronics, which
are difficult to obtain. In order to minimize the effect of
radiation, the RUs as well as the trigger system and the
back-end computing nodes, are kept out of the radiation
zone. This helps to get the advantage of the high pro-
cessing power available electronics with a large ecosys-
tem, ease of accessibility and maintenance.
Computing 
Node
   (Server/PCs)RU
FEE
Trigger System 
Data 
Links
DAQ 
Links
Trigger Links
T
FEE : Front End Electronics    T: MultiGigabit Transceivers   
                           RU: Readout Unit
Radiation
 Zone 
Outside the 
Radiation Zone 
Detector
T
T TT T
Figure 1: Basic blocks of a typical data acquisition architecture for
HEP experiments.
The RU acts as an interface between detector data
links, the trigger system, and links to storage as well
as computing nodes as shown in Figure 1. The tasks
performed by the FPGA based RUs depend on the de-
tector specifications and requirements. Main tasks are
data sorting, optical link handling, multiplexing and for-
warding of data from different interfacing links, embed-
ding control and trigger information, etc. [14]. These
versatile functionalities require RU to be designed on
custom electronics boards with re-programmable func-
tionality [15]. It is based on up-to-date FPGA technol-
ogy with embedded on-chip transceivers. For our tests
we have used the Intel Arria-10 GX FPGA based devel-
opment board [11, 16]. The interfacing links of RU and
the high-speed communication protocols used for the
LHC experiments in the context of the present frame-
work are discussed in the following sections.
3. High-speed protocols
The DAQ architecture in Fig. 1 features three differ-
ent interfacing links: (i) the Data link, which connects
the detector FEE to RU, (ii) the Trigger link, which con-
nects the RU to the trigger system of the experiment,
and (iii) the DAQ link, which takes the data from the
RU to the storage and computing nodes. For the data
link, the Gigabit Transceiver (GBT) protocol architec-
ture [17], developed at CERN, has been found to be
most ideal. The GBT protocol supports 4.8 Gb/sec data
transmission rate. It ensures the transmission of data
2
from the FEE near the detectors in high radiation zone
to the RU, which is located near the counting room in a
low or no radiation zone. The Trigger link uses the Tim-
ing, Trigger and Control system based on Passive Opti-
cal Networks (TTC-PON) technology [18]; operates at
the rate of 9.6 Gigabit per second. It ensures fixed, de-
terministic latency and satisfies the timing specification
of the LHC.
The data packets get time-stamped in the RU. Thus
the links from the RU to the computing nodes is not la-
tency critical. It has been found that the latest promising
technology option of 10-Gigabit Ethernet [5] with am-
ple ecosystem are most suitable for the DAQ links in
the experiments. In Table 1, we give the detailed spec-
ifications of the three interface links used in the HEP
experiments for the acquisition of data.
4. Transceiver optimization
High-speed data communication suffers from the
transmission losses and signal integrity issues; not seen
at normal digital signalling levels [10]. The high-
frequency content of the signal gets degraded due to
dielectric losses, skin effect, discontinuities in connec-
tors, reflections caused by the vias, inadequately placed
traces, etc. We have developed a technique to optimize
the transceiver parameters accurately and offer the best
combination for a given high-speed link. This optimiza-
tion of the transceiver parameters could take care of the
transmission losses [19].
For the high-speed transmission channels with multi-
gigabit rates, the unit interval (UI) for the data bit de-
creases. At high transmission rates, the PCB materials
suffer from frequency dependent losses, hence become
dispersive. This prevents the signal from reaching its
full strength at the shrunk UI window, leading to jitter
and intersymbol interference (ISI). It also disturbs the
deciphering of the signal and the extraction of the em-
bedded clock becomes difficult at the receiver end.
An increase of the signal strength is an obvious so-
lution to overcome the attenuation. However, the issue
of high-frequency roll-off remains, and the pattern de-
pendent jitter gets aggravated. Consequently, the signal
does not reach its optimal strength within the interval
and may diffuse further into the next UI leading to ISI.
Also for the increase of signal strength overall power
consumption of the transceiver increases. Noise levels
in the system also increase proportionally. All these
lead to deteriorated metrics of signal integrity and re-
duced drive length. The effects are even more evident
with the use of high-speed interfaces with the systems
which were originally designed for low bandwidth ap-
plications.
To overcome these losses, we have developed the
transceiver optimization technique and a proficient
methodology for 20nm Arria-10 FPGA. This new
FPGA with considerably large on-chip resources [11]
are ideal for the processing requiremnts in the experi-
ments.
4.1. Optimization Technique
For the optimization, the high-frequency components
in the data stream are boosted up on every switch-
ing, using the digital pre-emphasis taps of the on-chip
transceiver. In addition, the low frequency components
are reduced. This technique helps to achieve the same
amount of emphasis with less power dissipation. The
exaggerations are overridden by the attenuation during
transmission and allow for the signal to be recovered
accurately.
-2Z
-1Z
+1Z
+2Z
+/-
+/-
+/-
+/-
VOD
1st Pre-tap
2nd Pre-tap
1st Post-tap
2nd Post-tap
Z: Operator for Z-transform
Figure 2: Voltage output differential (VOD) and tunable pre-emphasis
taps with flexible polarity in the embedded transceiver of FPGA.
The optimization technique has been implemented on
Intel Arria-10 FPGA development board with integrated
reconfigurable transceiver architecture [11]. It incorpo-
rates additional circuitry in buffers for equalisation and
pre-emphasis techniques. The transmitter of the embed-
ded transceiver has five programmable drivers as shown
in Figure 2. Voltage output differential (VOD) controls
the base amplitude. The four pre-emphasis taps are
1st pre-tap, 2nd pre-tap, 1st post-tap and 2nd post-tap.
These taps also include polarity settings. The post taps
are the causal taps and the pre-taps are the anti-causal
taps. These multiple taps and choice of polarity could
handle channel attenuating characteristics. Equalisation
with DC gain and Variable Gain Amplifier (VGA) is on
the receiver side of the transceiver. There are multiple
transceiver parameters with a large span of operating
range and so to scan the system performance for ev-
ery combination of the parameters is a time-consuming
process. Our goal had been to develop an efficient tech-
nique for optimization of transceiver parameters such
3
Table 1: Specifications of three high speed interface links, GBT [17], TTC-PON [18] and 10-Gb Ethernet.
Parameters GBT TTC-PON 10Gb Ethernet
Technology Specification Custom
XGPON1 with
modifications
802.3ae Specification
Standard
Designer Group CERN
ITU-T with
CERN modifications IEEE
Line Rate 4.8 Gbps
Downstream: 9.6 Gbps
Upstream: 2.4 Gbps 10.3125 Gbps per lane
Payload Rate 3.2 Gbps
Downstream: 7.68 Gbps
Upstream: 640 Mbps 10 Gbps per lane
Payload Size 120 bits@40 MHz
Downstream:
192 bits@40 MHz
Upstream:
16 bits@40 MHz
64 bits@156.25 MHz
Wavelength (nm)
850 nm
(Multi-mode)
1310 nm
(Single-mode)
Downstream: 1577 nm
Upstream: 1270 nm 850 nm(10 Gb BASE-SR)
Network Topology Point-to-Point Point-to-Multipoint Point-to-Point
Encoding
RS ECC with Block
Interleaver 8b/10b 64b/66b
Synchronous Trigger
Support Yes Yes No
Trigger Latency
150 ns
(Optical loop-back)
100 ns Downstream
1.6 us Upstream NA
that the signals impacted by the high-frequency losses
are recovered.
It works like a Finite Impulse Response (FIR) filter
with different delays referred to as the taps as shown in
the Figure 2. An FIR filter is based on a feed-forward
difference equation. The pre-emphasis technique ap-
plies a delay to the signal and adds it back to the real
signal with weight and inversion as and when required.
Although depending on the transmission channel pecu-
liarity, a simple delay, weight and inversion may not
be able to provide the required compensation. For this
reason, a combination of different delays, weights and
the polarity are combined. In this configuration, the
pre-emphasis 1st post-tap is the most useful parameter.
It emphasises the immediate bit period after the tran-
sition. The generation of the differential emphasised
signal, applying the unit delay by the first post-tap is
shown in Figure 3, assuming VOD = 1 and tap weight
as 0 < x < 1. The original positive signal Vp(T) is
compared with Vp(T-1) which is the unit-delayed sig-
nal. The emphasised signal is the difference between
the weighted x*Vp(T-1) signal and the Vp(T) signal.
The negative signal is similarly generated. The pre-
emphasised differential signal is differentiated from the
positive and negative signals. The effect of 2nd post-
tap after the transition, depending on the chosen polarity
setting is shown in Figure 4.
The pre-tap reduces the effect of pre-cursor ISI. Fig-
ure 5 shows the impact of 1st pre-tap and the 2nd pre-tap
on the single and double bit period respectively, before
the occurrence of high-frequency transition depending
on the polarity. Both pre-cursor ISI and post-cursor
ISI are handled by anti-causal and causal taps respec-
tively. However, pre-emphasis alone cannot guarantee
the performance of the system as it is implemented at
the transmitter by pre-conditioning the signal before it
is fed to the channel. There are high-frequency losses
in the transmission channel itself. Hence an equalisa-
tion is required at the receiver end. It compensates for
the low pass characteristics of the physical medium and
amplifies the attenuated high-frequency components of
the incoming signal. An equalizer on the receiver side
lifts the contents inside a band of frequencies and at-
tenuates the rest. The DC gain circuitry gives uniform
amplification to the received spectrum. It enables the
transceivers to operate over longer distances. The VGA
on the receiver optimizes the signal amplitude before
the CDR sampling.
4
Vp(T)
x*Vp(T-1)
Vp(T) - x*Vp(T-1)
Vn(T)
x*Vn(T-1)
Vp(T-1)
Vn(T)
Vp(T) - x*Vp(T-1)
Vn(T) x*Vn(T-1)
-
-
+
0 0 0 0
0 0
0 00 0
0 00
1 1 1
1 1 10 0
1 1 10 0 0
1
1
1
xx x
1 x- 1 x-
1 x-
1 x+
x-
x- 1 x- 1 x-0 0
x
-1 1 x- 1 x- 1 x-- x-1 x-1
Original positive Signal 
Unit delayed
Weighted Tap
Pre-emphasized positive
Original negative 
Pre-emphasized negative
Pre-emphasized differential
Figure 3: The pre-emphasis signal generation technique at the 1st
post-tap in embedded FPGA transceivers, 0 < x < 1 is the tap weight.
Vp(T) - x*Vp(T-2) - Vn(T) + x*Vn(T-2)
Signal with Pre-emphasis
Signal with Pre-emphasis
Vp(T) - x*Vp(T-1) - Vn(T) + x*Vn(T-1)1st Post-Tap
Inverted 2nd Post-Tap
Signal without
Pre-emphasis
Signal without
Pre-emphasis
1 x+x -1 1 x- 1 x- 1 x- x-1 x - 11 x-- x - 1 x - 1x - 1
1 x+x -1 1 x 1 x- 1 x- x-1 x-11 x-- - 1 x - 1x - 1+ x-
Figure 4: Pre-emphasis 2nd post-tap (Inverted) compared with pre-
emphasis 1st post-tap and their effect on the signal without pre-
emphasis.
To achieve an optimal signal integrity perfor-
mance, both transmitter and receiver parameters of the
transceiver on FPGA chip augments each other and
work combined to compensate for the high-frequency
losses. However, the overcompensation degrades the
signal quality and adds more jitter leading to the closed
eye diagram rendering it futile for the receiver to iden-
tify the signal and hence should be avoided.
Signal with Pre-emphasis
Signal with Pre-emphasis
Vp(T) + x*Vp(T+1) - Vn(T) - x*Vn(T+1)
1st Pre-Tap
2nd Pre-Tap
Signal without
Pre-emphasis
Signal without
Pre-emphasis
1 x+x -1 1 x 1 x x-1 1 x-- x - 1
1 x+x -1 1 x 1 x+ 1 x- x-1 x-11 x-- - 1 x - 1x-1+ x-
+ 1 x+ + 1 x-- 1 x-- 1 x--
Vp(T) + x*Vp(T+2) - Vn(T) - x*Vn(T+2)
-
Figure 5: Pre-emphasis 1st pre-tap and the 2nd pre-tap (Inverted) and
their effect on the signal without pre-emphasis.
5. Test setup
An FPGA based setup has been developed to test the
potency of the proposed optimization technique. The
transceiver is tested for the high-speed links under the
stressed conditions. The setup has been utilised to em-
ulate the stressed high-speed link conditions and to in-
vestigate the high frequency losses in the transmission.
It determines the capability of the transceiver system to
recover the data from the degraded signals. Tests are
performed at the system level to operate the setup at a
prescribed BER equal to or better than 10−12 as per the
IEEE standard.
The test setup, shown in Fig. 6, engrosses the Arria-
10 FPGA development board (10AX115S2F45I1SG de-
vice) for the implementation and testing of the optimiza-
tion technique. The FPGA development card is installed
on the PCIe 16 lane slot of the server, where the power
is obtained from the server motherboard. The func-
tions and specifications of each of the components of
the setup are given in Table 2.
Intel Quartus-II platform is the firmware application
package, implemented on the FPGA logic design. The
transmission links at the specified data rates are imple-
mented using Quartus-II Qsys tool. Qsys is Intel′s sys-
tem integration tool for the quick generation of the in-
terconnect logic. The signal integrity of the transceiver
links is validated using Transceiver Toolkit (TTK) fea-
ture of Quartus-II with a GUI. The TTK is used to
quickly access, tune and test the transceiver parameter
settings in runtime through a combination of metrics.
The TTK enables us to measure BER and the eye di-
agrams and also verify the signal integrity in external
loopback mode. Details of the firmware-tools, such as,
Quartus II, Qsys, TTK, PRBS patterns and auto-sweep
5
Server
Motherboard
Slot for PCIe x 16 Gen 3
on mother board
PCIe connector on FPGA board
  Optical 
Loopback
Variable  
Optical 
Attenuator
Arria 10
FPGA FPGA Board
SFP+
User loopback logic 
on SiliconExternally
Pluggable
 Module
Tx
Rx
* Optical power meter for optical power measurement     * Lucent to Ferrule (LC to FC) connector to couple optical
(InGaAs detector, range (-70 dBm), resolution 0.01 dBm)    fibre to the power meter (50/125um hybrid connector)
Figure 6: Arria-10 FPGA card inserted in PCIe x16 slot of server. The optical signal from the externally pluggable SFP+ is looped back via the
fibre equipped with the variable optical attenuator (VOA).
Table 2: Components used in the test setup, their role and specifications.
Component Role in test setup Specification
FPGA Test Board
Integrated FPGA based design environment
with embedded transceivers on silicon. PCIe
connection. Slot for hot pluggable transceiver
optical modules. Other accessories
Intel Arria10 FPGA, (20nm mid-range).
Transceivers upto 17.4 Gbps [11].
Variable Optical
Attenuator (VOA) with
optical Fiber
Optical power attenuation
in the fibre loopback path.
Range(dB)-0∼60, Accuracy +/- 0.8dB.
Fibre(850nm): Multimode 50/125um with
Lucent connector (LC), Dia-2mm,
Insertion loss <2.5dB, Length-2 m
Serial Form factor
Pluggable (SFP+)
module.
External transceiver modules to be
coupled to the fibre. Laser at transmitter
and PIN diodes at the receiver ends
Hot-pluggable footprint, upto 10Gbps,
850nm VCSEL laser, duplex LC connector.
Link length of 300m [20].
Workstation with FPGA
design platforms
FPGA board powered through PCIe
Gen3x16 slot. Compile and
generate the FPGA design with
firmware development softwares
PCIe Gen3 x16 slots available. Quartus-II
platform installed for firmware design
and generation. FPGA programmed through
USB blaster download cable.
Data
 Generator
(PRBS
 pattern) 
Transmitter Receiver
Data
 Receiver
(PRBS 
pattern 
check) 
core_clk_out
Feedback
 clock
Connection layout in 
Qsys (Platform designer) 
FPGA (Silicon)
Optical link
core_clk_in
Figure 7: Typical BER test loopback logic on FPGA using Qsys tool.
The serialised data is transmitted, looped back and checked for the
flipped bits at the receiver.
features may be found in reference [12].
For the data loopback tests [21], multimode optical
fibre equipped with Variable Optical Attenuator (VOA)
and external pluggable SFP+ modules are used. The
far end of the transceiver is coiled back to the receiv-
ing end. The received data is then verified by the data
checker logic on FPGA for any erroneous bits as shown
in Figure 7. To test the signal integrity a variety of
data patterns can be used. However, in each case, a
checker must be available for verification. PRBS pat-
terns are injected into the test system as it generates the
stressed and lengthy patterns with fewer memory con-
sumption [22]. Another advantage of using PRBS pat-
terns for the tests is that the boundary synchronisation
is not necessary at the physical layer as the patterns are
time correlated. The Intel soft logic cores are used for
PRBS data pattern generator and checker [12].
The BER measurement approach was chosen with re-
6
spect to the controlled attenuated optical power at the
receiver with the help of VOA. It allowed us to rapidly
characterise the transceiver sensitivity below which the
embedded clock cannot be recovered from the data
stream, and loss of lock occurs [19]. It also deter-
mines the minimum required optical power to achieve
the targeted BER for a system operating at a specified
data rate. Auto sweep feature of TTK is used to ob-
tain the optimum settings of the best performing param-
eters of the transceiver for a specified BER. This op-
timized set of transceiver parameters delivers the best
metrics of signal integrity and the eye diagrams by its
height and width. In the next section, we elaborate the
methodology for the optimization of high data rate on-
chip transceivers to reduce the effect of high-frequency
losses.
6. Methodology
The methodology to extract the optimized settings
of the transceiver parameters has been explained in the
flowchart in Figure 8. To start with, the optimization
process scans the full range of each transceiver param-
eter using the TTK auto-sweep feature while the rest of
the parameters are set at their Intel-default values. Then
it records the best performing tap setting values for each
transceiver parameter as indicated by eye parameters.
At this instance, a Solution Matrix (S) at Nth iteration,
set N = 1 is developed. Then, we separately group
the transmitter parameters viz. VOD, Pre-emphasis (1st
pre-tap, 2nd pre-tap, 1st post-tap, 2nd post-tap) and
the receiver parameters (DC gain, Equalisation control,
VGA). Then we scan again the transmission and receive
parameters separately in the range of -3 ≤ S ≤ 3, while
receive and transmit parameters respectively are set at
the values enlisted in the S. Record again the best per-
forming cases and update the S with newer values, in-
crement N by 1. Assign the latest matrix values to the
TTK and run the loopback test. If this does not result
in the improved metrics of signal integrity (Eye dia-
gram and the BER) than the one obtained at the Intel
default set values; repeat the optimization loop with the
adjusted S values in the range defined until the improve-
ment in both eye diagram and BER is achieved.
The parameters cannot be declared as optimized un-
til a stage of degradation in the signal integrity metrics
from their peak values is observed. The degradation of
metrics denotes the over-compensation and it marks the
transition from the maxima of the transceiver parame-
ters. Assign and update the S with the best performing
case metric values rejecting the over-compensated value
set. The final S values with the best performing metrics
is known as Solution Space [19]. The deduced final val-
ues are fed to the transceiver for further analysis. The
results are presented and discussed in the next section.
The proposed technique has definite advantages over
traditional method where the transceiver optimization
may be carried out in an extremely time-consuming way
by evaluating the signal integrity through a large num-
ber of permutations and combinations of the parame-
ters. The parameters and their possible ranges are listed
in the Table 3.
Table 3: Transceiver parameters, range of operations for the manual
optimization.
Transceiver
parameter
Range of
possible
values
Number of
iterations
required
Transmitter Side
VOD 0 to 31 32
Pre-emphasis 1st post-tap -31 to 31 63
Pre-emphasis 1st pre-tap -31 to 31 63
Pre-emphasis 2nd post-tap -15 to 15 31
Pre-emphasis 2nd pre-tap - 7 to 7 15
Receiver Side
DC gain 0 to 4 5
Equalisation 0 to 15 16
VGA 0 to 7 8
7. Results and discussion
Results are demonstrated and validated for the three
different high speed optical links: 10 Gbps links, 4.8
Gbps GBT protocol and 9.6 Gbps TTC-PON. The test
system confronts the lock and hold capability of the
CDR circuit, perturbs all the conceivable instances of
ISI and analyses the receiver sensitivity for any prob-
able drifts. Drifts at the receiver are caused due to
long imbalanced runs of the data transition pattern. The
PRBS31, 231 − 1 patterns integrate every alteration of
31 bits. It gives a random sequence of bits with high
and low transitional values as defined by the logic lev-
els of FPGA. The different combinations induce non-
similar ISI configurations. It is required to stress the
transceivers, test any innate ISI in a transmitter, and to
assess the quality of transmission. PRBS patterns depict
a white spectrum in the frequency domain and are in-
jected to tests the robustness of the high-speed links. For
the entire analysis, PRBS31 is used to stress the system.
However, the variation of eye diagram and BER charac-
teristics are also studied for PRBS7, PRBS9, PRBS15,
PRBS23 in addition to PRBS31.
7
Load the developed TTK design 
for the specified data rate on the FPGA
Select the desired PRBS value
Select the loopback mode as External
Start the data transmission
(TTK parameters at Intel-Altera default)
Record Eye Width/Height and 
BER at Nth iteration
Set variable
    N =1
Scan each individual parameter of transceiver for full range 
using Auto-sweep feature of TTK 
(Rest of the pararmeter values at the Intel-Altera default  
Attenuate the received optical signal in steps 
using VOA
Is Receiver 
CDR locked 
Record BER vs dBm at
 each attenuated step
Signal attenuated 
beyond the Receiver CDR limit
Record the best performing value of each transceiver
parameter regard to Eye diagram metrics
Develop S.Matrix(Nth) with the best performing values
Group the transmitter parameters 
and the receiver parameters
Scan the Transmitter parameters 
in range (-3 <= S.Matrix(Nth) <= 3)
while receiver parameters at the S. Matrix(Nth)   
Scan the receiver parameters 
in range (-3 <= S. Matrix(Nth) <= 3)
while transmitter parameters at the S. Matrix(nth)   
Record the best performing value of each transceiver
parameter regard to Eye diagram metrics
   Set variable
     N = N+1
  
Assign the S.Matrix (Nth) values to the TTK and 
run the data loopback transmission 
Record Eye Width/Height and BER
at Nth iteration
Is BER (Nth)<BER (Nth-1)  
&
Eye W/H (Nth) > Eye W/H (Nth-1)
Is BER (Nth+1)>BER (Nth)
or
Eye W/H (Nth+1) < Eye W/H (Nth) 
No
No
 Yes
S.Matrix(Nth) is the final Solution space
 and the optimized set 
(reject the over-compensated S.Matrix)
Assign the Solution space to the TTK
Run the transmission in external
 loopback mode
Attenuate the received optical signal in steps 
using VOA
Record BER vs dBm at
 each attenuated step
Signal attenuated 
beyond the Receiver CDR limit
PLOT BER vs the power in dBm
(at Solution space) 
Plot the Intel default Settings
 and the solution space 
on the multivariate spider chart
YesIs Receiver 
CDR locked 
NO
No
Yes
Update the S.Matrix (Nth) with the latest values
of the best performing transceiver parametres
Start
PLOT BER vs the power in dBm 
(at Intel-default) 
Reduce the optical
power attenuation to 
zero using VOA 
Yes
Over compensation has occured
Stop
Tune the values 
of S.Matrix
Solution Matrix (S.Matrix)
Figure 8: Stepwise flow diagram for the Transceiver Optimization. Data transmission is started with the Intel default parameters and a Solution
matrix is derived to achieve the optimized signal integrity
8
7.1. Eye Diagram analysis
At the system startup, the transceiver parameters in
TTK are set at the default values. Changes in eye dia-
gram are compared for different PRBS stressed patterns
as the first set of analysis. Eye Height and Width is plot-
ted on a three axes plot with PRBS pattern on the third
axes as shown in Figure 9. It is found that PRBS31 has
the most stressed eye metrics and as anticipated a more
closed eye is examined for all the three links speed.
7.75 15.5 23.25 31
PRBS
    
13
26
39
52Eye Width
         
4.75
9.5
14.25
19
Eye Height
          
 
 
PRBS 7
PRBS 9
PRBS 15
PRBS 23
PRBS 31
10 Gbps
7.75 15.5 23.25 31
PRBS
    
15.5
31
46.5
62Eye Width
         
7.25
14.5
21.75
29
Eye Height
          
 
 
PRBS 7
PRBS 9
PRBS 15
PRBS 23
PRBS 31
GBT 4.8 Gbps
7.75 15.5 23.25 31
PRBS
    
11.75
23.5
35.25
47Eye Width
         
5
10
15
20
Eye Height
          
 
 
PRBS 7
PRBS 9
PRBS 15
PRBS 23
PRBS 31
TTC-PON 9.6 Gbps
Figure 9: Changes in the Eye height and Eye width with PRBS varia-
tion for optical links at three line rates.
7.2. BER Results
Another important metric of signal integrity is BER.
Its measurement is a statistical phenomenon and the es-
timate is ideal only if the number of tested bits tends
to infinity, which is not possible in a real lab test setup.
Hence, a method was proposed in reference [23] to limit
the stressing time of a system to a feasible length and to
measure the BER with high confidence level (CL) too.
CL is used to quantify the quality of the estimate in per-
centage. It is the systems actual probability of error less
than the specified limit. The minimum number of bits
required to be tested for the BER measurement with a
specific associated CL is given in equation 1:
n = − ln(1 −CL)
BER
+
ln
(∑N
k=0
(n ∗ BER)k
k!
)
BER
T = n/R
 (1)
T is test time needed, R is the line rate and when N = 0
the solution is trivial given in equation 2.
n = − ln(1 −CL)
BER
(2)
Where n are the total number of bits transmitted and N
are the number of errors that occurred during the trans-
mission. There is a compromise between testing time
and the required accuracy of the measurement as shown
in equation 1.
For the 95 percent CL, equation 2 reduces to n '
3/(BER). Hence to achieve the BER of 10−12 at 95 per-
cent CL, total 3x1012 bits need to be tested, as a thumb
rule.
7.2.1. BER analysis for various link speeds
The concept is further extended to find the minimum
inspection time required to measure BER of 10−12 for
different CL with no errors for GBT, TTC-PON and
10 Gbps links as shown in Figure 10. In this paper,
all the BER measurements are done for 3x1012 bits to
achieve 95 percent CL. Variation of BER at Intel-default
2 3 4 5 6 7 8 9 10 11 120
500
1000
1500
2000
2500
 Line Rate (Gbps)
Te
st
 T
im
e 
ne
ed
ed
 (s
ec
s)
 
 
TTC−PON
upstream
rate = 2.4
Gbps
GBT line
rate = 4.8
Gbps
TTC−PON
downstream
rate = 9.6
Gbps
10 G line
rate
= 10.3125
GbpsCL = 0.90
CL = 0.95
CL = 0.99
Figure 10: Time to achieve BER of 10−12 for the Line rate of GBT,
TTC-PON and 10 Gbps optical links having different CL.
transceiver set is recorded with respect to the attenuation
9
−15 −14 −13 −12 −11 −10 −9
−14
−12
−10
−8
−6
−4
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
PRBS 7   (R2 = 0.99)
PRBS 9   (R2 = 0.98)
PRBS15  (R2 = 0.98)
PRBS23  (R2 = 0.99)
PRBS31  (R2 = 0.99)
10 Gbps
−16 −15 −14 −13 −12 −11−16
−14
−12
−10
−8
−6
−4
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
PRBS7   (R2 = 0.98)
PRBS9   (R2 = 0.90)
PRBS15 (R2 = 0.98)
PRBS23 (R2 = 0.99)
PRBS31 (R2 = 0.96)
GBT 4.8 Gbps
−15 −13 −11 −9 −7 −5−16
−14
−12
−10
−8
−6
−4
−2
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
PRBS7   (R2 = 0.99)
PRBS9   (R2 = 0.99)
PRBS15 (R2 = 0.99)
PRBS23 (R2 = 0.98)
PRBS31 (R2 = 0.96)
TTC-PON 9.6 Gbps
Figure 11: BER versus received optical power(dBm) for transceiver
at Intel FPGA default settings for different PRBS operating in three
line rates.
of the received optical power; following the methodol-
ogy flowchart shown in Figure 8. This test is executed
with the help of VOA attached to the loopback fibre.
BER variation is recorded for different PRBS patterns
and plotted for the links operating at 10 Gbps, 4.8 Gbps
and 9.6 Gbps rates as shown in Figure 11.
The exponential curve fitting is the best-suited ap-
proximation for the BER in logarithmic domain [24].
Double exponent fit function with constants is used to
fit the BER data as it provides close fits in a variety of
BER plot situations. It fits the BER data using uncon-
strained nonlinear optimization [25]. The statistics for
goodness-of-fit in terms of R-Square (R2) for different
PRBS is marked in the Figure 11.
The test shown in Figure 11 highlights that at a spec-
ified CL higher number of errors are received in the
transmission system for a given received optical power;
when PRBS31 is injected as the test data pattern as
compared to the other PRBS patterns. The outcome of
the tests shown in Figure 9 and Figure 11 revealed the
degradation of the metrics of signal integrity with the in-
crease in the size of a unique word of data in the PRBS
sequence. The results from these tests are as anticipated
and well substantiated. It has further strengthened the
usefulness of the PRBS31 as a strenuous test pattern to
demonstrate the validation of the proposed methodol-
ogy. However, there is a crossover point for 4.8 Gbps
at BER∼10−10. It is kept beyond the discussion as our
region of interest is better by two orders of magnitude
which is BER∼10−12.
7.3. Improvement in Transmission
The improvement in the system performance is
marked by two metrics of signal integrity viz. BER
and Eye Diagram. The eye contour for the Intel-default
settings and at the deduced optimized settings of the
transceiver is captured using the EyeQ (a GUI feature of
TTK). It helps to estimate and visualize the vertical and
horizontal eye opening at the receiver as shown in Fig-
ure 12. After the application of the deduced transceiver
parameters settings using the proposed technique, there
is a notable enhancement in width (Horizontal Phase
Step) and height (Vertical Step) of the eye diagram.
Hence the quality of signal transmission is improved.
The optimized values of the transceiver parameters
known as solution space, found from the proposed
methodology for the targeted BER of 10−12 are plotted
against the Intel-default set in the form of a multivariate
kiviat diagram for all the three link speeds as given in
Figure 13. It allows us to demonstrate a clear compari-
son of the individual parameters on each axis.
Variation in BER is plotted for the deduced solution
space values of a transceiver and for the Intel default
set; concerning the different attenuation levels of input
optical power at the receiver. It is shown for PRBS31
for all the three links under observation in Figure 14.
Further analysing the results from Figure 14, the least
optical power required at the receiver to attain a pre-
ferred BER or better could be determined from the
curve. Also it shows, that a specific marked BER is
achieved at a lower optical power when transceiver is
operated at the deduced parameter values listed in solu-
tion space in comparison to the Intel default set. Here to
mention the particular case as an example, the targeted
BER of 10−12 for the optical link test as per IEEE stan-
dards is achieved at lower values of the optical power
10
Vertical step(19)/Horizontal Phase step(41) for 10
Gbps at the Intel FPGA default settings
Vertical step(49)/Horizontal Phase step(54) for 10
Gbps at the Optimized FPGA settings
Vertical step(28)/Horizontal Phase step(59) for 4.8
Gbps at the Intel FPGA default settings
Vertical step(63)/Horizontal Phase step(63) for 4.8
Gbps at the Optimized FPGA settings
Vertical step(18)/Horizontal Phase step(43) for 9.6
Gbps at the Intel FPGA default settings
Vertical step(41)/Horizontal Phase step(50) for 9.6
Gbps at the Optimized FPGA settings
Figure 12: Eye diagram at the Intel FPGA default and at the Opti-
mized settings of transceiver.
and the improvement at the mentioned BER is quantita-
tively listed in Table 4 for the three link speeds.
Table 4: Comparison of Optical power(dBm) to attain BER of 10−12
for the three high speed interface links.
Protocol
With default
approach
(dBm)
With optimization
technique
(dBm)
Difference
(dBm)
Improvement
(Percentage)
10Gb Ethernet -9.2 -10.35 -1.15 12.5
GBT -11.9 -12.7 -0.8 6.7
TTC-PON -6.45 -9.3 -2.85 44.1
Another clear observation emerged from the data
comparison of Figure 14 is that the receiver sensitiv-
−5.25 14.5 34.25 54
VOD 
Control
           
−5.25
14.5
34.25
54
Pre−emphasis 
1st Post−Tap
                         
−5.25
14.5
34.25
54
Pre−emphasis 
1st Pre−Tap
                    
−5.25
14.5
34.25
54
Pre−emphasis
 2nd Post−Tap
                         
−5.25
14.5
34.25
54Pre−emphasis
 2nd Pre−Tap   
−5.2514.534.2554
DC Gain
       
−5.25
14.5
34.25
54
Equalization
Control
                    
−5.25
14.5
34.25
54
VGA
   
−5.25
14.5
34.25
54 Eye Height
          
−5.25
14.5
34.25
54
Eye Width
         
 
 
Solution Space Intel−Default
10 Gbps line rate
−3 19 41 63
VOD
 Control
           
−3
19
41
63
Pre−emphasis 
1st Post−Tap
                         
−3
19
41
63
Pre−emphasis 
1st Pre−Tap
                    
−3
19
41
63
Pre−emphasis 
2nd Post−Tap
                         
−3
19
41
63Pre−emphasis 
2nd Pre−Tap
                        
−3194163
DC Gain
       
−3
19
41
63
Equalization
 Control
                    
−3
19
41
63
VGA
   
−3
19
41
63 Eye Height
          
−3
19
41
63
Eye Width
         
 
 
Solution Space Intel−Default
GBT 4.8 Gbps line rate
−6.25 12.5 31.25 50
VOD 
Control
           
−6.25
12.5
31.25
50
Pre−emphasis 
1st Post−Tap
                         
−6.25
12.5
31.25
50
Pre−emphasis
1st Pre−Tap
                    
−6.25
12.5
31.25
50
Pre−emphasis 
2nd Post−Tap
                         
−6.25
12.5
31.25
50Pre−emphasis 
2nd Pre−Tap
                        
−6.2512.531.2550
DC Gain
       
−6.25
12.5
31.25
50
Equalization
 Control
                    
−6.25
12.5
31.25
50
VGA
   
−6.25
12.5
31.25
50 Eye Height
          
−6.25
12.5
31.25
50
Eye Width
         
 
 
Solution Space Intel−Default
TTC-PON 9.6 Gbps line rate
Figure 13: Multivariate kiviat diagram showing the solution space and
the Intel FPGA default values for three different link rates.
11
−15.5 −15 −14.5 −14 −13.5 −13 −12.5 −12 −11.5 −11 −10.5 −10 −9.5 −9 −8.5
−16
−14
−12
−10
−8
−6
−4
−2
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
Default Settings
   (R2 = 0.99)
Optimized settings
     (R2 = 0.99)
10 Gbps line rate
−16.5 −16 −15.5 −15 −14.5 −14 −13.5 −13 −12.5 −12 −11.5 −11
−15
−13
−11
−9
−7
−5
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
Default settings
  (R2 = 0.99)
Optimized settings
  (R2 = 0.98)
GBT 4.8 Gbps line rate
−13.5 −12.5 −11.5 −10.5 −9.5 −8.5 −7.5 −6.5 −5.5
−16
−14
−12
−10
−8
−6
−4
−2
Power (dBm)
B
it 
Er
ro
r R
at
io
 (B
ER
) in
 lo
g s
ca
le
 
 
Default settings
  (R2 = 0.96)
Optimized settings
   (R2 = 0.99)
TTC-PON 9.6 Gbps line rate
Figure 14: Comparison of BER versus the received optical power
for default and optimized transceiver settings separately for three line
rates.
ity below which the loss of lock occurs, is enhanced
due to the reduction in the high-frequency losses with
the application of the proposed optimization technique.
This results in reducing the limit of the optical power
required for the proper CDR and the signal is traceable
for comparatively lower values of the received optical
power. The quantitative comparisons are given in Ta-
ble 5.
Table 5: Comparison of optical power for CDR for the three high
speed interface links.
Protocol
With default
parameters
(dBm)
With optimization
technique
(dBm)
Difference
(dBm)
Improvement
(Percentage)
10Gb Ethernet -14.4 -15 -0.6 4.17
GBT -15.34 -16.04 -0.7 4.56
TTC-PON -11.78 -13.2 -1.42 12.05
The test results shown in Figure 13 and 14 confirms
that the effect of high-frequency losses on the link per-
formance is controlled. It is achieved after the applica-
tion of the deduced solution space values to the TTK
and a significant improvement on the BER is noted at
a particular received optical power. The tests and re-
sults validate the usefulness of the proposed technique
to enhance the transceiver performance and the signal
integrity by compensating for the high-frequency losses.
8. Summary
We have presented a novel transceiver optimization
technique to reduce the high-frequency losses which
occur due to the increased rates of data transmission
in case of HEP experiments. The technique has been
implemented on the latest 20nm Intel-Altera Arria-10
FPGA. The scheme has been tested and validated for the
link rates of three high-speed communication protocols,
GBT, TTC-PON and 10 Gbps Ethernet, which are most
commonly used for interfacing the detector front-end
electronics, trigger and DAQ systems. The proposed
scheme is an optimized approach which reduces num-
ber of iterations required.
The tests are performed with PRBS31 pattern at a
confidence level of 95 percent. There is considerable
gain in the system performance with the application of
the proposed technique as specified by the two parame-
ters of signal integrity, the BER and the Eye Diagram.
The Intel FPGA set parameters and the solution space
values are marked on the kiviat diagram for the fast
comparison between the parameters. The results point
that to attain the marked BER of 10−12; the required op-
tical power is reduced by 12.5%, 6.7% and 44.1% for
10Gbps, GBT and TTC-PON respectively. The BER is
also improved over the received range of optical power.
The CDR capability of the system is also enhanced as
the least optical power required to recover the data traf-
fic is reduced by 4.17%, 4.56% and 12.05% for 10Gbps,
GBT and TTC-PON respectively. The technique im-
proves the signal integrity and reduces the BER. This
technique is a heuristic solution and has potential for
practical applications as it provides rapid convergence
of the solution space to achieve optimized transceiver
settings. It makes the implementation of the new tech-
nique time efficient. This transceiver optimization tech-
nique and its implementation approach would lend itself
well for other FPGAs users that allows on-chip assess-
ment of signal quality like Eye diagram.
Acknowledgement
The authors gratefully acknowledge the support of
the ALICE Collaboration at CERN during the period
12
of the research work. We thank Alex Kluge, Tivadar
Kiss, Erno David of the ALICE Electronics coordina-
tion and the CRU project for their valuable help and ad-
vice. We thank Subhasis Chattopadhyay, Anurag Misra
and Saurabh Srivastava for fruitful suggestions during
the preparation of the manuscript.
References
[1] D. E. Morrissey, T. Plehn, T. M. Tait, Physics searches at the
LHC, Physics Reports 515 (1-2) (2012) 1–113.
[2] W. K. Panofsky, Evolution of particle accelerators, SLAC Beam
Line 27 (1997) 36–44.
[3] W. Smith, Trigger and data acquisition for hadron colliders at
the energy frontier (2013), arXiv preprint arXiv:1307.0706.
[4] S. A. Khan, J. Mitra, E. David, T. Kiss, T. K. Nayak, A po-
tent approach for the development of FPGA based DAQ sys-
tem for HEP experiments, Journal of Instrumentation 12 (2017)
T10010.
[5] J. Toledo, F. Mora, H. Mu¨ller, Past, present and future of data
acquisition systems in high energy physics experiments, Micro-
processors and Microsystems 27 (2003) 353–358.
[6] J. Mitra, S. A. Khan, et al., Common readout unit (CRU)-a new
readout architecture for the ALICE experiment, Journal of In-
strumentation 11 (2016) C03021.
[7] Gutie´rrez, et al., The ALICE TPC readout control unit, in:
Nuclear Science Symposium Conference Record, Vol.1, IEEE
2005, Vol. 1, 2005, p. 575.
[8] S. A. Khan, J. Mitra, T. K. Nayak, Development of a high
speed data acquisition system for the detectors at high luminos-
ity LHC, in: Proceedings of the XXII DAE High Energy Physics
Symposium, Springer, 2018, p. 223.
[9] B. Razavi, Challenges in the design high-speed clock and data
recovery circuits, IEEE Communications magazine 40 (2002)
94–101.
[10] S. H. Hall, H. L. Heck, Advanced signal integrity for high-speed
digital designs, John Wiley & Sons, 2011.
[11] I. Altera, Intel Arria 10 Device Overview (2018).
[12] I. Altera, Quartus Prime Standard Edition Handbook Volume 1:
Design and Synthesis (2017).
[13] G. F. Knoll, Radiation detection and measurement, John Wiley
& Sons, 2010.
[14] L. Li, A. M. Wyrwicz, Parallel 2D FFT implementation on
FPGA suitable for real-time MR image processing, Review of
Scientific Instruments 89 (9) (2018) 093706.
[15] S. G. Castillo, K. B. Ozanyan, Field-programmable data acqui-
sition and processing channel for optical tomography systems,
Review of scientific instruments 76 (9) (2005) 095109.
[16] Intel Corporation, Arria 10 FPGA Development Kit User Guide
(2017).
[17] M. B. Marin, S. Baron, et al., The GBT-FPGA core: features
and challenges, Journal of Instrumentation 10 (2015) C03021.
[18] E. Mendes, S. Baron, D. Kolotouros, C. Soos, F. Vasey, The
10G TTC-PON: challenges, solutions and performance, Journal
of Instrumentation 12 (2017) C02041.
[19] I. Altera, High-SpeedLink Tuning Using Signal Conditioning
Circuitry in Stratix V Transceivers (2015).
[20] S. Committee, et al., SFF-8431 Specifications for Enhanced
Small Form Factor Pluggable Module SFP+, revision 4.1, July
6, 2009.
[21] S. I. Green, Multichannel bit error rate tester for fiber op-
tic transceiver testing, Review of scientific instruments 73 (8)
(2002) 3125–3127.
[22] H. Badaoui, Y. Frignac, P. Ramantanis, B. E. Benkelfat,
M. Feham, PRQS Sequences Characteristics Analysis by Auto-
correlation Function and Statistical Properties, IJCSI (2010) 39.
[23] D. Mitic´, A. Lebl, Zˇ. Markov, Calculating the required number
of bits in the function of confidence level and error probability
estimation, Serbian Journal of Electrical Engineering 9 (2012)
361–375.
[24] L. J. Ippolito, Appendix b: Error functions and bit error rate,
Satellite Communications Systems Engineering: Atmospheric
Effects, Satellite Link Design and System Performance 363–
366.
[25] S. Chapra, R. P. Canale, Numerical methods for engineers : with
personal computer applications / steven c. chapra, raymond p.
canale.
13
