An FPGA-based Emulation of the G-Link Chip-Set for the ATLAS Level-1 Barrel Muon Trigger by Aloisio, A et al.
An FPGA-based Emulation of the G-Link Chip-Set for the ATLAS Level-1 Barrel Muon
Trigger
A. Aloisioa,b, F. Cevenini. a,b, R. Giordanoa,b, V. Izzoa
a INFN - Sezione di Napoli - Via Cintia, 80126, Napoli, Italy







Many High Energy Physics experiments based their serial links
on the Agilent HDMP-1032/34A serializer/deserializer chip-set
(or GLink). This success was mainly due to the fact that this pair
of chips was able to transfer data at ∼ 1 Gb/s with a determinis-
tic latency, ﬁxed after each power up or reset of the link. Despite
this unique timing feature, Agilent discontinued the production
and no compatible commercial off-the-shelf chip-sets are avail-
able. The ATLAS Level-1 Muon trigger includes some serial
links based on GLink in order to transfer data from the detector
to the counting room. The transmission side of the links will
not be upgraded, however a replacement for the receivers in the
counting room in case of failures is needed.
In this paper, we present a solution to replace GLink trans-
mitters and/or receivers. Our design is based on the giga-
bit serial IO (GTP) embedded in a Xilinx Virtex 5 Field Pro-
grammable Gate Array (FPGA). We present the architecture and
we discuss parameters of the implementation such as latency
and resource occupation. We compare the GLink chip-set and
the GTP-based emulator in terms of latency, eye diagram and
power dissipation.
I. INTRODUCTION
Trigger systems of High Energy Physics (HEP) experiments
need data transfers to be executed with ﬁxed latency, in order to
preserve the timing information. This requirement is not nec-
essarily satisﬁed by Serializer-Deserializer (SerDes) chip-sets,
which can have latency variations in terms of integer numbers
of Unit Intervals (UIs) and/or of clock cycles of the parallel do-
main. For instance, the TLK2711A [1] exhibits latency varia-
tions up to 31 UIs on the receiver data-path. The Gigabit link,
or GLink, chip-set [2], produced by Agilent, was able to trans-
fer data at data-rates up to 1 Gb/s with a ﬁxed latency even after
a power-cycle or a loss of lock. Serial links of data acquisi-
tion systems of HEP experiments have been often based on the
GLink chip-set. For instance it has been deployed in the Alice,
ATLAS, Babar [3], CDF, CMS, D0 and Nemo [4] experiments
(just to cite some of them). The chip-set became so widely used,
that CERN produced a radiation hard serializer compatible with
it [5]. Unfortunately, a few years ago Agilent discontinued the
production of the chip-set and users needing replacements are
looking for alternative solutions. Latest FPGAs include embed-
ded multi-Gigabit SerDes, which offer a wide variety of conﬁg-
urable features. The beneﬁt from the integration of such a device
in FPGA is in terms of power consumption, size, board layout
complexity, cost and re-programmability. The Level-1 Barrel
Muon Trigger of the ATLAS experiment includes GLink serial
links in order to transfer data from the detector to the count-
ing room. The transmission side of the links is on-detector and
will unlikely be upgraded, however a replacement for the re-
ceivers in the counting room in case of failures is needed. We
developed a replacement solution for GLink transmitters and re-
ceivers, based on the gigabit serial IO (GTP) embedded in Xil-
inx Virtex 5 Field Programmable Gate Array (FPGA). Our solu-
tion preserves the ﬁxed-latency feature of the original chip-set.
In the coming sections we will introduce the present L1 Barrel
Muon Trigger and the GLink chip-set, then we will describe the
architecture and the implementation of our design. Eventually
we will present some test results about our emulator, comparing
them also with the GLink chip-set.
II. ATLAS BARREL MUON TRIGGER AND DAQ
The ATLAS detector [6] is installed in one of the four beam-
crossing sites at the Large Hadron Collider (LHC) of CERN.
The detector has a cylindrical symmetry and it is centered on
the interaction point. ATLAS consists of several subsystems,
among them there is a muon spectrometer, which in the bar-
rel region is built in the loops of an air-core toroidal magnet
and includes Resistive Plate Chambers (RPCs). RPCs are ar-
ranged in towers used for the Level-1 (L1) muon trigger (Fig.
1). The spectrometer is divided in two halves along the axis and
each half is in turn divided in 16 sectors. A physical sector is
segmented in two trigger sectors, including 6 or 7 RPC towers
each.
The whole trigger system is implemented as a synchronous
pipeline, with a total latency of 2.0 μs, clocked by the Timing,
Trigger and Control (TTC) system [7] of the LHC. The TTC
distributes timing information such as the bunch crossing clock
(at about 40 MHz) and the L1 trigger.
The read-out and trigger electronics of the barrel muon spec-
trometer includes an on-detector part and an off-detector one. A
board on the detector, the PAD [8], transfers data to a Versa
Module Eurocard (VME) board in the counting room, the Sec-
tor Logic/RX (SL/RX) [9], via an 800-Mbps serial link based
on the GLink chip-set. Each SL/RX board includes 8 GLink
receivers and two FPGAs handling the received data and the
communication with other off-detector boards.
509
Figure 1: Left: Cross section of the ATLAS muon spectrometer. Right: Level-1 Trigger and DAQ for the spectrometer.
During the trigger decision, data are stored by the on-
detector electronics. If the event is validated, a L1 accept signal
is broad-casted to the PADs, which transfer data to the RX/SL.
The RX/SL board, in turn, sends data to other VME boards for
further processing and storage. More information about the AT-
LAS barrel muon trigger and Data Acquisition (DAQ) can be
found in [10].
III. THE GLINK CHIP-SET
The GLink chip-set consists of a serializer (HDMP-1032A)
and a deserializer (HDMP-1034A). The chips work with data-
rates up to 1 Gb/s and encode data according to the Conditional
Inversion Master Transition (CIMT) protocol. In order to read
serial data, the receiver extracts a clock from the CIMT stream
and locks its phase to the master transition. The recovered clock
synchronizes all the internal operations of the receiver and it is
available as an output. Received data are transferred out of the
device synchronously with the recovered clock and the chip-set
architecture is such that the overall link latency is deterministic.
Moreover, by means of the dedicated Parallel Automatic Syn-
chronization System (PASS), it is also possible to output data
synchronously with a local receiver clock, provided that it has a
constant phase relationship with the transmission clock (like it
happens in the ATLAS L1 barrel muon trigger, which is clocked
by the LHC machine clock).
We now brieﬂy introduce the CIMT encoding protocol. A
CIMT stream is a sequence of 20-bit words, each containing 16
data bits (D-Field) and 4 control bits (C-Field). The C-Field
ﬂags each word as a data word, a control word or an idle word.
Idle words are used in order to synchronize the link at start-up
and to keep it phase-locked when no data or control words are
transmitted. The protocol guarantees a transition in the mid-
dle of the C-Field and the receiver checks for this transition in
received data in order to perform word alignment and to de-
tect errors. Two encoding modes are available: one compatible
with older chip-sets and an enhanced one, which is more ro-
bust against incorrect word alignments. The DC-balance of the
link is ensured by sending inverted or unaltered words in such a
way to minimize the bit disparity, deﬁned as the difference be-
tween the total number of transmitted 1s and 0s. By reading the
C-Field content, the receiver is able to determine whether the
payload is inverted or not and restore its original form.
IV. GLINK EMULATION
We built our GLink emulator around the Xilinx GTP
transceiver [11], embedded in Virtex 5 [12] FPGAs. Other
FPGA vendor offer embedded SerDes, for instance Altera with
the GX and Lattice with the ﬂexiPCS. However, the ﬁxed-
latency characteristic of our emulator is deeply-based on some
hardware features of the GTP. For a discussion about the possi-
bility to implement a ﬁxed-latency link with FPGA-embedded
SerDes see [13].
A. Architecture
The GTP can serialize/de-serialize words 8, 16, 10 and 20
bit wide. We conﬁgured it to work with 20-bit CIMT-encoded
words at 40 MHz, in order to achieve a 800 Mb/s link. The re-
ceiver clock has an unknown, but ﬁxed, phase offset with respect
to the transmitter clock. In order to transfer data with minimum
latency the GTP allows to skip internal elastic buffers, one be-
ing in the data-path of the transmitter and the other one in the
data-path of the receiver. When skipping buffers, all phase dif-
ferences must be resolved between the external parallel clock
domain and a clock domain internal to the device. We set up
the transmitter to work without the elastic buffer, while we left
two options for the receiver: the ﬁrst one without the buffer and
with an improved latency (Conﬁguration1), but with some con-
straints on the relative phase between transmission and recep-
tion clocks and the second one without any phase constraint,
but with a higher latency (Conﬁguration2).
510
Figure 2: Simpliﬁed block diagram of the emulator.
On the transmitter, a phase control logic instructs the GTP
to align the phase of the internal clock to the transmission clock
and asserts the Ready signal when done. A dedicated logic en-
codes incoming 16-bit words into 20-bit CIMT words and trans-
fers them to the GTP (Fig. 2). The encoder is able to send data,
control or idle words and supports an input ﬂag bit exactly like
the original chip-set.
On the receiver side, when working in Conﬁguration1, the
phase align and control logic checks whether or not it is pos-
sible to retrieve data from the link with the assigned parallel
clock phase. If it is not possible, the phase must be changed
either in the FPGA or outside. In Conﬁguration2 every phase
offset is legal, therefore no checks are performed. In order to
align received data to the correct word boundary, we added to
the GTP: a CIMT decoder and a word align control logic. The
decoder checks the C-Field of incoming CIMT words and, if it
is not valid, ﬂags an error to the word align control logic. When
errors are found, the logic activates the shifter inside the GTP,
changing the word boundary alignment of parallel data. If, for
a deﬁned number of clock cycles, no errors are found, the align
control logic assumes parallel data are correctly aligned and as-
serts the Aligned signal. The decoder determines if the received
word is an idle, a control or a data word, extracts the status of
the ﬂag and activates the corresponding outputs.
For the sake of completeness, we inform the reader that our
emulator supports all the CIMT encoding modes of the HDMP-
1032/34A chip-set, but not the 20/21-bit modes of the older
HDMP-1022/24.
B. Physical Implementation
A full-duplex emulator (transmitter and receiver) requires
around 500 Look Up Tables (LUTs) and 400 Flip Flops (FFs),
which are 3% of the logic resources available in a Xilinx Virtex
5 LX50T FPGA (Table 1). Such a tiny resource requirement,
will allow us to integrate all the eight GLink receivers of the
RX/SL board in the FPGA and the impact of this integration
will be just a 6% of the fabric resources.
The latencies of the transmitter and the receiver are respec-
tively 6.75 and 5.25 parallel clock cycles (6.75 in Conﬁgura-
tion2). Details about the contribution of internal blocks are
given in Table 2. For each component we report the latencies
in terms of clock cycles and the absolute value. For compari-
son with the latencies of our solution we recall that latencies of
the GLink transmitter and receiver are respectively 1.4 and 3.0
parallel clock cycles. Hence, our emulator has a higher latency
with respect to the original chip-set, however this is not an issue
for our application.
We notice that a GLink receiver dissipates ~ 800 mW and a
transmitter ~ 700 mW (typical @ 1 Gb/s). Each GTP pair (trans-
mitter and receiver) dissipates ~ 300 mW (typical @ 3 Gb/s),
hence the power dissipation of the emulator is lower than the
one of the original chip-set.
Table 1: Resources used by an implementation of a GLink transmit-
ter/receiver in a Xilinx Virtex 5 LX50T.
Resource Occupied Percentage Available
LUTs 651 2.3 % 28,800
Registers 408 1.4 % 28,800
Slices 265 3.7 % 7,200
DCMs 2 17 % 12
GTPs 1 8.3 % 12










Total Encoding Latency (fabric) 4.5 112.5
Total GTP Latency 2.25 56.25
Total Transmitter Latency 6.75 168.75
Receiver
Total GTP Latency 4.75 118.75
Total Decoding Latency (fabric) 1 25
Total Receiver Latency 5.25 143.75
Total Link Latency 12 312.5
511
V. TEST RESULTS
Figure 3: Eye diagram comparison between GLink and the GTP.
In order to test our link, we deployed two off-the-shelf
boards [14] built around a Virtex 5 LX50T FPGA. The boards
route the serial I/O pins of one of the GTPs on the FPGA to
SubMiniature version A (SMA) connectors. We connected the
transmitter and the receiver GTPs with a pair of 5 ns, 50 Ω
impedance coaxial cables. Transmitted and received payloads
were available on single ended test-points as well as on Low-
Voltage Differential Signaling (LVDS) outputs and were moni-
tored by an oscilloscope to observe latency variations. We used
a dual channel clock generator providing two 40-MHz clock
outputs with a ﬁxed phase offset. This way, we emulated the
TTC system of the ATLAS experiment, which is used to clock
data in and out from the link.
We checked that our emulator is able to correctly transmit
(receive) data toward (from) an Agilent GLink receiver (trans-
mitter) chip in all the encoding modes supported by the HDMP-
1032/34A chip-set. In order to perform this test, we deployed
a ML-505 board and a custom board hosting a GLink transmit-
ter and a receiver. The test showed that the emulator correctly
exchanges data with a GLink chip in both the CIMT encoding
modes.
We present an eye diagram comparison between the Agilent
GLink transmitter and the GTP (Fig. 3). We fed the transmitters
with the same payload, a 16-bit pseudo random word sequence.
We probed the signal on the positive line of the differential pair,
at the far end of a 5 ns 50 Ω coaxial cable. Between the transmit-
ter and the cable, there was a 10 nF decoupling capacitor. We
terminated the negative line on its characteristic impedance to
keep the differential driver balanced. We notice that the GLink
eye width is 50 ps wider than GTP’s. Despite the GTP smaller
voltage swing (400 mV) with respect to GLink (600 mV), the
latter has rise and fall times respectively around 30% and 15%
lower. The timing jitter on GTP’s edges is ∼ 210 ps, while for
Agilent transmitter is ∼ 180 ps. This difference could be due
to the fact that the generation of high-speed serial clock, from
the 40-MHz oscillator, requires only the internal Phase Locked
Loop (PLL) for GLink. Instead, in our clocking scheme for the
GTP we deployed a Delay Locked Loop (DLL) of the FPGA
to multiply the 40-MHz clock in order to obtain the 80-MHz
clock. Therefore, the total jitter on the transmitted serial stream
includes the contribution of the jitters of both the PLL and the
DLL. Moreover, we used a single ended oscillator to source the
PLL of the GTP, while the User Guide recommends to use a
differential oscillator.
We performed Bit Error Ratio (BER) measurements on the
link implemented with our emulator. We deployed a custom
Bit Error Ratio Tester (BERT) [15], checking the received pay-
load against a local copy and ﬂagging an error when a differ-
ence occurred. More than 1013 bits have been transferred and
no errors have been observed, corresponding to a 10−12 BER,
estimated with a 99% conﬁdence level [16]. We did not per-
form BER measurements for a design integrating multiple G-
Link receivers in the same FPGA. However, other studies [17]
have shown that the GTP has a good tolerance both to the logic
activity in the FPGA fabric and to the switching activity of sur-
rounding IOs.
VI. CONCLUSIONS
Data-rates and transmission protocols of SerDes embedded
in FPGAs can be changed by simply re-programming the de-
vice. By suitably conﬁguring a GTP transceiver and adding few
logic resources from the FPGA fabric (~ 3% of the total), we
have been able to achieve a complete replacement for the GLink
chip-set. Our emulator transfers data with a ﬁxed latency, which
was a crucial feature of the original chip-set. We experimentally
veriﬁed the compatibility of our emulator with GLink both in
transmission and reception. Our receiver offers two conﬁgura-
tion options: the ﬁrst one with a shorter internal data-path and
with minimum latency, but with some constraints on the relative
phase between transmission and reception parallel clocks and
the second one without any phase constraint, but with a higher
latency. Since the emulator has a tiny footprint in terms of logic
resources, in a future upgrade of the RX/SL, it will allow us to
integrate all the GLink receivers on the board in a single FPGA,
still leaving most of the device resources free for trigger and
readout tasks. Hence, the layout of the upgraded board would
be simpliﬁed with respect to the present. Moreover, a GTP pair
dissipates less power than the G-Link chip-set, so the power dis-
sipation due to data de-serialization will be lowered in the up-
grade.
ACKNOWLEDGMENT
The authors are thankful to Giovanni Guasti and Francesco
Contu from Xilinx Italy for their support and help in conﬁg-
uring the GTP transceiver. This work is partly supported as a
PRIN project by the Italian Ministero dell’Istruzione, Univer-
sità e Ricerca Scientiﬁca.
REFERENCES
[1] TLK2711A - 1.6 TO 2.7 GBPS TRANSCEIVER,
Texas Instruments, 2007 [On-line]. Available:
http://focus.ti.com/lit/ds/symlink/tlk2711a.pdf
512
[2] Agilent HDMP 1032-1034 Transmitter-
Receiver Chip-set Datasheet, 2001, Agilent
[On-line]. Available: http://www.physics.ohio-
state.edu/~cms/cfeb/datasheets/hdmp1032.pdf
[3] P. Sanders, “The BaBar trigger, readout and event gather-
ing system”, IEEE Trans. on Nucl. Sci., Vol. 45, Issue 4,
Part 1, August 1998 pp. 1894-1897
[4] F. Ameli, “The Data Acquisition and Transport Design for
NEMO Phase 1”, IEEE Trans. on Nucl. Sci., Vol. 55, Issue
1, Part 1, Feb. 2008 pp. 233-240
[5] P. Moreira, T. Toiﬂ, A. Kluge, G. Cervelli, F. Faccio, A.
Marchioro, J. Christiansen, “GLink and gigabit Ethernet
compliant serializer for LHC data transmission”, In Nu-
clear Science Symposium Conference Record, 15-20 Oct.
2000, Vol. 2 , pp. 9/6 - 9/9
[6] ATLAS Collaboration, ATLAS Detector and
Physics Performance - Technical Design Re-
port - Volume I, May 1999 [On-line]. Avail-
able: http://atlas.web.cern.ch/Atlas/GROUPS/
PHYSICS/TDR/physics_tdr/printout/Volume_I.pdf
[7] B.G. Taylor for the RD12 Project Collaboration, TTC Dis-
tribution for LHC Detectors, IEEE Trans. on Nucl. Sci.,
Vol. 45, No. 3, June 1998, pp. 821-828.
[8] F. Pastore, E. Petrolo, R. Vari, S. Veneziano, “Perfor-
mances of the Coincidence Matrix ASIC of the ATLAS
Barrel Level-1 Muon Trigger”, In Proc. of the 11th Work-
shop on Electronics for LHC Experiments, Heidelberg,
Germany, 12-16 Sept 2005.
[9] G.Chiodi, E.Gennari, E.Petrolo, F.Pastore, A.Salamon,
R.Varia, S.Veneziano, “The ATLAS barrel level-1 Muon
Trigger Sector-Logic/RX off-detector trigger and acquisi-
tion board”, In Proc. of Topical Workshop on Electronics
for Particle Physics, Prague, Czech Republic, 03 - 07 Sep
2007, pp.232-237
[10] F. Anulli et al., The Level-1 Trigger Barrel System of the
ATLAS Experiment at CERN, 2009 [On-line]. Available:
http://cdsweb.cern.ch/record/1154759/ﬁles/ATL-DAQ-
PUB-2009-001.pdf
[11] Virtex-5 FPGA RocketIO GTP Transceiver User
Guide - UG196 (v1.7), Xilinx, 2008 [On-line]. Avail-
able: http://www.xilinx.com/support/documentation/
user_guides/ug196.pdf
[12] Virtex-5 FPGA User Guide - UG190
(v4.3), Xilinx, 2008 [On-line]. Available:
http://www.xilinx.com/support/documentation/
user_guides/ug190.pdf
[13] A. Aloisio, F. Cevenini, R. Giordano, V. Izzo, “High-
Speed, Fixed-Latency Serial Links with FPGAs for Syn-
chronous Transfers”, IEEE Trans. on Nucl. Sci., to be pub-
lished
[14] ML505/ML506/ML507 Evaluation Platform User
Guide - UG347 (v3.0.1), Xilinx, 2008 [On-line]. Avail-
able: http://www.xilinx.com/support/documentation/
boards_and_kits/ug347.pdf
[15] A. Aloisio, F. Cevenini, R. Cicalese, R. Giordano, V. Izzo,
“Beyond 320 Mbyte/s With 2eSST and Bus Invert Cod-
ing on VME64x”, IEEE Trans. on Nucl. Sci., Volume 55,
Issue 1, Feb. 2008, pp. 203-208
[16] Statistical Conﬁdence Levels for Estimating Er-
ror Probability, Maxim, 2007 [On-line]. Available:
http://pdfserv.maxim-ic.com/en/an/AN1095.pdf
[17] A. Aloisio, F. Cevenini, R. Giordano, V. Izzo, ”Char-
acterizing Jitter Performance of Multi Gigabit FPGA-
Embedded Serial Transceivers”, In Real Time Conference
Record, Beijing, China, 10-15 May 2009
513
