

#### PAPER • OPEN ACCESS

## DTS-100G — a versatile heterogeneous MPSoC board for cryogenic sensor readout

To cite this article: T. Muscheid et al 2023 JINST 18 C02067

View the article online for updates and enhancements.

### You may also like

- Digital tomosynthesis for verifying spine position during radiotherapy: a phantom study
  Oliver J Gurney-Champion, Max Dahele, Hassan Mostafavi et al.
- <u>Physical aspects of divertor Thomson</u> <u>scattering implementation on ITER</u> E.E. Mukhin, R.A. Pitts, P. Andrew et al.
- <u>Enhancing digital tomosynthesis (DTS) for</u> <u>lung radiotherapy guidance using patient-</u> <u>specific deep learning model</u> Zhuoran Jiang, Fang-Fang Yin, Yun Ge et al.

PUBLISHED BY IOP PUBLISHING FOR SISSA MEDIALAB



RECEIVED: October 21, 2022 REVISED: December 15, 2022 ACCEPTED: January 17, 2023 PUBLISHED: February 28, 2023

Topical Workshop on Electronics for Particle Physics Bergen, Norway 19–23 September 2022

# DTS-100G — a versatile heterogeneous MPSoC board for cryogenic sensor readout

T. Muscheid,<sup>*a*,\*</sup> A. Boebel,<sup>*b*</sup> N. Karcher,<sup>*a*</sup> T. Vanat,<sup>*b*</sup> L. Ardila-Perez,<sup>*a*</sup> I. Cheviakov,<sup>*b*</sup> M. Schleicher,<sup>*a*</sup> M. Zimmer,<sup>*b*</sup> M. Balzer<sup>*a*</sup> and O. Sander<sup>*a*</sup>

 <sup>a</sup>Institute for Data Processing and Electronics, Karlsruhe Institute of Technology, Karlsruhe, Germany
<sup>b</sup>Electronics Development Group, Deutsches Elektronen-Synchrotron, Hamburg, Germany

*E-mail:* timo.muscheid@kit.edu

ABSTRACT: Heterogeneous devices such as the Multi-Processor System-on-Chip (MPSoC) from Xilinx are extremely valuable in custom instrumentation systems. This contribution presents the joint development of a heterogeneous MPSoC board called DTS-100G by DESY and KIT. The board is built around a Xilinx Zynq Ultrascale+ chip offering all available high-speed transceivers using QSFP28, 28 Gbps FireFly, FMC, and FMC+ interfaces. The board is not designed for a particular application, but can be used as a generic DAQ platform for a variety of physics experiments. The DTS-100G board was successfully developed, built and commissioned. ECHo-100k is the first experiment which will employ the board. This contribution shows the system architecture and explains how the DTS-100G board is a crucial component in the DAQ chain.

KEYWORDS: Data acquisition circuits; Digital signal processing (DSP); Electronic detector readout concepts (solid-state); Front-end electronics for detector readout

<sup>\*</sup>Corresponding author.

<sup>© 2023</sup> The Author(s). Published by IOP Publishing Ltd on behalf of Sissa Medialab. Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

#### Contents

| 1 | Introduction                         | 1 |
|---|--------------------------------------|---|
| 2 | DTS-100G MPSoC board                 | 2 |
| 3 | Transceiver characterization         | 3 |
| 4 | Integration into the ECHo experiment | 4 |
| 5 | Summary                              | 5 |

#### 1 Introduction

Modern experiments in the field of particle physics use a large number of high-sensitive sensors for measuring the energy of particles, which enables further exploration of the universe [1-3]. Since the measurement of particles often takes place at a timescale of micro or even nanoseconds, a very fine granular time resolution is essential for data acquisition. Furthermore, the observation time is often in the range of several months or even years. Developing adequate readout electronics suitable for these applications is a complex task. The system must be capable of simultaneously processing data from all sensors at a high sample rate. FPGAs are often used on applications requiring real-time signal processing to effectively reduce the data rate to downstream components. MPSoCs, including both the FPGA and several CPUs, can be deployed to increase the versatility of the readout system. The Programmable Logic (PL) on the device implements all deterministic workloads dealing with the data flow whereas the Processing System (PS) controls the communication with the board components and storage systems.

The current generation of Xilinx MPSoCs features high-speed links with speeds of up to 32.75 Gbps [4] connected directly to the PL. Xilinx offers several evaluation boards built around these MPSoCs, but they only offer a limited subset of the MPSoC features. While these boards are suitable for prototyping, full-scale readout-electronics mostly requires custom hardware.

If real-time data reduction is not possible on the front-end, the data must be transferred to a back-end server. In recent years, several new interfaces for optical data transmission, such as Fire-Fly [5] and QSFP28, have been developed, which promise reliable, high-throughput board-to-board data transmission. These new optical transceivers enable transmission rates of up to 100 Gbps and allow transfer of the data for post-processing and storage. Within the Detector Technologies and Systems (DTS) program of Helmholtz,<sup>1</sup> DESY (Deutsches Elektronen-Synchrotron) and KIT (Karlsruhe Institute of Technology) developed the DTS-100G board as an universal and flexible platform. It functions as both a versatile MPSoC board for reading out various cryogenic sensors and an evaluation platform for high-speed optical transmission capabilities. This work

<sup>&</sup>lt;sup>1</sup> https://www.helmholtz-detectors.de/

aims to describe the board capabilities and its first application within the readout system for the Electron Capture of Ho<sup>163</sup> experiment (ECHo), which investigates the upper limit of the electron neutrino mass.

#### 2 DTS-100G MPSoC board

The DTS-100G board, shown on Figure 1, has a dimensions of  $258.5 \times 198$  mm with a thickness of 2.56 mm. It contains 18 signal and power layers. The board is built around a MPSoC from the Xilinx Zynq Ultrascale+ family. The first version of the board is equipped with a XCZU11EG-FFVC1760-2-E, but it can also be used with a footprint-compatible ZU17EG or ZU19EG device. The resources available in the PL of these MPSoCs are different, but the processing system is the same, so only small changes to the build system are needed to make a change. A second revision of the board is currently undergoing a few minor modifications and fixes.



Figure 1. DTS-100G MPSoC board.

The outstanding feature of the board is its connectivity. One of the development goals was to make use of all the MPSoC's high-speed links and provide them to users in a variety of connector types. The high-speed data transmission interfaces are directly accessible by the PL. The chosen MPSoC package offers 16 GTY and 32 GTH transceiver lanes with speeds of 32.75 Gbps and 16.3 Gbps respectively [4]. The DTS-100G links all available high-speed lanes to various connectors. Four GTY transceivers of the MPSoC are connected to a FireFly connector and four to a QSFP28 connector, enabling 100 Gbps board-to-board data transmission speed on each of these ports. The remaining eight GTY transceivers, as well as 16 GTH transceivers, are linked to a VITA 57.4 FMC+ connector for mezzanine card attachment. The other 16 GTH transceivers of the MPSoC can be accessed by two additional standard FMC connectors.

For data storage, the board offers a SO-DIMM slot with up to 16 GB of DDR4 RAM connected to the PS and 8 GB of on-board DDR4 RAM directly accessible from the PL. The processing system is connected to an M.2 connector that may be used for a SATA storage or as a PCIe link. Additionally, a Display Port, USB3.0, EEPROM, 2x UART over USB (FTDI), 1 Gb Ethernet and

 $2 \times 1$  Gb QSPI flash memory are available. The clock for the board components is provided by a configurable jitter cleaner (SI5345). Configuration of the individual components is performed by I<sup>2</sup>C and SPI buses from the MPSoC.

#### **3** Transceiver characterization

The board was designed for the use-case of acquiring measurement data from physical experiments, so the performance of the high-speed links received special attention during the design process. To verify the signal integrity of the high-speed links, Bit Error Rate (BER) [6] tests were performed on the GTY-transceivers. Communication on the FMC+ connector was characterized by attaching a loopback FMC+ HSPC card by Samtec [7], which generates a 156.25 MHz reference clock and loops all high-speed links of the FMC+ connector as well as several low-speed links. For characterization of the signal integrity of FireFly and QSFP28, 0.0 dB loopback modules were attached to these connectors. Since the ports were operated in a low-attenuation loopback channel, the IBERT IP-Core was set to Low-Power Mode (LPM). In the first step, the data rate for each transceiver was set to 10 Gbps. Successful operation at this data rate is a key requirement for the first application of the DTS-100G within the ECH0 experiment, which is introduced in section 4. Over several hours,  $3 \times 10^{14}$  bits were generated and transferred by each transceiver. During this measurement, not a single bit error was detected, resulting in a BER less than  $5 \times 10^{-15}$ . Figure 2 shows an example eye diagram for one of the transceivers with a BER of  $1 \times 10^{-10}$ .



**Figure 2.** Eye diagram of one FMC+ GTY-transceiver at 10 Gbps, other channels behave similarly. Measurement was performed using PRBS-31 with no pre- or post-emphasis compensation.

Subsequently, the transceivers were operated at 25 Gbps, which is the maximum line rate the DTS-100G is designed for. Analyzing the IBERT results revealed that reliable communication in this edge case has not been completely achieved. On the FMC+ transceivers the test pattern had to be reduced from PRBS-31 to PRB-15. With this modification, no bit errors were detected during 24 hours of measurement with a transmission of  $2.16 \times 10^{15}$  bits on each line. Communication via FireFly and QSFP28 was improved by additionally increasing the Pre- and Post-Cursor of the transmission signal. However, reliable communication on these links was only possible using PRBS-9 reaching a BER less than  $1 \times 10^{-12}$  on both connections during the 24 hour measurement. Possible sources for the signal distortions at the 25 Gbps line rate were identified on the PCB. The second revision of the board includes modifications to the ground layer below the transceiver pins and via structure of the transceivers to address these issues. However, the improvements still have to be tested after assembly of the revised board.

Nonetheless, because the transceivers already support 25 Gbps line rates with a BER below  $1 \times 10^{-12}$  for FireFly and QSFP28 lines, the board allows 100G Ethernet [8] communication. The four links of each connector operate at 25 Gbps each, leading to a total data rate of 100 Gbps. For testing the reliability of this protocol, the DTS-100G was connected to a Xilinx VCU108 Evaluation board [9]. The VCU108 offers two separate QSFP28 ports, one of them was connected to the FireFly of the DTS-100G, while the other was attached to the QSFP28. Over the course of 64 hours, bidirectional UDP Ethernet packets with Reed-Solomon Forward Error Correction (RS-FEC) were sent. On each connection, more than  $3 \times 10^{11}$  packets, consisting of around 9000 bytes each, were transmitted, and not a single packet error was detected. An effective payload data rate of 99.20 Gbps was achieved on both cases.

#### 4 Integration into the ECHo experiment

The ECHo experiment (Electron Capture of Holmium<sup>163</sup>) aims to investigate the electron neutrino mass [1]. It utilizes Metallic Magnetic Calorimeters (MMC) [10], a cryogenic sensor, to absorb and calculate the energy of X-ray photons emitted during the decay of Ho<sup>163</sup> to Dy<sup>163</sup>. By fitting the recorded energy spectrum of the photons against a modelled function, the maximum neutrino energy can be defined. In order to minimize thermal influence in the cryogenic environment, a microwave-SQUID-multiplexing [11] principle is used, which allows readout of multiple sensors on a single line. One multiplexer chip contains 800 sensors with a total bandwidth of 4 GHz between 4 and 8 GHz. The concept for the readout electronics is described in [12]. Mixing of the readout tone between radio frequency and baseband as well as the digitization of the combs is performed by custom hardware [13]. The main part of the software-defined-radio system is an MPSoC board responsible for the generation of the tones as well as for processing of the modulated signals. The 4 GHz comb is split into 5 subbands of 800 MHz, where each complex subband consists of separate I and Q data streams. Conversion is performed at a clock rate of 1 GHz by three four-channel DACs (AD9144) and five two-channel ADCs (AD9680) with internal digital down-conversion. In total, a conversion rate of 160 Gbps in both directions is achieved. The third DAC is only used partially for generation of the readout tones, so the two unused channels could be used for the generation of a fluxramp signal, which is required for linearization of the SQUIDs [14]. However, this task is currently being carried out by an extension board with a slower DAC (MAX5898), which was developed for the first prototype of this system [15]. This LVDS-DAC operates at a sample rate of 125 MHz and is able to convert two channels with 16 bit each.

Communication between the MPSoC board and the converters is realized by means of the JESD204B protocol, which encodes the data in the 8b/10b scheme and requires one GTY or GTH high-speed transceiver connection per effective byte. For the first prototype of the ECHo readout electronics containing only one of the five sub-bands, the Xilinx ZCU102 evaluation board was used. However, the full-scale system now requires 20 high-speed lanes with a bidirectional data rate of 10 Gbps each, so this evaluation board is not suitable. Instead, the DTS-100G, with its FMC+ connector providing 24 high-speed links fits perfectly into this application. Since both the ZCU102 and the DTS-100G are built around an MPSoC chip of the Zynq Ultrascale+ family, adapting the firmware to the new board was simple. By adding the DTS-100G board support package including

the device-tree and the configuration of the MPSoC into our custom build system based on Yocto [15], the firmware could quickly be deployed and installed on the board.

The processing chain in the PL consists of several stages to minimize the data rate step by step. After separating the channels from each other and demodulating the data in order to reconstruct the raw sensor signal, the samples containing information about the decay energy are extracted, while the samples during idle state are discarded [16]. Each 800 MHz subband is handled by a separate chain. Processing of the full bandwidth therefore requires five parallel instances of the entire chain. The parallel chains are merged by a switch module, that combines the individual signals to one single data stream. A general overview on the firmware is given in Figure 3(a). At the output, the data rate to be stored on the DDR4 or transmitted to a server is around 240 Mbps, depending on the exact decay rate of the isotope. The extracted pulses are stored in packages with meta information and up to 1024 samples. Figure 3(b) shows a superimposition of extracted samples from multiple detected pulses with four different energy levels. Due to the data rate minimization, the standard 1 Gb Ethernet connection is sufficient to send the data packages for post processing. However, high-speed data transfer might be used in the future in case access to raw data is desired, such as for system calibration.





(b) Superimposition of extracted pulses.

(a) Full-scale firmware for readout of the ECHo experiment.

Figure 3. Usage of the DTS-100G as part of the ECHo electronics.

#### 5 Summary

The DTS-100G MPSoC board is a universal platform suitable for the readout of particle physics detectors. The FMC+ connector, with 24 high-speed links and several 100 Gbps data transmission interfaces, allows acquisition of measurement data both with a high sampling rate and a high bit-width for maximum sensitivity. Characterization of the GTY transceivers showed error-free data transmission at 25 Gbps and effective data rates of over 99 Gbps on both four-lane FireFly and QSFP28 interfaces.

The first application of the DTS-100G will be the ECHo-100K experiment, which investigates the upper limit of the neutrino mass. The DTS-100G, as part of the readout electronics, is responsible for the generation of 400 readout tones and real-time processing of the modulated signals. In total, 160 Gbps of bidirectional data is handled by the transceivers. Since the DTS-100G provides 24 high-speed links on an FMC+ connector, it is perfectly suitable for this application. The PL within the MPSoC offers sufficient flexibility for implementation of the required processing

chain. A full vertical slice of the ECHo readout electronics is currently under commissioning and characterization in preparation for the production of 15 platforms required by the experiment.

#### References

- L. Gastaldo et al., The electron capture in <sup>163</sup>Ho experiment ECHo, Eur. Phys. J. Spec. Top. 226 (2017) 1623.
- [2] QUBIC collaboration et al., QUBIC I: overview and science program, JCAP 2022 (2022) 034.
- [3] M. Faverzani et al., The HOLMES experiment, J. Low Temp. Phys. 184 (2016) 922.
- [4] Xilinx, DS891, Zynq US+ MPSoC DS: Overview. v1.9 (2021).
- [5] Samtec, Firefly: ECUE Series in 28 Gbps (OIF CEI-28G-VSR) Applications (2013).
- [6] Xilinx, PG196, IBERT for UltraScale GTY Transceivers. v1.2 (2021).
- [7] Samtec, VITA 57.4 FMC+ HSPC/HSPCe LOOPBACK CARD (2019).
- [8] IEEE Computer Society, 802.3ba, Part 3: Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications (2010).
- [9] Xilinx, UG1066, VCU108 Evaluation Board User Guide. v1.5 (2019).
- [10] A. Fleischmann, C. Enss and G.M. Seidel, *Metallic Magnetic Calorimeters*, Springer, Berlin Heidelberg (2005).
- [11] S. Kempf, M. Wegner, L. Gastaldo, A. Fleischmann and C. Enss, *Multiplexed readout of MMC detector arrays using non-hysteretic rf-SQUIDs, J. Low Temp. Phys.* 176 (2014) 426.
- [12] O. Sander, N. Karcher, O. Krömer, S. Kempf, M. Wegner, C. Enss and M. Weber, Software-defined radio readout system for the ECHo experiment, IEEE Trans. Nucl. Sci. 66 (2019) 1204.
- [13] R. Gartmann, N. Karcher, R. Gebauer, O. Krömer and O. Sander, Progress of the ECHo SDR readout hardware for multiplexed MMCs, J. Low Temp. Phys. 209 (2022) 726.
- [14] J.A.B. Mates, K.D. Irwin, L.R. Vale, G.C. Hilton, J. Gao and K.W. Lehnert, *Flux-ramp modulation for SQUID multiplexing*, J. Low Temp. Phys. 167 (2012) 707.
- [15] N. Karcher, Ausleseelektronik für magnetische Mikrokalorimeter im Frequenzmultiplexverfahren, Ph.D. Thesis, Karlsruher Institut für Technologie (KIT) (2022).
- [16] N. Karcher, T. Muscheid, T. Wolber, D. Richter, C. Enss, S. Kempf and O. Sander, Online demodulation and trigger for flux-ramp modulated SQUID signals, J. Low Temp. Phys. 209 (2022) 581.