# THE UPGRADE PATH FROM LEGACY VME TO VXS DUAL STAR CONNECTIVITY FOR LARGE SCALE DATA ACQUISITION AND TRIGGER SYSTEMS

Chris Cuevas, David Abbott, Fernando Barbosa, Hai Dong, William Gu, Edward Jastrzembski, Scott Kaneta, B. Moffit, Nick Nganga, Ben Raydo, Alexander Somov, William Mark Taylor, Jeff Wilson, Thomas Jefferson National Accelerator Facility, Newport News, Virginia, U.S.A

Abstract

New instrumentation modules have been designed by Jefferson Lab that take advantage of the higher performance and elegant backplane connectivity of the VITA 41 standard of VXS. These new modules are required to meet the 200 KHz trigger rates envisioned for the 12GeV experimental program. Upgrading legacy VME designs to the high speed gigabit serial extensions that VXS offers, comes with significant challenges, including electronic engineering design, plus firmware and software development issues. This paper will detail our system design approach including the critical system requirement stages, and explain the pipeline design techniques and selection criteria for the FPGA that require embedded Gigabit serial transceivers. The entire trigger system operates synchronously at 250MHz, utilizing global clock and synchronization signals distributed to each front-end readout crate where finally the VXS switch slot distributes these signals to all front-end modules. The readout of the buffered detector signals relies on 2eSST over the standard VME64x path at >200MB/s. We have achieved an aggregate of 64Gb/s rate of trigger information from payload to switch slots within one VXS crate and will present results using production modules in a two crate test configuration with both VXS crates fully populated. The VXS trigger modules that reside in the front end crates, will be ready for production orders by the end of the fiscal year. VXS Global trigger modules are in the design stage now, and will be complete to meet the installation schedule for the 12GeV Physics program.

# VXS SELECTION CRITERIA

VME with serial extensions or VXS was selected as the 12GeV data acquisition backplane foundation for the front end detector readout and trigger hardware interface. The 6GeV experimental program relied on FastBus ADC and TDC with VME single board computers providing the configuration and readout of Physics data, along with custom electronics developed for the trigger system. [1] The new requirements for 12GeV experiments demand higher trigger and data readout rates as shown in Table 1.

We have a successful record with VME and VME64x custom data acquisition and trigger modules that were developed during the 6GeV program to replace aging and obsolete hardware. When VITA41 (VXS) [2] was initially emerging we decided that the specifications for the new serial extensions would meet and exceed our requirements for creating digital energy summation for

hundreds of ADC channels within a VXS card enclosure. The VXS backplane also is an elegant solution to distribute critical and essential global clock and synchronization signals to each front end module without cumbersome cabling or additional rear transition hardware. The VXS selection offered low risk and allowed a logical upgrade path for our legacy hardware. The VME consortium was promoting the addition of gigabit connectivity to a very large customer base, and the new VXS standard remains a viable choice for the future.

Table 1: Experiment Rates and L1 Trigger Channel Count.

|                 | Hall B   | Hall D    |
|-----------------|----------|-----------|
|                 | CLAS12   | GlueX     |
| L1 Trigger      | 4,000    | 4,500     |
| Channels        |          |           |
| #VXS Crates     | 38       | 25        |
| (Total Crates)  | (43)     | (52)      |
| #Single Board   | 43       | 52        |
| CPU             |          |           |
| L1 Trigger Rate | <20KHz   | <200KHz   |
|                 |          |           |
| L1 Data         | 60MB/s   | 3GB/s     |
| Rate(MB/s)      |          |           |
| L3 Farm Data    | 10KHz    | 20KHz     |
| Rate (Disk      | (60MB/s) | (300MB/s) |
| MB/s)           |          |           |

#### SYSTEM TOPOLOGY OVERVIEW

VXS offers dual star switch capability with 18 payload slots connected to two central switch slots. The VITA 41.0 specification lists 4 full duplex gigabit connections or "lanes" from each payload slot, to the each switch slot and a line rate of 10 Gb/s, so 80 Gb/s bandwidth is offered with dual star connectivity. At the start of our R&D hardware design phase in 2005, we were using the Xilinx Virtex-IV series with gigabit transceivers capable of 3.125Gb/s. Our production payload modules will use Xilinx Virtex-V transceivers capable of 6.25Gb/s line speeds. All four full duplex lanes are routed on the circuit board, but presently only two lanes are used to one switch slot. The front end payload modules are sixteen channel, 250MHz flash ADC boards [4] that use these gigabit lanes to transmit digital sum information to one of the VXS switch slots.[5] This switch slot module is a Crate Trigger Processor (CTP) which collects 64Gb/s from the 16 payload modules and processes a crate summation that is optically transmitted over a parallel fibre optic cable[6] to a Sub-System Processors (SSP) and finally to a Global Trigger Processor (GTP)[7] that determines if a Physics trigger condition was satisfied.

550 Hardware

# High Speed Gigabit Serial Transmission

HARDWARE DESIGN CHALLENGES

The design challenges for reliable and successful transmission of gigabit serial data over the VXS backplane requires the investment of high speed circuit board layout and routing tools. The FPGA selection requirements include at least four full duplex Gigabit Transceivers, user I/O pin count >500, and fast integrated block memory with multi-rate FIFO logic. We use circuit board routing simulation tools such as Mentor Graphics HyperLynx [8] which are invaluable for critical simulation and verification of circuit board signal integrity for the gigabit transmission paths before the manufacturing process. The FPGA devices that we use are capable of 6.25Gb/s serial transfer, and we have designed our circuit boards with signal integrity techniques using standard FR4 circuit board material to achieve >2.5Gb/s which meets the data transfer bandwidth requirements.

Another significant investment required for the hardware verification of the gigabit transceivers was a digital signal analyzer with 8GHz bandwidth to measure and record the backplane and fibre optic gigabit transceiver performance and to perform jitter analysis on the critical system clock and synchronization signals with at least 1ps resolution. We have purchased the Tektronix jitter analysis software which is a critical tool for the verification of our system clock, and for measurements of the phase controlled jitter attenuated clock provided by the Signal Distribution (SD) switch card in every crate

The investment of firmware development tools from FPGA industry leaders, Xilinx and Altera were also taken into consideration for the upgrade path to VXS, and costs for firmware simulation and verification tools from industry leading vendors must be considered because the license and maintenance fees are not trivial.

We use the Xilinx Aurora protocol for serial transmission which is robust, simple and is included with the FPGA development tools. Considerations for forward error correcting algorithms have been planned, but not implemented at this time.

# Fiber Optic Distribution Network

As shown in Figure 1, the digital sum value from each CTP in the front-end crate, and the distribution of the global clock, synchronization and trigger commands from the global trigger hardware, use a separate fibre optic cable. The crate sum fibre link is shown in orange, and the critical timing signals distributed to each front-end crate are blue. Each fibre optic link makes use of the Avago POP4 fibre optic transceivers and parallel OM3 rated glass fibre cable with MTP connections. These fibre optic transceivers operate at 3.125 Gb/s for an aggregate bandwidth of 10 Gb/s which is ample enough for the summing information that is sent forward to the global trigger processing hardware. The fibre link used for the distribution of the global clock, critical timing signals and trigger commands run at 1.25 Gb/s.

# Global Trigger Essentials

The global trigger SSP modules collect the 10Gb/s trigger information streams from each of the front end crates via the parallel fibre optic connections. These trigger data streams represent energy sums from calorimeter detector apparatus and hit patterns from Time-Of-Flight (TOF) detectors. The SSP are VXS payload modules, so after the SSP has performed the required subsystem alignment, the trigger data is passed forward to the GTP on the VXS backplane. The final trigger word is cabled to the Trigger Supervisor (TS) [10] and is distributed to the front-end crates and modules to initiate data readout.



Figure 1: Gigabit Serial Links.

#### System Latency and Transport Protocol

The Level 1 trigger is formed from the detector signals that have been selected for a given Physics event. The energy sums from a calorimeter, for instance, are created in firmware and transported on the VXS backplane using Xilinx's Aurora<sup>TM</sup> [11] protocol. Aurora<sup>TM</sup> is the low latency communication protocol between FPGAs on payload and switch boards. Other serial gigabit protocols such as Ethernet or PCIe cannot be used because the latency is too large to meet the 3.2µS requirement. This requirement comes from our choice of a commercial

cc Creative Commons Attribution 3.0 (CC BY 3.0)

Time-Digital-Converter ASIC [12]. These protocols include other properties that are not required for the transfer of the digital sum information and will add significant overhead compared to Aurora™. The Aurora™ protocol is also used on the fibre optic communication interface, and Aurora provides full duplex lane bonding to achieve aggregate bandwidth sufficient to transfer information from each crate to the global trigger crate modules.

# Data Rates with Linux Single Board Computers

The VXS crates use single board computers (SBC) and the Data Acquisition Group at Jefferson Lab has evaluated the performance offered by various companies.

In our present two crate test station, one SBC is a GE-7875 (Intel Core 2 Duo) and the other VXS crate uses a Concurrent v717 (Intel mobile core i7). These SBC have similar interrupt response times of  $\sim 5\mu s$  and maximum network throughput of 116MB/s using only 5% of the CPU resource. These SBC support 2eSST VME transfer mode and with a 200 KHz trigger rate, we achieve a nominal VME backplane transfer of >40MB/s. Each crate can accept 256 coaxial detector signals, and typical occupancy for a fully populated crate will be <10%.

#### LATEST PERFORMANCE RESULTS



Figure 2: Two VXS crate test station.

A photo of two VXS crates, which are almost fully populated with payload boards, is shown in Figure 2.

Each payload module has 16 coaxial inputs, and these inputs are captured with a 12-bit flash ADC. A fully populated crate can accept 256 coaxial detector signals and all channels participate in creating an energy sum value which is transferred every 4ns. Depending on the

location of the detector apparatus, the typical number of channels for a Physics trigger that will have relevant signal occupancy is <10%. For our two crate testing, we inject pulse signals to at least 12% of the channels to simulate a typical occupancy from a detector apparatus.

Figure 3 shows data transfer rate versus a Level 1 Trigger rate with channel occupancy at 12.5%. The aggregate data transfer rate of >40MB/s is from both VXS crate SBC. Selected input channels produce pulses at trigger rates set with a programmable generator, and several trigger rates points are shown. The Trigger Interface has the capability to store up to 256 trigger events (block) before signalling the SBC for VME 2eSST block readout. This block-event capability allows the fully pipelined system to accept a 200 KHz trigger rate and transfer the Physics event data from the SBC to the software trigger storage server. Figure 3 also shows the advantage of event blocking and larger events per block reduce CPU response time and data overhead per event.



Figure 3: Trigger rate vs. Data rate and Block-level affects.

#### **CONCLUSION**

We have met the requirements and milestones established for the new front-end data acquisition hardware and trigger system functions defined at the conceptual design phase for the 12GeV experimental upgrade project.

The upgrade path the VITA 41 or 'VXS' presented many significant engineering design challenges for both circuit board and firmware development. Careful planning including the cost of essential circuit board design and simulation tools, plus the required test equipment is critical for success. We now have the essential experience with high speed gigabit serial backplanes, FPGA transceiver technology, and firmware development.

In the next two years we will be challenged with production testing of the new front-end and trigger

552 Hardware

— cc Creative Commons Attribution 3.0 (CC BY 3.0) Copyright © 2011 by the respective authors

hardware, and will be required to meet the implementation goals for pre-commissioning the detector readout and trigger systems in the 12GeV experimental areas. The two crate VXS test station has been an excellent foundation for the development of data acquisition and trigger commissioning 'tools' that will be essential for full qualification of the VXS hardware systems needed when the 12GeV Physics programs begin.

#### **ACKNOWLEDGEMENT**

Authored by Jefferson Science Associates, LLC under U.S. DOE Contract No. DE-AC05-06OR23177 The U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce this manuscript for U.S. Government purposes.

#### REFERENCES

- [1] CLAS-Note 91-017; "A VXI Based Trigger for the CLAS Detector at CEBAF" D. Doughty Jr. J. Englert, R. Hale, S. Lemon, CNU; C. Cuevas, D. Joyce, CEBAF 1991.
- [2] ANSI/VITA 41.0-2006; VMEbus Switched Serial Standard, (2006)
- [3] VME and Critical Systems Magazine; "Mil tech gets smart with OpenVPX, VXS designs"; May, 2011 p. 30-31
- [5] "Integrated Tests of a High Speed VXS Switch Card and 250Msps Flash ADCs", H. Dong, C. Cuevas, D. Curry, E. Jastrzembski, F. Barbosa, J. Wilson, M. Taylor, B. Raydo. *IEEE Nuclear Science Symposium* N15-375, 2007.
- [6] 2010 Panduit Corporation; "Panduit® QuickNet™ MTP Trunk Cable Assemblies" http://www.panduit.com
- [7] "The Global Trigger Processor: A VXS Switch Module for Triggering Large Scale Data Acquisition Systems"; S.R. Kaneta, C. Cuevas, H. Dong, W. Gu, E. Jastrzembski, N. Nganga, B.J. Raydo, J. Wilson JLAB, Newport News, Virginia, USA; ICALEPS 2011- WEPMS017
- [8] 2010 Mentor Graphics "HyperLynx®" Signal Integrity Tools and Circuit Board Verification"; http://www.mentor.com
- [9] "Delivering Phase Controlled Jitter Attenuated Clock Signals to Data Acquisition System"; I IEEE Nuclear Science Symposium NP2.S-185; 2011
- [10] Jefferson Lab September, 2011; "The Trigger and Clock Distribution for 12GeV Upgrade Experiments"; DAQ Group and Fast Electronics Group
- [11] Xilinx SP002 "Aurora v2.4 Protocol Specification"; January 10, 2006: http://www.xilinx.com/aurora
- [12] "F1: An Eight Channel Time-to-Digital Converter Chip for High Rate Experiments"; Braun, G; Fischer, H; Franz, J; Grünemaier, A; Heinsius, FH; Hennig, L; Königsmann, KC; Niebuhr, M; Schierloh, M; Schmidt, T et al. 5th Conference on

Electronics for LHC Experiments, Snowmass, CO, USA, 20 - 24 Sep 1999, pp.383-387F1TDC