Abstract. The Muon-to-Central-Trigger-Processor Interface is part of the Level-1 trigger system of the ATLAS experiment at the Large Hadron Collider at CERN. The upgrade of the Muon-to-Central Trigger Processor Interface will be described. It will use optical input and provide full precision region-of-interest information for muon candidates to the topological trigger processor of the Level-1 trigger system. The new Muon-to-Central-Trigger-Processor Interface will be implemented as an ATCA blade receiving 208 optical serial links from the ATLAS muon trigger detectors. Two high-end processing FPGAs will eliminate double counting of identical muon candidates in overlapping regions and send candidate information to the topological trigger and multiplicities to a third FPGA which will combine the candidate information, send muon multiplicities to the Central Trigger Processor and provide readout data to the ATLAS data acquisition system. A System-on-Chip module will provide communication with the ATLAS run control system for control, configuration and monitoring of the new Muon-to-Central-Trigger-Processor Interface.
Introduction

The ATLAS Trigger System
The ATLAS experiment [1] is a general-purpose experiment at the Large Hadron Collider (LHC) at CERN. It observes proton-proton collisions at a center-of-mass energy of 13 TeV. With about 25 interactions in every bunch crossing (BC) of the LHC beams every 25 ns, there are 10 9 interactions per seconds potentially producing interesting physics. Therefore a trigger system is needed in order to select those events with interesting physics content and which can be recorded to permanent storage at a reasonable rate. The ATLAS trigger system consists of a first-level trigger based on custom electronics and firmware which reduces the event rate to a maximum of 100 kHz, and a higher-level trigger system based on commercial-off-the-shelf computers and network components and processing software which reduces the event rate to around 1 kHz.
Fig. 1. The ATLAS Level-1 trigger system
The first level trigger, see Figure 1 , uses reduced-granularity information from the calorimeters and from dedicated muon trigger detectors. The trigger information is based on multiplicities and topologies of trigger candidate objects. The muon trigger is based on Resistive Plate Chambers (RPC) in the barrel region and Thin-Gap Chambers (TGC) in the endcap and forward region. The Muon-to-Central-Trigger-Processor Interface (MUCTPI) [2] combines the muon candidate counts from RPC and TGC. The Central Trigger Processor (CTP) combines all trigger object multiplicities from the calorimeter and muon trigger, and the topology flags from the topological trigger, and makes the final Level-1 decision based on rules described in a trigger menu. The Level-1 trigger decision is sent back to the detector front-end electronics using the timing, trigger, and control (TTC) system. 
The MUCTPI
For each BC the MUCTPI receives up to two muon candidates from each of the 208 muon sectors, 64 in the barrel region and 144 in the endcap and forward region. The MUCTPI counts muon candidates for six different pT-thresholds. It avoids double counting of single muons that are detected in more than one muon sector due to geometrical overlap of the muon chambers and the trajectory of the muon in the magnetic field; this is called "overlap handling". Figure 2 shows the geometrical coverage of one of the 16 boards of the current MUCTPI with 4 barrel, 6 endcap and 3 forward sectors.
ATLAS Upgrade
Upgrade Plans
The MUCTPI upgrade is part of the overall trigger upgrade of ATLAS on the road to the high-luminosity LHC (HL-LHC), starting around 2025. The upgrade is in line with the development of the New Small Wheel (NSW) of the muon trigger system [3] . The required improvements to the MUCTPI are the following:
• Send full-precision information on muon candidates to the topological trigger processor; • Replace the electrical connections between the muon sectors logics and the MUCTPI by optical links with the goal to remove bulky and difficult to maintain cables, and allow for new or more information, like more candidates, and to increase the bandwidth in order to send more candidates, more precise position information and additional flags from muon identification algorithms • Allow the overlap handling to be improved by taking into account possible overlap between octants, which is currently not possible; • Fit within the same current tight latency requirement of eight BC (200 ns);
• Be compatible with the ATLAS upgrades for the HL-LHC. 
2.2
The New MUCTPI
The new MUCTPI, see Figure 3 , will be built as a single ATCA blade, compared to 18 VME modules in the current system. The new MUCTPI will receive 208 optical links using fibre ribbons and optical receiver modules (Avago minipods). It will use two state-of-the-art FPGAs (Xilinx Virtex Ultrascale) as Muon Sector Processors for the overlap handling, counting of muon candidates, and providing muon candidates to the topological processor. The counts are also passed to a third FPGA (Xilinx Kintex Ultrascale) which will act as Trigger and Readout Processor and provide the total count of muon candidates to the CTP and readout data to the data acquisition system. A Control Processor implemented by a Xilinx Zynq System-On-Chip (SoC) will integrate the MUCTPI into the ATLAS run control system for sending control commands, e.g. start, stop, pause, run calibration etc., loading configuration data, e.g. lookup-table files, algorithm parameters, etc., and collecting monitoring data, e.g. counters, selected event data, etc.
MUCTPI Run Control
RemoteBus Software
The processor part of the SoC is being used in order to communicate with the ATLAS Trigger and Data Acquisition (TDAQ) run control system. A reliable protocol is adopted for communication: TCP/IP. A client-server and request-response approach is implemented with the client being a TDAQ controller running on a PC, and sending requests, and the server being a process on the SoC, receiving the requests, processing them, and sending responses. A synchronous approach is followed as with the previous MUCTPI, and multiple clients and multi-threaded servers are allowed. This newly designed software, RemoteBus, provides several modes of working:
• Single reads and writes from and to memory on the Muon Sector Processor and Trigger&Readout FPGAs, as well as block read and write functions (as with the previous MUCTPI using VME); • Provide extensibility for user-defined functions, typically for more complex serial protocols for auxiliary hardware, e.g. I2C, SPI, JTAG, etc.; • Allow queuing of requests: bundle several requests before sending them together in order to mitigate latency overhead due to network transport.
This software was named "RemoteBus" because it implements functions similar to remote procedure call (RPC) and because it is similar to read and write operations as with the previous MUCTPI using VME bus. Every RemoteBus Client (thread) has its own TCP socket and its own RemoteBus Server thread, see Figure 4 . The RemoteBus Server reads/writes from/to the other processor FPGAs using the Xilinx AXI Chip2chip protocol [4] for communication between FPGAs and executes functions for auxiliary hardware on the server. Some requests are pre-defined in base classes implemented for communication between any two computers, e.g. READ(N), WRITE(N). Additional requests are added depending on the hardware of the server (i.e. the MUCTPI). All parameters, request and response, are 32-bit data words, and are added into the message or retrieved from the message in a stack-like way. Additional request types can be added as functions to the server and client, using C++ inheritance. The Yocto/OpenEmbedded development framework [5] is used for creating the Linux operating system, for compiling the application software (RemoteBus) and for providing all files necessary to boot and run the SoC. Two derived classes "ZC706Client" and "ZC706Server" were implemented for the Xilinx ZC706 (Zynq) evaluation board. Requests were added for the hardware of the evaluation board, in particular for DC/DC converters, clock configuration, and temperature/voltage monitors. The minimal latency for a request-response transaction is around 75 μsec. The bandwidth is limited by the Ethernet throughput and reaches about 50 Mbyte/s for 10 kword blocks, this is about 10 times more than the throughput of the previous MUCTPI using VME. No particular effort at optimizing the network was done. Running multiple clients or client threads is safe and increases performance. RemoteBus is currently being applied for testing the MUCTPI prototype.
Port of TDAQ software on embedded Linux
As an alternative approach, in the ATLAS Level-1 Central Trigger team has started to evaluate the porting of ATLAS TDAQ run control software to embedded Linux. In that case, the TDAQ controller would run directly on the processor part of the SoC. This evaluation is using the Yocto/OpenEmbedded framework and is currently under way.
Conclusions
The new MUCTPI prototype became available at the start of May 2017 and is currently being tested. The run control path has been tested with Xilinx Zynq evaluation boards. RemoteBus software was developed with functions for accessing memory in the processor FPGAs, as well as for auxiliary hardware. A port of the ATLAS TDAQ software to Xilinx Zynq with embedded Linux is under way. The Yocto/OpenEmbedded development framework is used for building the Linux operating system and the RemoteBus software. In conclusion, trigger electronics are not only becoming fully optical, much denser, and more intelligent for processing, but also more intelligent to control.
