Abstract-The linear accelerators employed to drive Free Electron Lasers (FELs), such as the X-ray Free Electron Laser (XFEL) currently being built in Hamburg, require sophisticated control systems. The Low Level Radio Frequency (LLRF) control system should stabilize the phase and amplitude of the electromagnetic fields in accelerating modules with tolerances below 0.02% for amplitude and 0.01 degree for phase to produce ultra-stable electron beam that meets the conditions required for Self-Amplified Spontaneous Emission (SASE). The LLRF control system of a 32-cavity accelerating module of the XFEL accelerator requires acquisition of more than 100 analogue signals sampled with frequency around 100 MHz. Data processing in a real-time loop should complete within a few hundred nanoseconds. Moreover, the LLRF control system should be reliable, upgradeable and serviceable. The Advanced Telecommunications Computing Architecture (ATCA) standard, developed for telecommunication applications, can fulfil all of these requirements.
I. INTRODUCTION

M
ODERN linear accelerators require a powerful digital Low Level Radio Frequency (LLRF) system that will stabilize the electromagnetic field in accelerating cavities, operating in pulsed mode, with pulse-to-pulse rms tolerances as low as 0.02% for amplitude and 0.01 degree for the injector [1]. This numbers are derived from the required beam energy spread, bunch compression and bunch arrival time. For the linac the requirements are less stringent with 0.1% for amplitude and 0.2 deg [2] for phase for uncorrelated errors assuming that the repetitive correlated errors are reduced by feedforward and slow drifts are compensated by beam based feedback systems. To obtain such a good stability, a digital controller with fast feedback and adaptive feed-forward is required [3] . The LLRF system measures the probe signals in accelerating cavities and digitizes them [4] . The digital signal, in form of a Vector Sum (VS), is delivered to the real-time controller. The processed in-phase and quadrature componets (I and Q) are connected to the Vector Modulator (VM) that modifies the phase and amplitude of the 1.3 GHz reference signal. The modulated signal is amplified and delivered to the accelerating cavities as presented in Fig. 1 . The system composed of 32 cavities, LLRF hardware and a single klystron is called an RF station [5] . The forward and reflected power signals can be used instead of the probe signal to maintain the operation of the accelerator when hardware failure occurs, e.g., broken cable, DAQ or downconverter module [6] , [7] . Moreover, the reflected power signals can be used to calculate cavity detuning caused by Lorentz force and to compensate the detuning using piezo actuators. The distributed LLRF control system for X-ray Free Electron Laser (XFEL) will consists of 20 RF stations supervising 640 cavities.
High reliability, availability and modular design of the LLRF system are as crucial as operational demands [8] .
Currently, VME is the most popular system architecture used in High Energy Physics (HEP) installations. This architecture, originated in the late 70 s, has several disadvantages which 0018-9499/$26.00 © 2011 IEEE make it less and less suitable for current highly demanding control systems. The ATCA architecture eliminates most of the VME weaknesses. Modern telecommunication standards, adopted recently for High Energy Physics applications, like ATCA or AMC have a potential to fulfil the above-mentioned requirements [9] - [12] . The standard offers native redundancy of the most critical elements, like power supply, communication interfaces and useful features, such as hot-swapping or shelf management and monitoring [9] - [11] , [13] - [16] .
The prototype system, dedicated for the XFEL accelerator, was built with the application of the ATCA standard. Most of the functionality of the LLRF system is implemented in AMC modules. The application of pluggable AMC modules allows an easy upgrade of the system in the future.
The authors propose the architecture for the LLRF system of a single RF station. The demonstration of the ATCA-based LLRF system was performed at the FLASH accelerator at DESY.
A. Low Level RF Controller
The LLRF controller calculates appropriate control signals for the RF field source based on the measurements of the RF field in individual cavities to achieve the required RF field stability.
The simplified block diagram of the algorithm is presented in Fig. 2 . The first stage of the algorithm is I-Q demodulation of the signals for individual measurement channels. After an unique calibration of each detected vector, they are summed together to create the Vector Sum (VS) of the fields in the whole accelerating module. This sum is compared with a user-provided Set Point (SP) value and the resulting error signal is processed by a P-Controller in a feedback loop. To create the final control signal, multiple corrections are applied and the resulting I-Q vector is provided at the output.
B. Centralized Versus Distributed LLRF System
The LLRF system consists of several ATCA carrier blades with three AMC slots each. An ATCA carrier blade supplies the main processing power (FPGA and DSP) and all the functionality required for AMC modules, whereas these modules provide various functionality of the LLRF system. The ATCAbased LLRF control system requires the following AMC devices:
• 8-channel data acquisition module, • Vector Modulator (VM),
• Trigger and timing module, • A single RF station of the XFEL LLRF control system installed in the accelerator tunnel should supervise 32 superconducting accelerating cavities driven by a single klystron. Groups of eight cavities form four cryomodules (#1-#4) [17] . Therefore, each RF station should be connected to the cavities using 96 RF cables (probe, forward and reflected power signals). The length of the RF cables should be minimized in order to reduce temperature drifts of the cables, signal cross-talk and attenuation. Therefore two different architectures of the LLRF system were taken into consideration:
• centralised system, • distributed system. The centralised system is composed of a single 14-or 16-slot vertical ATCA shelf [18] delivering infrastructure required by the ATCA standard, e.g., cooling, redundant power supply and Shelf Management (ShM). ATCA carrier blades with various AMC modules and RTMs are installed in the ATCA shelf. The shelf is placed in the central position, next to cryomodules as presented in Fig. 3 . The length of the cables is acceptable in such a case, however significant amount of space is required for the ATCA shelf.
On the other hand, the LLRF system of a single RF Station can be split into a few subsystems, each supervising a single cryomodule. The distributed system could be designed with application of four ATCA shelves. In such a case, 5-or 6-slot ATCA horizontal shelves [19] can be used. The dimensions of the shelves give an opportunity to install them just above the cryomodules in the nearest proximity of the accelerating cavities. This solution allows to use short RF cables. However, application of four ATCA shelves with redundant equipment required by the standard will unacceptably increase the cost of the whole accelerator LLRF system.
The remedy for this problem could be a semi-distributed system composed of two horizontal shelves installed above the cryomodules #1-#2 and #3-#4 respectively. The concept of a semi-distributed LLRF system is depicted in Fig. 4 . In such a case, Master and Slave systems can be distinguished. The Slave system collects all the data from cryomodules #1 and #2 (cavity probes, reflected and forward power, etc.), digitizes them and sends to the Master system. The main LLRF controller located in the Master shelf collects data from cryomodules #3 and #4 and receives data from the Slave system. The processed data are provided to the Vector Modulator module and forwarded to the preamplifier and finally to the klystron supplying the cavities. Therefore, low latency connectivity between Master and Slave subsystems is required. The connectivity can be realized using optical fibre and Multi-Gigabit Transceivers (MGTs) available in Xilinx Virtex chips. The control data can be sent using either PCI Express or Gb Ethernet standards. More detailed discussion concerning data requirements and transmission in the LLRF system based on the ATCA standard can be found in [20] . In case of the distributed system, redundancy of the main interfaces could be required [21] .
The analysis of the system has shown that the horizontal ATCA shelves with 5 or 6 slots can provide the required functionality of the system (including Lorentz force detuning system). The space available above the cryomodules is large enough to house two standard horizontal ATCA shelves [22] .
The electronic equipment of the distributed LLRF system installed above the cryomodules will be exposed to higher gamma and neutron radiation doses than in case of the centralised system. Therefore, it is recommended to monitor gamma and neutron fluence in the nearest proximity of the ATCA and AMC equipment.
II. ATCA-BASED LLRF CONTROL SYSTEM
Since the FLASH accelerator is an experimental facility and most of its hardware is installed in an experimental hall outside the tunnel, the centralised system has been chosen. Only three accelerating cryomodules, called ACC 4, ACC 5 and ACC 6 respectively, were available for the demonstration. Therefore, the hardware used during the demonstration consists of four in-house developed carrier blades. Three of them were used for data acquisition and one for reference timing and RF signal modulation. The data acquisition modules, Vector Modulator (VM) and Timing module were implemented as AMC modules installed on carrier blades. The block diagram of the LLRF control system with various links emphasised is presented in Fig. 5 .
The system is managed using an ATCA computation blade Adlink 6900 with PCIe interface available on the backplane (zone 2). The CPU working under Debian Linux operating system is used as a Root Complex required for PCIe. The DOOCS server running on Adlink CPU was responsible for the configuration of the submodules of the LLRF system. The LLRF system is controlled from DESY control room via Gb Ethernet. Analogue RF signals with frequencies GHz from 24 cavities (probe signals, forward and reflected powers) are connected to downconverters installed on RTM modules. The downconverters, supplied from Local Oscillator (LO) with frequency GHz, allow to obtain signal with intermediate frequency
MHz. The signal is digitized using commercially available AMC modules TAMC900. The vector sum signals, partially calculated in FPGA devices available on TAMC900s, are sent to the main FPGA present on the carrier blade installed in the slot #3. The data transmission uses a custom Low Latency protocol based on the LVDS standard. The measured latency is around 100 ns. The main Feed-Forward and Feed-Back controller implemented in the FPGA chip available on the carrier blade in the slot #3 generates the I-Q signal that is connected to the VM. It is responsible for modulating the RF signal used to drive a preamplifier and a 10 MW klystron. The timing module is responsible for generating clock and trigger signals required in the LLRF system. The hardware used during demonstration with Ethernet cables and Xilinx programmers is depicted in Fig. 6 (only two carrier blades are installed in the shelf).
A. ATCA Carrier Blade
The ATCA carrier designed for LLRF system is equipped with three AMC bays. The board provides low latency computation for Real-Time (RT) controller loop (Xilinx Virtex 5 FPGA) and floating point Digital Signal Processor for more complex applications required for system diagnostics. The photograph of the carrier blade with an RTM module and three AMC modules is presented in Fig. 7 .
The carrier blade provides all the necessary interfaces for the RT controller, system management and diagnostics required by the ATCA standard. A more detailed discussion of various links can be found in [20] . The FPGA device is connected using Low Latency Links (LLL) with all three AMC modules. The main carrier blade with the VM has low latency connectivity to all other carrier blades, see Fig. 5 . Each carrier blade has a built-in, eight channel PCIe x1 switch connected to the AMC modules, the FPGA on the carrier blade and two channels on the backplane (zone 2). The PCIe chain link between the carrier blades is created during activation using Intelligent Platform Management Interface (IPMI) mechanisms, see Fig. 5 .
The reference timing signal is generated by the Timing module ( MHz) from the Master Oscillator (MO) and distributed together with trigger signals using a custom timing backplane.
All analogue signals are transmitted from the RTM module using Erni connectors (zone 3). 
B. AMC Modules
Most of the functionality of the LLRF system, required for its basic operation, was implemented using hot-swappable AMC modules:
• 8 channel data acquisition module, • Vector Modulator, • Trigger and timing generation module. The DAQ TAMC900 module was manufactured by TEWS company, whereas the VM and timing modules were developed in-house. Since most of the AMC modules used for the LLRF system (VM, Timing and IO) requires all functionality defined by the AMC standard (MMC, power subsystem), communication interfaces and processing power, the AMC modules were divided into two submodules called respectively AMC_A and AMC_B. The digital AMC_B board includes all the digital hardware required by the AMC standard and FPGA processing power, whereas the AMC_A module is fitted with all the hardware required for subsystem functionality, e.g., VM, timing hardware or IOs [23] .
1) Local Timing Generation and Distribution:
The LLRF control system requires synchronization of devices within the ATCA crate by means of timing signals. Two types of timing signals are distributed locally within the ATCA shelf: the trigger and the clock signals. Both signal types are generated by the AMC Timing Receiver card (called AMC_TM). The trigger signals are derived from a fiber-optic receiver part of the AMC_TM. Three different CMOS-level trigger signals are decoded in the FPGA and sent to the AMC connector of the Timing Receiver. Three independent clocks are also generated by the AMC_TM. The clock frequency can be flexibly programmed in the range from 10 MHz to 100 MHz. Clocks are synchronized to the RF phase reference signal delivered to the ATCA shelf from the Master Oscillator [24] system. The required jitter value for the clock is less than 5 ps. The jitter demonstrated at the output of the AMC_TM in laboratory conditions varies between 1 ps and 1.3 ps depending on the clock frequency [25] , [26] . The timing distribution scheme is shown in Fig. 8 . The AMC_TM can be located in any AMC bay on the carrier. The timing signals are distributed over the carrier to the remaining AMC bays. There is also a bidirectional connection between the ATCA carrier blade and the custom Zone 3 Backplane (Z3B). The Z3B allows for distribution of the trigger and clock signals to other carrier blades inside the ATCA shelf. This architecture allows for distribution of the timing signals from any AMC bay on any carrier to all other boards within the shelf. The redundancy is also supported by a possibility of switching between two AMC_TM cards in case of failure of one of the modules.
2) Vector Modulator: The AMC Vector Modulator module (AMC_VM) is used as an actuator device delivering an RF output signal of the LLRF control system. AMC_VM output signal drives the klystron. Vector Modulator card contains an I-Q modulator circuit [27] that can change the amplitude and phase of the MO phase reference signal. In the presented system, the modulator I and Q inputs are driven by high-speed DACs. The DAC control signals are transmitted to the module via the low-latency link from the ATCA carrier. The AMC_VM module also contains DC offset conditioning circuits, RF power amplifier, programmable attenuator, RF-gate and diagnostic circuits. The attenuator allows for implementation of an automatic output power level control. The RF-gate protects the AMC_VM output from high level signals coming from the klystron input and it can also be used for fast shutting down of the klystron drive signal in case of interlock or fault detection. A detailed AMC_VM card description is given in [28] .
3) Data Acquisition Module: The TAMC900 single width, mid-height AMC has been selected for data acquisition. It consists of eight LTC2254 14-bit, 105 MS/s ADCs, two modules of 250 MHz 18-bit 1 Mword QDR-II SRAM at two independent channels and the XC5VLX30T Virtex-5 FPGA. Data from ADCs can be transmitted to the CPU by up to x8 PCIe link. The configuration of the on-board peripherals is maintained by a XC95144XL CPLD. The TAMC900 provides three clock inputs and three trigger inputs. The three external clock inputs and the PCIe reference clock are routed to a flexible clocking scheme that allows independent clocking of the ADCs in two groups. The trigger inputs are routed directly to the FPGA. The inputs to ADCs are connected via the AMC connector and the carrier blade to the downconverters on the RTM. The module also features a set of LVDS connections [29] , used as a low-latency link to the remaining parts of the controller.
4) Radiation Monitoring Module:
The Radiation Monitoring Module (RMM) is designed as an AMC module [30] . It is not indispensable for operation of the LLRF control system, although it could be helpful when the system is installed in the tunnel and exposed to radiation. The module equipped with gamma and neutron radiation detectors allows to estimate doses absorbed by the LLRF hardware. A RadFET detector is used for gamma dosimetry, whereas a Static Random Access Memory (SRAM) chip is applied as a reference error counter or neutron fluence detector when it is calibrated [31] , [32] .
Since the LLRF control system was installed outside the accelerator tunnel and it was not exposed to radiation during tests, the RMM module was not tested.
C. Lorentz Force Detuning System
In the XFEL the piezoelectric actuators are to be used to compensate the changes in the cavity resonant frequency caused by Lorenz forces deforming the cavities during the RF pulse. This system requires generation of control signals at the level of hundreds of volts, uncommon to the ATCA-based system. This part has not yet been fully implemented and tested. Two possibilities are considered:
• a separate crate connected to the ATCA system via an optical cable, • a special, custom-made ATCA board.
D. Distributed LLRF Controller
The presented architecture of the hardware platform requires special considerations for the LLRF controller firmware design. To ensure efficient usage of resources, the control algorithm blocks have been distributed among several FPGA chips connected with each other using Low Latency Links.
As presented in Fig. 9 , the blocks related to field component detection are moved directly to FPGAs on the TEWS AMC modules. Each module calculates a partial Vector Sum. The final VS calculation and the main part of the feedback algorithm is executed on the carrier board installed in slot #3, see Fig. 5 . The intermediate control signal is transmitted to the Vector Modulator board where the final corrections are applied and the control signal is provided at the output.
Such distribution algorithms require several inter-connections between FPGAs on several boards. The communication links were implemented as differential lines using the LVDS standard. In the presented application, the latency of the whole algorithm is important-excessive latency reduces the maximum feedback loop gain possible to achieve. To minimize the latency and number of used lines, Double Data Rate (DDR) serial transmission protocol is used. The achieved latency of a single link is around 100 ns.
E. Diagnostics and Control Software
DOOCS [33] is a distributed object oriented control system designed to secure a reliable software platform for high energy experiments conducted in the DESY research center. It allows to provide functionality starting from the device server level up to the operator console [34] . DOOCS server for ATCA-based LLRF control system was installed on the ADLink carrier board working under control of the DESY distribution of Linux operating system. Therefore it was able to perform direct communication with each system component through the PCIe bus. The main task of the server was to process input calibration coefficients taken from the operator control panels presented in Fig. 10 , interpret them, recalculate the control parameters and send them back to the appropriate hardware subsystems. The server was synchronized with the hardware system through the USR1 signal generated by the PCIe device driver in response to the interrupt from the LLRF controller. Therefore, after every operation phase of the accelerator, it updates current information about the system activities. Consequently, the operators can monitor the status of the machine in real time and react quickly to detected anomalies.
F. Intelligent Platform Management Interface
The Intelligent Platform Management Controller (IPMC) is a vital part of every intelligent Field Replaceable Unit (FRU) in ATCA. It is responsible for communication with the redundant Shelf Managers (ShMs) as well as Advanced Mezzanine Cards (AMCs) in case of ATCA carrier boards. The entirety of management communication concerning any given FRU is governed by the IPMC. The states of various sensors, including voltage, current and temperature, need to be monitored by it in order to be able to detect and possibly prevent failures. They must at least inform the System Manager about such occurrences.
Moreover, IPMC has to deal with Electronic Keying (EK), that is linking the devices in an ATCA shelf according to the interfaces they are able to support. This removes the need for manual cable manipulation at the front panel since all the connections have been moved to the backplane. In case of the Carrier Board (CB) for LLRF Control System the PCI Express (PCIe) protocol has been used for sending the configuration parameters to the LLRF controller. Each CB stores information about the possible connections in a non-volatile EEPROM. This data is presented to the ShM during activation to which the ShM responds with requests to activate certain links. The IPMC, knowing the relative position of its CB in the shelf as well as the states of other CBs, enables only these connections which are necessary to create the PCIe chain as presented in Fig. 5 . This procedure is repeated every time a board is activated or deactivated in the shelf and the IPMC is responsible for maintaining the proper links. This is done automatically, with no help of the user, thanks to dedicated algorithms implemented in the IPMC.
III. INITIAL TESTS OF THE SYSTEM IN FLASH TUNNEL
The main goal of the tests of the prototype ATCA system at FLASH was to demonstrate that this standard fulfils many technical aspects required for XFEL. The performed demonstration was only an initial test of the system performed in two steps: running in the laboratory and controlling an accelerating field in 24 cavities of the FLASH accelerator with feedback.
A. DAQ-Diagnostics and Readout
For diagnostic purposes the data from all the TAMC900 channels can be stored in QDR memory during the RF pulse and transferred via the PCI Express interface between the pulses. The ADC data bus is 15 bits wide, as the converters provide 14 bits of sample data and an overflow flag. The block diagram of the firmware is presented in Fig. 11 . The Integral Interface [35] is used to communicate with the firmware control registers. For the transfer of the ADC samples DMA has been implemented, allowing to achieve data rate of 140 MB/s for x1 link and 400 MB/s for x4 link.
B. Tests in Laboratory
The goal of the tests in the laboratory was to put all analogue and digital components in one ATCA shelf, connect the test signals, install software and make it all run together. The tested system comprised four carrier blades equipped with nine TAMC900 ADC modules, one AMC vector modulator and one AMC timing module. All the boards were managed by an ATCA CPU-6900 blade located in slot 1. The configuration is presented in Fig. 12(a) . The three ATCA blades with ADC modules were coupled with RTM modules where analogue signals were connected-24 probes, 24 forward and 24 reflected power signals from 24 cavities. The RTM modules are visible in Fig. 12(b) .
During the tests, the software installed on the CPU was controlling other modules over the backplane using PCIe protocol. The software controlled 15 FPGA chips with firmware performing cavity field detection, timing configuration and distribution, analogue vector modulator control and data acquisition. The laboratory test also helped to improve software for IPMI running on every module in the shelf.
C. Tests of the 24 Cavity System
An RF station for XFEL will consist of 32 cavities in 4 accelerating modules driven by one klystron. The FLASH accelerator was an excellent place for testing the system. In 2009 the modules ACC4/5/6 (24 cavities in total) were driven by one klystron. This is very similar to the XFEL setup. The system was installed in the experimental hall and real signals of the FLASH accelerator were connected. The system was running in feedback mode controlling field in 24 cavities and driving a 10 MW klystron. The dependency of the residual noise on amplitude and phase as function of the loop gain has been measured with the detectors in the loop due to the difficulty of the implementation and vector-sum calibration of a second LLRF control system in parallel. Previous measurements with beam and parallel LLRF installations have shown, that the measurement results are not dominated by internal noise sources. The gain of the feedback loop was changed from 0 to 60 and the stability of the vector sum of 24 cavities was measured by calculation of amplitude and phase stability for given feedback gains. Figs. 13 and 14 present the results from the experiment where the dashed line is the minimum value. The optimal point was achieved for gain equal to 35 and the corresponding stability was % for amplitude and ( deg. The achieved stability fulfils the requirements for field stability in the linac section of the XFEL accelerator (0.1% for amplitude and 0.2 deg for phase).
IV. SUMMARY
During the tests in the FLASH accelerator the centralized version of the system has been used. The system was designed using the ATCA and AMC specifications. The main objective of the project was to verify and prove that modern standards, developed for telecommunication applications, could be useful for building of the LLRF control system.
The ATCA standard allows to design complex systems and offers high reliability, availability and serviceability desired in HEP. These features require an implementation of sophisticated mechanisms, like shelf management and redundancy. Since the standard was developed for telecommunication, it supports a wide variety of gigabit digital interfaces. They are necessary in LLRF control systems, but the standard does not consider distribution of high-frequency analogue signals and low-jitter reference clocks, also required in HEP applications.
The developed system has proved that analogue high-frequency signals can coexist with digital signals (e.g., PCIe, GbE). There were some problems with the clock distribution. The bus topology used for signal distribution on the ATCA carrier blades introduces a large clock jitter. Therefore, additional signal splitters on the carrier blades are necessary to improve clock stability. This has been corrected in the second revision of the carrier blade.
Since the standard was developed for telecommunication, the sophisticated subsystems required by LLRF are not available on the market. The whole system apart from the TAMC900 DAQ boards has been custom-made. The custom components, never tested before, included the carrier blades, AMC_B modules, Vector Modulator, Timing module, Timing backplane and Intelligent Platform Management/Module Management Controllers. The components integrated together seamlessly and operated very well, fulfilling all the expectations.
The PCI Express bus has turned out to be a very efficient solution for data acquisition for monitoring purposes. However, the latency introduced by this standard is too large for the distributed real-time LLRF controller. Two interfaces were evaluated for low-latency LLRF data transmission: an interface based on pure LVDS serial connection and gigabit transceiver offered by Xilinx FPGAs. Both solutions allow data transmission with latencies in range of hundred ns, therefore they allow to realize a distributed controller based on 4 carrier blades. Electronic keing was implemented for all signals available on the ATCA backplane (PCIe, LVDS and Xilinx gigabit transceivers).
The system has a very large computational power-multiple digital signal processors and FPGAs makes the system easily upgradable. The implementation of remote firmware upgrade via the IPMI protocol based on the HPM.1 standard is under development.
Estimation of the reliability of the system requires much more time and it is not known at the moment. Plenty of design errors have been detected and fixed in the next revision. A new version of the blades is being manufactured and new tests are planned. Many questions are still unanswered, however the demonstration at FLASH has proven that the ATCA standard can be used for instrumentation purposes where many channels of small signal levels must be processed with low noise and low latencies of a hundred of nanoseconds.
The ATCA and AMC specifications allow to design a complex LLRF system, however the development costs are relatively high in comparison with other standards (e.g., VME).
The PICMG "xTCA for Physics" subgroup is currently developing a new TCA standard with RTM extension that offers similar functionality to ATCA. The new standard, intended for HEP applications, is very promising from the LLRF point of view. It requires lower development cost in comparison with the ATCA. Moreover, some components developed for the presented system (AMC modules) can still be used in the new TCA standard.
