EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH European Laboratory for Particle Physics



Large Hadron Collider Project

LHC Project Report 1151

## DIGITAL SIGNAL PROCESSING FOR THE MULTI BUNCH LHC TRANSVERSE FEEDBACK SYSTEM

P. Baudrenghien, W. Hofle, G. Kotzian, V. Rossi CERN, Geneva, Switzerland

### Abstract

For the LHC a VME card has been developed that contains all functionalities for transverse damping, diagnostics and controlled bunch by bunch excitation. It receives the normalized bunch by bunch position from two pick-ups via Gigabit Serial Links (SERDES). A Stratix II FPGA is responsible for resynchronising the two data streams to the bunch-synchronous clock domain (40.08 MHz) and then applying all the digital signal processing: In addition to the classic functionalities (gain balance, rejection of closed orbit, pick-up combinations, one-turn delay) it contains 3- turn Hilbert filters for phase adjustment with a single pickup scheme, a phase equalizer to correct for the non-linear phase response of the power amplifier and an interpolator to double the processing frequency followed by a low-pass filter to precisely control the bandwidth. Using two clock domains in the FPGA the phase of the feedback loop can be adjusted with a resolution of 10 ps. Built-in diagnostic memory (observation and post-mortem) and excitation memory for setting-up are also included. The card receives functions to continuously adjust its parameters as required during injection, ramping and physics.

Presented at EPAC'08, 11th European Particle Accelerator Conference, Genoa, Italy - June 23-27, 2008

CERN, CH-1211 Geneva 23 Switzerland

# DIGITAL SIGNAL PROCESSING FOR THE MULTI-BUNCH LHC TRANSVERSE FEEDBACK SYSTEM

P. Baudrenghien, W. Höfle, G. Kotzian, V. Rossi CERN, Geneva, Switzerland

### Abstract

For the LHC a VME card has been developed that contains all functionalities for transverse damping, diagnostics and controlled bunch by bunch excitation. It receives the normalized bunch by bunch position from two pick-ups via Gigabit Serial Links (SERDES). A Stratix II FPGA is responsible for resynchronising the two data streams to the bunch-synchronous clock domain (40.08 MHz) and then applying all the digital signal processing: In addition to the classic functionalities (gain balance, rejection of closed orbit, pick-up combinations, one-turn delay) it contains 3turn Hilbert filters for phase adjustment with a single pickup scheme, a phase equalizer to correct for the non-linear phase response of the power amplifier and an interpolator to double the processing frequency followed by a low-pass filter to precisely control the bandwidth. Using two clock domains in the FPGA the phase of the feedback loop can be adjusted with a resolution of 10 ps. Built-in diagnostic memory (observation and post-mortem) and excitation memory for setting-up are also included. The card receives functions to continuously adjust its parameters as required during injection, ramping and physics.

#### **INTRODUCTION**

The bandwidth of the LHC transverse feedback system shall cover a frequency range from the lowest betatron frequency of 3 kHz up to half the bunch repetition frequency of 40.08 MHz, in order to damp all possible coupled bunch dipole oscillations. The power system of the transverse damper is described in more detail in [1]. The minimum required sampling rate is twice the required bandwidth, i.e. a sampling at the bunch repetition frequency of 40.08 MHz. In total there are four independent transverse damper systems in LHC, one per ring and transverse plane. Two 50  $\Omega$ coupler type pick-ups are used to detect the betatron oscillations. A beam position front-end has been developed [2] to process the signals from the pick-ups and generate a digital input at 40.08 MHz rate to the Digital Signal Processing Unit (DSPU). The beam position front-end band-pass filters the input signal at 400.8 MHz using a sampled line filter, down converts to baseband using the 400.8 MHz reference clock. The I and Q baseband signals from the sum and delta signals are digitized using a 40.08 MHz beamsynchronous clock. A normalised beam position is computed digitally for each bunch from the digitized I and Q values of the delta and sum signals. The function of the DSPU described in this paper is to provide the signal processing necessary to combine the signals from the two pick-ups, appropriately delay these and synchronize the generated kicker signal with the beam. It will be controlled via function generators.

### **OVERVIEW OF SIGNAL PROCESSING**

Fig. 1 shows an overview of the signal processing on the DSPU VME board. The bit stream from the two BPM modules is received on the DSPU board and re-sychnronized using a common revolution frequency clock. The data stream is clocked into a memory and then read out synchronously with the 40.08 MHz clock. The function of the gain balance block is to normalize signals proportional to  $\sqrt{\beta}$  at the corresponding pick-up locations. A digital notch filter removes the closed orbit variation. An optional Hilbert filter [3] can be switched on. It will shift the pickup signals in betatron phase by the appropriate amount to achieve an optimal feedback phase. The signals from the two pick-ups are then combined using two coefficients  $b_1$ and  $b_2$ . These can be programmed by a function generator enabling adjustment during the accelerating cycle. If the Hilbert filters are used for the phase adjustment, the system can run on a single pick-up, or  $b_1$  and  $b_2$  can be set both equal to one and signals from both pick-ups can be added to improve the S/N ratio. After combining the pick-up signals a delay, complementing the electronic and cable delay to equal the particle time of flight between pick-up and kicker plus one turn, is added. The fine delay is adjusted by a change in clock domain using an externally delayed clock for the further signal processing. A sign bit can be used to switch from damping to anti-damping for a particular bunch.

While for a 20.04 MHz bandwidth, strickly speaking, we only need a 40.08 MHz clock rate, the further processing will require a faster clock rate. Following interpolation to 80.16 MHz a 32 tap FIR filter is used to compensate the power amplifier phase response. Using further FIRs the gain can be optimized for injection damping and instability control as well as to shape the roll-off beyond 20 MHz. Overall loop gain adjustment is provided via the reference to the DAC.



Figure 1: Overview of functionalities in DSPU module.

## PICK-UP MIXING COEFFICIENTS

The two pick-up mixing coefficients follow from

$$b_{1,2} = -\frac{1}{2} \left( \frac{\cos(\Delta \phi_{Qkm})}{\cos(\Delta \phi/2)} \mp \frac{\sin(\Delta \phi_{Qkm})}{\sin(\Delta \phi/2)} \right) \quad (1)$$

 $\Delta \phi = \phi_2 - \phi_1$  is the phase advance between the two pickups and

$$\Delta \phi_{Qkm} = 3\pi Q_f + \phi_k - (\phi_2 + \phi_1)/2 \tag{2}$$

with  $Q_f$  denoting the fractional tune,  $\phi_1$ ,  $\phi_2$  and  $\phi_k$  the betatron phase advances at pick-up and kicker locations with respect to a fixed reference. The coefficients are normalized such that the resulting vector is independent of the phase adjustment,  $b_1^2 + b_2^2 + 2b_1b_2\cos(\Delta\phi) = 1$ .

## KICKER AND POWER AMPLIFIER PHASE RESPONSE COMPENSATION

The kicker and power amplifier transfer function are described in more detail in [1]. Due to its low pass characteristics the phase in the power systems shifts by  $-90^{\circ}$  over the full range of operating frequencies from a few kHz to 20 MHz. A 32 tap FIR filter has been developed that linearizes the overall phase response to achieve a constant group delay. Figure 2 shows the impulse response of this filter clocked at 80.16 MHz rate. Values plotted represent the cofficients of the FIR filter.



Figure 2: Impulse response (coffecients) for 32 tap phase correction filter for the power amplifier.

## HARDWARE AND FIRMWARE IMPLEMENTATION

The DSPU module consists of a single 220 mm deep VME card (Figure 3). On the backside, the P1 connector (top) implements the 16-bit data VME interface while the P2 connects to a custom-designed backplane distributing timing, bunch synchronous clocks, functions, interlock lines, JTAG and low-noise linear power supply reserved



Figure 3: The ADT DSPU VME module, EDA-01777

for the analog circuitry. The data streams from two pickups (16 bit words, bunch-by-bunch position at 40.08 MHz) arrive on the front-panel serial connectors at a data rate of 1 Gbit/s. The Serdes (TI TLK1501) convert the data into parallel 16 bit words that are fed into the FPGA (Altera Stratix II EP2S90). The third Serdes channel is intended for future extensions. The first processing block (Figure 1) resynchronizes the data with the bunch synchronous 40.08 MHz derived from the 80.16 MHz clock received on the board and extracts the revolution frequency marker, a pointer to bunch number 1. It is used in several places in the LHC Low Level [4]. The FPGA contains the signal processing blocks described above in this paper and it implements the VME interface for setting the various parameters and for reading the logging memories. Eight channels (the two inputs signals, the output, two intermediary signals and three spares) are stored in two banks of memory (eight synchronous SRAM chips, IDT71V67603, 256k x 36, upgradable to 512k x 36). The Observation memory bank records data at a rate that can be set by the user from a maximum of 40.08 MHz (6.4 ms record length) down to 1.223 kHz (209 s record length). When reducing the recording rate decimation filtering is applied on the 40.08 MHz data flow. This circular memory can be stopped by a hard or software trigger, and read back by the user. The Post-Mortem memory records the same signals but it operates at a constant 40.08 MHz rate and can only be stopped by the machine-wide post-mortem trigger distributed on the backplane. The data stored for the LHC Post-Mortem system will be used to analyse the causes after an accidental abort (dump) of the beam. A block of FPGA embedded memory (2 x 4096 x 16 bit) is used to inject a perturbation in the loop. When it is not operating at the full bunch repetition rate, an interpolation filter is applied to the memory record before addition to the signal. Coupled with the observation memory, this can be used for measuring step or frequency response. The card receives four serially encoded functions, transmitted on the backplane (P2 connector). They are used to change gain, delay and phase shift (pick-up mixing coefficients) continuously during injection, ramping and physics, as needed. A 14 bit DAC clocked at 80.16 MHz (AD9754 125 MSPS) converts the FPGA output into an analogue drive signal.

A standardized design flow is used for hardware and firmware: FPGA code is developed with Visual Elite, schematics and PCB design are done with Cadence [5].

#### REFERENCES

- E. V. Gorbachev et al., EPAC '08, Geneva, June 2008, THPC121, These Proceedings (2008).
- [2] P. Baudrenghien, D. Valuch LLRF07 Workshop, October 2007, Knoxville TN, USA (2007).
- [3] V. Rossi, CERN-SL-2002-047-HRF, CERN, Geneva, Switzerland (2002).
- [4] J. Molendijk, https://edms.cern.ch/file/695885/1/LHC\_Bunch\_ Synchronous\_Digital\_Data\_Transmission.pdf, CERN edms 695885, CERN, Geneva, Switzerland (2006).
- [5] J. Molendijk et al., EPAC '06, Edinbourgh, June 2006, TUPCH196, pp. 1474–1476 (2006).