# Development of a new first level trigger board for the Pierre Auger Observatory based on the Cyclone<sup>TM</sup>FPGA.

Z. Szadkowski<sup>a,b</sup>, Karl-Heinz Becker<sup>b</sup> and Karl-Heinz Kampert<sup>b</sup>

(a) University of Łódź, Pomorska 151, 90-236 Łódź, Poland.

(b) Bergische Universität Wuppertal, Fachbereich C - Physik, D-42097 Wuppertal, Germany

Presenter: Z. Szadkowski (zszadkow@physik.uni-wuppertal.de) ger-szadkowski-Z-abs2-he15-poster

The Pierre Auger Observatory surface array will contain 1600 water Cherenkov detectors working autonomously and processing local shower triggers. After digitizing the PMTs signals in 40 MHz FADCs the data are processed by Altera PLD chips working as trigger/memory circuitry. We present a new powerful and cost-effective realization based on the Altera FPGA family Cyclone. It allows more sophisticated trigger definitions at lower power consumption, simpler board design and lower cost. The hardware and software design is presented and results of detailed tests in the lab and in the field are discussed. The new design may be used for the completion of the Southern side and provides at present also the best cost-effective option for Auger North.

# 1. Introduction

The Cherenkov light of each Pierre Auger water tank is read out by three 9-inch photo-multiplier tubes (PMTs). The signals from the anodes (low-gain channel) and last dynodes (high-gain channel) are transmitted via equallength shielded cables to a Front-End Board, attached to a Station Controller. In the Front-End Board, the 6 signals of a water tank are filtered by anti-aliasing 5-pole Bessel filters with a cut-off frequency of 20 MHz. Sampling is continuously done at 40 MHz by 10-bit Flash ADCs. Using 5 bits of overlap between the two ADCs of a PMT, a dynamic range of 15 bits is achieved. The outputs of the six ADCs are processed by Programmable Logic Devices (PLD) Altera<sup>®</sup> [1] working as trigger/memory circuitry (TMC). They are supported by additional Dual-Port Random Access Memory (DPRAM) serving as a temporary buffer. The TMC evaluates the ADC outputs for interesting trigger patterns, stores the data in buffer memory, and informs the detector station micro-controller in case a trigger occurs. The station controller sends trigger packets and, when requested, event data to the observatory campus via a wireless network. A hierarchical event trigger is used to select events of interest and reject uninteresting events in order not to exceed the rate constraints imposed by the station micro-controller, the communications link bandwidth, and the central data acquisition system.

# 2. 3rd generation - Cyclone<sup>TM</sup> design

The Altera<sup>®</sup> FPGA family Cyclone<sup>TM 1</sup> allows a significant simplification of the construction of the surface detector trigger as well as decreasing the total costs and improving parameters. Different from the currently used ACEX<sup>®</sup> chips, Cyclone<sup>TM</sup> chips contain much more internal memory avoiding the need to support the TMC by external memory. The TMC utilizes two types of channels: a fast one to investigate the shower profile and a slow one for calibration data. The memory in the fast channel requires 2 buffers · 768 words · 64 bits = 24 M4K RAM Blocks. The remaining 28 M4K RAM blocks (in the selected Cyclone<sup>TM</sup> chip - EP1C12Q240I7) can be used for the limited size memory in the slow channel (3584 words FIFO is estimated as sufficient). Nevertheless, we decided to keep the external larger memory rather than occupying internal memory blocks in order to allow implementing new trigger conditions [2].

<sup>&</sup>lt;sup>1</sup> Very recently, the family Cyclone II has appeared on the market.



Figure 1. A prototype Cyclone<sup>TM</sup> Front End Board.

The power consumption is at the same time reduced as the core is supplied by 1.5 Volts only. Furthermore, compared to the previous two-chip ACEX design, a single chip avoids problems with chip synchronization [3]. The registered performance is at the level of 130 MHz. Such a high internal speed would allow increasing the sampling frequency, thereby improving the time resolution of the triggers. Moreover, additional resources also allow implementing new kinds of triggers: i.e. 16-point Fast Fourier Transform calculating in real time all complex coeffcients, which may be useful to recognize specific types of events [4]. The structure of internal routines is fully pipelined. Some complex routines (mainly wide-bus adders) for reliable and high-speed performance requiring two clock cycles in ACEX<sup>®</sup> chips are implemented in the Cyclone<sup>TM</sup> in a single clock cycle due to a newer Cyclone<sup>TM</sup> architecture assuring much high register performance.

The Cyclone<sup>TM</sup> chips offer sufficient resources and internal memory to implement the full algorithm into a single chip. However, such an approach would impose limits on future optimizations or on implementations of new ideas. As a compromise between simplicity and flexibility for future modifications we chose a two chip design; the Cyclone<sup>TM</sup> combined with the DPRAM supporting slow memory as used in the previous designs.

# 3. New triggers

The currently used Time over Threshold (ToT) trigger [5, 6] performs a simple one bit analysis in order to count the number of FADC samples exceeding a fixed threshold within a sliding time window of consecutive samples. Due to the low threshold applied, equivalent to only 20% of a passing vertical muon, special attention has to be paid to avoid variations caused by pedestal and temperature fluctuations. New powerful FPGA chips allow implementing much more sophisticated algorithms in order to improve the stability, reliability and noise suppression. Particularly, making use of the full 10-bit information of the FADC traces enables selecting events with a much higher precision.

A natural estimator as an extension of the ToT idea is the integral of all FADC values in a sliding window. The integral of the time bins corresponds to the charge from PMTs and to the Cherenkov flux collected in the tank. A second estimator is the sum of the squared time bins in a sliding window. Here, the idea is to distinguish a contribution of a signal from significantly high-level time bins from low-level ones close to noise. A third estimator evaluated is the on-line calculated Area/Peak ratio.

#### 3.1 Area over Threshold (AoT) trigger

The observed temperature dependent long and short time drifts of the pedestal call for additional routines trying to compensate for that effect. Practically, we opted the well known sigma-delta algorithm. An 18-bit counter controlling an effective pedestal is incremented/decremented by a single count depending on a comparison of the ADC value and the current pedestal. The running sum of the 18 bit compensated ADC values in a sliding window of 128 consecutive samples is calculated by means of two adders, one of which is driven by a 128 word shift register. Next, the sum is compared with a preset threshold to produce the AoT trigger.

AoT : 
$$\sum_{i=0}^{len} (A_i - ped_i) > Thr_A$$

Contrary to the ToT trigger, where the length of the sliding window could be tuned dynamically, the length of the delay chain is fixed in the AoT. The ToT chain was implemented in logic elements (LEs) as a 1-bit shift register. Implementing 18-bit shift registers in LEs would consume an unacceptable amount of resources. However, shift registers can be implemented in internal memory blocks without access to the intermediate stages.

#### 3.2 Power over Threshold (PoT) trigger

The digitized FADC values are composed of the pedestal and charge collected from the PMTs. In other words, the FADC traces can be considered as some rare signal on a noisy background. The power of the signal can be calculated by

PoT : 
$$\sum_{i=0}^{len} (A_i - ped_i)^2 > Thr_P$$

Sometimes, the ToT trigger can be generated by rather accidental conditions, since the shape of the FADC traces is ignored. For example, a relatively narrow but high bump (not sufficient to generate a simple threshold trigger) may be too narrow to exceed the occupancy threshold and could not be registered by ToT trigger. On the other hand, the occupancy threshold may be exceeded by some tail contribution of a low level signal, since the ToT threshold, needed to fire time bins, is relatively low.

With the above definition of the PoT, the sensitivity of a trigger to noise is reduced considerably. The contribution of time bins just above the pedestal is significantly suppressed in comparison to the relatively higher bumps.

The implementation of the algorithm is similar to the AoT case, only an additional multiplier is needed. However, a direct multiplication of 18-bit data utilizes a lot of resources. The Cyclone<sup>TM</sup> FPGA unfortunately does not contain embedded DSP blocks and a multiplication routine has to be implemented as a soft core. To optimize the resources occupancy, let us notice, that the performance of the full 18-bit bus seems to be spendthrift. As in case of the AoT, a shift register has to be implemented in the internal memory. However, the width of such implemented bus should be a multiple of the Embedded System Block width of 8 bits. Thus, we have chosen 16 instead of 18 bits to not block memory for other routines. Neglecting the two least significant bits from the compensated ADC routine seems to be a good compromise between performance and resource utilization. Still, a 16-bit bus provides sufficient accuracy. Let us notice, that the digital filter is working independently and still compensates FADC traces with an additional 8-bit fractional counter.

Calculating a square by a direct multiplication of the same data is inefficient. Therefore, let us consider 16-bit data as a two-byte structure. Moreover, multiplication by a constant value being a power of two is de facto a shift of data:

 $(ADC[15..0])^2 = (ADC[15..8])^2 \gg 16 + ADC[15..8] \cdot ADC[7..0] \gg 9 + (ADC[7..0])^2$ 

The first and third term of the above formula correspond to the calculation of a square of 8-bit data. Results are 16-bit. The most effective way treats the ADC values as an 8-bit input address for a ROM with a 16-bit output. The ROM will contain square coefficients. However, since both terms require the same set of coefficients, it is more efficient to implement that coefficients into the DPRAM preloaded by coefficients during the FPGA configuration. The DPRAM would be working in read-only mode. Only the second term requires a real multiplier.

#### 3.3 Area/Peak Ratio trigger

The area/peak ratio is a useful estimator calculated at present only offline. Preliminary results show that the values of the area/peak ratio are contained in a relative narrow dynamic range. That parameter could be used also for trigger purposes. Cyclone<sup>TM</sup> chips offer a lpm\_divide routine supporting the division of two integer numbers. The area is already calculated in the AoT routine. A parallel routine finds online the peak value in a sliding window. The division routine is 9 clock cycles pipelined to assure sufficient registered performance

AoT, PoT and Area/Peak ratio triggers are preliminary implemented and will be tested under real conditions in the field. To optimize consumption of memory and logic registers, all new triggers are merged to a single routine to utilize the same shift registers as much as possible. All new triggers work in parallel simultaneously. Additional flags will indicate the type of triggers.

# 4. Conclusions

The currently used ACEX design for generating first level triggers in the Pierre Auger surface detectors offered the most cost-effective solution when starting the construction of the observatory. The Cyclone design presented here is being developed as part of a R&D work to increase the integrity of the system, to simplify it, as well as to allow implementing new features/triggers impossible to implement in the ACEX design due to lack of resources. More resources allow better optimization the code and the reduced power consumption increases the long-term reliability. Higher registered performance allows to use faster data sampling to increase the time resolution.

### References

- [1] Programmable Logic Devices Family, http://www.altera.com/products/devices/dev-index.jsp
- [2] Z. Szadkowski, "A Proposal of a Single Chip Surface Detector Trigger Based On Altera<sup>®</sup> Cyclone<sup>TM</sup> Family", *in Proc. 28th ICRC*, Tsukuba, 2003.
- [3] Z. Szadkowski, "The concept of an ACEX<sup>®</sup> cost-effective first level surface detector trigger in the Pierre Auger Observatory", *Nucl. Instr. Meth.*, 2005, in print.
- [4] Z. Szadkowski, "16-point Discrete Fourier Transform based on the Radix-2 FFT algorithm implemented into Cyclone FPGA as the UHERC trigger for horizontal air showers", *Contribution to this conference*.
- [5] Pierre Auger Collaboration, "Properties and Performance of the Prototype Instrument for the Pierre Auger Observatory", *Nucl. Instr. Meth.* A523, pp. 50-95, 2004.
- [6] Z. Szadkowski, D. Nitz," Implementation of the first level surface detector trigger for the Auger Observatory Engineering Array", *Nucl. Instr. Meth.*, A545, pp. 624-631, 2005.