# FINAL REPORT

## **Project Title:**

A Fast Topological Trigger for Real Time Analysis of Nanosecond Phenomena; Opening the Gamma Ray Window to Our Universe

#### Principle Investigator/Program Director:

F. Krennrich Department of Physics and Astronomy 12 Physics Ames, Iowa 50011-3160 515-294-3736 krennrich@iastate.edu

## Period Covered by this Progress Report:

August 1, 2007 - August 31, 2010

## **Recipient Organization:**

Iowa State University Ames, Iowa, 50011

DOE Award Number: DE-FG02-07ER41497

## **1** Scientific/Technological Opportunity

This proposal was to enable the development of a proof-of-principle nanosecond trigger system that is designed to perform a real time analysis of fast Cherenkov light flashes from air showers. The basic building blocks of the trigger system have been designed and constructed under the ADR program (DEFG0207ER41497) during 2007/2008, and a performance test of the prototype system was done in 2009. This led the VERITAS collaboration to adopt the trigger system developed under this grant and make it part of the VERITAS upgrade. The VERITAS upgrade (funded by the NSF-MRI program ) including the trigger technology developed by our group was completed in 2012 and is now fully operational.

In summary, the successful development and demonstration of the fast topological trigger technology allowed us to make significant advances for the field. As was shown with VERITAS, triggering on lower energies is synonymous with observing sources at greater distance, or farther back in time. Currently, none of the modern IACTs have this capability. The development and implementation of fast real-time trigger technology substantially increases the collection area of IACTs at the lowest energies: as a consequence, the energy threshold of an array like VERITAS was lowered from 100 - 200 GeV to 80 GeV, in part due to the new FPGA-based camera trigger. Given the difficulties and expense associated with reaching energies below 100 GeV by other means, e.g., using large reflectors (see HESS-II phase), going to high altitude, etc., the technology developed in this work, has helped to achieve this task at substantially lower cost with already existing IACTs.

## 2 Completion of the Trigger Development:

To test the concept of a topological array trigger we have designed a system that has three levels of triggering. Level 1 is the discriminated outputs of the front-end electronics which is already provided by the VERITAS telescopes. At low thresholds, the individual pixel rates approach 10 MHz, dominated by pile-up from single photo-electrons due the night sky background. For a telescope camera having 500 channels, the aggregate Level 1 rate is then of order 5 GHz. The Level 2 trigger does the pattern recognition and centroid calculation processing for each individual telescope. Our goal is achieve a 10 MHz output rate from Level 2. This is the most challenging part of the system, as it must process a large number of channels at very high speeds. Note that in our system architecture, we have incorporated an intermediate level of triggering, Level 2. Processor for each telescope, and performing the parallax calculations to look for correlations. For a typical modern IACT, the maximum Level 3 accept rate is 1 KHz, but we have set a goal for performance up to 100 KHz for the future applications.

The physical configuration for our application is shown in Fig. 1. We assume a 500-channel telescope, and divide it into 3 regions, each of which has a small overlap region with its neighbor. The signals from the front ends are received by I/O Modules that reside in a 9U x 160 mm crate. The input signals are converted to low voltage differential signals (LVDS) on the I/O Modules, and sent across the backplane using point-to-point routing to a corresponding Level 1.5 Processor. The Level 1.5 Processors look for 3-fold coincidences of neighboring pixels in

a cell of 7 pixels within a programmable time window. When this criterion is met, the pixel coordinates and a timestamp for the event are captured, which are then sent from the three Level 1.5 Processors to a single Level 2 Processor that resides in the crate over point-to-point cable connections on the front panels. The Level 2 Processor sorts the data based on the timestamps, and calculates several quantities, including the first and second moments of the hit pixels, and counts the number of pixels hit. From this information, the radial distance r and the angle phi of the centroid of the pixel pattern for each camera can be calculated. This information, along with the timestamp, is then sent to a central Level 3 processor is a high-speed PC with a 4-channel PCI input card for receiving and decoding the data sent by Level 2. It then calculates the parallax width parameter from all the telescopes in the array. If the event is accepted, the Level 3 Processor uses the timestamp to calculate the delay needed in sending the event accept signal to the front ends.

The Level 1 signals in our system are differential ECL signals. The I/O Module has an input and an output for these signals, through which ECL receivers can spy on the This allows the signals. existing Level 2 system of the experiment to operate autonomously. The signals come in to the module on 10-pair connectors, are routed to an ECL receiver, and are immediately routed to an output connector so that the existing cabling for the experiment can be used. A block diagram of the card, and a picture of the module, is shown in Fig. 5. There is no termination on

the ECL lines on the I/O



Figure 1: Physical configuration of the Level 2 Trigger for a 500channel telescope camera. The camera is divided into 3 regions with small overlaps. Each region is processed separately in a custom Level 2 crate.

Module, and careful layout techniques are employed to ensure that the 100-Ohm characteristic of the input cables is maintained. The input signals are converted to LVDS on the I/O Modules, and sent across the backplane using point-to-point routing to a corresponding Level 1.5 Processor. In order to handle the overlap region, each Level 1 signal is copied twice: once for processing as part of the primary region, and a second time in case that pixel is part of the overlap region. In order to avoid having customized I/O Modules, we do not identify those pixels that are part of the overlap regions on the IO Modules; all pixels are duplicated and processed in the same way, so that all I/O Modules can be fabricated identically. The specific routing of signals from the

I/O cards to the L1.5 Processor is handled on the backplane.

The crate that we have designed is 9U x 160 mm. It is a hybrid design, which uses a commercial VME backplane for J1, and a custom design for the J2-J3 portion. The J1 VME is used to access the Level 1.5 and Level 2 Processors in the crate, primarily for setup configuration and diagnostics. The J2-J3 backplane is custom, and handles the point-to-point routing of signals from the I/O Modules to the respective Level 1.5 Processors. A picture of the crate, along with a graphic of the layout of the J2-J3 backplane, is shown in Fig. 6.

The Level 1.5 (L1.5) Processors receive the Level 1 signals from the backplane. Each receives a unique set of signals from the camera, approximately 180 signals each, corresponding to the pixel map shown in Fig. 4, with only the overlap pixels shared between different processors. The signals are received and processed by a Xilinx Virtex-5 Field Programmable Gate Array (FPGA), model XC5VLX50, running at 400MHz. In the FPGA, each pixel is stretched by a variable width oneshot, from 4 ns up to 40 ns, before entering the pipelined coincidence logic. The algorithm first registers the signals to the 400 MHz clock, and process them looking for the 3-fold coincidence of neighboring pixels as described earlier. Each pixel is processed as a primary (center) pixel in a cell of 7 pixels, and also as a secondary pixel in up to six neighboring cells. When the coincidence criterion is met, the L1.5 Processor latches all pixels that become active during the next few time slices following the coincidence. This provides a variable width acceptance window to allow for pixels that arrive slightly later than the others, depending on the nature of the wave front, night sky background, etc. The processor also latches a 32-bit timestamp along with the active pixels. The latched pixel bit pattern and timestamp are stored in an event FIFO. When the FIFO becomes not empty, the active bits in the bit pattern are converted to addresses, which are then sent to the Level 2 Processor, along with the timestamp. The transmission is accomplished using a ribbon cable across the front panels, operating at 50MHz. The 400 MHz clock is created internally by the Xilinx Digital Clock Manager (DCM) from a 50MHz source distributed in phase to each L1.5 Processor by the Level 2 Processor. The timestamps are synchronized not only across the three L1.5 Processors, but across the system as well, and are used by the Level 3 Processor later on to bring the event fragments together from across the system. A block diagram and picture of the L1.5 Processor are shown in Fig. 7.

The L1.5 Processor has access to the J1 VME backplane. This is used to program setup parameters such as the one-shot width, masking of bad channels, etc. We also plan to incorporate skew adjust for the L1 signals received from the backplane, since the traces have different widths, and we seek an accuracy of 2 ns. The L1.5 Processor also has a diagnostic state machine, where data can be loaded that mimics a L1 signals, and can operate at 400 MHz. Lastly, we have incorporated a diagnostic mode, where results from the L1.5 processing can be stored in an on-board memory that is accessible from VME.

Data from the L1.5 Processor is received by the Level 2 (L2) Processor via the front-panel connections, and is stored in internal buffer FIFOs implemented within a Xilinx XC4VLX40 FPGA. This FPGA contains 18,432 logic slices and can implement up to 288kbits of RAM. Multiple 48-bit DSP logic blocks are available for numeric processing. Logic clock rates in excess of 200MHz are easily achieved. Each L1.5 Processor sends lists of hit pixels when an event occurs, along with a timestamp for the event. Note that once the data is timestamped, the system allows latencies and fluctuations in later processing (within limits.) The L2 FPGA collects the list of all pixels whose timestamps fall within a programmable range and converts all pixel numbers to X-Y coordinates using lookup tables. A pipeline sums all the pixels together, forming the moments

traditionally used in the image analysis (sum of X, sum of Y, sum of  $X^2$ , sum of  $Y^2$  and sum of XY). After counting how many pixels were part of the event, a data record of the moments, plus the timestamp associated with the event, is transmitted over a 2Gbps fiber optic link to the L3 Processor where the data is merged with that of other L2 Processors. The fiber optic link is bi-directional. The link from L3 to L2 also provides the clock reference, the once-persecond synchronization pulse to reset the timestamp counters in the L1.5 Processors, and allows for distribution of commands including error detection and system reset. A picture of the L2 Processor is shown in Fig. 8.

The L2 Processor receives and redistributes the master 50MHz clock to the three L1.5 cards that share the same crate. The clock may either come from the fiber optic link from Level 3 or may be synthesized from the 10MHz output of a GPS receiver connected to the front panel. A phaselocked-loop clock distribution chip with skew control allows for compensation of delays so that all L1.5 Processors within a crate are matched in phase. The L1.5 Processors receive the 50MHz reference clock and internally multiply it to 400MHz for use in the coincidence logic. The physical connection between L1.5 and L2 consists of a four-pair RJ45 cable for the clock distribution plus a 20-pair short ribbon cable for the pixel data. Every second a marker signal is sent from L2



Figure 2: Pictures of the Level 2 Processor.

to all L1.5 Processors for timestamp synchronization. Like L1.5, the L2 Processor also has access to the J1 VME backplane. This is used primarily to load a diagnostic state machine, where data can be loaded that mimics a L1.5 inputs. As was done in the L1.5 Processor, the L2 also has a diagnostic mode, where results from the L2 processing can be stored in an on-board memory that is accessible from VME.

Several of the key goals of this proposal was achieved, including the development of a camera trigger system based on 400 MHz FPGAs [Krennrich 2009] and a real world demonstration not the system, with all its advantages over previous trigger systems. Some of the achievements of this trigger and its performance in VERITAS is described in the next section.

## **3** Performance of Trigger in VERITAS:

Compared to the initial VERITAS configuration in 2007, recent improvements including the trigger upgrade reduces the exposure time by at least a factor of 2 for a given  $\gamma$ -ray flux. After extensive tests including in-situ trigger efficiency measurements, the new VERITAS L2 trigger system was installed for regular operation in November 2011. This system is a joint ANL/ISU development (Anderson et al. 2008; Krennrich et al. 2009). The R&D was enabled by a DOE ADR grant to Prof. Krennrich (2007-2009) and by LDRD funds (ANL) to Dr. Byrum, and was chosen by the VERITAS collaboration as part of VERITAS-II. This new system is FPGA-based and provides significantly more control for tuning the trigger and optimizing its performance.

The new system allows one to align the digital pulses from the discriminators to better than 0.2 ns (see Figure 4), thus enabling a reduced effective coincidence resolving time, going from 10 ns of the old system, to as short as 3 ns, without significant efficiency loss. Shorter coincidence times reduce the amount of noise triggers from the night sky background fluctuations, thereby stabilizing the L2 rates while enabling a lower energy threshold. Besides providing consistent and stable L2 rates, the new trigger system also offers the possibility to measure the efficiency of the system in situ using cosmic-ray and  $\gamma$ -ray events. Since



Figure 3: - The arrival times of the trigger signals at the FPGAs that form time coincidences are shown before (right) and after (left) the timing calibration.

one of the trigger systems uses passive splitters, the digital pulses are routed to a second (parasitic) system whose response is recorded in the data stream by a VERITAS FADC channel. This setup has allowed us to measure the efficiency<sup>1</sup> of the trigger across the camera showing that the new trigger system operates essentially at 100% efficiency, even when using a 5 ns coincidence resolving time.

The understanding of the camera trigger efficiency at the few percent level, is an important prerequisite for performing deep exposures to limit/estimate systematic uncertainties. Further details are given in §5.1. The FPGA-based trigger design also allows one to implement real-time image analysis to further reduce background and dead time of the data acquisition system. The ISU group is currently exploring new trigger schemes, e.g., the implementation of a real-time stereo analysis. Furthermore, we are developing a pass through trigger for cosmic-ray muons. The latter are extremely useful for instrument calibration purposes.



Figure 4: - Camera map of the efficiency of the ANL/ISU L2 trigger relative to the previous L2 trigger. These data were taken with a 5 ns coincidence width setting in the ANL/ISU trigger.

<sup>&</sup>lt;sup>1</sup>efficiency study was led by Prof. Weinstein.

### References

[Krennrich 2009] Krennrich, F. et al., AIP Conf. Proc, 1085, 894 (2009).

[FasterTechnology 2008], see http://www.fastertechnology.com/products\_p6.html

- [Ong 2005] See for instance Rene Ong's Rapporteur Talk at the 29th International Cosmic Ray Conference (Pune) (2005)
- [Krennrich & Lamb (1995a)] Krennrich, F. and Lamb, R.C.Experimental Astronomy, 6, 285-292 (1995a)
- [Krennrich & Lamb 1995b] Krennrich, F. and Lamb, R.C.Proceed. of Towards a Major Atmospheric Cherenkov Dtetector III, Padua (1995b)

[VERITAS] see http://veritas.sao.arizona.edu/

- [HESS] see http://www.mpi-hp.mpg.de/hfm/HESS/HESS.html
- [Hillas] private communication with A.M. Hillas.
- [LeBohec, Krennrich & Sleege 2005] LeBohec, S., Krennrich, F. & Sleege, Astroparticle Physics, 23, 238 (2005).
- [Hillas 1985] Hillas, A.M., in Proc 19th I.C.R.C. (La Jolla), Vol 3, p. 445 (1985).

[see http://gamma3.astro.ucla.edu/future\_cherenkov] see http://gamma3.astro.ucla.edu/future\_cherenkov

- [Mirzoyan et al. 1994] Mirzoyan, R. et al., Nucl. Instr. Meth. A351, 513 (1994).
- [Karle 1994] Karle, A., PhD Thesis, Ludwig Maximilians University (Munich) (1994)

[Punch et al. 1992] Punch, M. et al., Nature, 358, 477 (1992).

[Weekes 1989] Weekes, T.C., ApJ, 342, 379 (1989).