# The performance of a Pre-Processor Multi-Chip Module for the ATLAS Level-1 Trigger

J. Krause, U. Pfeiffer, K. Schmitt, O. Stelzer Kirchhoff-Institut für Physik, Universität Heidelberg, Germany

### Abstract

We have built and tested a mixed signal Multi-Chip Module (MCM) to be used in the Pre-Processor of the ATLAS Level-1 Calorimeter Trigger. The MCM performs high speed digital signal processing on four analogue trigger input signals. Results are transmitted serially at a serial data rate of 800 MBd. Nine chips of different technologies are mounted on a four layer copper substrate. Analogue-to-digital converters and serialiser chips are the major consumers of electrical power on the MCM, which amounts to 7.5 Watts for all dies. Special cut-out areas are used to dissipate heat directly to the copper substrate. In this paper we report on design criteria, the MCM technology chosen for substrate and die mounting, experiences with the MCM operation and measurement results.

### I. INTRODUCTION

The event selection at the ATLAS experiment requires a fast three level Trigger system for the selection of physics processes of interest. The first trigger level (Level-1 Trigger) is designed to reach an event rate reduction from the 40 MHz LHC bunch-crossing rate down to the first level accept rate of 75 kHz [1]. This is achieved by applying algorithms to a three-dimensional energy map generated by the ATLAS calorimetry (Calorimeter Trigger) and by the selection of coincidences in muon trigger chambers (Muon Trigger). The Pre-Processor is located at the front-end of the Calorimeter It receives about 7200 analogue input signals Trigger. (trigger towers) from the electromagnetic and the hadronic calorimeters. Its task is to provide the digital data for the Calorimeter Trigger algorithms, which identify 'local' and 'global' energy distributions within the calorimetry. Local trigger algorithms are an electron/photon trigger, a hadron/tau trigger and a jet trigger. Global trigger algorithms are an  $E_T$ -miss and sum- $E_T$  trigger.

The maximum latency to find a level-1 trigger decision is 2.0  $\mu$ s including cable delays. This and the high number of trigger tower signals place tight constraints on the Pre-Processor electronics, requiring fast signal processing in Integrated Circuits (ICs) and a high scale integration on Multi-Chip Modules (MCMs). Inside a Multi-Chip Module many electronic components, made from different semiconductor materials, are mounted on a multi-layer substrate. In addition, a Multi-Chip Module is a packaging technology which encapsulates the entire system hermetically.

An overview of the tasks of the Pre-Processor system are given in Section II. This is followed by a functional description of the Multi-Chip Module in Section III, which includes a description of the production technique and the layout. The thermal MCM performance were obtained from calculations, simulations, and measurements described in Section IV. The final operation of the MCM was demonstrated by system measurements described in Section V.

# II. TASKS OF THE PRE-PROCESSOR

The reliability of the Pre-Processor is of importance for the running of the ATLAS experiment, because all the Level-1 Calorimeter Trigger input data have to go through it. The tasks, that the Pre-Processor system has to perform based on its 7200 analogue input signals, can be summarised as follows [2]:

- **Preprocessing:** Provide the trigger algorithms with digital data containing the transverse energy deposited, identified to a unique interaction in time (bunch-crossing identification). The preprocessing is done at 40 MHz with a maximum latency of 17 clock cycles (425 ns).
- **Readout of event data:** Raw trigger data from the Pre-Processor are needed to be able to tell what has caused a trigger and to allow monitoring of the performance of the trigger system.

Figure 1 shows the number of analogue input signals and serial output links of the Pre-Processor. The need for speed and compactness within the Pre-Processor requires the use of a large number of chips (dies), each optimised for its specific task. A Pre-Processor Module will process 64 analogue trigger tower signals. The signals are processed by commercial integrated circuits (ICs) and application specific ICs (ASICs), most of which are located on 16 Multi-Chip Modules per Pre-Processor Module.

### **III.** FUNCTIONAL DESCRIPTION OF THE MCM

The boundaries of the MCM were chosen at points of the processing chain, where only few signals come in and out of the MCM package. The tasks of the Multi-Chip Module within these boundaries are:

- to digitise four analogue trigger tower signals at 40 MHz with 8 bit resolution;
- to preprocess each trigger tower data in terms of energy calibration and bunch-crossing identification;
- to serialise preprocessed trigger tower data using highspeed gigabit chip sets. The user data rate is 640 Mbit/s (16 bit at 40 MHz) and the serial data rate is 800 MBd (including protocol bits);



Figure 1: The tasks of the Pre-Processor System and its input and output signals.

# • to provide deadtime-free readout of four trigger towers.

In order to achieve these objectives, the MCM consists of different semiconductor devices such as mixed signal and pure digital chips. Some of them are commercially available and others are application specific. The readout and most of the digital signal processing inside the MCM is done by a prototype Pre-Processor ASIC (FeAsic<sup>1</sup>) developed at the ASIC laboratory of the University of Heidelberg [3]. The four analogue input signals are digitised by two dual flash ADCs from Analog Devices (AD9058). One mixed signal Flip-Chip Interconnection ASIC (Finco) includes all necessary functions for data pre-multiplexing, signal level conversion, reference voltage generation, in-circuit testing and temperature monitoring. The output data are serialised by two gigabit transmitters (HDMP-1012) from Hewlett Packard running at a serial data rate of 800 MBd. If the Finco ASIC performs data pre-multiplexing, the use of one transmitter serialising at 1600 MBd is sufficient.



Figure 2: Block diagram of the Pre-Processor MCM

Figure 2 shows a block diagram of the MCM. From that figure, one can see the scope of the MCM, where wide parallel data buses are kept internal and only analogue input, control, and the high-speed serial data signals come out of the MCM package. The real time signal processing is from left to right.

# A. MCM production technique

For the processing of the large number of channels (64 trigger towers per Pre-Processor Module), a laminated MCM-L technique was chosen to combine small feature sizes with low prices. The design process of the laminated multi-layer structure is based on an industrially-available production technique for high-density printed circuit boards. The process, which is offered by Würth Elektronik [4] is called TWINflex<sup>(R)</sup>. It is characterised by its use of plasma etched micro-vias, where plasma is used for 'dry' etching of insulating material (Polyimide). Plasma etching enables precise via contacts between layers with a diameter of 100  $\mu$ m down to 50  $\mu$ m. The process of plasma etching can be used either within a surface-mount pad, or even in a pad suitable for Flip-Chip bonding, as used for the Finco ASIC mounting.

The body of the demonstrator MCM is a combination of three flexible Polyimid foils laminated on a rigid copper substrate to form four routing layers. The layer cross-section consists of a *core* foil of 50  $\mu$ m thickness, which carries 18  $\mu$ m copper plates on either sides. Plasma etching is used for 'buried' via connections to adjacent layers and routing structures are formed in copper using conventional etching techniques. The core foil is surrounded by *outer* foils of 25  $\mu$ m Polyimid, which are copper plated only on one side. The actual contact through the core foil is accomplished with electroplated copper and after that, the routing structures are formed. The electroplating process increases the track thickness from 18  $\mu$ m to 25  $\mu$ m. The application of adhesive accomplishes laminating.



Figure 3: Cross-section of the flexible MCM part after laminating. Staggered vias are used for the connection through all layers.

Figure 3 shows the final laminated and flexible part of the MCM. A combination of three vias (staggered vias) is needed to accomplish a contact from the top to the bottom layer. The flexible multi-layer is further processed by milling of predefined cutout regions. Finally the flexible part is glued onto a copper substrate of 800  $\mu$ m thickness.

<sup>&</sup>lt;sup>1</sup>FeAsic (Front-End ASIC) stems from a former name for the Pre-Processor.

To improve the thermal resistance of high-power chips, the use of cutout regions and thermal vias were investigated. A cutout region is necessary for the high-power G-link die to ensure optimal thermal contact to the copper substrate. The G-link die is glued directly onto the copper substrate inside its cutout. It is connected to bond pads on the top layer of the cross-section using a standard ultrasonic wire-bond technique. Staggered vias are grouped as close as possible to form thermal vias, which improve heat conduction to the substrate for all four FADCs.

Components such as capacitors and resistors are connected to the multi-layer structure using surface-mount technology (SMD). Advanced Flip-Chip mounting was investigated for a further reduction of the occupied bonding space. A high-density SMD connector was used to permit a quick replacement of a broken MCM. If built-in tests have identified a broken MCM, it can be replaced without any soldering. Figure 4 shows the final MCM cross-section. Each chip is encapsulated individually, and a global elastic encapsulation material is used to absorb stress arising from the use of different materials and to hermetically protect all components from their environment.



Figure 4: Side view of the hermetically sealed demonstrator MCM. High-density SMD connectors were used to allow quick replacement upon component failure.

# B. MCM layout

This section describes the physical layout of the MCM. It summarises various aspects of the design process. Figure 5 shows a picture of the G-link die attachment inside a cutout region. A few wire-bonds were placed down through the cutout region onto the copper base to conduct the ground substrate potential.



Figure 5: Die attachment on surface and inside a cutout region.

A via-in-pad technology was used for Flip-Chip pads of the Finco footprint. This pad layout has reached the maximum possible routing density for the TWINflex<sup>®</sup> MCM-L technology. The pads were arranged in a two-dimensional matrix as shown in Figure 6. The use of micro-vias for

Flip-Chip pads has made it necessary to fill the via holes before reflow soldering of the Finco die, and to use a solder-mask for the pad contacts.



Figure 6: Finco footprint for Flip-Chip mounting.

On the top layer, a cross-hatched ground shape surrounds bonding and SMD pads. This reduces the electromagnetic influence of signals to each other and it stabilises the ground potential. A cross-hatched shape is needed because drying moisture coming out of the cross-section can destroy the MCM.

The final MCM is shown in Figure 7 after glob-top encapsulation of individual chips and prior to final hermetic encapsulation. The layout has a form factor of 4.3 cm  $\times$  3.7 cm enclosing an area of 15.9 cm<sup>2</sup>. The total height is 1.21 cm including heatsink and SMD connectors. The amount of silicon area is 80.15 mm<sup>2</sup> for 9 dies, resulting in a ratio of 5 % for the silicon to substrate area. The SMD connector pin-count is 120, whereas the internal pad count is 613. A total of 1380 vias were used for 271 signal nets with a total track length of about 5 m.



Figure 7: MCM after glob-top encapsulation of individual chips and prior to final hermetical encapsulation.

#### **IV. THERMAL PERFORMANCE**

The thermal performance was first calculated by considering one-dimensional heat flow. These calculations can be done at an early state of the design process and they can help to choose the optimal cooling approach before manufacturing. In order to determine the module temperature behaviour a two-dimensional temperature simulation was performed. After manufacturing, the adequacy of the thermal design was proven by temperature measurements.

#### A. Thermal calculations

Figure 8 illustrates the temperature rise  $(\Delta T = R \cdot P)$  for each chip, which is the product of the thermal resistance Rand the power dissipation P. From that figure, one can see that the FADC cooling is improved by about 87 % by using thermal vias. Furthermore, the cutout technology used for the high power G-link die has improved the cooling mechanism by about 90 %.



Figure 8: Temperature rise for each chip assuming a uniform heatsink temperature of  $45 \,^{\circ}$ C.

# B. Thermal simulations and measurements

Figure 9 shows a two-dimensional temperature simulation (air map). A fan is blowing vertically from the bottom to the top and the air map is overlaid on the MCM layout to identify the component positions. From the initial air temperature of 25 °C the air is heated up by 1.9 °C to 26.9 °C. Its maximum value is reached above the FADCs and the Finco die, whereas it is less heated above the G-links sitting in cutouts.



Figure 9: Two-dimensional temperature simulation.

Two-dimensional temperature measurements were done using an infrared-sensitive camera (Amber Radiance). Its wavelength sensitivity range from  $3-5 \mu m$  with a resolution of  $256 \times 256$  pixels and a dynamic range of 12-bits. This camera displays temperature as grayscale and calculates a mean value which can be used for measurement of the MCM mean temperature. Two measurements of the un-clocked MCM were performed: one measurement with fan cooling, and a second measurement which was first un-cooled and after 2.1 minutes the fan was turned on. In both cases the MCM was un-clocked. Figure 10 shows that transient temperature behaviour started at power-on time for a duration of 4 minutes.



Figure 10: Two-dimensional temperature measurements.

The infrared picture in Figure 11 was taken 4 minutes after the MCM was powered on. At this point (labeled 5 in Figure 10) the equilibrium temperature has reached 34.6 °C  $\pm$  1.5 °C. The fan blows from bottom to the top of the picture. Hence the cooling effect is better for the chips close to the fan. All the chips on the bottom side of the picture are slightly colder than their adjacent neighbours at the top side. This is particularly true for the FADC sitting in front of the SMD connector. This measurement of the temperature distribution qualitatively corresponds to what was simulated in Figure 9.



Figure 11: Infrared picture used for temperature measurements.

# V. System Measurement Results

This section describes measurement results as part of a modular Pre-Processor test system. The aim was to demonstrate the operation of the demonstrator MCM, with all its real time preprocessing and its high-speed serial data transmission of trigger tower data. The MCM test set-up consists of two VME motherboards, each equipped with a Common Mezzanine Card (CMC). One Motherboard carries a Pre-Processor CMC card and the other one carries a G-link receiver CMC card. As input to the Pre-Processor card, a liquid argon-shaped calorimeter signal was generated by an Arbitrary Function Generator (AFG).



Figure 12: Correlation of the analogue input signal with the high-speed serial bit-stream (G-link signal).

The G-link output signals from the MCM were connected via a 1 m long coax cable to the G-link receiver card. The G-link receiver decodes the serial bit-stream and provides the parallel output data to a motherboard FPGA for readout.

Figure 12 shows the correlation of the analogue input signal with the high-speed serial bit-stream. The bunch-crossingidentified data occurs after a latency of 9 bunch-crossings (225 ns) in one G-link bit-stream. This latency attributed as follows: one clock tick from the FADC, seven clock ticks from the FeAsic, and one clock tick from the G-link.



Figure 13: Air temperature profile.

A temperature model of the MCM package was extracted and used for a board-level simulation of the final Pre-Processor motherboard. This simulation has identified a chevron-like MCM placement as best suited for low temperature profiles. Figure 13 shows the temperature profile of the air flow above 16 MCMs mounted on an 9U VME board and Figure 14 shows the temperature profile of the motherboard itself. The maximum air temperature is 49.3 °C and the maximum board temperature is 31.4 °C.



Figure 14: Board temperature profile.

# VI. CONCLUSIONS

The measurement results described here have demonstrated the operation of the demonstrator MCM. This includes all the preprocessing from the analogue line receiver circuit up to the reception of high-speed serial data at 800 MBd. Temperature simulations and measurements have shown the beneficial effect of cutouts and thermal vias.

# VII. REFERENCES

 [1] ATLAS Level-1 Trigger Group *ATLAS First-Level Trigger Technical Design Report*  ATLAS TDR-12, CERN/LHCC/98-14, CERN, Geneva 24 June 1998 http://atlasinfo.cern.ch/Atlas/GROUPS/DAQTRIG/TDR/tdr.html

 [2] Pfeiffer, U. et al.
ATLAS Level-1 Calorimeter Trigger System Architecture Fourth Workshop on Electronics for LHC Experiments, CERN/LHCC/98-36, Rome, Italy 21-25 September 1998. http://www.asic.ihep.uni-heidelberg.de/atlas/publications.html

- [3] Mass, A. et al. Front-End Digitization and readout System for the ATLAS Level-1 Calorimeter Trigger Second Workshop on Electronics for LHC Experiments, CERN/LHCC/96-39, Balatonfoerd, Hungary 23-27 September 1996. http://wwwasic.ihep.uni-heidelberg.de/atlas/publications.html
- [4] Würth Elektronik http://www.wuerth-elektronik.de