First Level Trigger for H1, using the latest FPGA generation by Urban, M et al.
First Level Trigger for H1, using the latest FPGA generation
M. Urban, U. Straumann, University of Zurich
A. Rausch, J. Becker, University of Heidelberg
DESY, 22603 Hamburg, Germany
urban@mail.desy.de
Abstract
To cope with the higher luminosities after the HERA
upgrade, H1 [1] builds a set of new MWPCs, which
provide information to distinguish between beam
background and true ep interactions. The first level
trigger uses the latest 20K400 APEX FPGAs with 500
user IO pins to find tracks in 10’000 digital pad signals.
It allows to reconstruct the event vertex and cut on its
position.
The system works deadtime free in a pipelined
manner using 41.6 MHz clock frequency. The pipelines
needed for data acquisition are also programmed into the
same FPGAs.
I. OVERVIEW
For the upgrade project many of the components of
the H1−Detector have to be modified or redesigned. The
new central inner multiwire proportional chamber (CIP)
consists of five cylindrical detector layers with cathode
pad readout [2]. The total active length of the detector is
2.20 m and its inner diameter is 30 cm. The size of the
pads is matched to the anticipated resolution of the
reconstruction of the event origin on the beam axis. The
total number of readout channels is nearly 10000, ten
times larger than in the previous system [1]. 
The trigger system is supposed to reconstruct tracks
from the hit patterns of the CIP. From the distribution of
the track origins it should differentiate between true ep
collisions and background events. In contrast to the
previous system it should not only recognize tracks
originating from the nominal vertex region, but also
actively count background tracks, allowing to improve
the rejection quality of beam related background events.
The new CIP system consists of three parts: 
1. The active detector, including the signal amplifier
and discriminator electronics in an ASIC [5].
2. The link system, which multiplexes the digital
chamber data and transmitts it to the electronics
trailer via optical fibers with 3.3 Gbit/sec per cable
[4].
3. The trigger and data acquisition system, located in
the electronic trailer. This part is described by this
note.
In June 2001, the CIP upgrade should be fully
operating and ready for data taking. 
II. TRIGGER ALGORITHM
In the first step, the trigger needs to recognize tracks
and reconstruct their origin on the beam axis. For this
purpose all possible hit patterns in the five layers of the
readout pads of the CIP are stored together with their
origin. Allowed hit patterns are arranged around each of
the 106 pads of the middle plane. 45 pads are members
of such a local environment (weekly shaded in Figure 1).
By adjusting the pad size in each layer (projective
geometry) a given pattern always originates from the
same origin on the beam axis, independant of the
absolute position of the corresponding central pad.
Therefore the logical track patterns and their origins are
the same for each local environment of each central pad,
simplifing implementation of the trigger algorithm
significantly.  
A list of all these track patterns is maintained, where
active patterns of the given event are flaged by a single
bit. The array of these flags is called hitlist.
  
The hitlist is sorted according to the origin of the
tracks along the beam axis in 15 groups (bins) of about
16 cm width.
The whole detector is arranged in 16 ϕ sectors. The
pattern recognition is implemented separately for each of
them. Therefore the next step in the algorithm is to add
all active track patterns of the hitlist in each bin, and
adding the contents of all bins with the same z position
of all ϕ sectors. As a result of this operation 15 numbers
with a size of 10 bit form the so called vertex histogram. 
Figure 1: Examples of active track patterns and their
origin on the beam axis. The weekly shaded regions in
the  detector indicate two examples of local environments
around a central pad, where tracks are recognised.
The example in Figure 2 shows the vertex
histogram of a good ep event, where most tracks have
been reconstructed with an origin around bin 6. The
contents of every channel corresponds to the number of
active track patterns originating from this region on the
beam axis. 
Finally from the vertex histogram information about
the quality of the event needs to be extracted and
summarised in a single 16 bit trigger word, which is then
digested by the central trigger system of H1 together
with other first level trigger information. For analysing
the vertex histogram several methods are being
discussed. The best performance seems to be achieved
by cuting on the ratio of background (dark region in
Figure 2) to signal tracks. Monte Carlo simulations of
the new system using real events from H1 data indicate,
that the new system improves the background rejection
ratio by about a factor 20 compared to the present
system. The implementation should allow for maximal
flexibility in generating the trigger word.
III.   HARDWARE
The implementation of the trigger and data
acquisition part of the new CIP System is concentrated
in a total of six VME crates located in the electronic
trailor. Four of these crates are identical and contain one
trigger card per ϕ sector, which is the heart of the
system. It contains two FPGAs into which both the
complete trigger algorithm and the pipelines for the data
acquisition of the raw detector data for the respective ϕ
sector is programmed. The two additional crates house
the standard H1 trigger and detector control electronics,
as well as data acquisition and  slow control computers. 
The four trigger crates (6U height) consists of a
standard VME D16 backplane on the upper half of the
crate. On the lower half a custom built backplane acts as
the backbone for the input data distribution: On the rear
side of this backplane the receiver electronic boards for
the optical links [4] from the detector are mounted. One
board receives the data of two neighbouring ϕ sectors of
the same detector plane and demultiplexes the data
partially down to four fold multiplexing. The transfer
speed per data line on the backplane is 41.6 Mbit/sec.
These data is now redistributed on this backplane to the
trigger cards, which need all data of all five planes of
one ϕ sector, and which are plugged in from the front
side. 
In addition the custom backplane contains the clock
and control signal distribution, as well as the upper VME
data lines D16 ... D31.
The trigger cards [3] contain two Altera APEX
20K400 FPGAs[6] into which the complete trigger
algorithm and the data acquisition pipelines are
programmed. The 20K400 has 500 user I/O pins
available and consists of about 1600 logic array blocks
(LAB, often called configurable logic blocks CLB). Any
block of 16 LABs (so called MEGALABs) share a
common memory space of about 2000 bits, the access of
which can be organised in many different ways. Each
LAB consist of 10 logic elements (LE), the smallest unit
of the logics in the APEX. It contains among others a
programmable register and lookup table for the input
definition. There are 16’640 LEs in a 20K400 and a total
of 1 Million of equivalent gates. 
FPGAs compete with ASICs and DSPs as
programable devices for fast parallel applications. With
increasing number of gates in the available FPGAs, these
devices meet optimally the demands of fast trigger
systems in particle physics experiments. Since the Altera
APEX devices combine a large number of logic
elements with random accessible memory, they ideally
match the needs of most pipelined trigger systems, since
this allows easily to include large lookup tables, logic
for the trigger algorithms and pipelines for data storage
within the same device. 
To realize the trigger algorithm needed for the CIP
system, nearly 14’000 LEs are needed for each ϕ sector.
This means, that about 500.000 Transistors have to work
savely. From this it follows, that the trigger algorithm
would in principle fit into one FPGA only, however to
store 32 BX of raw data in a pipeline (implemented as a
FIFO memory) two FPGAs ared needed. The total
amount of information processed by each FPGA
amounts to 392 MByte/sec.
The FPGAs programming is supported by an
hardware development software called Quartus [6]. This
software tool offers Verilog design entry of the logics
and contains fitters and routers to make optimal use of
the logics in the FPGA. A simulation tool allows to
check the logic as well as all details of the timing, taking
into account the effective length and capacity of the
connections within the device after placing and routing.
Although the Quartus software has initially been buggy
and unstable, in the meanwhile it evolved to a reliable
and easy to use tool, which allows small turnaround
times.
A Lattice iSpL PLD connects the APEX devices with
the VME bus. Six EEPROMs store the software code
that is programmed into the FPGAs after a system reset.
The raw data stored in the pipelines of the FPGAs are
transferred to the CPUs through the VME bus backplane
by readout software. The VME CPU controls the data
transfer, compresses the pad information (zero
Figure 2: Vertex histogram of a single event (Monte Carlo
simulation). The nominal vertex region is centered around bin
number 6. The dark region is assigned to background.  
suppression) and transmits the event information to the
central event building systems [7].
Special sum cards sum the data of all  − sectors,
analyse and evaluate the histogram and form the 16 bit
trigger word, which is sent to the central trigger system. 
Timing: At HERA every 96 ns (= 1 BX) an ep
collision occurs. The trigger decision of the H1 level 1
trigger occurs about 22 BX after the collision. All trigger
hardware and data storage needs therefore to be
pipelined until the level 1 trigger decision occurs.
Taking cable delays into account, the maximum
tolerable latency for the CIP trigger electronics is
limited to 1 µs or about 10 BX. The FPGAs are run with
41.6 MHz clock, phase locked to the 10.4 Mhz BX
clock, allowing for 4 computational steps for each BX. 
Since the raw data is delivered four fold multiplexed
on each input line, the first step in the FPGA logic is to
demultiplex the data, which takes one BX. The track
recognition and evalution of the hitlist takes also one
BX. Next two BXs are needed to count the active
patterns in each bin (8 level adder cascade). Finally one
more BX is needed to multiplex the 15 8 bit numbers
onto 32 data lines for transmitting the result to the sum
cards. All this timing has been verified with the Quartus
simulation, in addition some of the critical steps have
been measured with the scope to verify the simulation
[3].
The sum card will contain a four level adder cascade
and look up tables for the triggerword. Including the
input demultiplexing we expect, that this will use up not
more than further 4 BXs. The total latency of the trigger
logic is therefore 9 BX or about 0.86 µs, well within the
requirement of the level 1 system.
Present status: The trigger cards including the
FPGAs (mounted as ball grid arrays) have been
successfully tested together with the custom backplane
and the readout CPUs [7]. The sum cards and the final
readout software are being designed at the University of
Heidelberg presently.
IV. REFERENCES
[1] H1 Collaboration: The H1 Detector, H1 internal
report H1−96−01 and Nucl. Instr. and Meth.A. 386,
1997, see also www−h1.desy.de/
[2] M. Cuje et al.: H1 High Luminosity Upgrade 2000
CIP and Level 1 vertex Trigger, H1 internal report
H1−IN−535(01/1998)
[3] M.Urban, Diploma Thesis Univ. Heidelberg, 5/2000
[4] S.Lüders,  these proceedings.
[5] Documentation on the CIPix: wwwasic.kip.uni−
heidelberg.de/~feuersta/projects/CIPix/index.html
[6] Altera Internal Notes, www.altera.com, 5/2000
[7] J. Becker, H1 internal presentation, 1/2000; and
Diploma Thesis Univ. Heidelberg, to app. 11/2000.
Figure  3: The trigger card. On the rear (left) the VME connector and (right)  the high density connector (250
pins) for the input signals from the custom backplane can be seen. 
