Digital Signal Processing for Particle Detectors in Front-End Electronics by Naaranoja, Tiina
Master’s Thesis
Physics
Digital Signal Processing for Particle Detectors
in Front-End Electronics
Tiina Naaranoja
(2014)
Supervisor: Prof. Risto Orava
Ph.D. Paul Aspell
Reviewers: Prof. Risto Orava
Univ. Lect. Kenneth O¨sterberg
University of Helsinki
Department of Physics
PL 64 (Gustaf Ha¨llstro¨min katu 2)
00014 Helsingin yliopisto Finland
Faculty of Science Department of Physics
Tiina Naaranoja
Digital Signal Processing for Particle Detectors in Front-End Electronics
Physics
Master’s Thesis September 2014 69
Digital Signal Processing, DSP, Electronics, ASIC, Particle Detectors, CERN, LHC, CMS, GEM
The Large Hadron Collider (LHC) at CERN is currently being started up after a long shutdown.
Another similar maintenance and upgrade period is due to take place in a few years. The luminosity
and maximum beam energy will be increased after the shutdowns. Many upgrade projects stem
from the increased demands from the changed environment and the opportunity of installation work
during the shutdowns. The CMS GEM collaboration proposes to upgrade the muon system in CMS
experiment by adding Gaseous Electron Multiplier (GEM) chambers.
The new GEM-detectors need new Front-End electronics. There are two parallel development
branches for mixed-signal ASICs; one comes with analog signal processing (VFAT3-chip) and an-
other with analog and digital signal processing (GdSP-chip). This Thesis covers the development
of the digital signal processing for the GdSP-chip. The design is described on algorithm level and
with block diagrams.
The signal originating in the triple GEM-detector sets special challenges on the signal processing.
The time constant in the analog shaper is programmable due to irregularities in the GEM-signal.
This in turn poses challenges for the digital signal processing. The pulse peaking time and signal
bandwidth depend on the choice made for the time constant.
The basic signal processing techniques and needs are common for many detectors. Most of the dig-
ital signal processing has shared requirements with an existing, well-tested Front-End chip. Time
pick-off and trigger production was not included in these shared tasks. Several time pick-off meth-
ods were considered and compared with simulations. The simulations were performed first using
Simulink running on Matlab and then on Cadence tools using Verilog hardware description lan-
guage.
Time resolution is an important attribute determined jointly by the detector and the signal pro-
cessing. It is related to the probability to associate the measured pulse with the correct event.
The effect of the different time pick-off methods on time resolution was compared with simulations.
Only the most promising designs were developed further. Constant Fraction Discriminator and
Pulse Recognition, the two most promising algorithms, were compared against analog Constant
Fraction Discriminator and Time over Threshold time pick-off methods. The time resolutions ob-
tained with noiseless signal were found to be comparable. At least in gas detector applications
digital signal processing should not be ruled out of fear for deteriorated time resolution.
The proposed digital signal processing chain for GdSP includes Baseline Correction, Digital Shaper,
Integrator, Zero Suppression and Bunch Crossing Identification. The Baseline Correction includes
options for using fixed baseline removal and moving average filter. In addition it contains a small
memory, which can be used as test signal input or as look-up-table et cetera. Pole-zero cancellation
is proposed for digital shaping. The integrator filters high frequency noise. The Constant Fraction
Discriminator was found optimal for Bunch Crossing Identification.
Tiedekunta/Osasto — Fakultet/Sektion — Faculty Laitos — Institution — Department
Tekija¨ — Fo¨rfattare — Author
Tyo¨n nimi — Arbetets titel — Title
Oppiaine — La¨roa¨mne — Subject
Tyo¨n laji — Arbetets art — Level Aika — Datum — Month and year Sivuma¨a¨ra¨ — Sidoantal — Number of pages
Tiivistelma¨ — Referat — Abstract
Avainsanat — Nyckelord — Keywords
Sa¨ilytyspaikka — Fo¨rvaringssta¨lle — Where deposited
Muita tietoja — o¨vriga uppgifter — Additional information
HELSINGIN YLIOPISTO — HELSINGFORS UNIVERSITET — UNIVERSITY OF HELSINKI
Contents
1. Introduction 1
1.1. GEMs for CMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. GEM-electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Motivations for Digital Signal Processing . . . . . . . . . . . . . . . . . . . . . 7
2. Review on Detector Signal Processing 8
2.1. Signal generation and ampliﬁcation . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2. Time resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4. Filters, Transfer Functions and z-transform . . . . . . . . . . . . . . . . . . . 13
2.5. Noise Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3. Digital Signal Processing Algorithms 17
3.1. General requirements and similarities with S-Altro chip . . . . . . . . . . . . . 17
3.2. Baseline Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3. Digital Shaping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1. Pole-Zero Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.2. Single Delay Line Shaping . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.3. Peak Sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4. Bunch Crossing ID, Amplitude and trigger . . . . . . . . . . . . . . . . . . . . 25
3.4.1. Piece-wise linear ﬁtting . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.2. Time over Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.3. Deconvolution method and pulse recognition . . . . . . . . . . . . . . 28
3.4.4. Constant fraction discriminator . . . . . . . . . . . . . . . . . . . . . . 30
3.4.5. Peak ﬁnder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4.6. Zero-crossing identiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5. Zero Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4. Simulations 34
4.1. Migrating ﬁlters from S-Altro . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2. Preliminary Comparison of Diﬀerent Time pick-oﬀ Methods . . . . . . . . . . 35
4.2.1. Approximation of GEM Signal . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2. Simulation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.3. Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3. Comparison between digital and analog BXID methods . . . . . . . . . . . . . 42
4.3.1. Simulated GEM signal . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.2. Simulation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3.3. Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4. Required Eﬀective Number of Bits . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4.1. Analytical Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4.2. Diﬀerential Pulse Height Spectrum . . . . . . . . . . . . . . . . . . . . 46
4.4.3. ENOB and time resolution . . . . . . . . . . . . . . . . . . . . . . . . 47
3
5. Proposal for the GdSP Signal Processing Chain 51
5.1. Digital Signal Processing Core and Chain for one channel . . . . . . . . . . . 51
5.2. Baseline Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.3. Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4. Digital Shaper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.5. Constant Fraction Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.6. Zero-Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6. Conclusions 65
A. Acronym Glossary ii
B. Block diagrams of S-Altro DSP ﬁlters v
B.1. First Baseline Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
B.2. Digital Shaper / Tail Cancellation Filter . . . . . . . . . . . . . . . . . . . . . vi
B.3. Second Baseline Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
B.4. Zero Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
C. Block diagrams of Simulink models x
C.1. Piece-wise Linear Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
C.2. Deconvolution method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
C.3. Peak Finder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
C.4. Zero-Crossing Identiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
C.5. Pulse Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
C.6. Constant Fraction Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . xiv
C.7. Peak sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
C.8. Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
D. Block Diagram of Pulse Recognition (Verilog Model) xvi
1. Introduction
1.1. GEMs for CMS
The Compact Muon Solenoid (CMS) is one of the four large experiments at the Large Hadron
Collider (LHC) at CERN (Conseil Européen pour la Recherche Nucléaire). It is a general-
purpose detector designed to cover as wide part of the physics at the LHC as possible. Its
achievements so far include the participation in the discovery of the Higgs boson. The fu-
ture holds studies on the discovered Higgs boson, searches for supersymmetry and extra
dimensions. The detector consists of onion-like layers of diﬀerent detector types: innermost
the inner tracker consisting of a pixel detector and a silicon tracker, the electromagnetic
calorimeter, the hadronic calorimeter, the superconducting solenoid yielding a 4 Tesla mag-
netic ﬁeld and outermost the muon system. The layout is shown in ﬁgure 1.1. Geometrically
the detector is divided into the cylindrical barrel, which makes up the bulk, and the planar
endcaps at the ends of the cylinder. [1]
2
0
0
8
 
J
I
N
S
T
 
3
 
S
0
8
0
0
4
Compact Muon S olenoid
Pixel Detector
Silicon Tracker
Very-forward
Calorimeter
Electromagnetic
Calorimeter
Hadron
Calorimeter
Preshower
Muon
Detectors
Superconducting Solenoid
Figure 1.1: A perspective view of the CMS detector.
to measure precisely the momentum of high-energy charged particles. This forces a choice of
superconducting technology for the magnets.
The overall layout of CMS [1] is shown in figure 1.1. At the heart of CMS sits a 13-m-
long, 6-m-inner-diameter, 4-T superconducting solenoid providing a large bending power (12 Tm)
before the muon bending angle is measured by the muon system. The return field is large enough
to saturate 1.5 m of iron, allowing 4 muon stations to be integrated to ensure robustness and full
geometric coverage. Each muon station consists of several layers of aluminium drift tubes (DT)
in the barrel region and cathode strip chambers (CSC) in the endcap region, complemented by
resistive plate chambers (RPC).
The bore of the magnet coil is large enough to accommodate the inner tracker and the
calorimetry inside. The tracking volume is given by a cylinder of 5.8-m length and 2.6-m di-
ameter. In order to deal with high track multiplicities, CMS employs 10 layers of silicon microstrip
detectors, which provide the required granularity and precision. In addition, 3 layers of silicon
pixel detectors are placed close to the interaction region to improve the measurement of the impact
parameter of charged-particle tracks, as well as the position of secondary vertices. The expected
muon momentum resolution using only the muon system, using only the inner tracker, and using
both sub-detectors is shown in figure 1.2.
The electromagnetic calorimeter (ECAL) uses lead tungstate (PbWO4) crystals with cov-
erage in pseudorapidity up to |η | < 3.0. The scintillation light is detected by silicon avalanche
photodiodes (APDs) in the barrel region and vacuum phototriodes (VPTs) in the endcap region. A
preshower system is installed in front of the endcap ECAL for pi0 rejection. The energy resolution
– 3 –
Figure 1.1.: A perspective view of the CMS detector (Image source [1])
The muon system is a tracker that is used for muon identiﬁcation, momentum measure-
ment and triggering. It includes Cathode Strip Chambers (CSCs), Drift Tubes (DTs) and
Resistive Plate Chambers (RPCs). The DTs are used in the barrel and CSCs in the endcaps.
They both have good position resolution and good background rejection. The RPCs form a
1
1. Introduction
independent triggering system. They have the good temporal resolution that is needed for
correct bunch crossing time assignment, but coarser position resolution. [1]
NOT FOR DISTRIBUTION JINST_026P_1013 v2
0 2 4 6 8 10 12 z (m)
R
 (m
)
1
0
2
3
4
5
6
7
8
1 3 5 7 9 11
5.0
4.0
3.0
2.5
2.4
2.3
2.2
2.1
2.0
1.9
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.00.9 1.10.80.70.60.50.40.30.20.1
40.4°44.3° 36.8°48.4°52.8°57.5°62.5°67.7°73.1°78.6°84.3°
0.77°
2.1°
5.7°
9.4°
10.4°
11.5°
12.6°
14.0°
15.4°
17.0°
18.8°
20.7°
22.8°
25.2°
27.7°
30.5°
33.5°
!°
η
!°η
M
E4
/1
M
E3
/1
M
E2
/1
M
E1
/2
M
E1
/1
M
E2
/2
M
E3
/2
M
E1
/3
R
E3
/3
R
E1
/3
R
E1
/2
MB1
MB2
MB3
MB4
Wheel 0 Wheel 1
RB1
RB2
RB3
RB4
HCAL
ECAL
Solenoid magnet
Silicon 
tracker
Steel
R
E2
/2
Wheel 2
R
E2
/3
R
E3
/2
M
E4
/2
R
E4
/3
R
E4
/2
G
E1
/1 G
E2
/1
DTs
CSCs
RPCs
GEMs
Figure 1. Transverse section of the CMS detector showing the present muon system including RPCs, DTs
and CSCs. The proposed locations for the GEM detectors in the inner two endcap stations are indicated by
the red boxes.
and to enhance also the physics reach by increasing the muon acceptance of the detector. The main
issue with respect to muon triggering in this region is the high background rate stemming from
low-pT tracks that are misidentified as high-pT muons. Adding more detection layers with high
granularity at high |η | will yield additional space points for track reconstruction to be combined
with the information from the CSCs. This will improve both the muon momentum resolution and
the background rejection at the trigger level, especially in the inner endcap stations where the muon
track bending angle due to the magnetic field from the CMS solenoid is largest.
For the high-luminosity phase of the LHC, one expects particle rates of several kHz/cm2 in the
|η |> 1.6 region of the Muon System, which imposes severe restrictions on the technology that can
be used. The presently installed Bakelite-based RPCs in the |η | < 1.6 region are for example not
able to sustain such rates. Instead, the CMS GEM Collaboration aims at installing several layers
of triple-GEMs in the 1.5 < |η | < 2.4 region of the Muon System. Pairs of triple-GEM chambers
combined together to form super-chambers, would be installed in the two inner endcap stations
of the Muon System, as indicated in Fig. 1. In the presently proposed schedule, the inner endcap
stations would be equipped with GEMs (labeled GE1/1) during the second LHC Long Shutdown
in 2017-2018, while the installation in the second endcap stations (labeled GE2/1) would be done
during the third LHC Long Shutdown scheduled beyond 2020.
2. Detector Description
2.1 Overall Layout of the System
The overall layout of the GEM chambers follows closely the design of the existing RPCs in the
endcap disks. In each of the 4 endcap stations (x = 1..4), the latter detectors are arranged in two
– 2 –
i .2.: Tran verse section of the CMS detector. The prop sed GEM detectors are in-
dicated with red b xes on the endcap.(Image source [2] )
Figure 1.3.: CMS detector when opened for maintenance. Barrel is on left and endcap on
right.
The CMS GEM Collaboration proposes to install Gas Electron Multiplier (GEM) cham-
bers in the forward region of the CMS experiment endcaps (see ﬁgure 1.2). They would
extend the CMS muon system that now has lowered redundancy in the forward region
(pseudorapidity 1.6 < |η| < 2.4). Currently the region has only CSCs installed. GEMs are
an attractive option, because they have large gain (exceeding 104), good spatial resolution,
good time resolution and they can cope with the anticipated high rates. First GEMs would
2
1. Introduction
be installed to the region labeled GE1/1 during second LHC Long Shutdown in 2017-2018.
The region labeled GE2/1 would be equipped during third LHC Long Shutdown scheduled
beyond 2020. [2]
ing the rate capability ≈105/mm2 and spatial resolution in the order of 100 µm. With a time
resolution of ≈4-5 ns operating at full efficiency with a non-flammable gas mixture, this detector
technology promises to fulfill all requirements including operation at a high magnetic field. In the
following we demonstrate the operation and study of small and large scale prototypes and corrob-
orate the requirements for a high eta tracking and trigger complement for the CMS forward muon
detector.
Figure 4. Components and operation of a triple-GEM
3. Construction and performance of small and full scale prototypes
In the last two years several prototypes have been built and extensively studied[3, 4]. Here some
results will be reported. In Table 2 and 3, we show the list of prototypes that were constructed and
studied, including the detector configuration.
Figure 5. Performance of 10×10 GEMs made with standard and single mask techniques.
– 4 –
Figure 1.4.: Triple GEM with CMS gap conﬁguration. (Image source [3])
• Foils developed using PCB manufacturing techniques 
• Large areas ~ 1m x 2m with industrial processes (cost effective) 
• Each foil (perforated with holes) is 50μm kapton sheet with copper coated side (5μm).  
• Typical hole dimensions : Diameter = 70μm, Pitch = 140μm, 
• Long term (10 years) operation experience in Compass, more recently ‐ LHCb and TOTEM operational 
 
A GEM foil A Triple GEM detector  
Gas Electron Multiplier (GEM) 
ing the rate capability ≈105/mm2 and spatial resolution in the order of 100 µm. With a time
resolution of ≈4-5 ns operating at full efficiency with a non-flammable gas mixture, this detector
technology promises to fulfill all requirements including operation at a high magnetic field. In the
following we demonstrate the operation and study of small and large scale prototypes and corrob-
orate the requirements for a high eta tracking and trigger complement for the CMS forward muon
detector.
Figure 4. Components and operation of a triple-GEM
3. Construction and performance of small and full scale prototypes
In the last two years several prototypes have been built and extensively studied[3, 4]. Here some
results will be reported. In Table 2 and 3, we show the list of prototypes that were constructed and
studied, including the detector configurati n.
Figure 5. Performance of 10×10 GEMs made with standard and single mask techniques.
– 4 –
Figure 1.5.: GEM foil. Electron microscope image of GEM foil (left, image source [5]) and
ﬁeld inside the holes (right, image source [3]) .
GEMs are gaseous micro-pattern detectors ﬁrst ntroduced in 1996 [4]. The primary
ionization occurs in drift region, which is eﬀectively a proportional counter. What sets the
detector part from normal prop rtional counter is the GEM foil shown in Fig. 1.5. The
surfaces of the foil are of a conducting material w h insulating layer between them. Into
the foil are e ched micrometer scale holes. There is large diﬀerence in the electric potential
between the surfaces of the foil . This results in a high ﬁeld inside the holes as seen in
Fig. 1.5. When electrons from the primary ionization enter the holes they are accelerated
enough to cause secondary ionization. Triple-GEMs have three layers of GEM foils. After
the electrons are multiplied in GEM foils they are collected on electrodes (strips) and the
signal is ampliﬁed and processed in the front-end electronics. (Eg. [6])
At present there are triple-GEM chambers installed in Common Muon and Proton Appa-
ratus for Structure and Spectroscopy (COMPASS) [7], Total Cross Section, Elastic Scatter-
ing and Diﬀraction Dissociation (TOTEM) [8] and Large Hadron Collider beauty (LHCb)
[9, 10] experiments. COMPASS has 10 years of experience with the GEMs. COMPASS
and TOTEM GEMs use Ar/CO2 gas mixture with proportions 70:30 and LHCb GEMs
3
1. Introduction
Ar/CO2/CF4 mixture. Tests have conﬁrmed that the faster gas mixture Ar/CO2/CF4 with
proportions 45:15:40 has also better time resolution[11].
Triple-GEM prototypes both small (10 × 10 cm2) for initial testing and two full scale
superchamber have been built and tested in CERN test beam facilities. Both analog (APV25-
chip) and digital (VFAT2) front-ends were used. With analog readout a spatial resolution
below 110 µm and a time resolution of 4 ns were achieved. The digital readout increased the
spatial resolution to 270 µm. [2]
1.2. GEM-electronics
A detector such as GEM needs a lot of electronics for signal read-out and triggering. For
some parts existing electronics can be used, some need to be custom designed and for some
there are ongoing generic development projects. Mostly the envisaged electronics falls to the
two latter categories.
P. Aspell   ! CERN  !  May  2012!
GEMs for CMS System Architecture 
VFAT3/G SP ASIC de ign + uTCA design 
!"#
GBT 
Front-ends 
VFAT/GdSP 
DC/DC converters 
Power Supplies 
LV 
HV 
~10V 
3 Optical links @ 3.2Gbps 
µTCA  crates 
Ser/Des 
Trigger 
 
DAQ 
 
DCS 
 
TTC 
On Detector 
AMC/GLIB  
GEM 
PAC Trigger Trigger 
Custom 
Trigger 
Boards 
Off Detector 
Figure 1.6.: Diagram of proposed GEM electronics.(Image source [27])
The overall design of the GEM electronics is shown in Fig. 1.6. The GEM superchamber
is divided into segments that each work as an independent detector. Each segment has 128
strips and one front-end (FE) Application Speciﬁc Integrated Circuit (ASIC). One surface
of the GEM superchamber is a Printed Circuit Board (PCB). It acts at the same time as
electronics board and closes the gas volume. On the detector side surface the innermost
layer is dedicated to the electrode strips. The outer layers are dedicated to Low-Voltage
Diﬀerential Signaling (LVDS) lines for communication with the FE ASICs, clock, ground,
high and low voltage supply [12].
The electronic control and readout system has components both on detector and oﬀ detec-
tor in the counting room. The GigaBit Transceiver (GBT) [13] ASIC is a generic development
project that is foreseen to control and readout FE ASICs on detector. The oﬀ detector sys-
tem provides interface to the CMS Data Acquisition (DAQ), TTC and trigger systems. The
4
1. Introduction
rates and the need for compatibility with the GBT mean that a new ASIC will be required for the future. In the
following we provide an overview of the VFAT2 chip before leading on to the design requirements for a new
front-end chip (VFAT3/GdSP).
VFAT2 The VFAT2 chip (Fig. 61) is a trigger and tracking front-end ASIC, initially designed for the readout
of silicon and gas sensors of the TOTEM experiment. It has two main functions; the first (Trigger) is to provide
programmable “fast OR” information based on the region of the sensor hit. This can be used for the creation of a
trigger. The second function (Tracking) is for providing precise spatial hit information for a given triggered event.
The possibility of these two functions plus internal input protection for use with GEMs make VFAT2 the ideal
(existing) candidate for the readout of the current GEM prototypes.
Figure 61: VFAT2, A 128 channel front-end ASIC with trigger and tracking capabilities. This ASIC is currently
used for the prototype systems.
Fig. 62 shows the block diagram for VFAT2.
!"#$%&'$()'*+$&#"'
,-%&$"$.-"'
,-(."-/'0-123'
4212.$/''
56!'
7($/-1'2(&8.'
,$/29"$.2-('
:;7<=':;7<>'
:#3.-"'
0-123'
:?(3'@'
<-(-*.$9/#'
A$%%2(1'
B(3C'@'4#3'
4$.$'
D-"%$..#"'
E>''
F'0G>7H';#:?(3H',$/!8/*#H'I,J'K'
L':#3.-"'M6!*'
F0G4:K'
4$.$M8.'
F0G4:K'
,-(.$2(*N'
,-/8%(')$.$'
I,'
B,'
,+2&54'
,;,'
3+#3O*8%'
!"#$%&' ()&)*#$'
!"#$%&'(()*+$
5=,'
;#3#2P#"'$()'
;#12*.#"*'
:47' :,0'
47,*'
Figure 62: The VFAT2 architecture, shown for 1 channel.
It has 128 analog input channels each of which are equipped with a very low noise pre-amplifier and shaping
stage plus comparator. A calibration unit allows delivery of controlled test pulses to any channel for calibration
purposes. Signal discrimination on a programmable threshold provides binary “hit” information which passes
through a synchronisation unit and then stored within SRAMs until a trigger is received. The storage capacity
enables trigger latencies of up to 6.4 µs and the simultaneous storage of data for up to 128 triggered events. Dead
time free operation with up to 100kHz Poisson distributed trigger rates is ensured. Time and event tags are added
to the triggered data which are then formatted and read from the chip in the form of digitized data packets at 40
Mbps. The programmability is achieved through and I2C interface whilst the fast ports are LVDS.
58
Figure 1.7.: Major steps of integrated circuit desig ﬂow (left, image sour e [15]) nd an
example of end product VFAT2 front-end ASIC (right, image source [17]).
design will be based on Advanced Mezzanine Cards (AMC) used with Micro Telecommu-
nications Computing Architecture (µTCA) standard. It will contain Gigabit Link Interface
Boards (GLIB) [14] from generic development project and custom Field Programmable Ar-
rays (FPGA).
The FE ASICs will be custom designed. The design process usually takes some years,
in this case it is foreseen to take over two years [11]. The main steps of the design ﬂow
are shown in Fig. 1.7. The design ﬂow begins with system deﬁnition and requirements.
Simultaneous to particle detector front-ends ASIC design the development of the detector
itself is often ongoing. This leads often to changing requirements in the middle of the design
process. The behavior and logic of the system is designed and tested with simulations. In
the case of digital design the design is written in Hardware Description Language (HDL)
such as Verilog. In HDL there are three diﬀerent levels of abstraction: Behavioural level,
Register Transfer Level (RTL) and Gate level. At behavioral level the model describes the
algorithm, but has no regard for the structural realization. A RTL model is synthesizable.
Being syntehsizable means that the logic can be expressed as a circuit consisting of discrete
primitive operations. The RTL model uses outside clock and has well deﬁned timing bounds.
The gate level code is usually generated with synthesis tools from the RTL model. It has
only logical values (0, 1, z, x) and uses only primitive operations (AND, OR etc) deﬁned by
the library that is being used. After more testing comes placing and routing. In the case
of digital design it is done with automated tools. The gates are expressed as transistors on
the semiconductor substrate. The routing via conducting lines is planned. Based on the
created model masks can be cre ted. Using the sks the chips are fabr cated. First only
few prototype chips are fabric ted and tested. The be vior of he r al chip might diﬀer
from the expected and in this case some level of redesigning is needed.
The COMPASS GEMs use APV25 (Analogue Pipeline Voltage mode) [16] front-end ASIC
5
1. Introduction
N
O
T
 
F
O
R
 
D
I
S
T
R
I
B
U
T
I
O
N
 
J
I
N
S
T
_
0
2
6
P
_
1
0
1
3
 
v
2
Figure 10. Block diagrams of the two proposed front-end ASICs for the GEMs at CMS: the VFAT3 (left)
and the GdSP (right) chip.
η . Every (η ,φ) segment contains 128 channels and is readout by one single front-end ASIC. This
means that a basic detector unit has up to 3840 channels, readout by up to 30 front-end chips. The
communication between the on-detector and off-detector electronics is done via GBT-GLIB-based
optical links [14]. Electrical links (e-links) connect all front-end chips in the same column in φ to
one single GBT chipset mounted on the detector. The GBT chipsets are linked via optical fibers, 3
in total per chamber, to the GLIB-based off-detector electronics housed in µTCA crates. The latter
contain dedicated units to interface to the CMS trigger, data-acquisition and control systems.
Two possibilities are pursued for the front-end chips [15], a digital readout based on the VFAT3
ASIC and an analog one based on the GdSP ASIC. Block diagrams of these 128 channel chips are
displayed in Fig. 10. The VFAT3 chip, the successor of the VFAT2, has a programmable shaping
time and gain in order to optimize the charge collection from the GEM detector while keeping
an excellent timing resolution; it is designed for high rates, and interfaces directly to the GBT
chipset. The GdSP chip is quite similar to the VFAT3, except that this one instead of a comparator
incorporates an ADC to digitize samples from the preamplifier and signal shaper. This kind of chip
with one ADC per channel for a total of 128 channels became only recently feasibile thanks to the
low power consumption below 4 mW per channel. From the CMS trigger point of view, the digital
VFAT3 chip would provide sufficient information for now, for the GE1/1 detector to be installed
during the second LHC Long Shutdown. During the third LHC Long Shutdown one could imagine
replacing the VFAT3 with the GdSP, when also the GE2/1 detector would be installed.
5. Integration into the CMS Detector
The final layout of the basic chamber units has to take into account any issues related to the in-
tegration of the GEM super-chambers into the CMS endcap stations. The space available for the
super-chambers inside CMS is quite limited (about 100 mm) and the forward GE1/1 region is
densely packed in terms of detector cabling and services. Apart from verifying the dimensions of
the chambers and the detector insertion procedure itself, one should also make sure the positions
of the connections for gas, cooling, electronics and detector powering are optimal.
In order to perform a test installation, 3 super-chamber dummies were produced as shown in
Fig. 11. The units did not contain any detectors, nor any electronics, however, the weight and
dimensions were similar to the real super-chambers. Earlier this year during the LHC Long Shut-
– 9 –
Figure 1.8.: Block diagrams of VFAT3 and GdSP front-end ASICs (Credits Paul spell)
originally designed for the CMS tracker. It has 128 analog channels that are sampled and
multiplexed for analog readout. In TOTEM the readout is based on VFAT2 (Very Forward
Atl s and Totem) [17] front-end chi designed in CERN microelectronics group primarily for
TOTEM. It has 128 analog i put channels. Th ou put is bi ary inf rmation from compara-
t rs. The LCHb expe iment u es Carioca-GEM[19], which is an adaptation of CARIOCA
(CERN and Rio Current-mode Ampliﬁer) ASIC. It has 8 analog input channels and like
VFAT2 binary outputs from a discriminator. Common task for all of the front-end ASICs
is that they all have at least ampliﬁer and shaper for all channels. From these the VFAT2
chip is closest to the CMS GEM requirements.
The CMS GEM front-end ASIC will be based on VFAT2. The requirements for trigger
and tracking are similar to those of TOTEM, but there are some diﬀerences. The new
chip will need to be compatible with diﬀerent readout system using a diﬀerent interface.
The method for trigger production will be diﬀerent. The trigger signal will be processed
inst ad of direct read ut f comparator signal. The ch rge collection time needs to be
longer. Since the optimal collection time is unknown the pulse shapi g time of the sh per
will be progr mmable. There are two alternative options that are being investigated VFAT3
and GdSP (Gas detector digital Signal Processing). The block diagrams of both designs are
shown in Fig. 1.8. The main diﬀerence between the VFAT3 and GdSP is that the latter will
have Analog to Digital Converters (ADC) and additional digital signal processing (DSP).
For optimal time resolution in VFAT3 digital Time over Threshold (ToT) technique will
be used after the comparator for precise trigger timing. Also an alternative using analog
Constant Fraction Discriminator (CFD) technique is being investigated [36]. The GdSP will
use digital time pick-of method that is discussed in sections 3.4 and 5.5. Many parts of
the DSP on GdSP are based on the DSP on S-Altro (Super-ALICE TPC Read Out) [20]
chip, which in turn is based on ALTRO (ALICE TPC Read Out) [21] chip currently in use
in A Large Ion Collider Experiments (ALICE) Time Projection Chamber (TPC) detector.
The S-Altro was used as role model, source of inspiration and for some parts as cautionary
example in the design process. Three of the ﬁve ﬁlters in GdSP DSP are based on S-Altro.
The baseline correction ﬁlter was largely changed, the pole-zero ﬁlter was scaled down and
zero suppression was used unchanged. The advantage of S-Altro blocks is that they have
been tested in use. Many of the ﬁlters are unchanged from ALTRO chip, which is in use in
ALICE. Valuable feedback n how they are working in practice was received. The digital
shaper that was changed since ALTRO has been tested with 16 channel S-Altro emonstrator
chip [20].
This thesis work concerns the design of the digital signal processing part for GdSP chip.
The focus is on the algorithm level and conceptual design. New designs and ideas were
initially tested using Simulink [24] that builds upon Matlab [23]. The beneﬁt was fast
6
1. Introduction
design process and easy testing. Based on the Simulink designs models were written in
Verilog language [25]. The Simulink translator was not used since automatically generated
code tends to be non-optimized and diﬃcult to read and modify. The Verilog models were
designed and tested using Cadence tools [38].
1.3. Motivations for Digital Signal Processing
None of the preceding GEM front-end ASICs have ADCs and DSP. The main reason for it
has been the power consumption of ADCs. If there is to be an ADC on all 128 channels the
power consumption of them needs to be very low. Only recently the development of low power
ADCs has reached the limit where it is feasible to consider including them. The Successive
Approximation Register (SAR) ADCs [22] currently being developed in AGH University are
envisioned for GdSP. They have very low 1mW power consumption per channel at 40 MHz
sampling rate. The development work is still ongoing, but promising.
The beneﬁts of DSP include additional processing which leads to cleaner trigger. More
of the background artefacts can be distinguished from the signal. An important aspect is
reduction of read out data. With the luminosity increase in LHC upgrade, overﬂowing buﬀers
in front-end ASICs are threatening to become a very real problem. Important attributes,
such as timing or center of charge cluster, can be extracted from the signal. The data is
reduced when only the necessary attributes are read out. The digital processing is also more
ﬂexible than analog and the ﬁlters are relatively easy to make programmable.
As discussed the position resolution was seen to be worse with digital readout in GEM
tests. This could be remedied with cluster processing. The center of gravity of the clusters
could be either evaluated on chip for coarser position resolution or oﬀ-chip for ﬁner resolution.
If the clusters are to be processed oﬀ chip the pulse amplitude is needed and needs to be read
out. Alternatively the data could be reduced by only reading out the central channel of a
cluster. Unfortunately one single algorithm cannot satisfy all of these desires simultaneously.
In GdSP the possibility where the user could choose between pulse amplitude readout and
binary readout was pursued. There is no certainty that cluster processing will be needed. If
the GEMs are to work in the same fashion as RPCs in the muon system, very ﬁne position
resolution is not needed.
7
2. Review on Detector Signal Processing
The amount of literature on signal processing for particle detectors is very limited. Some
radiation detection textbooks such as Knoll [6] contain a chapter or two on signal processing.
The abundant literature on digital signal processing covers only part of the signal processing
relevant for particle detectors. The whole picture is obtained by combining the information
from signal processing speciﬁc and radiation detection speciﬁc literature.
The signal processing chain for particle detectors begins in the detector and ends with
formatted data. The main objective of the signal processing is to remove irrelevant informa-
tion (noise, baseline ﬂuctuations etc.) from the signal and extract the wanted information
from it. Each step in the chain depends on the previous step and often also on the following
steps. The signal processing is dependent on the detector type and what is foreseen to be
done with it. The properties of the detector, such as charge collection time, determine what
kind of time constant is used in the pre-ampliﬁer. This in turn has large implications on the
following signal processing. The usage, for which the detector is built, determines what kind
of attributes of the signal are needed. In trackers the knowledge that there was a pulse is all
that is needed in the end. For other applications (calorimeters for example) the pulse height
is very important attribute. The importance of timing varies with application as well.
2.1. Signal generation and ampliﬁcation
The signal begins in the detector. In gas detectors the particle being detected causes ioniza-
tion in the detector. The electrodes start collecting charge already when the electrons are
still on their way and have not reached the electrode yet. The moving charge induces charge
on the electrode. The full charge is collected when the electrons reach the electrode.
The Shockley-Ramo theorem from the late 30's predicts the form for the induced signal.
It uses the concept of weighting ﬁeld ~Ew, which is the electric ﬁeld inside the detector when
one electrode is connected to unity potential (ie. 1 Volt). It is a calculated quantity and not
a real ﬁeld. The induced current and charge are given by
i = −q ~Ew · ~v (2.1)
and
Q = q
ˆ
~Ewdx (2.2)
respectively, where ~v is the velocity of the moving charge.[26]
In GEMs the moving charge is screened from the electrodes by the GEM foils everywhere
but in the collection region shown in Fig. 2.1. When a charged particle enters the GEM
it ionizes the gas in the drift zone. The electrons and ions are distributed along the track
of the particle. The ions drift towards the drift-cathode and are not used in measurement.
The primary electrons drift towards the ﬁrst GEM foil. When they reach the foil they are
distributed in time by the drift time across the drift region. In the GEM foil holes, the local
strong electric ﬁeld accelerates the electrons, which acquire enough energy to ionize the gas.
The formed secondary electrons join the primary electrons. The electrons are multiplied by
8
2. Review on Detector Signal Processing
Figure 2.1.: Schematic cross-section of triple GEM detector. The dimensions do not corre-
spond to CMS GEMs, but the concept is same. (Image source [10])
factor around 20 [27]. This is repeated at each GEM foil. The total charge collected is the
sum of all these electrons.
Figure 2.2.: Triple GEM signal with 2 ns shaping time. On y-axis the signal amplitude is
given in arbitrary units. (Image source [10])
Three signals measured from MIPs in triple GEM by Ziegler et al. is shown in Fig. 2.2. As
seen from the ﬁgure the signal from triple GEM is very irregular. The total duration of the
signal is consistent with charge collection time that is obtained by summing the drift times
in drift and collection regions [10]. The irregularities in the signal are thought to originate
from clusters created during the primary electron formation [10]. The irregularities of the
signal need to be addressed in the ampliﬁer.
The ampliﬁcation is usually performed in stages shown in Fig. 2.3 a). In some applications
the ﬁrst stage, preampliﬁer is the only necessary stage. The last stage, shaper (e.g. pole-
zero cancellation) is optional. Often an RC-circuit is used to discharge controllably the
accumulated charge from the ampliﬁer as illustrated in Fig. 2.3 b). Without it the output
would be a step function with height corresponding to the total charge collected. The
9
2. Review on Detector Signal Processing
a) Amplification and shaping chain
b) Preamplifier
c) CR-RC shaping
Preamp Shaping amp Shaper
C0
R0
t
Vout
t
Vout
t
Vout
R1
R2C1
C2
diff. int.
{
τc
τp
{
~e-t/RC
Figure 2.3.: Simpliﬁed diagrams of the signal ampliﬁcation and shaping with their output.
preampliﬁer is usually followed by shaping ampliﬁer. Fig. 2.3 c) shows an example of
shaping ampliﬁer; CR-RC shaper with ampliﬁcation. Additional shaping, for example pole-
zero cancellation, is sometimes needed.
After the preampliﬁer the pulse has a rise time that equals the charge collection time
in the detector τc. If the preampliﬁer is accompanied by the resistor R0, the charge is
discharged with decay constant τ0 = R0C0. The decay constant is independent of the
detector capacitance Ci, if the ampliﬁcation is
A (Ci + C0)
C0
.
In the shaping ampliﬁer the diﬀerentiator modiﬁes the pulse decay. Diﬀerentiators impulse
response is of the form
V ∼ e− tτ1 ,
where the decay constant is τ1 = R1C1. The integrator modiﬁes the rise time of the pulse.
It has the time constant τ2 = R2C2. The impulse response of the integrator is of the form
V ∼ (1− e− tτ2 ).
10
2. Review on Detector Signal Processing
In the special case where τ1 = τ2 = τ the combined impulse response of the diﬀerentiator
and integrator is given by
V = V0
t
τ
e−
t
τ (2.3)
where V0 is input pulse height. The pulse has a peak when t = τ . In this case the peaking
time τp and shaping time τ are equal. Later in this text they are used as synonyms. Equation
2.3 applies to the preampliﬁer output, if the collection time τc < τ . If this condition is not
met, the pulse suﬀers from ballistic deﬁcit, when all charge is not included in the integration.
[6]
In GEMs the collection time depends on the ﬁll gas and detector geometry. It is expected
to be around 40 ns [27] for CMS GEMs. In addition to the ballistic deﬁcit too short shaping
time can lead to deformed pulse shape, because of the irregularities in the signal [28]. For
these reasons the shaping time needed is foreseen to be at least 50 ns. This is twice the
sampling period in the ADCs. The consequences of it are discussed in the following chapters.
2.2. Time resolution
In the context of detectors the time-resolution might not mean what one ﬁrst thinks. It does
not refer to the sampling frequency of the electronics. It's not necessarily even the precision
of the measurement with respect to time. One way of thinking about the time resolution is
to think of it as the variance in the peaking time in the signal. Another is to think it as the
inherent uncertainty in the time measurement
The time t0, when the particle enters the detector
1, is deduced from the time the pulse is
detected. When these latencies are plotted, they are spread over time. The time resolution
is deﬁned as the standard deviation2 of the latency. For continuous signal the time spectrum
could take the form shown in picture 2.4.
time 
resolution
latency
t0 time
counts
Figure 2.4.: Time resolution of continuous signal.
When the signal is discrete, the time resolution might not be calculated straightforward.
Fig. 2.5 shows a histogram of the latencies, when the time resolution is much smaller than
the sampling period. One clearly cannot simply take the standard deviation here. The
Gaussian ﬁt presented in the ﬁgure is obtained using error function. It is assumed that the
that the temporal distribution can be approximated with normal distribution
f(x, µ, σ) =
1
σ
√
2pi
e−
(x−µ)2
2σ2 , (2.4)
1In simulation t0 is known. In measurements for example a faster detector can be used as reference or
coincidence setup can be used.
2Sometimes full width at half maximum is used [6].
11
2. Review on Detector Signal Processing
18 19 20
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Bunch crossings from interaction
n
/N
 
 
Normalized
accumulated
timestamps
Tp=250ns
Gaussian fit
Figure 2.5.: Time resolution of discrete signal
where µ is the mean and σ the standard deviation. The error function
erf(x) =
2
pi
xˆ
0
e−t
2
dt (2.5)
is closely related to the integral of normal distribution. The percentage of counts inside
interval from µ − nσ to µ + nσ is given by erf (n/√2). The timing eﬃciency ηt is deﬁned
as the percentage of counts in the right time bin. It can be tied to error function
ηt = erf
(
n√
2
)
, (2.6)
where the width of the time bin is ∆t = 2nσ. Combining the equations and solving for
standard deviation gives the time resolution
σt =
1
2
∆t√
2 erf−1(ηt)
(2.7)
2.3. Sampling
The required sampling frequency is usually given by NyquistShannon theorem. If the signal
frequency has a low limit |f | > fN , the signal can be uniquely determined by the samples
taken with frequency fsampling > 2fN [29]. In particle detectors the signal is not periodic,
but it is possible to take the Fourier transformation of it. For example signal shaped with
CR-RC ﬁlter with single shaping time has a peak, when t = τ . The dominant frequency of
the rising edge fsignal = 1/4τ can be estimated with the dominant frequency in the Fourier
transform of the signal. This is the frequency of a sine wave that is closest in form to
the signal. The sampling is usually synchronous with the LHC-clock, which has a 40MHz
frequency. This means, that the shaping time should be τ > 12.5 ns. In GdSP the shaping
time is programmable ranging from 25 ns to 200 ns and the sampling period is 25 ns.
12
2. Review on Detector Signal Processing
Digitization process results in quantization noise. The digitized value is rounded and often
slightly diﬀerent from the real value. The error e range is
−∆/2 < e ≤ ∆/2
where ∆ is the step size of the quantization. The noise created is random, white noise
with no correlation with the input signal. The probability distribution is uniform over the
quantization error range. The relation between the step ∆ and signal can have great eﬀect
on overall signal to noise ratio. [29]
In real ADCs there is always some noise (mostly thermal kTC-noise) and non-linearities
(harmonic distortions). Eﬀective number of bits (ENOB) is a measure of how close the ADC
is to the ideal. ENOB is given by
ENOB =
SINAD − 1.76dB
6.02
, (2.8)
where SINAD is SIgnal to Noise And Distortion ratio [30]. In simpler words ENOB is the
number of bits in an ideal ADC reduced by the noise and distortions in the ADC.
2.4. Filters, Transfer Functions and z-transform
The calculations in signal processing can be carried out equivalently in time domain and
frequency domain. With detector signal processing it is often more convenient to operate in
time domain. For some ﬁlters like pole-zero cancellation ﬁlter frequency domain comes more
naturally.
In time domain, a ﬁlters operation is characterized by impulse response h [n]. It gives the
output y [n] for a given input x [n]. The impulse response is obtained from the output, when
the input is a delta function:
y [n] = h [n]x [n] =

h1
h2
h3
...
 =

h1 0 0 0 · · ·
h2 h1 0 0 · · ·
h3 h2 h1 0 · · ·
...
...
...
...
. . .


1
0
0
...
 (2.9)
Transfer function is a generalization of the impulse response. It maps the ﬁlter input to
the output in a speciﬁed domain for a linear time-invariant (LTI) system3. A discrete-time
sequence (input, output, transfer function) can be represented in discrete frequency domain
via z-transform. The variable z is deﬁned as z = eiω, where ω is the angular frequency and
i is the imaginary unit. [30]
For example for discrete-time ﬁrst order pole-zero ﬁlter discussed in the chapters 3.3.1 and
5.4 the output y[n] is given by y [n] = y [n− 1] ·K − x [n− 1] · L+ x [n], where x[n] are the
samples of the input signal and K and L constants having value between zero and one. The
z-transformed transfer function is easy to calculate after the whole function is transformed
using: 
x [n]→ X(z)
y [n]→ Y (z)
x [n− 1]→ z−1X(z)
y [n− 1]→ z−1Y (z)
3Filters in detector signal processing are in general linear time-invariant systems or at least aspire to be.
13
2. Review on Detector Signal Processing
The z-transformed output is
Y (z) = z−1Y (z) ·K − z−1X(z) · L+X(z).
After short rearranging the transfer function is found:
H(z) =
Y (z)
X(z)
=
1− L · z−1
1−K · z−1 =
z − L
z −K (2.10)
where L · z−1 is the zero and K · z−1 is the pole.
The frequency response of the ﬁlter has maximum at the pole z = K and goes to zero at
z = L. This is easy to see from equation 2.10. If z = K, the denominator goes to zero and
the transfer function tends towards inﬁnity. When in turn z = L, the numerator goes to
zero as does the transfer function.
In general the step response of ﬁrst order ﬁlter in discrete time domain with real pole K
and real zero L has the form
h[n] = e−K·n
(
1− L
K
)
+
L
K
. (2.11)
This can be split into four cases:
h[n] =

1 L = K
e−K·n L = 0
a− b · e−K·n K > L, where 0<a<1 and b >0
a+ b · e−K·n K < L, where a>1 and 0<b<1
(2.12)
, where a = LK and b =
∣∣1− LK ∣∣ are constants. When the poles and zeros equal, the ﬁlter
passes all signals unchanged. In single pole case (zero in origo) and the ﬁlter shaves oﬀ
an exponential from the signal tail. When zero is not in origo and pole is larger than the
zero, in addition to canceling the exponential the signal is added with a constant value after
pulse. This is useful for correction of undershoots. In the case, when the zero is larger than
the pole, the leading edge of the pulse is softened and the peaking time increases. The ﬁlter
is most eﬀective in tail cancellation when the zero is at origo L = 0 and pole is as large as
possible K = 1.
When several ﬁlters are in cascade each ﬁrst order ﬁlter operates individually. This is the
case in the exemplary ﬁlter. The total transfer function is a product of the ﬁrst order ﬁlter
transfer functions:
H(z) =
1− L1 · z−1
1−K1 · z−1 ·
1− L2 · z−1
1−K2 · z−1 ·
1− L3 · z−1
1−K3 · z−1 ·
1− L4 · z−1
1−K4 · z−1 (2.13)
Similarly the total impulse response is simply product of ﬁrst order impulse responses h[n] =
h1[n] · h2[n] · h3[n] · h4[n].
It is worth noting that there are usually more than one possible designs with the same
transfer function. The transfer function is not unique for certain design.
2.5. Noise Sources
The noise accompanying detector signal can originate in the detector itself, parasitic capac-
ities between channels, preampliﬁer, analog shapers and digitization process. The digital
14
2. Review on Detector Signal Processing
signal processing contributes to the noise as well. An often used measure of noise is the sig-
nal to noise ratio (SNR). For systems, where absolute level of noise is known or calculated,
but signal amplitude is unknown, the noise is often expressed as equivalent noise charge
(ENC). It gives the equivalent noise to an ideal system at the system input. The unit is
rms electrons. The SNR can be easily calculated from ENC when the detector response is
known.
Figure 2.6.: Noise sources in charge ampliﬁers (Image source [26]).
For the noise sources from detector to analog shaper the concept of weighting function
w(t) is used in noise calculations. It is the mirror image of the impulse response of the
amplifying and shaping circuit. The noise is expressed as ENC. The ampliﬁer noise can
be modeled using equivalent circuit shown in Fig. 2.6. The detector is modeled as signal
generator and capacitance in parallel. The noise is modeled with voltage sources in series
and current sources in parallel with the detector and perfect ampliﬁer. ENC is the noise
charge at the input to the ampliﬁer in this model. It is also a measure of how well the
ampliﬁer and shaper ﬁlter noise.
The noise is a sum of the series white noise, 1/f-noise and parallel white noise [26]:
ENC2 = ENC2series + ENC
2
f + ENC
2
parallel
=
1
2
e2nC
2
in
∞ˆ
−∞
[
w′(t)
]2
dt+ piC2inAf
∞ˆ
−∞
[
w(
1/2)(t)
]2
dt+
1
2
i2n
∞ˆ
−∞
[w(t)]2 dt
=
1
2
e2nC
2
in
A1
τp
+ piC2inAfA2 +
1
2
i2nA3τp (2.14)
The average series noise voltage spectral density is e2n = 4kTRs, where Rs is equivalent series
noise resistance. The 1/f-noise spectral density is given by Af/f , where Af is a constant.
Parallel noise current spectral density is i2n = 2qI0 for current trough the detector and
i2n = 4kT/Rp for a resistance in parallel with the detector. The constants A1, A2 and A3
depend on the weighting function shape. τp is the peaking time of the impulse response. [26]
Since the series noise is inversely proportional to peaking time and parallel noise is directly
proportional, there exist an optimal shaping time, that minimizes the noise. The 1/f-noise
is independent of the shaping time, but is aﬀected by the form of the weighting function.
The 1/f-noise has fractal nature. It looks the same independent of the time scale. It can
be minimized by adjusting the weighting function shape. The 1/f-noise is proportional to
constant A2. For example for triangular weighting function A2 = 0.88, for Gaussian A2 = 1.0
and CR-RC A2 = 1.18. Hence the best choice for minimizing 1/f-noise would be triangular
weighting function. [26]
Most of the noise is integrated over in the shaper. Noise that averages on zero and has
frequency outside passband is ﬁltered out. Noise that consists of unipolar pulses is integrated
to a constant baseline. The series and parallel white noise are both temperature dependent.
This results in a baseline, that shifts with temperature.
15
2. Review on Detector Signal Processing
As discussed previously the input to ADC needs to be band-limited. Frequencies above
the Nyquist frequency result in aliasing. The losses in signal to noise ratio due to aliasing
cannot be recovered by any digital signal processing means. In practice this means that
at least some analog low-pass ﬁltering is always needed before sampling [26]. The ADCs
contribute to the signal to noise ratio as discussed previously. When the number of bits is
high, the quantization noise is low.
How well the bandwidth of the system is coupled with the signal frequency, aﬀects the SNR
in digital signal processing as well as in analog shaping. In addition to this any arithmetics
in digital system may suﬀer from rounding errors. To minimize this error the whole digital
signal processing can be performed with increased precision and rounded only after the full
processing chain. The beneﬁts are usually well worth the increase in surface area.
16
3. Digital Signal Processing Algorithms
3.1. General requirements and similarities with S-Altro chip
There are ﬁve general tasks that the signal processing unit should be able to perform:
1. Baseline removal.
2. Noise ﬁltering.
3. Pulse shaping and/or tail cancellation.
4. Pulse time pick-oﬀ.
5. Trigger production.
6. Zero suppression.
Noise originating from the detector is mostly ﬁltered in the analog front-end, but slow tem-
perature drifts in the baseline and the noise originating in the front-end chip itself are
propagated. Low frequency noise, such as the temperature drift, can be ﬁltered with the
baseline, when using adaptive method for calculating the baseline in stead of ﬁxed baseline.
Some time pick-oﬀ methods are highly sensitive to high frequency noise.
!"#$%&'()*+(,+-./01(2'3%3%4$&(5&6789(
- 80 - 
(
(
5.2 The S-A L T R O prototype architecture overview 
The S-ALTRO pro otype in egrates 16 channels operating concurrent y on 16 independent 
analog signals coming directly from the detector. Each acquisition channel consists of a CSA, a 
10-bit ADC with a nominal sampling rate of 40 MHz, a pipelined Digital Signal Processor and a 
Multi-Event Memory. See Figure 5.1. 
The CSA and ADC blocks were presented in Chapter 3. The digital signal processor 
implements the optimized digital filters presented in Chapter 4. The Zero Suppression and Data 
Format are the same as used in the ALTRO chip ([BOSCH2]). The communication protocol 
(Common C ntrol Logic) is based on the protocol of the ALTRO chip, although it has been 
adapted to the new features of the S-ALTRO prototype. Next section explains in detail the chip 
interface.  
The main motivation for adopting the ALTRO chip interface and communication protocol was 
to use for the characterization of the S-ALTRO prototype part of the hardware and software 
already developed for the ALTRO chip. The S-ALTRO prototype characterization in a real 
detector is not the aim of this thesis. 
DIGITAL SIGNAL PROCESSOR
Baseline
Correction
1 
Digital 
Shaper
Baseline
Correction
2
Zero 
Supression
Data 
Format
Trigger
Manager
Bus 
Interface
Config.
and Status 
Registers
BD   CTRL
level 1
(acquisition)
Runs with Sampling Clock
Runs with Readout Clock
COMMON
CONTROL 
LOGIC
level 2
(validation)
ACQUISITION CHANNEL (x16)
Multi
Event
Buffer
ADC
+
-
CSA
Baseline 
memory
10 13 13 10 10 40
tw
 
F igure 5.1 The S-ALTRO prototype block diagram 
5.2.1 The S-A L T R O Prototype Common Control Logic 
The interface has a digital bus composed of 40 bi-directional lines and 8 control lines. The 40-
bit bus contains 20 address bits that define the S-ALTRO address space and 20 data bits. This 
addressable space contains the baseline memories, the configuration/status registers and a set of 
Figure 3.1.: S-Altro prototype block diagram (Image source [20])
17
3. Digital Signal Processing Algorithms
Pile-up1 might become a problem with the relatively high hit rates in the forward region
and long shaping times necessary for gas detectors. The pulse shape can be modiﬁed to
reduce pulse pile-up. Either the whole pulse is shortened or only a long tail is removed.
Much of the requirements for GdSP is similar to S-Altro chip. The block diagram of the
S-Altro prototype is shown in Fig. 3.1. The Verilog code of the digital signal processing part
of the prototype was available as a starting point for the GdSP development. As seen from
the ﬁgure baseline removal, digital shaping and zero suppression are common tasks for the
chips. The signals from TPC and GEM both beneﬁt from long shaping times. Diﬀerences
arise in the requirement for timing information and trigger production for GEMs. The GEM
signal might beneﬁt also from high frequency noise ﬁltering, which could be needed to obtain
good resolution for the timing extraction algorithms.
3.2. Baseline Correction
After digitization the signal has an arbitrary baseline value, that is usually optimized so
that as much of the dynamic range as possible can be taken advantage of and in the same
time allow some ﬂuctuations in the baseline. If the signal is scaled too strictly to match the
ADC range, even small baseline ﬂuctuations take the signal outside the dynamic range. In
addition to the added constant baseline the baseline ﬂuctuates slowly with temperature. As
seen in section 2.5 on page 14 the thermal noise current, that is after analog ﬁltering seen
mostly as baseline level, depends on the temperature. The restoration of the baseline to zero
is important, because many techniques in digital signal processing, such as the pole-zero
ﬁlter, are based on the assumption, that the baseline is removed.
There are several possible approaches to Baseline Correction (BC). Easiest is to subtract
only a constant baseline. This is adequate only if there is no remarkable ﬂuctuations in
the baseline. For elimination of the slow ﬂuctuations a form of high-pass ﬁlter can be used.
Usually it is based on subtracting averaged signal. The moving average of the signal is
calculated continuously and subtracted from the signal. In applications for fast detectors,
when the pulse rise times are short, the moving average of few samples can be simply
subtracted from the signal [37]. This creates a undershoot after a pulse that is relative to
pulse height and duration as illustrated in Fig. 3.2. For short pulses and moderate rates
the dead time created after a pulse can be tolerated. The pulses in gas detector applications
are as a rule too long and the rates in LHC too high. Such a dead time after the pulse
would lower the detection eﬃciency considerably. In S-Altro the problem of the undershoot
is solved by freezing the baseline calculation for the duration of the pulse as illustrated in
Fig. 3.3. The signal is compared two low and high threshold. When the signal is outside the
thresholds, the calculation is frozen. The drawback of this solution is the risk that baseline
calculation gets frozen unintentionally and remains at erratic value until reset.
In the S-Altro the baseline correction is performed in two stages as seen in the prototype
block diagram ﬁgure 3.1. The ﬁrst Baseline Correction block (BC1) oﬀers several options
for more coarse baseline correction. The second Baseline Correction block (BC2) uses the
moving average subtraction with pulse exclusion for ﬁner corrections of baseline drift. The
BC1 oﬀers options for constant pedestal subtraction, subtraction of pre-recorded baseline
from SRAM, conversion mode, where the SRAM is used as a look-up-table, test mode, where
the input signal is read from the SRAM, self calibrated variable pedestal (Vpd) subtraction
and several combinations of these. The ALICE experiment takes data in bursts. Data
acquisition windows alternate with time when only background is present. The Vpd value
is calibrated outside the acquisition window and kept constant during the acquisition. The
1When sequential pulses are close enough to overlap signiﬁcantly, the latter pulse will have increased height.
18
3. Digital Signal Processing Algorithms
a
m
p
lit
u
d
e
a
m
p
lit
u
d
e
time time
dead-
time
input
signal
moving
average
output
signal
Figure 3.2.: Moving average subtraction.
a
m
p
lit
u
d
e
a
m
p
lit
u
d
e
time time
input
signal
moving
average
frozen
double
thresholds
output
signal
Figure 3.3.: Moving average subtraction with pulse exclusion.
CMS takes data continuously, which means that this approach is not applicable; there are
no acquisition windows and hence no time between them. The idea behind pre-recorded
baseline is to remove systematic perturbations during an acquisition window. This could be
used also in continuous mode, if there are systematic perturbations occurring with known
exact frequency. The justiﬁcation for having SRAM for each channel in GdSP would not
come from baseline correction, but rather from the conversion mode and test mode. The
presence of moving average subtraction does not exclude the alternative baseline correction
methods. The idea to oﬀer the user a few options for baseline correction is seen as wise. It is
for example diﬃcult to foresee whether the stability of simple constant pedestal subtraction
is valued over the more precise moving average subtraction. Although extra caution is taken
to make the moving average calculation as stable as possible, the long term stability is
ultimately revealed only in use.
The moving average calculation can be implemented as Inﬁnite Impulse Response (IIR) or
Finite Impulse Response (FIR) ﬁlter. Fig. 3.4 shows an example of IIR implementation of
baseline calculation. The double threshold scheme is also included. This ﬁlter is used in S-
Altro for Vpd calculation. For every sample the diﬀerence between the signal and calculated
baseline is divided by an exponent of two and added to the baseline. When the baseline is
below the signal, it is increased. When the baseline is above the signal, it is decreased. As
the baseline approaches the correct value, the corrections get smaller. The eﬀects of a given
19
3. Digital Signal Processing Algorithms
=
!
1
z en
n
n1
’b
1
n n
1
1
1
0
5 51
1
2
6
2
6
2
6
!
1
z en
A
cq
n
!
!
A
rg
u
m
en
t 
o
f 
an
 i
f 
st
at
em
en
t 
!
 D
o
u
b
le
 t
h
re
sh
o
ld
 s
ch
em
e
!
C
O
U
N
T
E
R
enrs
t
’d
 1
2
cl
k
C
lk
A
cq
n
R
st
B
v
al
id
v
al
cn
t
su
m
su
m
R
st
B
V
p
d
_
N
o
is
eC
h
D
in
_
2
c_
au
x
T
h
rs
h
B
1
L
T
h
rs
h
B
1
H
tr
sh
_
h
_
s
su
m
su
m
tr
sh
_
l_
s
V
p
d
II
R
 f
il
te
r
al
w
ay
s 
@
C
lk
,R
st
B
+! +!
2
!
n
2
!
n
2
!
n
2
!
n
F
ig
u
re
3.
4.
:
II
R
im
p
le
m
en
ta
ti
on
of
m
ov
in
g
av
er
ag
e
ﬁ
lt
er
w
it
h
d
ou
b
le
th
re
sh
ol
d
sc
h
em
e
[2
0]
.
20
3. Digital Signal Processing Algorithms
a)
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
()
(*
(+
!
"#
$
%&
'
(,
!
"#
$
%&
'
(-
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
(.
(/
(#
!
"#
$
%&
'
(0
1
23
0
#
#
0
#
#
0
0
4 "
4
0
#
#
0
#
#
0
0
5
5
#
5
5
/
5
5
.
'
%6
7
68
9
:
;
<
==
2>
<
?
8
@
%A
3
B
C
D
EF
2&
G
E8
HH
I
J
I
68
&2
;
<
E%
I
J
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
()
(*
(+
!
"#
$
%&
'
(,
!
"#
$
%&
'
(-
!
"#
$
%&
'
!
"#
$
%&
'
(.
(/
(#
1
23
0
#
#
0
#
#
0
0
4
5
5
#
5
5
/
5
5
.
'
%6
7
68
9
:
;
<
==
2>
<
?
8
@
%A
3
B
C
D
/
EF
2&
G
E>
2=
<
H&
E%
I
J
4
4
%I
J
/
%I
J
-
%I
J
)
b
)
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
()
(*
(+
!
"#
$
%&
'
(,
!
"#
$
%&
'
(-
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
(.
(/
(#
!
"#
$
%&
'
(0
1
23
0
#
#
0
#
#
0
0
4 "
4
0
#
#
0
#
#
0
0
5
5
#
5
5
/
5
5
.
'
%6
7
68
9
:
;
<
==
2>
<
?
8
@
%A
3
B
C
D
EF
2&
G
E8
HH
I
J
I
68
&2
;
<
E%
I
J
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
!
"#
$
%&
'
()
(*
(+
!
"#
$
%&
'
(,
!
"#
$
%&
'
(-
!
"#
$
%&
'
!
"#
$
%&
'
(.
(/
(#
1
23
0
#
#
0
#
#
0
0
4
5
5
#
5
5
/
5
5
.
'
%6
7
68
9
:
;
<
==
2>
<
?
8
@
%A
3
B
C
D
/
EF
2&
G
E>
2=
<
H&
E%
I
J
4
4
%I
J
/
%I
J
-
%I
J
)
F
ig
u
re
3.
5.
:
M
ov
in
g
av
er
ag
e
u
n
it
u
si
n
g
a)
ac
cu
m
u
la
ti
ve
su
m
b
)
d
ir
ec
t
su
m
F
IR
ﬁ
lt
er
.
21
3. Digital Signal Processing Algorithms
sample are diluted with time, but they cease to aﬀect the output completely only at reset.
Two examples of FIR ﬁlters are shown in Fig. 3.5. The ﬁlter a) is used in S-Altro for
moving average calculation in BC2. It uses accumulative sum. The ﬁlter b) uses direct sum.
In both cases the samples are read into a pipeline. With direct sum, the sum over two,
four and eight samples is taken and divided with two, four and eight respectively. The used
average is selected from these options.
With accumulative sum the same averages over two four or eight samples can be calculated.
Lets assume that the average over eight samples is calculated. Always when the pipeline
moves the newest sample z8 is added to the sum and the oldest sample z0 is subtracted:
sum[n] = sum[n− 1] + z8[n]− z0[n], (3.1)
where n denotes the current time. When there are nine samples in the pipeline, this gives
the sum over the ﬁrst eight samples. It is easy to see by inserting the previous sum
sum[n−1] = z8[n−1]+z7[n−1]+z6[n−1]+z5[n−1]+z4[n−1]+z3[n−1]+z2[n−1]+z1[n−1]
to the accumulative sum (eq. 3.1) and remembering how the pipeline shifts, ie. z0[n] =
z1[n− 1] and so forth.
The calculated sum is divided by the selected exponent of two (two, four or eight) by
shifting bits. The number of samples in the sum is altered by taking diﬀerent sample in the
pipeline as the new sample.
In principle the direct and accumulative sums give the same results, but there is a suspicion
that accumulative sum might have stability issues. The suspicion is based on experiences by
users of ALTRO chip, which has accumulative sum implementation in baseline correction.
Instability with the accumulative sum was not experienced in simulations (section4).
IIR ﬁlters are in general more eﬃcient in implementation. This eﬃciency can be measured
by counting how many adders there are in the design. In contrast to the two adders in the
IIR implementation of baseline calculation, the FIR implementation requires ﬁve adders and
even then the range of programmability is more restricted. The downside to IIR ﬁlters is
that they can be unstable [29, 30]. In baseline calculation instability would manifest itself
as radically oscillating value for the calculated baseline. The stability needs to be checked
through analysis of the region of convergence i.e. the poles of the ﬁlter need to be inside the
unit circle. FIR ﬁlters are always stable, because they always have all poles in origo.
3.3. Digital Shaping
The main aim of digital shaping is to shorten the signal and consequently reduce the eﬀect
of pulse pile-up. In gas detectors ion tails are common causes for the need of digital shaping.
One of the beneﬁts of GEM detectors is that they do not suﬀer from ion tails, because the
signal is formed from electrons alone. The need for shaping would come from the need for
long shaping times that arises from the long charge collection time. The anticipated muon
trigger rate at CMS high η region is 1 MHz. This corresponds pulses that arrive on average
every 40 sampling clocks. With the longest shaping time in the analog ﬁlter, the pulse itself
is almost 40 clocks long. In this environment pulse pile-up is likely to happen, but it is
unsure if it will be enough to cause problems if left untreated. The time pick-oﬀ techniques
are in general designed to tolerate moderate pulse pile-up.
3.3.1. Pole-Zero Cancellation
Pole-zero cancellation is often used method in detector signal processing both in analog and
digital form. It is usually used either to eliminate undershoot after pulse [6] or long ion tail
22
3. Digital Signal Processing Algorithms
[20].
Figure 3.6.: Pole-zero cancellation with analog electronics. a) Signal from preampliﬁer. b)
Signal from CR-RC shaper. c) CR-RC shaper with pole-zero cancellation. d)
Signal with pole-zero cancellation. (Image source [6])
The best demonstration to pole-zero cancellation is to use analog ﬁlter and oscilloscope
with particle detector. The pole-zero ﬁlter has only one control, that adjusts the resistor
Rpz in Fig. 3.6. Turning the control one way will make the pulse tail longer. Adjusting the
other way will shorten the tail and eventually result in undershoot.
A digital pole-zero ﬁlter has similar behavior, but the poles K and zeros L are given
directly as variables. In section 2.4 the transfer function of pole-zero ﬁlter was calculated
and impulse response was investigated.
Pole-Zero cancellation is the ﬁlter of choice in S-Altro. Its block diagram can be found in
appendix B. Its Digital Shaper2 is a 4th order pole-zero ﬁlter. It is used primarily to cancel
ion tails in the signal. It consists of four ﬁrst order pole-zero ﬁlters in cascade. The poles
and zeros are constrained to be real, positive and inside the unit circle. These constraints
ensure that the ﬁlter is stable. [20]
2It also goes by name Tail Cancellation Filter.
23
3. Digital Signal Processing Algorithms
The pole-zero ﬁlter has one obvious limitation. It cannot make the pulse rise time any
shorter. It can shorten the pulse tail, correct undershoots and increase pulse rise time.
3.3.2. Single Delay Line Shaping
time
Input 
signal
Delayed,
scaled
signal
Output 
signal
Figure 3.7.: Delay line shaping. The output signal is the sum of the original signal and the
delayed, inverted and scaled signal.
Single Delay Line (SDL) shaping illustrated in Fig. 3.7 has its origins in coaxial cables
that are shorted at the receiving end. The signal is reﬂected from the shorted end and the
reﬂection cancels the signal tail [6]. Same eﬀect can be achieved with digital ﬁlter that delays
and scales down the signal and subtracts it from the original signal. The resulting pulse has
the duration equal to the delay. To avoid ballistic deﬁcit in the pulse height the delay needs
to be at least as long as the pulse peaking time.
This is most eﬀective technique when the signal has either step function shape or long
exponential tail. With GdSP SDL was considered for use on preampliﬁer signal allowing to
skip the analog shaping entirely.
3.3.3. Peak Sharpening
Peak sharpening or resolution enhancement is more common procedure in context with
distributions (e.g. spectroscopy) and image processing. It improves the apparent peak
resolution making the peak artiﬁcially more narrow. It makes superimposed peaks easier to
isolate. It is related to deconvolution, where the response function broadening the peak or
pulse is known and can be countered. Deconvolution is discussed more in the chapter 3.4.3,
where it is used in the context of the time pick-oﬀ algorithm. Unlike with deconvolution
the convoluting function does not need to be known for peak sharpening. There are at least
two approaches for peak sharpening. One is based on subtraction of second derivative of
the signal [31]. Alternatively FIR ﬁlter calculating weighted sum of few samples could be
used [35]. Limited time for the design process and prioritization of time-pick oﬀ over digital
shaping allowed to design and test only one peak sharpening ﬁlter. The second derivative
technique was chosen as it was simpler and more interesting. It has simple and eloquent
mathematical foundation, whereas the FIR ﬁlter is a collection of ﬁtted parameters. The
second derivative technique is eﬀective but very sensitive to noise. From SNR considerations
point of view a FIR ﬁlter might have been a better option.
24
3. Digital Signal Processing Algorithms
Figure 3.8.: Peak sharpening using second derivative. On left a Lorentzian peak in red is
shown together with negative of its second derivative in green. On right is output
peak. (Image source [31])
Peak Sharpening using second derivative subtracts the second derivative x
′′
of the signal
x. The output signal y is given by
y = x− k · x′′ , (3.2)
where k is adjustable parameter. For discrete time ﬁlter realization the equation 3.2 gets
form
y[n] = x[n] + k · ({x[n+ 1]− x[n]} − {x[n]− x[n− 1]}) . (3.3)
Fig. 3.8 shows a Lorentzian peak with its second derivative and the resulting enhanced peak.
The inverted second derivative will approximately cancel the signal at the sides and add to
the signal at the peak. In the output peak, the baseline on each side of the peak is not
exactly zero. The wiggles are minimized by adjusting k appropriately. The total area under
the second derivative is zero. This means that the area under the peak is conserved. Because
the technique is linear, the proportionality of pulse height to collected charge is conserved
and normal calibration techniques are applicable.
3.4. Bunch Crossing ID, Amplitude and trigger
In general a trigger is binary logic pulse marking a pulse of signiﬁcant height. In particle
detectors the ﬁrst level trigger tells that a particle has been detected.
The trigger and time-tag from the ﬁlter need to have ﬁxed time delay in respect to the
beginning of the pulse. If trigger is given when the signal crosses a threshold, the trigger
may have time walk for two diﬀerent reasons (see Fig. 3.9). The time walk might originate
in varying pulse rise time. The rise time is usually assumed constant. With short shaping
times, the GEM signal rise time does have variance. The rise time walk cannot usually be
eliminated. Another source of time walk is the varying pulse height. Diﬀerent timing is given
to pulses with diﬀerent amplitudes. The amplitude time walk can be eliminated using two
diﬀerent strategies. One can use threshold comparison and use pulse amplitude to correct
the timing. Second strategy is to pick a characteristic of the pulse, for example peak, that
has ﬁxed delay from the pulse beginning and use it for trigger production.
In the applications foreseen for the GdSP-chip timing information is more crucial than
amplitude information. The needed accuracy for measured amplitude is not known. It's
not even clear if amplitude information is needed outside the chip. There are two basic
25
3. Digital Signal Processing Algorithms
a)
b)
Figure 3.9.: Time walk originating from a) variations in rise time and b) variations in pulse
height (Image source [6])
techniques for extracting amplitude information. In simplest case, we simply take the value
of the largest sample in a pulse. For greater accuracy weighted sum of samples around the
peak of the pulse can be used. The Piece-wise linear ﬁtting algorithm uses the weighted
sum method and uses it in the calculation for timing information. The other bunch crossing
identiﬁcation (BXID) algorithms can be used with either one of the amplitude extraction
methods.
Some assumptions were made in designing the BXID:
1. Pulses are synchronous with the sampling clock.
2. Phase diﬀerence between the clock and pulses can be adjusted i.e. signal can be delayed
with precision of few nanoseconds in the analog front-end.
3. Timing information is needed only for bunch crossing assignment (time-bins are 25 ns
wide).
The ﬁrst assumption is met in LHC experiments in general, but not necessarily in other
possible applications. In GdSP analog front end the signal delay can be adjusted. If either
one of these assumptions should not be met, the best time resolution that can be delivered
is half sampling period (12.5 ns). If better resolution is required, there are other possible
approaches: To use BXID algorithm that delivers natively sub-clock precision time-stamp
(for example piece-wise linear ﬁtting), use faster ADC, interpolate the signal and use existing
block (for example Constant Fraction Discriminator (CFD)) with faster clock or interpolate
inside the BXID block (for example Zero-Crossing Identiﬁcation (ZCI) and CFD are well
suited for this).
The S-Altro does not produce trigger and extract pulse timing information. This is how-
ever a common task in detector electronics and there is no need to invent a new ﬁlter. To
get the best performance several known methods were compared. Multiple simulations were
designed to help in the designing of the ﬁlters and to choose the best from them. The
26
3. Digital Signal Processing Algorithms
initially considered ﬁlters were Piece-wise linear ﬁtting[32], Time over Threshold, Decon-
volution method[33, 34], CFD, Peak Finder and ZCI. One previous study including Peak
Finder, Deconvolution ﬁlter, ZCI and CFD was found [35]. It was conducted with quite
diﬀerent setup in mind (calorimeter as detector and constant, short shaping time) and the
results could not be reliably generalized for GdSP.
3.4.1. Piece-wise linear ﬁtting
Piece-wise linear ﬁtting (PWLF) was developed by Buzuloiu for time and amplitude pick-oﬀ
in HEP applications[32]. The amplitude is calculated with a weighted sum of three samples.
The amplitude time walk is calculated using a Look-Up Table (LUT) with another weighted
sum as input. The parameters for the weighted sums are calculated by ﬁtting a line to the
pulses in the sample space (Fig. 3.10).
a) b)
Figure 3.10.: a) Sampled pulse b) Pulse representation in sample space (Image source [32])
Unlike the other time pick-oﬀ methods described in this chapter, this method is designed
to compute the time walk with time steps smaller than the sampling period. This means it
is more precise than a mere bunch crossing algorithm.
For the GEM application only bunch crossing identiﬁcation was needed and PWLF was
considered too complex and needy on chip resources.
3.4.2. Time over Threshold
Time over Threshold (ToT) is based on trigger production at threshold crossing. The timing
is corrected afterwards. At threshold crossing a counter is started. It is stopped when the
signal returns below threshold. The reading on the counter is proportional to the pulse
amplitude. The pulse amplitude and the time walk can be read from LUT (look-up table)
using the time over threshold as input. The operation principle of ToT is illustrated in Fig.
3.11.
ToT is often the only available method oine, when the only information on the signal is
binary values from a comparator. It is also a valid method when used on analog signal (e.g.
see simulation results using analog ToT in ﬁg. 4.12).
In digital designs ToT has certain challenges. As a part of digital signal processor, it
would be easiest to use the sampling clock for the counter. The pulse length varies with the
shaping time from 4 to 32 sampling clocks. Taking the average pulse length of 14 clocks
and quantization error of half sampling clock, the uncertainty of ToT value becomes 3.6 %.
To get more accurate value one needs a faster counter. In this case the signal needs to be
interpolated. Then the accuracy of the threshold crossing highly depends on the accuracy of
the samples on either side of the crossing. This leaves the method highly sensitive to noise.
27
3. Digital Signal Processing Algorithms
threshold
clock LUT
time
A
start
clock
stop
clock
Timing correction,
Pulse amplitude
Figure 3.11.: ToT operation principle.
The need to choose between relatively high uncertainty and more complex algorithm that
is sensitive to noise led to the decision to exclude ToT from the simulations discussed in
Section 4.
3.4.3. Deconvolution method and pulse recognition
The deconvolution method aims to restore the signal to the form prior to the convolution
in preampliﬁer and ﬁlters[34]. It has been successfully used in Preshower electronics[33]. In
Preshower application it is assumed that the signal from the detector can be approximated
by a delta peak. In practice it means that the original signal pulse is shorter than 25 ns.
The deconvolution method is based on matrix inversion of the impulse response of the
ampliﬁer (and ﬁlters) h(∆t) =
[
h1 h2 h3 · · ·
]
, where ∆t is the sampling period. When
signal S with samples si is convoluted with the matrix H with elements hij from impulse
response, then the convoluted signal V with samples vj is obtained:

h1 0 0 0 · · ·
h2 h1 0 0 · · ·
h3 h2 h1 0 · · ·
...
...
...
...
. . .


s1
s2
s3
...
 =

v1
v2
v3
...
 (3.4)
The original signal s can be found by using matrix inversion: S = H−1HS = WV , where
W = H−1. [34]
The deconvolution ﬁlter used in Preshower is a variation of the theme. It uses three
samples. Three quantities are calculated
α = w1v1 (3.5)
β = w1v2 + w2v1 (3.6)
γ = w1v3 + w2v2 + w3v1 (3.7)
and a bunch crossing is assigned when β is the greatest quantity: α < β > γ [33]. If
deconvolution method is used on non-ideal pulse, the result is a short pulse instead of one
clock long delta peak as shown in Fig. 3.12. The quantities α, β and γ are needed for
unambiguous bunch crossing assignment i.e. the ﬁrst sample on a pulse is found. The
method as it is used in Preshower works well only when the peaking time is shorter or equal
28
3. Digital Signal Processing Algorithms
i)
0 . s
0 .6
0 .4
0 .2
20 40
F ig . 3 . (a ) The ca l cu l a ted ou tpu t pu l se f rom an idea l charge sens i t i ve preamp l i f i er and a CR-RC shaper w i th equa l t ime cons tan t s
( - r = 3) , samp l ed a t every t ime in terva l , and (b) the resu l t o f a per fec t deconvo lu t ion opera t ion .
l eakage cur ren t increases i s prov ided by th i s me thod
than by s imp l y us ing a shor ter t ime cons tan t , wh i l e
exce l l en t t im ing reso lu t ion can s imu l taneous l y be ob-
ta ined . In c i rcums tances where l eakage cur ren t s do no t
l ead to excess i ve no i se the advan tage o f fu l l charge
co l l ec t ion and lower no i se s igna l measuremen t , ob-
ta ined v i a s lower shap ing t ime , can be fu l l y exp lo i ted .
The techn ique i s app l i ed by form ing a we igh ted sum
o f the samp l ed ou tpu t vo l tages o f the amp l i f i er f rom
three success i ve t ime in terva l s . An impor tan t conse -
quence o f the me thod i s tha t i t can be imp l emen ted as
an e l emen tary CMOS c i rcu i t w i th neg l ig ib l e power
consump t ion .
3 .1 . The deconvo lu t ion pr inc ip l e
The bas i c prob l em to be so l ved i s : g i ven an inpu t
s igna l , s( t ) and a know l edge o f the impu l se response o f
an amp l i f i er , h( t ) , how can the in i t i a l s igna l be re -
t r i eved f rom the measured ou tpu t o f the amp l i f i er ,
v( t ) )? F ig . 2 shows a schema t i c d i agram o f a sys tem to
do th i s .
I t i s we l l known tha t v( t ) , s( t ) and h( t ) are re l a ted
in the t ime doma in by a convo lu t ion in tegra l . Th i s can
be wr i t ten
v( t ) = f f
 
h( t - t ' )s( t ' ) d t ' .
-x
Here , we are par t i cu l ar l y in teres ted in the case when
we samp l e the amp l i f i er ou tpu t vo l tage a t regu l ar in-
terva l s , so tha t the measuremen t s o f r ( t ) ) , for examp l e ,
are (v l c 2 v 3 L '4 vs . . . ) . Then i t i s conven i en t to wr i te
the equa t ion in a ma t r i x form # ' as
V = YH ; ,S , or V= HS .
# ' D ig i ta l s igna l process ing texts o f ten express th i s equa t ion
form wh i ch , for our purposes . does no t prov ide the same
phys i ca l ins igh t .
S . Gadomsk i e t a l . / Dec ( , )n i -o lu t ion me thod o f fas t pu i se shap ing
0 .8
0 .6
0 .4
0 .2
0
Idea l wave form Idea l wave form deconvouu ted
The e l emen t s o f H can be wr i t ten down by inspec -
t ion , tak ing in to accoun t the requ i remen t s o f causa l i t y ,
wh i ch are tha t h( t - t ' ) = 0 for t < t ' , or H i j = 0 for
i < j . Then , i f s( t ) represen t s an in i t i a l impu l se (1 0 0 0
0 . . . ) and the impu l se response o f the sys tem i s (h
h- , h 3 h4 . . . ) the resu l t ing ma t r i x equa t ion i s
h ,
h ,
h ,
(b j
.11 . 0 , 1 . . , .1 . j . 1 1 . . 1 . 1 .
0 10 20 30 40
0 0 0
h , 0 0
h , h , 0
219
Fur ther inspec t ion shows tha t th i s works for l a ter im-
pu l ses , l i ke (0 1 0 0 . . . )and (0 0 1 0 0 - ) e tc .
The or ig ina l impu l se can be recons t ruc ted by per -
form ing the ma t r i x invers ion
S=WV=H - 'HS ,
wh i ch i s genera l l y feas ib l e , a l though in pr inc ip l e the
ma t r i x shou ld be o f in f in i te d imens ions # ' .
S ince H has a regu l ar form ( lower t r i angu l ar ma t r i x)
i t i s c l ear tha t the inver ted ma t r i x , wh i ch we may ca l l
the we igh t ma t r i x W , i s a l so very regu l ar ; i t a l so has a
lower t r i angu l ar form . I t mus t ac t on (h I h 2 h 3 - - " ) to
#2 In many cases h , = 0. ea the impu l se response - to - ` o f a
charge sens i t i ve preamp l i f i er and CR-RC shaper , in wh i ch
case the ma t r i x F ! i s s ingu l ar . Th i s can be avo ided by the
t r i ck o f wr i t ing
h , 0 0 0
h ; h , 0 0
h4 h ; h , 0
and remember ing tha t the impu l se i s d i sp l aced one un i t o f
t ime f rom i t s t rue pos i t ion .
ii)
225
200
175
150
125
100
75
50
25
0
C.
2kT 50
I [wA] =
qR
=
R[kSZ]
.
l eakage cur ren t was ca l cu l a ted f rom the res i s tor va lue
us ing the re l a t ion
The resu l t s for three se l ec ted va lues o f l eakage
cur ren t , I = 0 WA , 4 .1 RA and 8 .9 WA , are shown in
f igs . 7a -7c . As shown in Append i x 13 , for x = At / , r =
1 /3 , we expec t an increase in the ser i es no i se by a
fac tor 1 .85 and a decrease in the para l l e l no i se by a
fac tor 0 .35 a f ter deconvo lu t ion . Thus for I = 0 , when
amp l i f i er ser i es no i se dom ina tes , we expec t the no i se
to increase ; we observe a fac tor 1 .6 . The d i screpancy
may be due to a con t r ibu t ion f rom 1 / f no i se in the
amp l i f i er , or f rom con t r ibu t ions to the to ta l no i se f rom
e l emen t s a f ter the f i rs t t rans i s tor , so tha t the s imp l e
mode l o f para l l e l and ser i es con t r ibu t ions i s an over -
s imp l i f i ca t ion .
S . Gadomsk i e t a l . / Decom-o lu t ion me thod o f fas t pu l se shap ing
0 20
40
40
Be fore deconvo lu t ion , tau = 7!5 ns
100
80
60
40
20
Exponen t i a l inpu t
N~ P
i
 
i_ i . I i -L-1 L I . I I
0 10 20 30 40
223
F ig . 8 . S imu l a t ion o f the amp l i f i er response to two equa l s i ze pu l ses occur r ing separa ted by an in terva l o f 2 t ime un i t s .
da ta f rom ind i v idua l l y measured samp l e pu l ses have been comb ined . (a ) The resu l t as measured a t the amp l i f i er ou tpu t , (b) the
resu l t a f ter deconvo lu t ion .
Here the
For l arge va lues o f equ i va l en t l eakage cur ren t we
ob erve decreases in sho t no i se by fac tors 0 .43 (4 .1
WA) , 0 .36 (6 .1 WA) and 0 .35 (8 .9 WA) when the con t r i -
bu t ion o f ser i es no i se i s proper l y sub t rac ted . These are
in sa t i s fac tory agreemen t w i th our pred i c t ions . The
s igna l - to-no i se ra t io i s a l so in agreemen t w i th our es t i -
ma tes. Fur ther measuremen t s are under way to con-
f i rm the va l id i t y o f the no i se ca l cu l a t ions under a w ider
range o f cond i t ions than exam ined so far ; pre l im inary
resu l t s are in good agreemen t w i th the ca l cu l a t ions
[141 .
4 .2 . Dev i a t ions f rom idea l behav iour
So far we have demons t ra ted the deconvo lu t ion
techn ique w i th re ference on l y to idea l s igna l impu l ses .
In prac t i ce , there w i l l be dev i a t ions f rom th i s be -
hav iour for a number o f reasons .
A f ter deconvo lu t ion
0 10 20 30 40
F ig . 9 . S imu l a t ion o f the sys tem response to a de tec tor cur ren t pu l se w i th an assumed e - ' T shape . T = 7 .5 ns was used . (a ) be fore
deconvo lu t ion , (b) a f ter deconvo lu t ion .
(b)
0 .6
0 .8 0 .5
0 .6 0 .4
0 .3
0 .4
0 .2
0 .2 0 .1
L . J - ~ - . 1 1111 16 . . . t . . . . .
Figure 3.12.: Simulated deconvolution on i) ideal pulse and ii) non-ideal pulse [34]
to the sampling period. In the case where it was longer the calculation of suitab weights
proved impossible in simulations (see section 4).
One of he strengths of deco volution method i the extr m ly short dead ti e. It can
tell apart two pulses at just two samples distance from each other[34]. With triple GEMs
this might become disadvantage. It would be possible to have multiple triggers on one pulse,
when using short shaping times, because the clusters in the GEM signal are not integrated
fully.
For these reasons an alter ative scheme was devised that acquired the name Pulse Recog-
nition (PuR). It has roots in the deconvolution meth d used in Preshower. The quantities
α, β and γ are calculated as described above. The diﬀerence is that the temporal distance n
between samples v2 and v3 can be adjusted and the weights w1, w2 and w3 are not calculated
using deconvolution method. The temporal distance between samples v1 and v2 is constant 3
sampling periods. The idea is to iterativ ly ﬁnd weights fo which the inequ lity α < β > γ
is true only when the sample v2 is at the top of a pulse. In other words it recognizes the
pulse.
For ﬁnding the weights the following method was used. Sampled average pulse and tem-
poral distance n were assumed known. First the samples in the averaged pulse that are
wanted as v1, v2 and v3 at the time of trigger were identiﬁed as signal(t1), signal(t2) at the
peak and signal(t3). Weight w1 is set to 1. To calculate weight w2 from the requirements
that the inequality α < β > γ holds for the identiﬁed samples, but not when the time is
shifted. By assuming that signal(t1) signal(t2) and shifting either signal(t1), signal(t2)
or signal(t3) or all of them by one clock one gets inequalities
w2 ≤ w1
(
1− signal(t3 −∆t)
signal(t2)
)
(3.8)
29
3. Digital Signal Processing Algorithms
w2 ≤ w1
(
1− signal(t2)
signal(t1 + ∆t)
)
(3.9)
from which the smaller is picked as the value for w2. The last remaining weight can be then
be calculated using
w3 = w2
(
1− signal(t2 + ∆t)
signal(t1 + ∆t)
)
+ w1
(
1− signal(t3 −∆t)− signal(t2 + ∆t)
signal(t1 + ∆t)
)
. (3.10)
The best value for n was found iteratively. Simulations were run with several diﬀerent n and
the one yielding best time resolution was chosen. The optimal number was usually when the
signal(t1) was found among the ﬁrst samples on the pulse.
An approximate total charge Q at trigger (β = Q) can be obtained when the weights
are calibrated using the same scaling factor for all of them. For the GEMs a total charge
measurement is not necessary and scaling was not performed. For very accurate charge mea-
surement the charge should be calculated using 3 additional weights as is done in Preshower.
3.4.4. Constant fraction discriminator
Constant fraction discriminator (CFD) produces a trigger, when constant fraction k of the
pulse peak amplitude is reached. Because the rise time is constant, the timing for the pulses
does not depend on the pulse amplitude. The trigger is given at fraction f of the peaking
time tp. This method is at its best, when the pulse does not have a sharp peak, but the
pulse shape is constant. Most commonly k = f = 1/2.
Traditionally in both analog and digital ﬁlters the signal is divided to two components.
One signal (i) is delayed by time td and the other (ii) is multiplied with −k. The two signals
are summed (iii) and the triggering time t is found when the signal (iii) crosses zero.
a)
tt0 tp
A
kA
B
kB
time
b)
(i)
(ii)
(iii)
t
zero
crossing
Figure 3.13.: Constant fraction discriminator. a) Timing is independent of pulse height. b)
Waveforms used in CFD
Another common attribute to the pulses, besides the same fraction of the pulse peak height
at the time of trigger, is that they have the same gradient at this point. For pulses that
do not have constant derivative this can be exploited. Triggering on pulse derivative would
not give unambiguous trigger. The correct derivative would be found before and after the
maximum derivative. First of them would often, but not always, be below the threshold
separating signal from noise. This would lead to unpredictable timing. In stead of derivative
the relation of two sequential samples
a =
sample(t+ ∆t)
sample(t)
30
3. Digital Signal Processing Algorithms
was used. It has the same form and values for diﬀerent pulse heights assuming that the
baseline is properly removed. It has maximum value at the foot of the pulse. It falls from
it's initial value reaching 1 at peak amplitude and nears zero asymptotically. The maximum
value for a depends on the shaping time. For shorter shaping times the pulse rises with
steeper slope. This leads to higher values for a. Picking a value for a that corresponds to
the maximum derivative was found most functioning. The ﬁlter obtained with this approach
corresponds to CFD with ﬁxed time delay of one sampling period (td = ∆t) and adjustable
k(a). The function mapping a→ k depends on the form of the pulse. Most often it is more
practical to determine a from waveform than analytical calculations.
3.4.5. Peak ﬁnder
Peak ﬁnder is a simple, straightforward algorithm. It compares three consecutive samples,
sn−1, sn and sn+1. If the middle sample is the greatest i.e. sn−1 < sn ≥ sn+1, the peak is
found. It's one of the four time pick-oﬀ methods listed in the paper on BXID for calorimeter
[35]. It is used at least in ALICE EmCal trigger [37]. As it is simple it takes very little chip
resources. For peak ﬁnder for one channel only two comparators are needed.
Get corresponding timewalk 
and amplitude from LUT
Measure ToT
http://www.sciencedirect.com/science/article/pii/S0168900206007121
Sn-1
Sn
Sn+1
time
Simple Peak 
Finder:
Sn-1 < Sn ! Sn+1
Deconvolution method:
weighted sums
Charge:
Q= W1 x Sn + W2 x Sn+1
Bunch Crossing:
A = V1 x Sn-1
B = V1 x Sn + V2 x Sn-1
C = V1 x Sn+1 + V2 x Sn + V3 x Sn-1
A < B ! C   -> peak at tnFigure 3.14.: Peak ﬁnder principle.
The limitation of the peak ﬁnder is that it works best for sharp peaks. This limits it's
eﬀective use to shorter shaping times. The amplitude information is given by the value of
the peak sample.
3.4.6. Zero-crossing identiﬁcation
Zero-crossing identiﬁcation (ZCI) is a variation of peak ﬁnder. The simple peak ﬁnder
described above only works for discrete samples. ZCI is implemented as readily for continuous
as for discrete signals. It is based on the fact that the derivative of the signal at the peak is
zero. The signal is ﬁrst diﬀerentiated. The zero-crossing in the diﬀerentiated signal coincides
with the peak in the original signal as shown in Fig. 3.15. Amplitude information is given
by the amplitude of the original signal at the time of zero-crossing.
31
3. Digital Signal Processing Algorithms
Figure 3.15.: Zero-crossing of diﬀerentiated signal. The derivative (green) crosses zero when
the Gaussian pulse (blue) reaches its peak. Time and amplitude are in arbitrary
units.
The algorithm is same for discrete signal. One has only to remember that the signal is
delayed by 1/2 sampling period when diﬀerentiated. ZCI is one of the four BXID:s listed in
the calorimeter study [35]. The digital algorithm is as simple as for peak ﬁnder and needs
as much resources. It has the same limitations as well; It only works well for well deﬁned
peak, which is obtained when pulse rise time is not much longer than the sampling period.
3.5. Zero Suppression
In the average detector most of the time a given channel has nothing going on. The zero
signal does not give any information and including it in data read out is a waste of bandwidth.
In reality the signal is rarely strictly zero, because even in the absence of a pulse there is
still noise. The task of Zero Suppression (ZS) is to discern the meaningful signal pulse
from the meaningless background noise and tag the meaningful signal for readout. In its
classic form ZS compares the signal to a ﬁxed threshold, which is set above the noise level
and below signal pulse height. Everything below the threshold is interpreted as zero and
everything above as meaningful signal. The meaningful signal can be either ﬂagged for data
packet formatting or processed instantly. The method applies, when whole pulses are read
out. Alternatively in some applications only pulse height data or binary data (hit or no
hit) might be needed. The extraction of pulse height or binary data is the task of BXID
algorithms described previously.
The ZS ﬁlter in S-Altro ﬂags data for separate data formatting block. The ZS consists
of ﬁve parts: comparison to absolute threshold, glitch ﬁlter, adding ﬂag to pre-samples,
adding ﬂag to post samples and merging two ﬂagged regions close to each other. The ZS
block ﬂags data for readout and storage. Fig. 3.16 illustrates how ZS selects the data for
readout. Initially data is ﬂagged when the sample is above an absolute threshold. Glitch
ﬁlter removes ﬂag from events that are shorter than is expected for a signal pulse. In order
to save complete pulse, samples can be ﬂagged as pre-samples before exceeding threshold.
Similarly samples can be ﬂagged as post samples after returning below threshold. If two
ﬂagged pulses are close together, the samples between them are ﬂagged merging the ﬂagged
regions. The ﬁlter is discussed with more details in section 5.6.
32
3. Digital Signal Processing Algorithms
!"#$%&'()*+(,+-./01(!"2$(13&'32&4(
- 40 - 
(
(
possibility to store a programmable number of samples before and after the pulse above 
threshold. The glitch filter checks for consecutive samples above the threshold. In order to 
guarantee that the zero suppression never increases the data volume, whenever two consecutive 
pulses (sequences of samples above threshold) are too close in time, they will be merged in a 
single sequence such that time stamp, pulse length pre- and post-samples do not have to be 
duplicated.  
Figure 3.18 shows an example of one dimension zero suppression implemented in ALTRO chip 
where the techniques mentioned were implemented. When the data for the zero suppression 
comes from only one channel it is called one dimension (1D) zero suppression. 
 
F igure 3.18 Example 1D zero suppression 
The S-ALTRO chip plans to introduce a three dimensions (3D) zero suppression scheme, which 
is based on three main functions: pulse finder, inter-channel communication and inter-chip 
communication. 
3.5.5.1 Pulse finder 
A pulse finder searches pulses around its channel at the current time T in three dimensions (two 
spatial and time dimensions). Figure 3.19 shows two examples, one for a detector signal and 
other for a noise signal, both in spatial and time dimensions. The spatial dimensions shows that 
the detector signal has few channels over threshold while the noise signal just one channel 
presents signal over the threshold. Same example is shown for the time dimension.  
Figure 3.16.: Zero Suppression in S-Altro. The chain of boxes bel w waveform represent ﬂag
bits. (Image source [20])
33
4. Simulations
4.1. Migrating ﬁlters from S-Altro
The Verilog code from S-Altro was available as starting point for the DSP development.
The overall architecture of the two chips is very diﬀerent, but the requirements for the basic
building blocks, the ﬁlters, were very similar. For the initial assessment of suitability for
GdSP purposes the basic DSP blocks were extracted from the S-Altro design and a test
channel was constructed, where ﬁlters could be easily examined individually and together.
Each ﬁlter had a bypass option. The ﬁlters behavior was simulated using Cadence products
NCLaunch and SimVision [38]. After the initial assessment the blocks were edited.
The Digital shaper was observed to have less eﬀect on the signal length than expected.
When using shaping times above 100 ns, the eﬀects were barely noticeable and no visible
undershoot was observed even with the maximum settings L = 0 and K = 1 . The ﬁlter
worked most eﬀectively with 25 ns shaping time.
Two separate Baseline Correction (BC) blocks were thought to be excess and they were
merged. The reasoning behind separate BC blocks was that the Digital Shaper (DS) needs
at least rough baseline correction to operate normally. The Moving Average Unit (MAU)
in BC2 can react unpredictably to pulse pile-up. The DS was thought to reduce the pulse
pile-up. In the case of GEM detectors the DS was not seen to reduce pulse pile-up enough
to justify separate BC blocks.
600 650 700 750 800
0
10
20
30
40
50
60
70
80
90
100
Time bin [25 ns]
Am
pl
itu
de
 [A
DC
]
TPC event
 
 
Din
Dout
Bsl
H Thrsh
L Thrsh
Figure 4.1.: Shifted baseline in BC2 with TPC data
Based on the feedback from Christian Lippmann, the BC2 was investigated with data
34
4. Simulations
from two TPC events that had caused problems with ALTRO. Both cases had a saturated
pulse followed by shifted baseline. The behavior was reproduced with S-Altro ﬁlters when
using tight relative thresholds. It was found that the baseline had little to do with the
saturated pulse. It was instead caused by noise crossing the threshold and freezing the
baseline calculation. A zoomed view of the event is shown in Fig. 4.1 where Din is the
input signal to BC2, Dout output signal from BC2, Bsl the calculated baseline and H Thrsh
and L Thrsh the double thresholds. When the input signal is outside the threshold limits,
the baseline calculation is frozen. The baseline shifted outside the thresholds while the
calculation was frozen and thus remained frozen. To prevent this the duration for which
the calculation remains frozen after a pulse should have some dependence on the pulse
duration. This was solved by disabling it for short, one to two clock long pulses. The
latency after reset before threshold scheme is implemented was made programmable. A soft
reset only for MAU was implemented. The adjustments increase the stability of the ﬁlter
and make recovery from errors easier, but the ﬁlter is never completely stable while the
double threshold scheme is used. The modiﬁcations include possibility to completely bypass
the double threshold scheme. Same result might be achieved by setting the thresholds to
maximum, but bypassing the thresholds is surer.
Some of the main changes to the BC blocks came from the desire to enable resetting while
taking data. In S-Altro the reset could always be made outside the acquisition window
and no attention had been given to artefacts arising from reset. In continuous data taking
it would be advantageous to be able to reset without producing artefacts that might be
interpreted as signal. Also programmability and feedback from the chip were taken into
special consideration. Troubleshooting is easier when more data is available.
4.2. Preliminary Comparison of Diﬀerent Time pick-oﬀ
Methods
There are many diﬀerent existing algorithms for extracting timing information from a pulse.
The purpose of these simulations was to resolve which of the algorithms seemed most promis-
ing. Only the most promising ones would be further examined and developed.
There has been a previous study comparing Peak Finder, ZCI, CFD and a combination on
Peak-Sharpening algorithm and Peak Finder [35]. The study served as a good starting point
and source of inspiration, but a further study was needed since the methods were compared
using only one, relatively short shaping time, whereas in this application the shaping time
is programmable.
In many ways these simulations are only tentative and are only valid for comparison within
the simulations. While adequate for the objectives of the simulations this does not fully
describe a realistic system. The signal in these simulations is a sampled but not digitized.
In other words it is a set of double precision ﬂoating point numbers. This was convenient
for ﬁnding suitable values for the algorithm variables. The noise used in the simulations is
based on Gaussian random number generator. Besides noise, signal and design, the time
resolution is aﬀected by the phase of the signal relative to the sampling clock. In these
simulations the phase was not optimized and could add substantially to time resolution. In
later simulations, that were compared to analog methods, the phase was optimized.
4.2.1. Approximation of GEM Signal
For the input signal a dummy GEM signal was generated. At the time of the simulations
no large amount (over 100 pulses) of data from either detector measurements or simulations
35
4. Simulations
(a) (b)
Figure 4.2.: Creation of dummy GEM signal for simulations. a) Signal from GEMs is ap-
proximated by three boxes with random height. b) The approximated GEM
signal is convoluted with the analog front-end transfer function.
Figure 4.3.: Pulse height resolution and peaking time resolutions of the dummy signal (using
peak ﬁnder with 1 ns sampling).
were available. For the simulation purposes a simple approximation of the signal was drafted.
The signal from GEMs appears to consist on average of three separate pulses close together
that have random heights (see Fig. 2.2 on page 9). This signal was approximated with three
boxes with random heights demonstrated in Fig. 4.2a. The signal was normalized so that the
sum of the box heights was one. The signal was then convoluted with the transfer function
of the analog front-end:
h(t) =
(
t
τ
)2
e−
2t
τ (4.1)
where τ is the shaping time. The analog shaping time is programmable with ﬁve options
25, 50, 100, 150 and 200 ns, that were also used in the simulations. The resulting shaped
signal is shown in Fig. 4.2b. The signal statistics, pulse amplitude resolution and peaking
time resolution, shown in Fig. 4.3 are close to GEM signal, when using Ar/CO2 gas mixture
with proportions 70:30.
36
4. Simulations
4.2.2. Simulation Methods
The simulations were conducted using Matlab Simulink. Initially crude models of the dif-
ferent methods were designed. The considered algorithms listed in the order from least
promising to most promising were
 Time over Threshold (ToT)
 Piece-Wise Linear Fitting (PWLF)
 Deconvolution method
 Peak Finder
 Zero-Crossing Identiﬁcation (ZCI)
 Pulse Recognition (PuR)
 Constant Fraction Discriminator (CFD)
where ToT was excluded already prior to simulations.
Figure 4.4.: Screen capture of Simulink simulation. The stimulus to the Design Under Test
(DUT) comes from Matlab workspace variables and noise from random num-
ber generator. The amplitude and Timestamps from the trigger are saved as
workspace variables. The input data, produced trigger and corresponding am-
plitude are monitored with Scope. The zoomed Scope waveforms are shown on
right.
The simulations were conducted in rounds of test cases. There were altogether six test
cases:
1. Design veriﬁcation. The designs were tested with single test pulse. The criteria for ver-
iﬁcation was that one trigger was observed for the pulse and that the output amplitude
scaled with the input pulse amplitude.
37
4. Simulations
2. Eﬀect of ﬁltering noise and peak sharpening on time resolution.
3. Performance over shaping time. Reliable and even performance with the diﬀerent
pulse peaking times was desired. The eﬀects of the detector were excluded by using
same pulse 100 times. Only the random noise superimposed on signal stimulated the
deviations from perfect performance. The time resolution, amplitude resolution and
excess triggers caused by noise were measured. Based on the results some of the designs
were dropped from further development.
4. Time resolution with realistic detector. The used dummy signal had similar time
and amplitude resolution as GEM-detectors. Random noise was superimposed on the
signal.
5. Sensitivity to pulse pile-up. The minimum temporal distance between two equal peaks
was determined.
After or during each test case the models would be either discarded, modiﬁed or kept as they
were. One of the models, PuR, started as repeated modiﬁcation on Deconvolution ﬁlter. The
ﬁnal block diagrams of the Simulink models can be found in Appendix C.
A simulation test bench is shown in Fig. 4.4. The signal is read from Matlab workspace
and the noise is generated in Gaussian random number generator. The input Tevent from
Matlab workspace counts the bunch crossings from the incidence of particle to detector and
pulse beginning. It simulates a counter that is started when the detector is stimulated. The
signal may be ampliﬁed before it enters the Design Under Test (DUT). The amplitude output
and the counter were sampled using the produced trigger and read into Matlab variables. In
most simulations the diﬀerent models were tested simultaneously by setting them in parallel
in the test bench.
4.2.3. Simulation results
The most important results from the simulations were the time resolutions given by the
diﬀerent methods. When interpreting the plots one should give little attention to the exact
values of the time resolution. It will change greatly with calibration. More important is the
shape of the curve in respect to peaking time. For example the peak ﬁnder has relatively
good time resolution with short 25 ns shaping time. With longer shaping times the time
resolution deteriorates.
Fig. 4.5 shows the results for time resolution when
(a) an ideal detector was assumed; only random noise gives rise to the time resolution
(b) a realistic detector was assumed; in addition to random noise the detector signal
has inherent time resolution.
The signal to noise ratio in the plots is S/N = 16 for high noise and S/N = 50 for low noise.
The expected ratio is between them around S/N = 20...30. In all cases it can be seen that
the peak ﬁnder algorithms give increasing time resolution for increasing shaping time. In the
detector independent study CFD and PuR gave similar time resolution. With low level of
noise the time resolution is beneath the detection limit. With GEM-like signal and low level
noise CFD seems to give better time resolution than PuR. When the noise level is increased,
the diﬀerence diminishes.
The Deconvolution method and Piece-Wise Linear Fitting (PWLF) are not included in
the graphs. The Deconvolution method yielded consistently circa 19 ns time resolution with
38
4. Simulations
20 40 60 80 100 120 140 160 180 200
2
4
6
8
10
12
14
16
Shaping time [ns]
Ti
m
e 
re
so
lu
tio
n 
[ns
]
 
 
PuR, high noise
CFD, high noise
PF/ZCI, high noise
PuR, low noise
CFD, low noise
PF/ZCI, low noise
(a)
20 40 60 80 100 120 140 160 180 200
2
4
6
8
10
12
14
16
18
20
Shaping time [ns]
Ti
m
e 
re
so
lu
tio
n 
[ns
]
 
 
PuR, high noise
CFD, high noise
PF/ZCI, high noise
PuR, low noise
CFD, low noise
PF/ZCI, low noise
(b)
Figure 4.5.: Time resolution of diﬀerent time pick-oﬀ algorithms with (a) random noise and
(b) realistic detector and random noise. Lowest measurable time resolution for
the setup is 3 ns.
39
4. Simulations
all shaping times. This corresponds to the case where half of the events are assigned with
correct timing. When one thinks on the principle of deconvolution, the reason for this time
resolution becomes apparent. The deconvolution aims to restore the original signal. The
original signal is however around 60 ns long and consists of multiple peaks with random
relative heights. The pulse from a deconvolution ﬁlter would hence be at least two clocks
long and have random form. In a way the deconvolution method sees the original signal too
clearly and is more susceptible to irregularities in the signal.
The PWLF was only tested with one shaping time τ = 50 ns and assuming ideal detector.
It gave competitive results with the other methods. For low noise the time resolution was
σt = 2.4 ns and for higher noise σt = 9.6 ns. The time resolution presumably could have
been even better, if ﬁlter complexity was increased. The problem with PWLF was not in
the time resolution. It had the potential to give the best time resolution from all the ﬁlters.
The problem was complexity. Even the ﬁrst order was so heavy that it would eventually
limit the number of channels on the chip.
(a) (b)
(c) (d)
Figure 4.6.: Peak Sharpener, Integrator and noise. (a) Input signal. (b) Output from Peak
Sharpener. (c) and (d) Output from Integrator. On x-axis is time and on y-axis
ADC counts.
The peak sharpener in question was found to amplify noise too much to be useful as
illustrated in Fig. 4.6. The ATLAS study [35] had found FIR peak sharpening slightly
useful. For best ﬁltering performance, the FIR-ﬁlter should be long enough to accommodate
whole pulse with the longest shaping time. This would give at least 20 clock latency for the
40
4. Simulations
ﬁlter. The latency would be problematic, because it is added to the trigger latency.
The signal smoothing Integrator was found helpful with high level, high frequency noise.
In Fig. 4.6 (c) and (d) the output signal from integrator is shown after ﬁrst and second
integration. CFD, PF and ZCI beneﬁted from integrator in most cases. With PuR the
beneﬁts weren't as consistent and clear. In as many cases the integrator interfered with its
operation as helped it.
40 60 80 100 120 140 160 180 200
0
5
10
15
20
25
30
35
Dead time for Constant fraction (best time resolution)
Tp (ns)
Cl
oc
ks
 b
et
we
en
 p
uls
es
(a)
40 60 80 100 120 140 160 180 200
0
5
10
15
20
25
30
35
Dead time for Pulse Recognition
Tp (ns)
Cl
oc
ks
 b
et
we
en
 p
uls
es
(b)
40 60 80 100 120 140 160 180 200
0
5
10
15
20
25
30
35
Tp (ns)
Cl
oc
ks
 b
et
we
en
 p
ul
se
s
Dead time for Peak finder
(c)
40 60 80 100 120 140 160 180 200
0
5
10
15
20
25
30
35
Dead time for Zero−crossing
Tp (ns)
Cl
oc
ks
 b
et
we
en
 p
ul
se
s
(d)
Figure 4.7.: Time resolution and pulse pile-up. The distance between two equal pulses before
timing is eﬀected For (a) CFD, (b) PuR, (c) PF and (d) ZCI.
Fig. 4.7 illustrates the required delay between pulses. The gray color indicates region
where the triggering eﬃciency is not reduced, but the timing is altered. Black color indicates
dead time. The trigger eﬃciency is reduced. Only one pulse of the two is seen, but the
timing is correct. Purple color marks a region where both triggering and timing eﬃciencies
are reduced. Only one pulse is seen and the timing for it is altered. The minimum time
between pulses for no eﬀects from pulse pile-up was found to be shortest for ZCI and longest
for PuR. The dead time was shortest for peak ﬁnder and longest for CFD. The eﬀects seem
to be more dependent on the shaping time than the BXID algorithm.
41
4. Simulations
4.3. Comparison between digital and analog BXID methods
Two most promising methods in previous simulations CFD and Pulse Recognition were
developed into Verilog models. The time resolution of the models was compared to analog
methods using the same signal. The analog methods were investigated in Université Libre
De Bruxelles.
4.3.1. Simulated GEM signal
The signal used in the simulations was obtained from Garﬁeld simulations of the GEM
detector conducted in Université Libre De Bruxelles (ULB) [36]. The simulation is for
gas mixture Ar:CO2:CF4 (45:15:40) and contains 500 separate events for a MIP entering
the GEM. The same signal was used in simulations using digital and analog time pick-oﬀ
methods.
Figure 4.8.: Garﬁeld simulations on GEM signal (Image source [36])
The ULB group shared the convoluted, unscaled signal. The Garﬁeld simulation gives
results in arbitrary units. The average signal height was scaled to correspond to a expected
pulse height from a MIP. All events were multiplied with the same factor so that the devia-
tions in the pulse height were maintained.
4.3.2. Simulation methods
Table 4.1.: Settings for best time resolution
Shaping time Constant Fraction discriminator Pulse Recognition
[ns] a order of integration n v1 v2 v3
25 4.43 1st 2 1 0 -0.4
50 3.98 1st 3 1 -0.4 -0.3
100 2.86 2nd 5 1 -2.2 4.2
250 1.60 2nd 9 1 -1.7 2.9
500 1.27 2nd 9 1 -0.2 0
42
4. Simulations
Figure 4.9.: Screen capture of simulation using NCLaunch and SimVision.
43
4. Simulations
The simulations were conducted using Cadence products NCLaunch and SimVision [38].
Verilog Hardware Deﬁnition Language (HDL) was used. SimVision was launched either from
within NCLaunch or directly from terminal depending on the design complexity. Fig. 4.9
shows a screen capture of a simulation with NCLaunch and SimVision. Snippets of the test
bench code in text editor are shown above and waveform data in SimVision window below.
The uppermost waveform is the input signal to the DSP core and the ﬁrst below it is the
baseline corrected input signal to the BXID blocks.
The greatest diﬀerence between the Simulink models and Verilog models is that in the
latter all variables, inputs and outputs have less precision and tighter range. In Simulink,
when the required range of the variables was yet unknown, the variables were all of type
double. Based on these previous simulations the ranges and precision for the variables needed
to represent all the required values could be drafted. The values for variables used in the
simulation are presented in Table 4.1. The hard coded ranges and precisions tie the ﬁlters
to the designed range of shaping times (25 ns to 200 ns) resulting on poorer time resolution
when using shaping times outside the range. Finding appropriate settings for PuR on 500
ns shaping time proved impossible. The variable range for n was too small. With shaping
times 25 ns and 500 ns PuR is eﬀectively used as simple peak ﬁnder.
On screen
Simulations: DSP-channel
September 15, 2013tiina.naaranoja@cern.ch 13
Test bench
Data in 
file
(GEM 
simulations)
Noise 
file
(3 frequency
mix)
Timestamp
file
Timing efficiency
Multi-triggers
Lost events
• Input amplitude and noise rms pairs, that correspond to different programmable gain and
ENOB pairs
• Timestamps further analyzed in matlab
• separate test bench for calibration
• no presentable results yet, currently problem of having multiple triggers/event even
without noise, timing efficiency around 94-96 % (dropped from 96-98%)
test DSP core
Ch1 (CFD)
Ch2 (PR)
Ch3 (CFD)
Ch4 (CFD)
+
Analyze
Figure 4.10.: Block diagram of the simulation test bench.
A block diagram of the simulation is shown in Fig. 4.10. For more realistic outcome the
ﬁlters were used with the full DSP chain. A truncated test version of a DSP-core with two
channels was created. The channels within the core used same settings. Additional two
channels were used with diﬀerent settings. The main diﬀerence in settings was the use of
integrator. The input signals were read from input ﬁle. Optional noise could be read from
separate noise ﬁle. At this point the noise was not used. Some approximated key results were
displayed on screen at the end of the simulation. They were mainly used in the calibration
(eg. ﬁne-tuning the signal phase). The time resolution was calculated with Matlab using
the output ﬁle for timestamps.
The simulations for the analog methods were conducted in ULB by Thierry Maerschalk
under supervision of Gilles De Lentdecker. [36] They were based on models written in Python
programming language. The simulations were conducted at this point without noise.
44
4. Simulations
4.3.3. Simulation results
0 50 100 150 200 250 300
5.4
5.5
5.6
5.7
5.8
5.9
6
6.1
Shaping time (ns)
Ti
m
e 
re
so
lu
tio
n 
(ns
)
Constant Fraction Discriminator
(a)
0 50 100 150 200 250 300
5.4
5.6
5.8
6
6.2
6.4
6.6
6.8
Shaping time (ns)
Ti
m
e 
re
so
lu
tio
n 
(ns
)
Pulse Recognition
(b)
Figure 4.11.: Time resolution using digital (a) Constant Fraction Discriminator and (b) Pulse
Recognition using Front-End gain 12,5 mV/fC without noise.
(a) (b)
Figure 4.12.: Time resolution using analog methods a) Time over Threshold and b) Constant
Fraction Discriminator without noise (Image from [36])
Fig. 4.11 shows the time resolution for digital methods and Fig. 4.12 for analog methods.
With short shaping times the time resolution is larger, because the GEM signal is not fully
integrated in the analog front-end. This results in distorted and irregular pulse shape. As the
shaping time increases the pulses become more uniform. The time resolution from the analog
methods follow this principle. With the digital methods this eﬀect is seen with the shorter
shaping times. With longer shaping times the calculations need to be increasingly accurate
as the relative diﬀerences between sample values become smaller. The digital system also
suﬀers from quantization noise. It may interfere with the ﬁner calculations and increases
the time resolution for longer shaping times. As a result the digital methods have optimal
shaping time around 50 ns. For this shaping time the time resolution for digital and analog
45
4. Simulations
methods is almost the same. For digital methods it is around 5.4 ns and for analog methods
around 5.2 ns.
4.4. Required Eﬀective Number of Bits
The Eﬀective Number of Bits (ENOB) required for good time resolution is an important
speciﬁcation for the ADCs. The ENOB is also discussed in section 2.3 on page 12. It is
assumed that the ADC input and output ranges are perfectly matched and the ENOB can
be expressed as random noise. The eﬀect of ENOB was assessed in three diﬀerent ways:
by analytical calculation, using diﬀerential pulse height spectrum and comparing the time
resolution with diﬀerent noise levels.
The primary, central channels closest to the incident particle collect the most charge. A
much smaller signal can be seen on the neighboring channels. The signal on neighboring
channels can be used to get better spatial resolution. The eﬀects of the ENOB are assessed
also for the neighboring channels.
4.4.1. Analytical Estimation
The noise resulting from ENOB NADC is added to the noise in analog front end NFE giving
the total noise
Ntot =
√
N2FE +N
2
ADC
. The expected equivalent noise charge for the front-end is NFE = 1100 e. The chip has two
options for ampliﬁer gain: g = 12.5mV/fC and g = 50mV/fC. Analytically one can calculate a
value for ENOB for which the two noise sources are balanced: NFE = NADC ⇒ENOB = 9.8
for g = 12.5mV/fC and ENOB = 7.8 for g = 50mV/fC. With these values for ENOB neither
noise source dominates and in principle the ADC noise should not signiﬁcantly increase the
time resolution.
4.4.2. Diﬀerential Pulse Height Spectrum
A traditional way to quantify tolerable noise level is to take a diﬀerential pulse height spec-
trum from the signal. When a threshold can be set between the peaks representing the noise
and signal, the noise is tolerable.
The simulated GEM signal has approximately 1 to 5% overlap with the front-end noise
alone. A pulse height region without any pulses between the signal and noise cannot be
found. The separation between the signal and noise does depend on ENOB as illustrated in
Fig. 4.13 and 4.14. The applied threshold T in ADC counts is on the x-axis. On y-axis is
the number of pulses N per a threshold interval dT = 10. This gives the number of pulses
that have amplitude in the range from T − dT to T .
The collected charge was estimated to be 14 fC on the primary channel and 6 fC on
neighbors [27]. The pulse amplitude on primary channel would be then 358.4 ADC counts
with g = 50 mV/fC and 153.6 ADC counts with g = 12.5 mV/fC. The total RMS noise in
ADC counts using g = 50 mV/fC was estimated to be 4.6 for ENOB=9, 6.0 for ENOB=7
and 16.6 for ENOB=5. Using g = 12.5 mV/fC it was estimated to be 1.5 for ENOB=9, 4.2
for ENOB=7 and 16.0 for ENOB=5.
Figures with the diﬀerential pulse height spectra are found on pages 48 and 49. The
minimum value for ENOB depends on the ampliﬁer gain and the collected charge on the
channel. The best separation between signal and noise is achieved on the primary channel
with gain g = 50 mV/fC. In this case they are separated even with 5 ENOB as can be seen
46
4. Simulations
in Fig.4.13 (c). The worst case is the neighboring channel with gain g = 12.5 mV/fC. Here
the signal is barely distinguishable from the noise even with 9 ENOB as seen in Fig. 4.14
(d).
4.4.3. ENOB and time resolution
The same, Verilog based test bench was used as when comparing the digital designs to
analog. The test bench is shown in Fig. 4.10 on page 44. Pulse amplitude corresponding to
primary channel was used. The noise was a mixture of three diﬀerent frequencies. Modeling
the noise with random numbers is likely to result in overestimation in the noise and in the
time resolution. It is fully possible that time resolution measured in laboratory tests would
be better.
Fig. 4.15 on page 50 shows the time resolutions dependency on ENOB for CFD and PuR for
the primary channel. Fig. 4.15 (b) has exactly the expected form of approximate rectangular
hyperbola. The time resolution approaches its optimal value when ENOB tends to inﬁnity
and begins to rapidly increase below 8 ENOB. With the greater gain (see Fig. 4.15 (a))
the time resolution begins to increase correspondingly below 7 ENOB. Unexpectedly the
best time resolution for the diﬀerent methods diﬀers with over 1 ns. With the greater gain
some clipping of the pulses was seen with the larger amplitudes. This occurrence was not
estimated frequent enough to cause signiﬁcant deterioration in the time resolution. When
compared to the time resolution with lower gain, the time resolution with CFD is around
the same value and with PuR it's improved. This gives a hint that PuR beneﬁts from the
increased accuracy in the calculations as the pulse is magniﬁed.
For the neighbor channels it is expected that the base level of time resolution is higher
resulting from the smaller S/N. Higher required ENOB is also expected.
47
4. Simulations
0 200 400 600 800 1000 1200
0
2
4
6
8
10
12
14
16
18
20
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 9, g = 50 mV/fC, Tp = 100 ns
(a) ENOB=9
0 200 400 600 800 1000 1200
0
2
4
6
8
10
12
14
16
18
20
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 7, g = 50 mV/fC, Tp = 100 ns
(b) ENOB=7
0 200 400 600 800 1000 1200
0
2
4
6
8
10
12
14
16
18
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 5, g = 50 mV/fC, Tp = 100 ns
(c) ENOB=5
0 50 100 150 200 250 300
0
10
20
30
40
50
60
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 9, g = 12.5 mV/fC, Tp = 100 ns
(d) ENOB=9
0 50 100 150 200 250 300
0
10
20
30
40
50
60
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 7, g = 12.5 mV/fC, Tp = 100 ns
(e) ENOB=7
0 50 100 150 200 250 300
0
10
20
30
40
50
60
70
80
90
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 1, ENOB = 5, g = 12.5 mV/fC, Tp = 100 ns
(f) ENOB=5
Figure 4.13.: Diﬀerential pulse height spectrum for primary channel for shaping time 100 ns
and ampliﬁer gain (a)-(c) g = 50 mV/fC and (d)-(f) g = 12.5 mV/fC
48
4. Simulations
0 100 200 300 400 500
0
5
10
15
20
25
30
35
40
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 9, g = 50 mV/fC, Tp = 100 ns
(a) ENOB=9
0 100 200 300 400 500
0
5
10
15
20
25
30
35
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 7, g = 50 mV/fC, Tp = 100 ns
(b) ENOB=7
0 100 200 300 400 500
0
5
10
15
20
25
30
35
40
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 5, g = 50 mV/fC, Tp = 100 ns
(c) ENOB=5
0 20 40 60 80 100 120 140
0
20
40
60
80
100
120
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 9, g = 12.5 mV/fC, Tp = 100 ns
(d) ENOB=9
0 20 40 60 80 100 120 140
0
20
40
60
80
100
120
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 7, g = 12.5 mV/fC, Tp = 100 ns
(e) ENOB=7
0 20 40 60 80 100 120 140
0
50
100
150
200
250
300
350
400
T
dN
/d
T
Differential pulse amplitude spectrum, dT = 10
Ch = 2, ENOB = 5, g = 12.5 mV/fC, Tp = 100 ns
(f) ENOB=5
Figure 4.14.: Diﬀerential pulse height spectrum for neighboring channel for shaping time 100
ns and ampliﬁer gain (a)-(c) g = 50 mV/fC and (d)-(f) g = 12.5 mV/fC
49
4. Simulations
4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5
6.5
7
7.5
8
8.5
9
9.5
10
ENOB
Ti
m
e 
re
so
lu
tio
n 
(ns
)
Tp =050 ns , g =50mV/fC
 
 
CFD
Pulse Recogn.
(a)
4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5
7
8
9
10
11
12
13
14
15
ENOB
Ti
m
e 
re
so
lu
tio
n 
(ns
)
Tp =050 ns , g =12mV/fC
 
 
CFD
Pulse Recogn.
(b)
Figure 4.15.: Time resolution with diﬀerent ENOBs for ampliﬁer shaping time τ = 50ns and
gains (a)g = 50 mV/fC and (b) g = 12.5 mV/fC .
50
5. Proposal for the GdSP Signal
Processing Chain
5.1. Digital Signal Processing Core and Chain for one channel
  Baseline
Correction
Digital
Shaper
Integrator
     Zero
Suppression
   Constant
    Fraction
Discriminator
  signed
     -> 
unsigned
>>2
10 13 13
1
0
mode
1
0
mode
Pulse
Flag
trigg
ADC
data
13
13 11
12
10
10
10
Figure 5.1.: Data path on the Digital Signal Processing chain.
The DSP core contains either 128 or 64 channels and a 10 bit counter. The counter is
used to read the SRAM in BC time-wise. The DSP chain for one channel is illustrated in
Fig. 5.1. For simplicity only the data path is shown.
The data input to DSP chain comes directly from an ADC. The data outputs Pulse and
Flag go to SRAM, where it the data is delayed for the duration of CMS trigger latency. The
data is formatted after the SRAM. The formed trigger signal trigg has direct route to the
E-Port for readout.
The DSP has two modes of operation; Tracker and Waveform. Baseline Correction (BC),
Digital Shaper and Integrator are used in same fashion with both modes. In the tracker mode
the Zero Suppression (ZS) block is completely unused. The outputs come from Constant
Fraction Discriminator (CFD). Pulse contains extracted pulse amplitude information. Flag
is the trigger signal. The choice, whether amplitude information or binary data is read out,
is made in data formatting block.
In waveform mode complete pulses are read out. In this mode Pulse contains the pro-
cessed waveform data. The Flag marks meaningful data for readout. The CFD can be
simultaneously used to produce trigger.
Many of the DSP blocks make use of a threshold value to distinguish signal from noise.
These blocks are BC2, ZS and CFD. All three ﬁlters have independently adjustable threshold
values. These values are common to all channels. The thresholds are added with component,
that is adjusted separately for each channel. This NoiseCh variable indicates the noise level
on the channel.
51
5. Proposal for the GdSP Signal Processing Chain
It is common practice to register the output from a ﬁlter. This prevents the bit rise time
from causing instability. These register levels add to the latency of the ﬁlter. If shorter
trigger latency would be needed, it could be investigated, if stable ﬁlters could be achieved
without the registers.
The reset scheme for the DSP-chain includes two resets. One is active-low common reset
RstB for all blocks. It clears all registers and calculated values to zero. The second reset
ma_rstB is distributed trough the conﬁguration registers. It resets exclusively the Moving
Average Unit (MAU) in BC2. Both common reset and MAU reset are designed to be usable
while the detector is on-line and taking data. The Table 5.1 lists the system inputs that are
common to all DSP ﬁlters in the core.
Table 5.1.: System inputs common to all blocks in the DSP core.
System inputs
I/O Name Width Description
Input Clk 1 40 MHz clock signal
Input RstB 1 Active low reset
5.2. Baseline Correction
The Baseline Correction (BC) is divided into two blocks BC1 and BC2. These two blocks
are both optional and can be bypassed to reduce latency (see Fig. 5.2). All tough they can
be both bypassed it is recommended that at least one of them is used. All the input and
output signal are summarized in table 5.2.
!
" #$%&
!"'('!)
!"
%*!
"
!
"
+,-./,0123
+,-./,01!3
+,-./,0143
!"
#/5678-
!
"*8-
986:
%;;
#/567;5.5
+,-./,01"3!"
!"
!"
!"
!"
*<
=='2 >
!
"
=='2
?
@!
$A.B
?
@!
!"
CD;
+,-./,01E3
$A.B
CD;7/
D:;:A.50
*,F.
+0G
$A.B
H/
$;
HIJ
+,-./,0
!4!"
B+!
!"
?
@! B+!
?
@!
B+2
?
@K
!
"
!
"
B+
+,-./,01L3 +,-./,01M3
*8-
*,F.
BA0<F.
+0G
$A.B
H/N'$;N'%;;N'986:N'
A/567;5.5N'CD;N'
+,-./,01EO"3
657/A.BN'I;P:AN'657:-N'
9Q/AQB2RN'9Q/AQB2SN'
J,8A:+QN'<TTA:.N'95DAI-N'
05.:-UVN'P08.UQN',W://8;:'
+,-./,01MOL3
!"
!4
!4
Figure 5.2.: Block diagram of Baseline Correction
52
5. Proposal for the GdSP Signal Processing Chain
!
" #$%&
!"'('!)
!"
%*!
"
!
"
+,-./,0123
+,-./,01!3
+,-./,0143
!"
#/5678-
!
"*8-
986:
%;;
#/567;5.5
+,-./,01"3!"
!"
!"
!"
!"
*<
=='2 >
!
"
=='2
?
@!
$A.B
?
@!
!"
CD;
+,-./,01E3
$A.B
CD;7/
D:;:A.50
*,F.
+0G
$A.B
H/
$;
HIJ
+,-./,0
!4!"
B+!
!"
?
@! B+!
?
@!
B+2
?
@K
!
"
!
"
B+
+,-./,01L3 +,-./,01M3
*8-
*,F.
BA0<F.
+0G
$A.B
H/N'$;N'%;;N'986:N'
A/567;5.5N'CD;N'
+,-./,01EO"3
657/A.BN'I;P:AN'657:-N'
9Q/AQB2RN'9Q/AQB2SN'
J,8A:+QN'<TTA:.N'95DAI-N'
05.:-UVN'P08.UQN',W://8;:'
+,-./,01MOL3
!"
!4
!4
Figure 5.3.: Block diagram of Baseline Correction 1
The block diagram of BC1 is shown in Fig. 5.3. It is based on the BC1 in S-Altro
ASIC. Features that were not compatible with continuous data taking were discarded in
the migration process. The main features of BC1 are ﬁxed pedestal subtraction and small
10x1024 SRAM. The ﬁlter oﬀers multiple usage options. The options in BC1 are chosen
through chip registers using the ﬁrst ﬁve bits of input variable named Control. All possible
permutations for Control might not make sense. The most useful options (and binary values
for Control) are
 Fixed pedestal subtraction from input signal (Control=1xxx0)
 Test mode: Signal is read from SRAM and ﬁxed pedestal is subtracted (Control=1x011)
 Conversion mode: SRAM is used as LUT for input and ﬁxed pedestal is subtracted
(Control=1x1x1)
 Subtraction of periodic disturbances written to SRAM (Control=0x010)
 Subtracting converted signal: using SRAM as LUT in signal→baseline conversion
(Control=0x1x0)
 Recording input signal to SRAM (Control=x101x)
 Writing data to SRAM trough control register interface (Control=x000x)
The BC1 can be a powerful tool in case of systematic periodic disturbances or other system-
atic distortions. The ﬁxed pedestal subtraction oﬀers a foolproof baseline correction, when
low frequency baseline shifts are not a signiﬁcant problem.
The BC2 ﬁlter subtracts a self-calibrating baseline. The ﬁlter is based on S-Altro BC2,
which in turn is unchanged version from ALTRO ASIC. In adaptation to GdSP the ﬁlter
was updated signiﬁcantly based on feedback. The block diagram of the ﬁlter can be found in
Fig. 5.4 on the next page. As seen from the ﬁgure, the double threshold scheme and moving
average calculations control logic are a considerable part of the ﬁlter.
53
5. Proposal for the GdSP Signal Processing Chain
!
"
#
$
%&
'
()
!
$
%&
'
()
!
$
%&
'
()
!
$
%&
'
()
!
*
+
*
,
*
#
$
%&
'
()
!
*
&
-
.
/$%
&
'
()
!
01
0
2
2
3#
0
4
56
1
7
2
2
3#
01
0 0
%
*
8
$
%&
'
()
!
9:
;
<
$
%&
'
()
!
9+
$
%&
'
()
!
9,
<
:5
)=
>
?
*
<
@
(A
&
B
?
*
<
@
(A
8
B
$
%&
'
()
!
9#
$
%&
'
()
!
9&
& 8
?
*
<
@
(A
C
D#
B
$
%&
E
38
& 8
0
%&
F
G
()
H
;
(I
3
9:
;
<
3
=G
J
6
)@
K
$
%&
'
()
!
H
;
1
K(
)!
& 8
&
9&
L
;
:5
*
6
G
1
7
J
:(
@
L
;
:1
:;
)
9:
;
)
:;
)@
6
=M
E
3
$
%&
&8
0
&
9:
;
)=
6
)
N
:;
)3
O
@
;
)3
=G
J
6
)@
K
E
3
$
%&
&8
0
&
L
;
:=
6
)
P;
)@
6
=M
3
=G
J
6
)@
K
'
()
!
H
;
1
K(
)!
H
;
1
@
6
-
.
/
3
"
G
6
)K
G
:3
:G
<
5=
9Q
0
%
& 8
*
Q
H
;
1
@
6
*
Q
1
!
(:
(1
;
J
Q
A&
&
D#
B
*
Q
1
!
(:
(1
;
J
QA
&
#
B
$
%&
'
()
!
$
%&
'
()
!
&
,
,
&
8
R
G
5(
@
"
>
S
>
K(
>
!
#
T
S
>
K(
>
!
#
P
U
0
4
56
V
99
(@
)
!
(:
!
(:
!
(:
V
J
)
&
, &
,
G
L
@
KK
5*
@
S
;
7
(?
6
#
+ C
U
4
G
J
)
!
(:
V
J
)
& 89
:;
<
1
>
9:
;
<
1
:
4
56
1
7
A&
#
B
)>
:A
&
#
B
)>
:
)>
:
4
56
1
7
A&
#
B
4
56
1
7
4
G
J
O
:@
3)
>
K@
(>
G
:*
3
(=
>
@
H
@
F
ig
u
re
5.
4.
:
B
lo
ck
d
ia
gr
am
of
B
as
el
in
e
C
or
re
ct
io
n
2
54
5. Proposal for the GdSP Signal Processing Chain
!
"#
$%&'
!
"#
$%&'
!
"#
$%&'
!
"#
$%&'
()
(* (+
!
"#
$%&'
(,
!
"#
$%&'
(-
!
"#
$%&'
!
"#
$%&'
!
"#
$%&'
(. (/ (#
!
"#
$%&'
(0
123
0#
#0
##
00
4
"
4
0#
#0
##
00
55#
55/
55.
'%6
7689
:;<==2><
?8@%A3
BCDEF2&GE8HHIJI68&2;<E%IJ
!
"#
$%&'
!
"#
$%&'
!
"#
$%&'
!
"#
$%&'
()
(* (+
!
"#
$%&'
(,
!
"#
$%&'
(-
!
"#
$%&'
!
"#
$%&'
(. (/
(#
123
0#
#0
##
004 55#
55/
55.
'%6
7689
:;<==2><
?8@%A3
BCD/EF2&GE>2=<H&E%IJ
4
4
%IJ/
%IJ-
%IJ)
Figure 5.5.: Block diagram of Moving Average Unit.
The double threshold scheme excludes pulses from the baseline calculation. The input
signal is compared to high and low thresholds relative to calculated baseline. If the signal is
outside the threshold limits, the baseline calculation is suspended. The calculation resumes
when the signal returns inside the threshold limits. Unlike in S-Altro the usage of the
threshold scheme is optional. The risks of the double threshold scheme were discussed in
the context of S-Altro BC and simulation results (see sections 3.2 on page 18 and 4.1 on
page 34). The downside of threshold-less calculation is that the pulses will be followed with
undershoots resulting in dead time as discussed in section 3.2. The bright side is that the
calculation is extremely predictable and stable as experienced in use in ALICE EmCal trigger
boards [37].
The control logic contains latency counter, postmask counter and ﬂat beat counter. The
latency counter lets the registers in MAU ﬁll after reset before implementing the double
threshold scheme. The postmask counter keeps the baseline calculation suspended for the
speciﬁed time even after the signal has returned inside the threshold limits. The ﬂat beat
counter makes certain that there are no pulses present in the signal at the moment when
the double threshold scheme is implemented after reset. It keeps counting up as long as the
signal is within threshold limits. If the signal is not within the limits, the counter is reset.
When the counter reaches a programmable value, it raises a ﬂag signifying that no pulses
are present in the signal.
The baseline calculation is based on Moving Average Unit (MAU) that is illustrated in
Fig. 5.5. It is the Finite Impulse Response ﬁlter with direct sum described in section 3.2.
The MAU calculates the moving average over two, four or eight samples. The baseline is
subtracted from the signal, if the ﬁlter is enabled.
55
5. Proposal for the GdSP Signal Processing Chain
Table 5.2.: Baseline Correction inputs, outputs and variables from conﬁguration registers.
Data in and out
I/O Name Width Description
Input Din 10 Data input from ADC. Unsigned.
Output Dout 13
Data output to next block. Signed two's
complement.
Register variables
I/O Name Width Description
Input Control 7 Selects the data path in the Baseline
Correction.
Input Add 10 Address for the SRAM in BC1.
Input Fpd 10 Fixed pedestal value.
Input Sram_data 10 Data to be written to the SRAM in BC1.
Input Rd 1 Read enable for the SRAM in BC1.
Input Wr 1 Write enable for the SRAM in BC1.
Input Edges 6 Sets the time before and after a pulse,
when the baseline is frozen.
Input ﬂat 4 Length of time period after reset during
which there are no pulses present before
threshold scheme is turned on.
Input glitch 1 Prevents short glitches from freezing
baseline calculation.
Input latency 5 Latency after reset before threshold
scheme is turned on.
Input NoiseCh 10 Noise per channel is added to the
thresholds.
Input override 1 Turns the threshold scheme oﬀ.
Input TapsEn 2 Selects the number of samples in the
moving average calculation.
Input ThrshB2H 9 High threshold. When signal is above this
threshold, baseline calculation is frozen.
Input ThrshB2L 9 Low threshold. When signal is below this
threshold, baseline calculation is frozen.
Output BslOut 13 Baseline that is being removed from the
signal.
56
5. Proposal for the GdSP Signal Processing Chain
Other
I/O Name Width Description
Input Time 10
Output from a counter on DSP-core level.
Used for reading or writing the SRAM in
BC1 time wise.
5.3. Integrator
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
s2 s1
s0
+
+ 1001
00
11
Dout
Din
avgsum2
sum4
taps
13
1314
15
2
Figure 5.6.: Block diagram of Integrator.
The integrator is used to ﬁlter high frequency noise that is close to sampling frequency.
It takes the average of two or four sequential samples. These correspond to the numerical
integral and second order numerical integral over time covering one sampling period.
The block diagram of the integrator is shown in Fig. 5.6. The input Din is saved into
short pipeline containing registered samples s2, s1 and s0. The input taps is used to select
the signal for output Dout. The signal can be directly routed from the input, when there is
no wish to use the integrator. Other options are the sum of input and ﬁrst sample in the
pipeline and a sum where the last two samples in the pipeline are added to the sum. The
division by two or four is performed by omitting LSB(s) in the selection.
The integrator is needed for obtaining good time resolution in the bunch crossing assign-
ment, when there is noise present. The importance of it grows with the shaping time. The
process shifts the signal frequency spectrum into a lower frequency and reduces pulse height
as a side eﬀect. The severity of these side eﬀects depends on the shaping time in the ana-
log shaper. With long shaping times, the eﬀects are barely noticeable. Without the noise
treatment the Constant Fraction Discriminator will give poorer time resolution and is likely
to produce false triggers on noise even with moderate level of noise. In the other hand with
very low level of noise integration will have adverse eﬀect on time resolution. The integrator
has been made programmable so that the optimal level of integration may be chosen.
57
5. Proposal for the GdSP Signal Processing Chain
Table 5.3.: Integrator inputs, outputs and variables from conﬁguration registers.
Data in and out
I/O Name Width Description
Input Din 13
Data input from previous block. Signed
two's complement.
Output Dout 13
Data output to next block. Signed two's
complement.
Register variables
I/O Name Width Description
Input taps 2 Selects whether integration is oﬀ, ﬁrst
order or second order. Corresponding
values for taps are 00, (01 or 10) and 11.
5.4. Digital Shaper
f1
lter1
in
K
L
result
f2
lter1
in
K
L
result
f3
lter1
in
K
L
result
!RstB
Z-1r1 1
0
sel_ﬁlt
ﬁlt_in
K1
K2
L1
L2
K3
L3
sel_ﬁlt
ﬁlt_out
13
13
13
13
13
13
13
13 13 13
r2 r3
y
Figure 5.7.: Digital Shaper block diagram
The Digital Shaper (DS) is a 3rd order pole-zero ﬁlter (see Fig. 5.7). It is adapted from
S-Altro with only one change. It was downscaled from 4th order cascade to 3rd order. The
transfer function of the ﬁlter was derived in chapter 2.4 on page 13:
H(z) =
1− L1 · z−1
1−K1 · z−1 ·
1− L2 · z−1
1−K2 · z−1 ·
1− L3 · z−1
1−K3 · z−1 (5.1)
The three zeros Li and poles Ki adjust the passband of the ﬁlter. As a rule of the thumb,
when the value for poles is increased, the signal tail is shortened and when the value for
zeros is increased the peaking time is increased.
The pole-zero ﬁlter in Fig. 5.8 is a type of transposed direct form ﬁlter. The optimal form
of the ﬁlter has been carefully studied for S-Altro and the probability of overﬂow has been
reduced compared to ALTRO [20].
The values for zeros Li and poles Ki can be within the range [0,1[. As the Verilog language
does not support ﬂoating point arithmetics, the variables need to be expressed as integers.
58
5. Proposal for the GdSP Signal Processing Chain
mult1N
P
R
mult2
N
P
R
+
-
a
13
!RstB
Z-1
+in
L
K
result
1313
13
13
m1
m2
c
13
13
Figure 5.8.: Block diagram of Pole-zero ﬁlter in DS.
1
0
+-1
+ N[12]N
P
13
13
13
13
13
13
13
>>13
>>13
26
26
R
C2 AX
temp
Figure 5.9.: Multiplication in DS.
The trick is to use integer variable in the arithmetic operation. The result is then divided
using bit shifting operation. In the multiplication block (Fig. 5.9) the signal and variable (K
or L) are ﬁrst multiplied and then the 13 LSBs of the result are shifted away. As the variables
K and L are 13 bits wide, the result is the same as multiplying with ﬂoating point number
that has a positive value below one. Strictly speaking the variables Ki,var in the ﬁlter are not
the same as the poles Ki,eq in the equation 5.1. They follow the relation Ki,eq=Ki,var/2
13.
The same applies to the zeros.
It is debatable whether digital shaper is even needed in GdSP. The GEM detector does
not suﬀer from long ion tails as the TPC. Signal that has peaking and decay times of the
same order of magnitude has been proven challenging for digital shaping. One of the ideas
for DS has been to use it to shape signal directly from the preampliﬁer. In any case the
signal would beneﬁt from shortened pulses.
59
5. Proposal for the GdSP Signal Processing Chain
Table 5.4.: Digital Shaper inputs, outputs and variables from conﬁguration registers.
Data in and out
I/O Name Width Description
Input ﬁlt_in 13
Data input from previous block. Signed
two's complement.
Output ﬁlt_out 13
Data output to next block. Signed two's
complement.
Register variables
I/O Name Width Description
Input sel_ﬁlt 1 Enable ﬁlter
Input L1 13 First order zero
Input L2 13 Second order zero
Input L3 13 Third order zero
Input K1 13 First order pole
Input K2 13 Second order pole
Input K3 13 Third order pole
5.5. Constant Fraction Discriminator
!RstB
Z-1
Din
+
_+a
+
_
+
_
Thrsh
12
7
Sn
frelation
fthrsh
Sn_a[16:3]
10
!RstB
Z-1
!RstB
Z-1
01
10
11
00
CFD
!RstB
Z-1
1
00
merge
2
trigg
amplitude
ﬂag
Hysteresis
Threshold
comparison
12
10
f2
f1
ﬂag_merge
ﬂag_r
Figure 5.10.: Constant Fraction Discriminator block diagram. The ﬁlter consists of the actual
CFD logic, threshold comparison that separates the signal from noise and ﬂag
merging that adds hysteresis to the comparators.
The Constant Fraction Discriminator (CFD) in Fig. 5.10 contains relational comparison
between two sequential samples, threshold comparison and a hysteresis block. The relational
operation is the core of the CFD. The comparison is true, at a certain fraction of the pulse
height. The fraction can be adjusted with variable a. The idea behind the ﬁlter was discussed
in depth in chapter 3.4.4.
The multiplication with variable a follows the same technique for introducing decimal
60
5. Proposal for the GdSP Signal Processing Chain
points that was described for the DS multy block on page 59. The variable has eﬀectively
maximum value 15.875 and minimum step 0.125. In other words it has approximately range
from zero to 16 with one decimal digit precision. This was the minimum requirement for
longer shaping times. The digital noise would be reduced, if more decimals were added. The
decimal digits are needed with the longer shaping times. In this region the ﬁlter is very
sensitive to noise on the signal. The beneﬁts of reduced digital noise would most likely be
overshadowed by the sensitivity to other noise sources.
The threshold comparison discerns signal from noise. Requiring that both the two samples
have the right relations and the signal is above threshold ensures that trigger is given only
on meaningful signal.
On a threshold crossing there tends to be jitter when enough noise is present. Using
hysteresis to eliminate the jitter is common practice in analog electronics. This kind of
hysteresis is emulated by merging small gaps in the ﬂag. The ﬂag given by the comparison
of two samples stays up from the point when the condition for slope is satisﬁed until the
end of the pulse (when the eﬀects of noise are eliminated). The threshold comparison ﬂag
is up always when the signal is above the threshold. Combining these ﬂags with AND gives
a ﬂag that goes up, when the condition for pulse slope is satisﬁed, and returns down, when
the signal goes below threshold. Any gaps in the ﬂag are caused by noise. These gaps will
cause false triggers if they are not treated. Merging short gaps in the ﬂag was perceived as
an eﬀective way to eliminate the noise induced triggers. After the merging the resulting ﬂag
is clipped to one clock long trigger signal.
The amplitude given by the CFD is only directional. Amplitude information was not
foreseen to be used for the GEM detectors and amplitude extraction is only an uncalibrated
byproduct of the ﬁlter. The amplitude is proportional to the pulse height and should be
adequate for center of gravity calculation for better spatial resolution. If precise amplitude
information was desired1 an additional amplitude extraction ﬁlter should be designed. For
example weighted sum of three samples following the trigger might give the desired result.
In the simulations in section 4 CFD and Pulse Recognition (PuR) were almost identical
in delivered time resolution and required resources on chip. By itself CFD is much lighter
algorithm, but to operate reliably when noise is present it needs to be accompanied by an
integrator. The only remaining justiﬁcations are the space required in conﬁguration registers
and the ease of use. CFD has only one parameter that needs to be calibrated. In contrast
PuR has four of them. Even when the one variable needed for integrator is added, the
number of parameters is still halved compared to PuR.
1Precise amplitude might be needed for applications such as calorimetry or energy loss measurements.
61
5. Proposal for the GdSP Signal Processing Chain
Table 5.5.: Constant Fraction Discriminator inputs, outputs and variables from conﬁguration
registers.
Data in and out
I/O Name Width Description
Input Din 13
Data input from previous block. Signed
two's complement.
Output amplitude 10
Pulse amplitude information. Gives pulse
amplitude simultaneously to trigger,
otherwise zero.
Output trigg 1 Trigger signal.
Register variables
I/O Name Width Description
Input a 7 Relation between two sequential samples
at the moment the trigger is given.
Input Thrsh 10 Threshold between noise and meaningful
signal.
Input merge 2 Level of hysteresis applied to threshold
crossing.
5.6. Zero-Suppression
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
z10 z9
z8 z1...
Data pipeline:  z-11
z0
sequence mask 
pipeline:     z-4
presample mask 
pipeline:     z-4
ﬂag merger 
pipeline:     z-3
postsample 
counter:     z-1
Flag pipeline:  z-11
+
_
Din
thrd
seq_mask
premask
postmask
11
10
2
2
3
Dout
ﬂag
10
cmp
q0 f0
f2 f1 fx
m0
+
11
Oﬀset
  signed
     -> 
unsigned11 10
Figure 5.11.: Zero Suppression simpliﬁed block diagram.
The Zero Suppression (ZS) ﬁlter block diagrams can be found in Fig. 5.11-5.15. The ﬁlter
has been extracted from S-Altro completely unchanged and is the same version as used in
62
5. Proposal for the GdSP Signal Processing Chain
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
cmp
seq_mask[0]
seq_mask[1]
q0
q3 q2
q1
Figure 5.12.: Block diagram of sequence mask pipeline in ZS.
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
q0
premask[0]
premask[1]
f0
f3 f2 f1
fx
Figure 5.13.: Block diagram of pre-sample mask pipeline in ZS.
ALTRO. The reasoning behind zero suppression and the basic properties of the ZS were
discussed brieﬂy in chapter 3.5.
Two alternatives were considered for the ﬁlter outputs. The chosen option was delayed,
but otherwise completely unaltered, signal data together with a ﬂag marking the meaningful
data. This methods gives more possibilities for the data formatter after the SRAM as all
of the data is available. The other alternative was to have only one output that contained
zero suppressed data. Outside the meaningful data the signal would be forced to zero. This
method would need slightly smaller SRAM.
!RstB
Z-1f0
m2
!RstB
Z-1
!RstB
Z-1 m0
m1
Figure 5.14.: Block diagram of Flag merger pipeline in ZS.
The ZS contains options for glitch ﬁltering (sequence mask pipeline), use of pre- and post-
samples and merging of two clusters. The principles of the ﬂag formation were illustrated in
Fig. 3.16 on page 33. The glitch ﬁlter (see Fig. 5.12) removes ﬂag from pulses that have only
1, 2 or 3 samples above threshold depending on the value assigned to variable seq_mask. The
pre-sample mask (see Fig. 5.13) adds samples to the ﬂag before threshold crossing. Including
pre-samples allows to read out values starting from the baseline value. The number of pre-
samples is given by variable premask. The maximum number of pre-samples is three. The
post-sample counter (see Fig. 5.15) adds maximum seven post-samples after the signal return
below threshold. In addition to reading out the entire pulse the use of post-samples allows
to catch possible undershoot after pulse. The ﬂag merger (see Fig. 5.14) merges two ﬂag
63
5. Proposal for the GdSP Signal Processing Chain
!RstB
Z-1
f1
f2
postmask[2:0]
pstscnt
1
0
1
0   = 0
+
-1
fx
Figure 5.15.: Block diagram of Post-sample counter in ZS.
clusters if there are only one or two samples between them. In data formatting time tag
and channel label are added to the data samples. The added labels are usually both 10 bit
words. In stead of sending the extra labels, the readout of extra data samples is preferred
when the number of communicated bits is the same.
Table 5.6.: Constant Fraction Discriminator inputs, outputs and variables from conﬁguration
registers.
Data in and out
I/O Name Width Description
Input Din 11
Data input from previous block. Signed
two's complement.
Output Dout 10 Data out. Delayed unsigned data.
Output ﬂag 1 Flag marking meaningful signal.
Register variables
I/O Name Width Description
Input Oﬀset 10 Oﬀset added to the signal before
converting to unsigned.
Input thrd 10 Threshold between noise and meaningful
signal.
Input seq_mask 2 Minimum number of samples above
threshold for glitch ﬁlter.
Input postmask 3 Number of post-samples ﬂagged after
returning below threshold.
Input premask 2 Number of pre-samples ﬂagged before
threshold crossing.
64
6. Conclusions
As no front-end chip exists, which would exactly meet the CMS GEM requirements, a novel
chip needed to be designed. Two approaches were considered: VFAT3 with analog signal
processing and GdSP with both analog and digital signal processing (DSP).
The design of the DSP in the GdSP was greatly eﬀected by the properties of the GEM
signal. The irregularities in the signal shape lead to long shaping times in analog shaping.
This in turn has to be taken into consideration in the DSP. Some of the requirements originate
in the CMS experiment set-up. The experiment has to operate for long periods of time and it
takes data continuously. This puts extra signiﬁcance on delivering as sparse data as possible.
It also increases the probability that the chip has to be reset while taking data. Resetting
in these conditions should be part of normal operation. It should not cause, for instance,
readout of disturbances in the signal, which were caused by the reset.
The DSP was designed to have two modes of operation. In tracker mode, only binary
timing information and optional pulse amplitude are read out. In waveform mode, whole
pulse is read out. The DSP required methods for baseline collection, digital shaping, noise
ﬁltering, zero suppression and time pick-oﬀ. For three of these methods, existing ﬁlter
models could be used. These ﬁlters were migrated from S-Altro chip. Two of them were
changed with varying degree in the adaptation process. Two ﬁlters needed to be designed
from scratch. An integrator was used for high frequency noise ﬁltering. For time pick-oﬀ,
several diﬀerent methods were considered. Constant Fraction Discriminator (CFD) proved
to be the best alternative.
The migrated blocks were veriﬁed with simulations. Diﬀerent simulation approaches were
used to ﬁnd the best time pick-oﬀ algorithm. The design of DSP-chain with all the ﬁlters
was veriﬁed with simulations.
From time pick-oﬀ algorithms, CFD and Pulse Recognition (PuR) gave similar time reso-
lution. When CFD is used with noise ﬁltering and PuR without, CFD and PuR are of same
size. The only diﬀerence is in number of variables, which is considerably higher for PuR.
Piece-Wise Linear Fitting would be a suitable option, when high accuracy is needed and
lower number of channels on chip is required.
The DSP sets high requirements for the ADC. The chip has minimum 64 channels and one
ADC per channel. Due to the high number of channels the power consumption per channel
needs to be low. In addition to the requirement for low power, the ADC needs to have high
ENOB. Time resolution comparable to analog methods would demand ENOB above 9. An
ADC with the combination of low power and high precision at the relatively high 40 MHz
sampling do not presently exist for CMOS 130 nm process. Novel SAR-ADC meeting the
criteria is being developed. At a checkpoint in 2013 the SAR-ADC had not reached suﬃcient
ENOB with 40 MHz sampling frequency. The GdSP development was put on hold and the
development was focused on VFAT3.
65
Acknowledgements
In addition to my supervisors and thesis reviewers several people have helped on the way.
I would like to thank the Technical Student programme and especially Laura Saulnier at
CERN for making the internship possible. As I've understood CMS experiment is the correct
party to thank for the funding.
The whole PH-ESE-ME section at CERN was helpful and eager to answer my questions.
Special thanks go to Massimiliano De Gaspari for discussing the peculiarities of the S-Altro
chip. Eduardo García, who designed the S-Altro DSP, was kind enough to visit CERN and
answer my questions even though he no longer worked there.
Christian Lippmann was kind enough to share experiences with using ALTRO on the
ALICE TPC. The feedback had a great impact on the design and hopefully made it much
better.
I've asked for a wish list for front-end DSP from people I came into contact with during
the design development mainly in CMS GEM collaboration and GEM users in University of
Helsinki. Thank you for sharing your ideas and opinions.
My family and friends have been great at cheering throughout the writing process. Thanks
for my fellow oﬃcemate at CERN Marko for teaching me the importance of cat memes and
videos. And last but not least thanks to Henri Riihimäki, who has stood by me throughout
the process.
66
Bibliography
[1] The CMS Collaboration (2008): The CMS experiment at the CERN LHC, JINST 3,
S08004.
[2] M. Tytgat et al. (2013): Status of the Triple-GEM Project for the Upgrade of the CMS
Muon System, Proc. of MPGD2013, Zaragoza, Spain, July 2013
[3] A. Sharma et al. (2011): An overview of the design, construction and performance
of large area triple-GEM prototypes for future upgrades of the CMS forward muon
system, Proc. of MPGD2011, Kobe, Japan, Aug. 2011
[4] F. Sauli (1997): GEM: A new concept for electron ampliﬁcation in gas detectors, Nucl.
Instrum. Methods A386, 531-534
[5] A. Sharma (2012): A GEM Detector System Upgrade of the High-η Muon Endcap
Stations GE1/1 + ME1, IVthCMS GEM Workshop, CERN, Nov. 2012
[6] G.F. Knoll (2010): Radiation Detection and Measurement, John Wiley & Sons
[7] B. Ketzer et al. (2001): "GEM detectors for COMPASS". IEEE Trans. Nucl. Sc. 48,
1065
[8] The TOTEM Collaboration (2008): The TOTEM Experiment at the CERN Large
Hadron Collider , JINST 3, S08007
[9] The LHCb Collaboration (2008):The LHCb Detector at the LHC, JINST 3, S08005
[10] M. Ziegler, P. Cwetanski and U. Straumann (June 1999): A triple GEM detector for
LHCb, LHCb internal note TRAC 99-024
[11] CMS GEMs Collaboration (2012): A GEM Detector System for an Upgrade of the
CMS Muon Endcaps, Technical Proposal, CMS IN 2012/001, CERN
[12] A. Marinov (2012): GEM PCB Development, GEM Tests & 904 Test Facility,
IVthCMS GEM Workshop, CERN, Nov. 2012
[13] P. Moreira et al., (2010): The GBT SerDes ASIC prototype, Published in J. Instrum.
5 C11016, presented at: Topical Workshop on Electronics for Particle Physics 2010.
Aachen, Germany, Sep. 2010
[14] P. Vichoudis et al. (2010): The Gigabit Link Interface Board (GLIB), a ﬂexible system
for the evaluation and use of GBT-based optical links, Published in J. Instrum. 5
C110167 presented at TWEPP2010. Aachen, Germany, Sep. 2010
[15] Image source: http://en.wikipedia.org/wiki/File:Integrated_circuit_design.png, re-
trieved 13. march 2014
[16] R. Turchetta et al. (2001) Design and results from the APV25, a deep sub-micron
CMOS front-end chip for the CMS tracker, Nucl. Instrum. Methods A 466, 359365
67
Bibliography
[17] P. Aspell et al. (2007) VFAT2 : A front-end system on chip providing fast trigger
information and digitized data storage for the charge sensitive readout of multi-channel
silicon and gas particle detectors, TWEPP2007, Prague, Czech Republic, Sep. 2007
[18] M. Alfonsi et al. (2007) Production and performance of LHCb triple-GEM detectors
equipped with the dedicated CARDIAC-GEM front-end electronics, Nucl. Instrum.
Methods A572, 12-13
[19] W. Bonivento et al. (2005) Design and performance of the front-end electronics of the
LHCb Muon Detector , Presented at LECC, Heidelberg, Germany, Sep. 2005
[20] E. García (2012) Novel Front-end Electronics for Time Projection Chamber Detectors,
Ph.D. Thesis, Universidad Politécnica de Valencia,
[21] B. Mota (2003) "Time-Domain Signal Processing Algorithms and their Implementation
in the ALTRO chip for the ALICE TPC", Ph.D. Thesis, Ecole Polytechnique Fédérale
de Lausanne
[22] J. Moro« (2013): Development of variable sampling rate low power 10-bit SAR ADC
in 130 nm IBM technology, TWEPP2013, Perugia, Italy, Sep. 2013
[23] Matlab, The Language of Technical Computing,
http://www.mathworks.se/products/matlab/, visited 23.8.2014
[24] Simulink, Simulation and Model-Based Design http://www.mathworks.se/products/simulink/
, visited 23.8.2014
[25] D. K. Tala: Verilog tutorial, http://www.asic-world.com/verilog/veritut.html, visited
1.6.2012 - 23.8.2014
[26] V. Radeka (2011): Signal Processing for Particle Detectors, in Schopper, H. and Fab-
jan, C. (ed.): Elementary Particles, Subvolume B: Detectors for Particles and Radiation,
Springer
[27] P. Aspell (2012): GEMs for CMS, From an electronics perspective, Seminar, CERN,
May 2012
[28] F. Guilloux (2012): GdSP/VFAT3 ASIC, CFE analogue prototype, presentation at
GEMs for CMS Electronics meeting, CERN, Oct. 2012
[29] A.V. Oppenheim, R.W. Schafer (1989): Discrete-Time Signal Processing, Prentice Hall
[30] W. Kester (2003): Mixed-signal and DSP Design Techniques, Elsevier Science
[31] T. O'Haver: Resolution enhancement (Peak Sharpening),
http://terpconnect.umd.edu/~toh/spectrum/ResolutionEnhancement.html, visited
25.4.2014
[32] V. Buzuloiu (1992): A fast and precise peak ﬁnder for the pulses generated by future
HEP detectors, Proc. CHEP'92 2, 827-831, Sep. 1992
[33] P. Bloch and E. Tourneﬁer (1999): BC assignment and charge reconstruction with
voltage sampling Preshower electronics, Preshower Internal document, CERN, March
1999
68
Bibliography
[34] S. Gadomski et al (1992): The deconvolution method of fast pulse shaping at hadron
colliders, Nucl. Instrum. Methods Phys. Res. A320, 217-227
[35] I. Brawn et al (1995): Bunch-Crossing Identiﬁcation for the ATLAS First-Level
Calorimeter Trigger, 293-296. 18. CERN-LHCC-95-56, Oct 1995
[36] Th. Maerschalk, G. De Lentdecker, G. Mullier (2013): Timing Resolution Techniques
- TOT and CFD - and Fast Simulation, VIthCMS GEM Workshop, CERN, May 2013
[37] J. Kral (2012):L0 trigger for the EMCal detector of the ALICE experiment, Nucl.
Instrum. Methods A693, 261-267
[38] CADENCE NCLAUNCH TUTORIAL, http://www.ee.virginia.edu/~mrs8n/cadence/nclaunchtut.pdf,
visited 18.8.2014
69
Appendices
i
A. Acronym Glossary
ADC Analog to Digital Converter.
ALICE A Large Ion Collider Experiment.
ALTRO ALICE TPC Read Out. Signal processing and read out front end chip developed
for ALICE's TPC-detector. Used extensively throughout ALICE.
AMC Advanced Mezzanine Cards
APV Analogue Pipeline Voltage mode.
ASIC Application Speciﬁc Integrated Circuit
ATLAS A Toroidal LHC ApparatuS experiment.
BC Baseline Correction. 1
BX Bunch Crossing.1
BXID Bunch Crossing IDentiﬁcation.
CARIOCA CERN and Rio Current-mode Ampliﬁer.
CFD Constant Fraction Discriminator. A pulse time pick-oﬀ method with both analog and
digital implementations.
CMS Compact Muon Solenoid.
COMPASS Common Muon and Proton Apparatus for Structure and Spectroscopy
CSC Cathode Strip Chamber.
CERN Conseil Européen pour la Recherche Nucléaire = European Laboratory for Particle
Physics
DAQ Data Acquisition.
DSP Digital Signal Processing.
DT Drift Tube.
ENC Equivalent Noise Charge
ENOB Eﬀective Number Of Bits.
FE Front-end.
FIR Finite Impulse Response.
1In literature sometimes BC refers to bunch crossing. Here for clarity BX is used for Bunch Crossing and
BC for Baseline Correction.
ii
A. Acronym Glossary
FPGA Field Programmable Gate Array.
GBT GigaBitTransceiver.
GdSP Gas detector/digital Signal Processing.
GEM Gas Electron Multiplier.
GLIB Gigabit Link Interface Board.
HEP High Energy Physics.
HDL Hardware Description Language.
IIR Inﬁnite Impulse Response.
LHC Large Hadron Collider.
LHCb Large Hadron Collider beauty.
LUT Look-Up Table.
LSB Least Signiﬁcant Bit.
LTI Linear Time-Invariant.
LVDS Low-Voltage Diﬀerential Signaling.
MA Moving Average
MAF Moving Average Filter.
MAU Moving Average Unit.
MIP Minimum Ionizing Particle.
MSB Most Signiﬁcant Bit.
PCB Printed Circuit Board.
PWLF Piece-Wise Linear Fitting.
VFAT Very Forward Atlas Totem. Microelectronics front-end chip for tracking and trigger-
ing. Developed for and used mainly in Totem.
rms Root mean square.
RPC Resistive Plate Chamber.
RTL Register Transfer Level
S-Altro Super-ALTRO. Microelectronics front end chip for amplifying, digitizing, processing
and reading out detector data. Designed for gas detectors in mind. Updated version
of Altro chip used in ALICE.
SAR Successive Approximation Register.
SINAD SIgnal to Noise And Distortion ratio.
iii
A. Acronym Glossary
SNR Signal to Noise Ratio.
S/N Signal to Noise Ratio.
SRAM Static Random Access Memory.
ToT Time over Threshold
TOTEM Total Cross Section, Elastic Scattering and Diﬀraction Dissociation.
TPC Time Projection Chamber.
TTC Timing, Trigger and Control.
uTCA Micro Telecommunications Computing Architecture. MicroTCA Electronics system
for telecommunication. Micro refers to the smaller size of the system compared to its
predecessors.
ZCI Zero-Crossing Identiﬁcation
iv
B. Block diagrams of S-Altro DSP ﬁlters
The block diagrams correspond the S-Altro prototypes Verilog code that was available. They
diﬀer in some parts from the block diagrams in the thesis of Eduardo Garcia [20].
B.1. First Baseline Correction
10
10
11−1z
always @ clk, Rst
11
IIR Filter 11
Bsc[1]
−1z
always @ clk, Rst
11
−1z
always @ clk, Rst1
0
Bsc[3]
11 in_select
Bsc[6]
!Rst
13 Dout
1
0
Bsc[0]
13
131
0
10
10
1
0
10
10
5
5
4
10
11
10
13 Dout_r
1
01
0Din_rDin Din_2cA
!Rst !Rst
Din_2c
Bsc[5]
Din_2c_aux Vpd
rb
ra
ThrshB1H
ThrshB1L
Acqn
n
NoiseCh
DO
Bsc[4]
Fpd add
PmAdd
Bsc[2]
addm AD
Vpd
−
PmRd
PmWr WEN
Bsc[1]
Figure B.1.: First Baseline Correction
=
!1
z
en
n
n
1’b1
n
n
11
10
5
5
11
26
26
26
!1
z
en
Acqn
!
!
Argument of an if statement ! Double threshold scheme
!
COUNTER
en
rst
’d 12
clkClk
Acqn
RstB
valid
valcnt
sum sum
RstB
Vpd
_
NoiseCh
Din_2c_aux
ThrshB1L
ThrshB1H
trsh_h_s
sum
sum
trsh_l_s
Vpd
IIR filter
always @Clk,RstB
+
!
+
!
2
!n
2
!n
2
!n
2
!n
Figure B.2.: IIR ﬁlter
v
B. Block diagrams of S-Altro DSP ﬁlters
B.2. Digital Shaper / Tail Cancellation Filter
ClkRstB
f2
ClkRstB
f3
ClkRstB
f4
1
0
resultin
L
K
resultin
L
K
resultin
L
K
resultin
L
K
−1z13
13
13
13
13
13
13
13
13
13 13 13 13 13
13
ClkRstB
f1
filt_in
sel_filt
L4
K4
L3
K3
K2
L2
L1
K1
r1 r2 r3 r4
RstB
filt_out
sel_filt
in_aux
y
Figure B.3.: Digital Shaper. Cascade of four ﬁrst order pole-zero ﬁlters.
−1z
13mult2P
N R
13 13
mult1
RP
N 13
13
13
13
13
RstB
cam2
in
L
K
m1
K_
L_
_
filter2 (DS filters f1 ... f4)
result
Figure B.4.: First order pole-zero ﬁlter used in digital shaper.
1
0
13
13
13
13
13
13
26
13
multy362
N[12]
temp[25:13]
AX[25:13]
R
temp
N
P C2−1
Figure B.5.: Custom multiplication operation used in Digital shaper. Input P is assumed to
be always positive, whereas N can be positive or negative.
vi
B. Block diagrams of S-Altro DSP ﬁlters
B.3. Second Baseline Correction
−4z
!RstB
Double threshold
scheme
−4z
MAF control
logic
MAF
−1z
!RstB
13 13
13
1
013
13
13
−1z
!RstB
2
10
1010
13
6
10
9
9
flag
Din
Bsl
f0
Din_p d0
TapsEn
ma_en
_
ThrshB2H
ThrshB2L
NoiseCh
dx
ma_en
dx_Bsls
Offset
dx_clip
Dout
clipping
dx_Bsls_aux[11:2]
dx_Bsls_aux[12]
Edges
4
Figure B.6.: Second Baseline Correction abstracted block diagram.
13 13
13
10
10
13
9
9
10
13
13
13
lth
lth[12]
hth
trsh_l_s
trsh_h_s
Din
ThrshB2L
ThrshB2H
NoiseCh
Bsl
flag
+
−
+
−4
−4
Figure B.7.: Double threshold scheme.
−4z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
1
0
=1
0 =
−1z 4
1
0
valid’d12
valcnt
1
counter of valid samples
!ma_en
!RstB
4
f3
Edges[1]
flag
Edges[0]
f2
f1
fx
−1
0
Postmask flag counter
Edges[5:2]
f2
f1
f0
pstscnt
Figure B.8.: Moving Average Filter control logic.
vii
B. Block diagrams of S-Altro DSP ﬁlters
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB
−1z
!RstB!Flag !Flag
!Flag!Flag
!Flag!Flag
!Flag!Flag
!Flag
−1z
!Flag
11
10
01
00
TapsEn
>>1
11
10
01
00
TapsEn
>>2
>>3
z8
z7 z6 z5 z4 z3 z2 z1 z0 s07
!RstB
s8_b
zx
Bsl
Din
_
Figure B.9.: Moving Average Filter.
B.4. Zero Suppression
10
−1z
!Rst
10
−1z
!Rst
10
−1z
!Rst
10
−1z
!Rst
thrd
Din
sequence mask pipeline
Z −4
presample mask pipeline
Z −4
flag merger pipeline
postsample counter
Z
Z −3
−1
f2 fxf1
cmp q0 f0 flagm0
postmask[2:0]
premask[1:0]
seq_mask[1:0]
... z1z10 z9 Doutz0
Data pipeline Z −11
Flag pipeline
+
−
Figure B.10.: Zero suppression abstracted block diagram.
−1z
!Rst
−1z
!Rst
−1z
!Rst
−1z
!Rst
seq_mask[1]
seq_mask[0]
q3 q2 q1
sequence flag pipeline
cmp
q0
Figure B.11.: Sequence mask pipeline.
−1z
!Rst
−1z
!Rst
−1z
!Rst
−1z
!Rst
f3 f2 f1
presample mask pipeline
premask[1]
premask[0]
q0
fx
f0
Figure B.12.: Presample mask pipeline.
viii
B. Block diagrams of S-Altro DSP ﬁlters
−1z
1
0
=1
0
−1
0
pstscnt
Postmask flag counter
fxpostmask[2:0]
f2
f1
Figure B.13.: Post sample counter.
−1z
!Rst
−1z
!Rst
−1z
!Rst
m
m2 m1
Merger flag pipeline
m0f0
Figure B.14.: Flag merger pipeline.
ix
C. Block diagrams of Simulink models
Simulink models of time pick-oﬀ methods, integrator and peak sharpener were used in sim-
ulations described in section 4.2. The block diagrams represent the models at the end of the
simulations, when they were either excluded as options or they were converted into Verilog
models.
C.1. Piece-wise Linear Fitting
3
timewalk
2
Amplitude
1
trigger
0.015
b2
-0.026
b1
0.015
b0
2.7
a2
-4.5
a1
2.7
a0
z
1
Unit Delay2
z
1
Unit Delay1
z
1
Unit Delay
<=
Relational
Operator
Product7
Product6
Product5
Product4
Product3
Product2
Product1
Lookup Table
Logical
Operator1
Logical
Operator
Add1
Add
2
Thrsh
1
Din
S2
S2
S1 S0
Figure C.1.: Piece-Wise Linear Fitting block diagram
x
C. Block diagrams of Simulink models
C.2. Deconvolution method
2
Amplitude
1
Trigger
59220
v3
-5984
v2
507
v1
z
1
Unit Delay2
z
1
Unit Delay1
z
1
Unit Delay
 > 
Switch<
Relational
Operator2
>=
Relational
Operator1
<=
Relational
Operator
Product5
Product4
Product3
Product2
Product1Product Logical
Operator2
Logical
Operator1
Logical
Operator
0
Constant
Add1
Add
2
Din
1
Thrsh
Sn Sn-1
v1xSn
v2xSn-1
v2xSn
v3xSn-1
v1xSn+1
A
B
C
Amplitude
Figure C.2.: Deconvolution ﬁlter block diagram
C.3. Peak Finder
Amplitude
2
trigger
1
Unit Delay1
z
1
Unit Delay
z
1 Switch
 > 
Relational
Operator2
<
Relational
Operator1
>=
Relational
Operator
>=
Logical
Operator
Constant
0
Thrsh
2
Din
1
Figure C.3.: Peak Finder block diagram
xi
C. Block diagrams of Simulink models
C.4. Zero-Crossing Identiﬁcation
Tstamp
3
Amplitude
2
trigger
1
Unit Delay2
z
1
Unit Delay1
z
1Unit Delay
z
1
Switch1
 > 
Switch
 > 
Subtract
Relational
Operator
>= Logical
Operator2
Logical
Operator1
Logical
Operator
Constant
0
Compare1
< 0
Compare
>= 0
BX counter
3
Thrsh
2
Din
1
dDin_n+1 dDin_n
Figure C.4.: Zero-Crossing Identiﬁcation version A block diagram
2
Amplitude
1
trigger
z
1
Unit Delay2
z
1
Unit Delay1
z
1
Unit Delay
-u
Unary Minus
 > 
Switch
Subtract
>=
Relational
Operator1
>=
Relational
Operator
Logical
Operator2
Logical
Operator1
Logical
Operator
0
Constant
< 0
Compare1
2
Thrsh
1
Din
dDin_n+1 dDin_n
Figure C.5.: Zero-Crossing Identiﬁcation version B block diagram
xii
C. Block diagrams of Simulink models
C.5. Pulse Recognition
Amplitude
2
Tstamp
1
pipeline
Din
n
sample1
Sample2
Sample3
Switch20
 > 
Switch19
 > 
DUT: Deconvolution1
Thrsh
sample1
sample2
sample3
v1
v2
v3
Trigger
Amplitude
Constant11
0
v3
7
v2
6
v1
5
BX counter
4
Thrsh
3
Din
2
n
1
Figure C.6.: Pulse Recognition (PuR) block diagram
Amplitude
2
Trigger
1
Unit Delay2
z
1
Relational
Operator2
<
Relational
Operator1
>=
Relational
Operator
<=
Product5
Product4
Product3
Product2
Product1Product Logical
Operator2
Logical
Operator1
Logical
Operator
Add1
Add
v3
7
v2
6
v1
5
sample3
4
sample2
3
sample1
2
Thrsh
1
C
B
A
v1xSn+1
v3xSn-1
v2xSn
v2xSn-1
v1xSn
Figure C.7.: Deconvolution1 in PuR
xiii
C. Block diagrams of Simulink models
Sample3
3
Sample2
2
sample1
1
Unit Delay9
z
1
Unit Delay8
z
1
Unit Delay7
z
1
Unit Delay6
z
1
Unit Delay5
z
1
Unit Delay4
z
1
Unit Delay3
z
1
Unit Delay2
z
1
Unit Delay10
z
1
Unit Delay1
z
1
Unit Delay
z
1
Multiport
Switch1
2
3
4
5
6
7
8
9
*
n
2
Din
1
s2s3s4_Tp100s5s6_Tp25s7s8s9s10s11 s1_Tp200
Figure C.8.: Pipeline in PuR
C.6. Constant Fraction Discriminator
Amplitude
2
Tstamp
1
Unit Delay
z
1
Switch1
 > 
Switch
 > 
Relational
Operator2
<=
Relational
Operator1
>=
Relational
Operator
>=
Product3
Logical
Operator3
Logical
Operator2
Logical
Operator1
Logical
Operator
Constant
0
1 clock trigger
z
1
BX counter
4
a
3
Thrsh
2
Din
1
Sn+1
Sn+1 Sn
Figure C.9.: Constant fraction discriminator block diagram
xiv
C. Block diagrams of Simulink models
C.7. Peak sharpening
Dout
1
Unit Delay3
z
1
Unit Delay2
z
1
Unit Delay1
z
1
Product
Add2
Add1Add
k
2
Din
1
Figure C.10.: Peak sharpener block diagram
C.8. Integrator
2
Smooth2
1
Smooth4
z
1
Unit Delay4
z
1
Unit Delay2
z
1
Unit Delay1
Divide1
Divide
2
Constant1
4
Constant
Add2
Add1
1
Din
Figure C.11.: Integrator block diagram
xv
D. Block Diagram of Pulse Recognition
(Verilog Model)
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
!RstB
Z-1
q10 q9
q8
q7
q0
q1...
Data pipeline:  z-11
000
111
001
010
011
100
101
110
q1
q2
q3
q4
q5
q6
q7
sample1
Din
+++
sample2
sample3
n
v1
>>3
mult2
mult1
mult3
+
+
v2
v3
C2
B1
B2
A1
C1
C3
B2
C2
C3
A
B
C
1
0
1
0
1
0
+
_
+
_
+
_
>>3
>>3
C[26]
B[24]
1 0
A_B
B_C
Thrsh
!RstB
Z-1
1
00
trigg
amplitude
10
ﬂag
B
12
13
4
5
14
10
12
12
<<2
Weighted sums
Trigger formation
15
25
27
18
24
18
24
24
Figure D.1.: Block diagram of Pulse Recognition.
-1
+14
+
1
0
>>3
>>3P
N
R
13 13
24
N[13]
C2
temp
<<14 AX
27
27
Figure D.2.: Multiplication used in PuR.
xvi
