Instrumentation of CdZnTe detectors for measuring prompt gamma-rays emitted during particle therapy by Födisch, Philipp
Aus dem National Center for Radiation Research in Oncology – OncoRay
Direktor: Herr Prof. Dr. Michael Baumann
Instrumentation of CdZnTe detectors
for measuring prompt gamma-rays
emitted during particle therapy
D i s s e r t a t i o n s s c h r i f t
zur Erlangung des akademischen Grades
Doktor der Medizintechnologie
Doctor rerum medicinalium (Dr. rer. medic.)
vorgelegt
der Medizinischen Fakultät Carl Gustav Carus
der Technischen Universität Dresden
von
Dipl.-Ing. Philipp Födisch
aus Wolfen
Dresden 2016

1. Gutachter: Prof. Dr. rer. nat. habil. Wolfgang Enghardt
2. Gutachter: Prof. Dr.-Ing. habil. Uwe Hampel
Tag der mündlichen Prüfung: 12. Mai 2017
gez.: PD Dr. rer. nat. habil. Steffen Löck
Vorsitzender der Promotionskommission

Contents
1. Introduction 1
1.1. Aim of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Analog front-end electronics 5
2.1. State-of-the-art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2. Basic design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1. CZT detector assembly . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2. Electrical characteristics of a CZT pixel detector . . . . . . . . . . . . 7
2.2.3. High voltage biasing and grounding . . . . . . . . . . . . . . . . . . . 9
2.2.4. Signal formation in CZT detectors . . . . . . . . . . . . . . . . . . . . 10
2.2.5. Readout concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.6. Operational amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3. Circuit design of a charge-sensitive amplifier . . . . . . . . . . . . . . . . . . 18
2.3.1. Circuit analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2. Charge-to-voltage transfer function . . . . . . . . . . . . . . . . . . . . 22
2.3.3. Input coupling of the CSA . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.4. Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4. Implementation and Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.1. Test pulse input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5.2. Pixel detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3. Digital signal processing 37
3.1. Unfolding-synthesis technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2. Digital deconvolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1. Prior work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.2. Discrete-time inverse amplifier transfer function . . . . . . . . . . . . . 41
3.2.3. Application to measured signals . . . . . . . . . . . . . . . . . . . . . 44
3.2.4. Implementation of a higher order IIR filter . . . . . . . . . . . . . . . . 46
3.2.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3. Digital pulse synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.1. Prior work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.2. FIR filter structures for FPGAs . . . . . . . . . . . . . . . . . . . . . . 51
3.3.3. Optimized fixed-point arithmetic . . . . . . . . . . . . . . . . . . . . . . 57
i
3.3.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4. Data interface 61
4.1. State-of-the-art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2. Embedded Gigabit Ethernet protocol stack . . . . . . . . . . . . . . . . . . . 62
4.3. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.1. System overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2. Media Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.3. Embedded protocol stack . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3.4. Clock synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4. Measurements and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.1. Throughput performance . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4.2. Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.3. Resource utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5. Experimental results 77
5.1. Digital pulse shapers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.1. Spectroscopy application . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.2. Timing applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2. γ-ray spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.1. Energy resolution of scintillation detectors . . . . . . . . . . . . . . . . 85
5.2.2. Energy resolution of a CZT pixel detector . . . . . . . . . . . . . . . . 90
5.3. γ-ray timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.1. Timing performance of scintillation detectors . . . . . . . . . . . . . . 93
5.3.2. Timing performance of CZT pixel detectors . . . . . . . . . . . . . . . 95
5.4. Measurements with a particle beam . . . . . . . . . . . . . . . . . . . . . . . 99
5.4.1. Bremsstrahlung Facility at ELBE . . . . . . . . . . . . . . . . . . . . . 99
6. Discussion 103
7. Summary 105
8. Zusammenfassung 107
Appendices 109
A. Waveform diagrams 109
B. Rear Transition Module 115
ii
List of acronyms and abbreviations
3D . . . . . . . . . . . . . . Three-dimensional
ADC . . . . . . . . . . . . Analog-to-digital converter
ALM . . . . . . . . . . . . Adaptive Logic Module
ALU . . . . . . . . . . . . Arithmetic logic unit
AMC . . . . . . . . . . . Advanced mezzanine card
ARP . . . . . . . . . . . . Address Resolution Protocol
ASIC . . . . . . . . . . . Application-specific integrated circuit
BGO . . . . . . . . . . . Bi4Ge3O12, bismuth germanate
CAR . . . . . . . . . . . . Cathode-over-anode ratio
CFD . . . . . . . . . . . . Constant fraction discriminator
COTS . . . . . . . . . . Commercial off-the-shelf
CSA . . . . . . . . . . . . Charge-sensitive amplifier
CZT . . . . . . . . . . . . CdZnTe, Cadmium zinc telluride
DAQ . . . . . . . . . . . . Data acquisition
DESY . . . . . . . . . . Deutsches Elektronen-Synchrotron
DOI . . . . . . . . . . . . . Depth of interaction
DSP . . . . . . . . . . . . Digital signal processor, Digital signal processing
ELBE . . . . . . . . . . . Elektronen Linearbeschleuniger für Strahlen hoher Brillianz und niedriger
Emittanz
ENC . . . . . . . . . . . . Equivalent noise charge
FCS . . . . . . . . . . . . Frame check sequence
FIFO . . . . . . . . . . . First-In-First-Out
FIR . . . . . . . . . . . . . Finite impulse response
FMC . . . . . . . . . . . . FPGA Mezzanine Card
FPGA . . . . . . . . . . Field-programmable gate array
FR-4 . . . . . . . . . . . . Flame Retardant 4
FSM . . . . . . . . . . . . Finite state machine
FWHM . . . . . . . . . . Full width at half maximum
GAGG . . . . . . . . . . Gd3Al2Ga3O12:Ce, cerium-doped gadolinium aluminum gallium garnet
iii
GBP . . . . . . . . . . . . Gain-bandwidth product
GMII . . . . . . . . . . . . Gigabit Media Independent Interface
GSPS . . . . . . . . . . Gigasamples per second
HZDR . . . . . . . . . . Helmholtz-Zentrum Dresden-Rossendorf
ICMP . . . . . . . . . . . Internet Control Message Protocol
IC . . . . . . . . . . . . . . Integrated circuit
IFG . . . . . . . . . . . . . Interframe gap
IIR . . . . . . . . . . . . . . Infinite impulse response
IP . . . . . . . . . . . . . . Intellectual property
IP . . . . . . . . . . . . . . Internet Protocol
LTI . . . . . . . . . . . . . Linear time-invariant
LUT . . . . . . . . . . . . Look-up table
MAC . . . . . . . . . . . . Media Access Control
MCH . . . . . . . . . . . MicroTCA Carrier Hub
MDIO . . . . . . . . . . . Management Data Input/Output
MicroTCA . . . . . . Micro Telecommunications Computing Architecture
MSPS . . . . . . . . . . Megasamples per second
MWD . . . . . . . . . . . Moving Window Deconvolution
NaI . . . . . . . . . . . . . Sodium iodide
OSI . . . . . . . . . . . . . Open Systems Interconnections
PCB . . . . . . . . . . . . Printed circuit board
PC . . . . . . . . . . . . . Personal computer
PGI . . . . . . . . . . . . . Prompt gamma imaging
PGT . . . . . . . . . . . . Prompt gamma timing
PHY . . . . . . . . . . . . Physical layer transceiver
PLL . . . . . . . . . . . . Phase-locked loop
PPS . . . . . . . . . . . . Pulse per second
PTP . . . . . . . . . . . . Precision Time Protocol
RGMII . . . . . . . . . . Reduced Gigabit Media Independent Interface
RMS . . . . . . . . . . . . Root-mean-square
RTL . . . . . . . . . . . . Register-transfer level
RTM . . . . . . . . . . . . Rear Transition Module
RTT . . . . . . . . . . . . Round-trip time
SFD . . . . . . . . . . . . Start frame delimiter
SGMII . . . . . . . . . . Serial Gigabit Media Independent Interface
iv
SNR . . . . . . . . . . . . signal-to-noise ratio
SoC . . . . . . . . . . . . System on a chip
TCA . . . . . . . . . . . . Telecommunications Computing Architecture
TIA . . . . . . . . . . . . . Transimpedance amplifier
UDP . . . . . . . . . . . . User Datagram Protocol
VHDL . . . . . . . . . . Very High Speed Integrated Circuit Hardware Description Language
VMEbus . . . . . . . . Versa Module Europa bus
v

1. Introduction
Cancer is a deadly disease for humans. In Germany, the total number of cancer diseases
increased from 338,300 in the year 1997 to 477,950 in 2012 (Batzler et al., 1999; Kaatsch
et al., 2015). The quoted study claims that at the same time the associated cancer deaths
remain constant slightly above 210,000 people. Regardless of how the statistics are inter-
preted, the decreasing cancer mortality rate is based on advances in medical science.
The successes in treating cancer are driven by a continuous improvement of medical tech-
nology. Nowadays, surgery is assisted by computers, chemotherapy benefits from computed
tomography imaging, and radiation therapy involves cutting-edge technologies for particle
accelerators and detector systems. An effective cure is achieved by combining the avail-
able techniques. With respect to the mentioned methodologies, radiotherapy is one of the
main cancer treatments (Krause and Supiot, 2015). Therefore, a beam of high energy is
generated by a particle accelerator, and is used to destroy the cancer cells in the patient’s
body. This can generally be a photon beam (X-rays), which is currently the standard type of
radiation used for treatment, or even a particle beam (proton or ion beam). Particle beams
have the same biological effect on tumor cells as photon beams, but they are aligned to stop
at a determined position in the tumor, and healthy tissue behind the target is less irradiated.
Thus the largest dose is deposited in the tumor, while healthy tissue is spared.
For now, there are 60 proton therapy facilities in operation around the world, where 6 are
located in Germany (PTCOG, 2016). With regard to facilities under construction or in a
planning stage, the total number will increase rapidly in the next years. In proton therapy, a
major task is the development of a treatment plan delivering most of the dose to the tumor
cells by adjusting beam parameters. An a posteriori verification of the applied dose is still
challenging, as the protons are stopped inside the tissue, and transport no direct information
about that location out of the patient. Consequently, information caused by interactions of
protons with the target must be considered for a measurement to make progress in range
verification. Monitoring the proton beam finally enables the exploitation of whole precision
for proton therapy.
A promising approach for an indirect measurement of the finite range of a proton beam is
based on the detection of emitted γ-rays. These γ-rays are promptly emitted as a result of
nuclear reactions of protons with tissue. Thus the emission of prompt γ-rays correlates with
the range of the proton beam and therefore with the deposited dose in the tumor (Fiedler
et al., 2011). The emission spectrum is dominated by interactions of 12C and 16O, which
is observable at 4.4 MeV and 6.1 MeV (Rohling, 2015). In order to locate the range or at
1
1. Introduction
least a range shift, several methods have been proposed and are currently evaluated for
clinical use. Throughout that task, a pinhole camera (Kim et al., 2009), or a knife-edge slit
camera (Smeets, 2012; Richter et al., 2016) for prompt gamma imaging (PGI) have been
investigated. For systems with passive collimation, “the difficulty is obviously to find a trade-
off between good spatial resolution and high detection efficiency” (Smeets, 2012). Moreover,
it has been demonstrated that a time-of-flight measurement is feasible for detecting range
shifts (Golnik, 2016). The prompt gamma timing (PGT) method requires “a stabilized photon
detection (in terms of energy and timing) at high throughput rates” (Golnik, 2016). A further
technique was proposed, which correlates the measured prompt γ-ray spectrum to nuclear
reaction cross sections (Verburg and Seco, 2014). The data acquisition (DAQ) required a
digitizer running at about 1 GSPS synchronous to the particle accelerator. At last, the most
complex system for PGI is a Compton camera (Richard et al., 2009; Peterson et al., 2010;
Roellinghoff et al., 2011; Kormoll et al., 2011b; Hueso-González et al., 2014; McCleskey
et al., 2015; Polf et al., 2015; Taya et al., 2016). “A slight scepticism has settled down in the
scientific community concerning Compton cameras for PGI. Due to the technical complexity
and high radiation background, only few experimental results hint at their applicability in a
clinical environment.” (Hueso-González, 2016).
All quoted studies about methodology lead to the conclusion that range verification in proton
therapy requires new developments in electronics. In general, a successful translation into
practical clinical use-cases is possible, if nuclear instruments rise to the challenges of:
• Energy range: Prompt γ-rays are emitted at energies up to ≈ 7 MeV (Schumann et al.,
2015). In case of PGI with a Compton camera, scattered events with energies be-
low 100 keV must be also detectable by the electronics for improved event statistics
(Rohling, 2015). Thus the measurement dynamic range must be extraordinarily high.
• γ-ray flux : Depending on the material and the volume of the detector, the estimated
detector load is in the few-Mcps range (Pausch et al., 2016). For CdZnTe (CZT) detec-
tors, which have been proposed for use in a Compton camera (Kormoll, 2013), the load
is rather reduced. As clinical benchmarks are still ongoing work, instruments must be
designed for highest possible performance in terms of dead time and achievable event
throughput.
• Energy resolution: For all methods, discrimination of prompt γ-rays according to em-
pirically defined ranges is convenient. Moreover, in case of a Compton camera, the
precision of reconstructed scattering angles depends on energy resolution (Rohling,
2015). The energy resolution of the readout electronics must exceed the intrinsic res-
olution of the detector material. Hence, the electronic noise and processing system
should not limit the overall spectroscopy performance.
• Time resolution: Regardless of the method, a timing of incident γ-rays relative to the
accelerator frequency or another detector in coincidence enables filtering of usable
events. In case of a time-of-flight measurement, requirements for timing are intensified
to the range of picoseconds (Golnik, 2016). A precise timing demands a low-jitter clock
2
1.1. Aim of this work
distribution and high bandwidth for fast signal rise times.
• Spatial resolution: For PGI, i.e. pinhole-camera, knife-edge slit camera, or Compton
camera, a high segmentation of the detector enhances the precision of obtained im-
ages. But an increased spatial resolution demands an increased number of readout
channels for the DAQ system, which concurrently multiplies complexity and amount of
data (Kormoll, 2013).
A detector system serving all the emphasized features is necessary for an implementation
of a Compton camera, but is not available. In general, before designing electronic systems,
a decision between proprietary or standardized components has to be made. Proprietary
systems tend to be compact and optimally adapted to a single task. On top of that design
approach are application-specific integrated circuits (ASICs). For use in experimental sys-
tems, they are expensive, less versatile, and hard to access. On the contrary, standardized
components benefit from their versatility due to their modular architecture at the cost of ad-
ditional overhead. For nuclear physics, the Versa Module Europa bus (VMEbus) standard is
widely used for instrumentation of control and experimental systems. Recent experimental
results for range verification were based on electronics built with VMEbus systems (Kormoll,
2013; Hueso-González, 2016; Golnik, 2016). Moreover, an emerging trend for MicroTCA
systems in the physics community is noticeable. In conjunction with field-programmable
gate arrays (FPGAs), that platform seems to fit the generalized requirements of a prototype
detector system for range verification, i.e. high data throughput, high signal bandwidth, and
high input channel count with precise clock synchronization and distribution.
1.1. Aim of this work
Starting from the development of a prototype system of a Compton camera for range verifi-
cation in proton therapy (Kormoll, 2013), the integration of an ASIC-based system (Födisch
et al., 2013) to that camera failed due to the limited energy range, time resolution, and
throughput of the customized readout electronics. The only available ASIC for the readout
of a CZT pixel detector could not withstand the requirements of a clinical scenario. With this
experience and knowledge about present limitations, the development and investigation of
a new prototype with commercial off-the-shelf (COTS) components was reinitialized. Elec-
tronics were replaced by highly integrated components, and analog signal processing was
discarded for the purposes of digital pulse shape analysis and processing. Despite digitiza-
tion, new developments in analog front-end electronics were necessary to acquire signals
from a CZT pixel detector (Födisch et al., 2016a). Moreover, new algorithms for improving
digital pulse processing with an FPGA have been developed (Födisch et al., 2016c) and
efficiently implemented (Födisch et al., 2016d). To cope with the challenges of high data
throughput and synchronization in distributed systems, an Ethernet-based interface opti-
mized for timing applications was implemented (Födisch et al., 2016b). Finally, CZT pixel
detectors were characterized for their potential use in a Compton camera application.
3

2. Analog front-end electronics
Cadmium zinc telluride (CdZnTe, CZT) radiation detectors are suitable for a variety of ap-
plications, due to their high spatial and energy resolution at room temperature. However,
state-of-the-art detector systems require high-performance readout electronics. Though an
ASIC is an adequate solution for the readout, requirements of high dynamic range and high
throughput are not available in any commercial circuit. Consequently, this chapter describes
the development of analog front-end electronics with operational amplifiers for an 8×8 pixe-
lated CZT detector. For this purpose, we modeled an electrical equivalent circuit of the CZT
detector with the associated charge-sensitive amplifier (CSA). Based on a detailed network
analysis, the circuit design is completed by numerical values for various features such as
ballistic deficit, charge-to-voltage gain, rise time, and noise level. A verification of the per-
formance is carried out by synthetic detector signals and a pixel detector. The experimental
results with the pixel detector assembly and a 22Na radioactive source emphasize the depth
dependence of the measured energy. After fitting the weighting potential, a correction of
the energy based on the derived depth of interaction (DOI) was feasible, thus improving the
energy resolution.
2.1. State-of-the-art
CdZnTe is a room-temperature semiconductor material for radiation detectors (Del Sordo
et al., 2009). It is available in compact detector units with highly segmented pixel layouts
and is ideally suited for high energy resolution γ-ray spectroscopy (Verger et al., 2005) and
3D imaging (Wahl et al., 2015). As has been previously reported, CZT detectors have been
used in Compton camera systems (Kormoll et al., 2011a; Hueso-González et al., 2014;
Lee et al., 2016). With regard to recent investigations, they have the potential to build an
imaging system for proton therapy (McCleskey et al., 2015; Polf et al., 2015; Taya et al., 2016;
Golnik et al., 2016). State-of-the-art readout systems for highly segmented CZT detectors
are conventionally built with an ASIC (He et al., 1999; Gan et al., 2016). Available ASICs
are optimized for γ-ray spectroscopy. Low energy range, usually up to 2 MeV (McCleskey
et al., 2015; Gan et al., 2016; Födisch et al., 2013), limited count rate capability, and poor
availability and product life cycle are unsolved challenges of an ASIC-based readout system
for an imaging system in proton therapy. In this environment, high energies up to 7 MeV
and count rates up to 1 Mcps have to be handled (Schumann et al., 2015; Hueso-González
et al., 2015; Pausch et al., 2016). Instead of using an ASIC for the readout electronics, COTS
operational amplifiers have been used for the front-end electronics (Ramachers and Stewart,
5
2. Analog front-end electronics
2007). Along with space-saving multi-channel analog-to-digital converters (ADCs) and an
FPGA, all tasks related to the signal acquisition and processing can be done with a COTS
system. A programmable digital system benefits from its versatility, which is needed for the
evaluation of a detector system for new applications like proton therapy. Even for application
in a fixed installation, a system made of COTS components provides the advantages of
proven reliability and life-cycle support.
In general, front-end electronics are the key element of overall performance of a detector
system. Our goal is to maximize the dynamic range of the front-end electronics since the
CZT is exposed to high-energy γ-rays, but also has to detect low-energy scatter events
used for Compton imaging. As this is the main objective, which cannot be solved with
a state-of-the-art ASIC, the timing information of an interaction must be preserved by the
readout system. Most ASICs merely include simple analog signal processing (e.g. leading-
edge trigger for timing and peak-hold circuit for energy information) making pulse shape
analysis or advanced timing algorithms difficult or even impossible. As the design of front-
end electronics is a tradeoff between signal bandwidth, noise, complexity, size, and costs,
the best solution must be driven by the application. For the prototype of a Compton camera,
we investigate a space-saving and simple circuit design with minimal components. The
system must include at least 65 analog readout channels, which are set up with COTS
voltage feedback operational amplifiers.
2.2. Basic design considerations
2.2.1. CZT detector assembly
For a medical imaging application, we use a CZT detector as the scattering layer in a Comp-
ton camera. A reasonable choice for this task is a pixelated CZT radiation detector from
Redlen Technologies (Redlen Technologies, 2011). The detector size is 19.42×19.42 mm2
with a thickness of 5 mm. Towards the continuous planar electrode on the back side, there
are 64 pixel electrodes aligned in an 8×8 array on the front side. The size of a pixel pad
is 2.2×2.2 mm2 for all pixels except the corner pixels with 1.98×1.98 mm2. In addition, a
steering grid surrounds all pixels. The inter-pixel space is 0.26 mm. The bulk device and
an assembled detector are shown in fig. 2.1. The detector is mounted on a printed circuit
board (PCB) with the continuous planar electrode on top. A bond wire is attached to a cop-
per pad on the carrier PCB. Furthermore, a conductive adhesive on each pixel pad ensures
the electrical connection to the pixel array on the PCB. An underfill between the pixel array
and the PCB supports the adhesive connection and improves mechanical stability. We also
decided to use a 3.2 mm-thick FR-4 material for the PCB to enhance the robustness of the
assembly. The electrodes of the detector are accessible via rugged high-speed connectors
on the bottom side. To shield the detector against visible light, a 3D-printed cap is attached
above the detector on the top of the carrier board. Our front-end electronics are designed to
work with this type of detector assembly, and the readout boards are plugged into the side
6
2.2. Basic design considerations
Figure 2.1.: A 5 mm thick pixelated CZT detector from Redlen Technolo-
gies (Redlen Technologies, 2011) with 8×8 electrodes (each 2.2 mm×2.2 mm)
on the front side (left) and a continuous planar electrode (19.42 mm×19.42 mm)
on the back side (right). The detector is mounted on a 3.2 mm-thick carrier
board with rugged connectors on the bottom side.
faces of the detector carrier board (see fig. 2.22). Therefore a stacked system with arbitrary
depth, as required for the evaluation of a suitable Compton camera setup, can be easily con-
structed. Further investigations on ruggedization of CZT detectors and detector assemblies
have been presented in (Lu et al., 2015).
2.2.2. Electrical characteristics of a CZT pixel detector
From the electrical point of view, a CZT detector can be modeled with the equivalent cir-
cuit shown in fig. 2.2. With an external operating voltage at the electrodes of the detector,
the terminals are referred to as cathode and anode in accordance with the applied polarity.
Usually, the continuous electrode is biased with a negative potential and the pixel electrodes
are at ground potential. For an ideal detector material, this would force the negative charge
carriers (electrons) to move towards the anode and the positive charge carriers (holes) to
move towards the cathode. As a consequence of charge trapping due to structural defects,
impurities, and irregularities of the material (Awadalla, 2015), the mobility and lifetime of the
holes in CZT are very poor compared to the electrons (Spieler, 2005). Only the moving
electrons induce a signal on the electrodes, while the portion of the signal due to the holes
can be neglected. Thus, if the generated electrons move to the position-sensitive side of
the detector, the overall detection performance is improved. As the readout electronics are
directly connected to the electrodes, the electrical characteristics of the detector influence
the dynamic behavior of the entire circuit. Finally, the network model for the readout elec-
tronics must include the electrical equivalent circuit of the detector. In general, a very simple
equivalent circuit is adequate to model the properties of the detector. As summarized in
fig. 2.2, it is a passive two-terminal component with a permittivity and a specific conduc-
7
2. Analog front-end electronics
tance. A capacitor represents the permittivity of the material and the electrical conductivity
is modeled by a resistor. For the evaluated pixelated CZT detector, the capacitance can be
roughly approximated by the model of the parallel-plate capacitor with an electrode area A
separated by the distance d . Therefore the capacitance C is calculated by
C = ε0εr
A
d
, (2.1)
where ε0 is the vacuum permittivity and εr is the relative permittivity of CZT. For the detector
in this study, the values are A = 377 mm2, d = 5 mm. In practical terms, the bias voltage
does not influence the relative permittivity in the applicable voltage range up to -600 V. The
capacitance of the detector is therefore largely independent of the bias voltage (Garson
et al., 2007). The value of εr depends on the manufacturing process, but ranges from 10
to 11 (De Antonis et al., 1996; Spieler, 2005). Thus, the entire bulk capacitance is in the
range from 6.6 pF to 7.4 pF. The capacitance of a single pixel can also be calculated by
eq. 2.1, where the size of the pixel determines the value of A (Rossi et al., 2006). With the
same assumption for εr, the pixel capacitance is in the range from 69 fF to 95 fF, including
the smaller corner pixels. Besides the estimation of capacitance, the bulk resistivity of the
detector is needed to model the electrical characteristics. We measured the leakage current
of the assembled CZT detector from fig. 2.1 with a precision high-voltage source with current
monitor (Iseg SHQ series) (iseg, 2014). This device reports a current of 10 nA±1 nA with a
detector bias voltage of -500 V. For a homogenous material, the resistance is defined as:
R = ρ
d
A
(2.2)
where ρ is the resistivity of the material. Our measurement corresponds to a resistor of
50 GΩ for the two-terminal equivalent circuit or a resistivity of 4·1011 Ωcm. This is in accor-
dance with the values from the datasheet (ρ > 1011 Ωcm) from Redlen (Redlen Technologies,
2011).
RD CD
Anode
Cathode
Figure 2.2.: An electrical equivalent circuit of a CZT detector. The bulk resistivity is modeled
with the resistor RD, which is typically of several tens of GΩ. The capacitor
CD represents the parallel-plate geometry of the electrodes. Its value can be
estimated with eq. 2.1.
Finally, the electrical characteristics of a CZT crystal mainly depend on the manufacturing
process. If they cannot be experimentally verified, the values for the components of the elec-
8
2.2. Basic design considerations
trical equivalent circuit can be estimated with the geometry of the detector and the constants
from the literature by eqs. 2.1 and 2.2. The equivalent circuit shown in fig. 2.2 is the simplest
electrical representation of the detector unit. It does not model a frequency dependency
with a complex permittivity. Additionally, stray capacitances introduced by traces and the
carrier board itself, cross-coupling between pixels, and any inductivities of the connectors
are ignored. However, a well-designed PCB layout can minimize these effects.
2.2.3. High voltage biasing and grounding
A fundamental operating condition for a CZT detector is the presence of an electric field
between the electrodes. Thus, the charge carriers generated by incident radiation move to-
wards the electrodes, and an electric current is measurable. Typical electric field strengths
for CZT detectors are in the range of 1 kV/cm (Redlen Technologies, 2011). As the con-
tinuous electrode of the detector is biased with a negative voltage, the ground potential is
connected to the pixelated electrodes on the opposite side. In general, the high voltage with
low ripple is generated by an external power supply connected to the detector via a cable
or PCB traces. To reduce any pickup noise related to electromagnetic interference, we put
a high-voltage filter close to the detector electrode. This is in the simplest case a passive
first-order low-pass filter (RC network shown in fig. 2.3). The value of the resistor RB should
+
–
V
R
C
RB
iD RD CD
Anodes
Cathode
High voltage bias CZT Detector Readout
Ground
Figure 2.3.: Basic connection scheme of the CZT detector. A high voltage supply with an
RC low-pass filter and a resistor RB are used to generate the bias voltage for
the cathode. The readout electronics are directly connected to the cathode and
anodes of the detector. The anodes are biased by ground signal of the readout
electronics, and consequently both ground potentials must be tied together.
be chosen to minimize the voltage drop across the high-voltage filter and maximize the volt-
age across the detector. The value of the filter capacitor C should be as high as possible to
achieve the best noise filtering. According to eqs. 2.1 and 2.2, the capacitance is inversely
proportional to the resistance. Therefore, the capacitor C must be chosen, such that the
9
2. Analog front-end electronics
insulation resistance of the dielectric material is much higher than the resistance of the de-
tector. Commercial capacitors of class 1 have an insulation resistance of more than 100 GΩ
with a maximum capacitance of 10 nF. Thus, a passive first-order low-pass filter with a cutoff
frequency below 1 Hz is possible (e.g. R = 47 MΩ, C = 10 nF). As a noise-filtered voltage is
the output of the RC circuit, it cannot be directly connected to the electrode in order to bias
the detector. One reason for this is that the filter capacitor C would be in parallel with the
capacitance CD of the detector. A bias resistor RB of 47 MΩ separates the filter network from
the detector. Another point that has to be taken into account is the path of current flow gen-
erated by the detector. The current should not flow into the high-voltage source. This can
be ensured by choosing a high resistance for biasing the detector, so that the time constant
RBC is much larger than the time constant of the readout electronics (Grupen and Shwartz,
2011). The active components of the readout electronics have their own power supply, which
is separated from the high-voltage supply. Further, the anodes are biased with the ground
potential of the readout electronics (signal ground in fig. 2.3). Both grounds have to be at
the same potential and must be tied together. The electric field between the cathode and
the anodes is therefore referenced to a known potential.
2.2.4. Signal formation in CZT detectors
As implied, incident radiation hitting the detector generates free charge carriers. These
electrons and holes move towards the electrodes because of the applied electric field. How-
ever, the generated charge is proportional to the incident γ-ray energy and the signal of the
detector is an electric current. The induced current through an electrode is defined as
i = q
→
v
→
E0 (
→
x ) , (2.3)
where q is the moving charge,
→
v is the instantaneous velocity of the charge q and
→
E0 (
→
x ) is
the weighting field associated with the electrode at the position
→
x of the charge (He, 2001).
The weighting potential ϕ0(
→
x ) is defined as
→
E0 (
→
x ) = −∇ →ϕ0 (→x ) , (2.4)
by setting one electrode to unit potential and all others to zero. For a parallel-plate geometry
of a detector, where the widths in x and y dimensions of the electrodes are much larger
than the thickness z , the electric field inside the detector is distributed homogeneously with
a constant field strength. Therefore, the weighting field
→
E0 (
→
x ) is equal to the electric field,
and, solving eq. 2.4 in the z-dimension, the normalized weighting potential ϕ0C(z) of the
cathode is a linear function (He, 2001):
ϕ0C(z) = z , 0 ≤ z ≤ 1 . (2.5)
The solution for the Poisson equation in two dimensions was given by (Rossi et al., 2006;
Wermes, 2006). These authors presented an equation for the calculation of the weighting
10
2.2. Basic design considerations
potential for a detector with a segmented electrode layout. By setting one dimension of the
pixel area to zero and normalizing the thickness z of the detector to 1, the weighting potential
ϕ0A of a single pixel with the normalized width a can be calculated as
ϕ0A(x , z) =
1
pi
arctan
(
sin(piz)sinh(pi a2)
cosh(pix)− cos(piz)cosh(pi a2)
)
, (2.6)
where x and z are the coordinates and 0≤ z ≤1. The weighting potentials are shown in
fig. 2.4. The weighting field E0(z) under the collecting electrode (x = 0, y = 0) can be
0 1 2 3 4 5
z / mm
0
0.2
0.4
0.6
0.8
1
W
ei
gh
tin
g 
po
te
nt
ia
l a
no
de
0 1 2 3 4 5
z / mm
0
0.2
0.4
0.6
0.8
1
W
ei
gh
tin
g 
po
te
nt
ia
l c
at
ho
de
Figure 2.4.: The weighting potentials of the cathode (bottom) and pixel anode (top) of the
5 mm-thick detector. The cathode is located at position z = 0 mm. The right
plots show the weighting potential under the collecting electrode (x = y = 0 mm).
calculated by solving eq. 2.4 with eq. 2.6. This results in the following equation:
E0(z) =
sinh(pi a2)
cos(piz)− cosh(pi a2)
. (2.7)
As we can estimate the weighting field of the detector with eq. 2.7, the next step is to calcu-
late the expected electric current to simulate the transient behavior of the detector signals.
With the assumption that the electric field E is constant and homogenous across the detector
because all anodes are at the same potential, the electric field is calculated by
E =
V
d
, (2.8)
where V is the applied bias voltage and d is the thickness of the detector. Then, the velocity
v of a moving charge q in the detector is calculated by
v = µeE , (2.9)
11
2. Analog front-end electronics
where µe is the mobility of the charge carriers (electrons). A typical value of µe for a CZT
is about 1000 cm2/(Vs) (He et al., 1998; Cho et al., 2011). With the ionization energy
E˜i = 4.64 eV for CZT (Spieler, 2005) and the energy E˜γ of incident radiation, the generated
moving charge q is defined as
q = e
E˜γ
E˜i
(2.10)
where e is the elementary charge. By inserting eqs. 2.7, 2.8, 2.9, 2.10 into eq. 2.3, the detec-
tor current can be numerically estimated. An example of incident radiation of E˜γ = 511 keV
is shown in fig. 2.5.
0 200 400 600
Time / ns
0
20
40
60
80
100
120
140
Ca
th
od
e 
cu
rre
nt
 / 
nA E=800 V/cmE=1000 V/cm
E=1200 V/cm
0 200 400 600
Time / ns
0
20
40
60
80
100
120
140
An
od
e 
cu
rre
nt
 / 
nA
E=800 V/cm
E=1000 V/cm
E=1200 V/cm
Figure 2.5.: Induced current i on the detector electrodes for incident γ-rays with an energy
of 511 keV. The left plot shows the current induced on the cathode and the right
plot shows the current induced on an anode pixel. The signals are calculated
with eq. 2.3. An increased electric field strength E causes a higher current i with
shorter drift times tD. The generated charge remains constant.
Fig. 2.5 shows the currents flowing through the electrodes for an interaction at the cathode
side of the detector. If the DOI is closer to the anode side, the total charge collection is
incomplete, as the fraction of charge from the holes cannot be measured. This circumstance
introduces a depth dependence and requires a correction for spectroscopic applications.
2.2.5. Readout concepts
The results from fig. 2.5 give an estimate of the required sensitivity for the readout electron-
ics. The signals are in the range of several nA (e.g. 8 nA for 100 keV at the cathode with an
electric field of 1200 V/cm). In addition, the drift time tD can be as short as 40 ns if the inter-
action takes place in the last 10 % of the detector volume at the anode side. As the detector
signals are electric currents, the simplest method for acquiring the signals is to use a shunt
resistor and a measurement of voltage drop across this resistor. Finally, the measured tran-
sient voltage signal could be fed to an arbitrary signal processing system. In fact, a single
resistor would be the simplest, smallest, and cheapest solution for the front-end electronics.
With a resistor in the range of some MΩ, i.e. is small enough to force the detector current to
flow into it, a voltage drop in the range of some mV is generated. This is sufficient for most
signal processing systems. However, this concept suffers from a lack of signal bandwidth.
As the left circuit in fig. 2.6 shows, the induced current will flow through RS, but the RSCS time
12
2.2. Basic design considerations
constant of the two-terminal circuit determines the bandwidth and therefore the rise time of
the circuit. Moreover, the detector adds its own capacitance to the shunting elements. The
-3 dB bandwidth of the output signal is given by
f−3 dB =
1
2piRS(CD + CS)
, (2.11)
where CD is the capacitance of the detector and CS is the shunting capacitance of the readout
electronics. Even with an undersized value of 1 pF for CD + CS and a shunt resistor of 1 MΩ,
the resulting bandwidth is only 159 kHz. However, the rise time tr of the output pulse is
proportional to the cutoff frequency fc of a low-pass filter. For a first-order low-pass filter, the
rise time from 10 % to 90 % of the step response is calculated by
tr =
ln(0.9)− ln(0.1)
2pifc
, (2.12)
where fc is the -3 dB cutoff frequency. In other words, the rise time for the given example is
about 2.2µs. Most of the current pulses would no longer be detectable.
To increase the bandwidth for such a current-to-voltage converter, a transimpedance ampli-
fier (TIA) is an appropriate solution (fig. 2.6, middle circuit). This configuration forces the
generated current to flow into the negative terminal (virtual ground) of the amplifier. That
means CD and CS are still present, but do not have the same impact on the time constant
as the shunting readout circuit. Instead, the rise time and current-to-voltage gain of an ideal
TIA are determined by the feedback network. Similarly to the example with the shunting
resistor, the TIA is assumed to have a current-to-voltage gain of 1 MΩ, but the bandwidth
is now limited by the parasitic capacitance of the feedback resistor RF. The resulting band-
width of the TIA is about 1.6 MHz with a rise time of 220 ns, if the parasitic capacitance of the
resistor is around 100 fF. This is sufficient to detect events near the cathode with drift times
longer than the rise time of the TIA, but most of the events will suffer from a significant pulse
amplitude loss. To further increase the bandwidth of a TIA, several methods can be ap-
plied (Brisebois, 2015), but, nevertheless, the achievable gain-bandwidth product is too low
with regard to the rise times of anode signals and an adequate signal-to-noise ratio (SNR).
Preserving the pulse shape by means of a current-to-voltage converter is quite challenging;
however, the detector generates a charge, which can be measured with a modification of
the feedback network. A single capacitor CF in the feedback loop of the amplifier results in a
current-integrating circuit (fig. 2.6, right). Finally, the current ic through the capacitor is
iD = −iC = CF dvOut
dt
. (2.13)
Therefore, the voltage vOut at the output becomes
vOut =
−1
CF
∫
iddt =
−q
CF
+ v0 . (2.14)
This results in an output voltage whose amplitude is proportional to the moving charge q
generated by the detector. This circuit is referred to as the charge-sensitive amplifier (CSA).
13
2. Analog front-end electronics
Both the TIA and the CSA are basic negative-voltage-feedback operational amplifier circuits.
iD
iIn
RS CS
i vOut
–
+
RF
iR
iiIn
vOut
iD
–
+
CF
iC
iiIn
vOut
iD
Figure 2.6.: Three different readout circuits for the conversion and amplification of the detec-
tor current iD to a voltage vOut. The left circuit converts the current to a voltage
by means of a shunt resistor. The transimpedance amplifier (middle) is also
used as a current-to-voltage converter, whereas the CSA (right) is used for a
charge-to-voltage conversion.
As the gain of the TIA is designed with a resistor in its feedback path, the gain of the CSA
is determined by its feedback capacitance. Nevertheless, a resistor has a parasitic capaci-
tance and a capacitance also has a parasitic resistance. Thus, both circuits are adequately
modeled by an operational amplifier with an RC feedback network. Equally importantly, an
operational amplifier also introduces a shunting resistor and capacitor. These are in parallel
with the impedance of the detector and abstracted by the resistor RD and capacitor CD in
fig. 2.7. On the whole, the model of a CSA is a mixture of the three circuits shown in fig. 2.6.
–
+
vOvP
vN
CDRD
CF
RF
i
iIn
vOut
Figure 2.7.: A real configuration of a readout circuit as CSA. The shunting impedance cannot
be eliminated, but the effects are reduced by the operational amplifier. The
resistor RF from the feedback path also has the parasitic capacitance CF. This
circuit will be used for further analysis.
For illustration, the simulated signal waveforms of the three readout circuits with typical val-
ues are shown in fig. 2.8. With this example, it is clearly visible that the CSA achieves the
highest amplitude. As the output voltage of the CSA begins to increase at time t = t0 and
reaches its peak amplitude vpeak at time tD when the current flow stops, the resulting output
voltage for an input current pulse with rectangle shape and constant amplitude iD is
vpeak (RF) = iDRF
(
1− e
−tD
RFCF
)
. (2.15)
14
2.2. Basic design considerations
0 t0 200 400 tD 600 800
Time t / ns
0
0.05
0.1
0.15
O
ut
pu
t v
O
ut
 
/ V
R      1Meg || 1pF
TIA   1Meg || 100fF
CSA 100fF
0 t0 200 400 tD 600 800
Time t / ns
0
0.05
0.1
0.15
O
ut
pu
t v
O
ut
 
/ V
RF = 1+
RF = 25 M+
RF = 5 M+
RF = 1 M+
Figure 2.8.: Output signals from the three different readout circuits (left). The resistor-based
current-to-voltage conversion has poor gain and bandwidth. These are improved
by the transimpedance amplifier (TIA). The largest gain is achieved with the
CSA. With a constant parasitic capacitance CF = 100 fF and an increased feed-
back resistor RF, the TIA becomes a CSA (right).
Therefore, the maximum output voltage vmax for the same input signal is calculated by re-
placing RF with a resistor R =∞. Thus, we can define vpeak over vmax as the peak ratio P of
the CSA with the time constant τ = RFCF as
vpeak (RF)
vmax (R)
=
iDRF
(
1− e
−tD
RFCF
)
iDR
(
1− e
−tD
RCF
) (2.16)
P = lim
R→∞
vpeak (RF)
vmax (R)
=
τ
tD
(
1− e−tDτ
)
. (2.17)
Eq. 2.17 describes the attenuation of the peak amplitude. Referring to (Knoll, 2010), the
degree of which the infinite time constant amplitude has been decreased is called the ballistic
deficit B. With eq. 2.17, a numerical expression of B can be defined as follows:
B = 1− P . (2.18)
The calculations are based on the assumption that the input current pulse has a rectangular
shape and a constant value during the drift time tD, as it is seen by the cathode. If we
consider a pixelated detector, where the size of the pixel is small compared to the continuous
electrode and detector thickness, the current pulse as shown in fig. 2.5 only rises when the
moving charge reaches the pixel electrode. However, the drift time tD is the same as for
the cathode current, but most of the charge is deposited at the end of the pulse. Thus, the
anode current can be simply imagined as a rectangular pulse with a shorter drift time and
higher current than the cathode signal. Consequently, the ratio of τ/tD is larger than the
ratio of the continuous electrode, if the same CSA is used. Thus the ballistic deficit has
a greater impact on the cathode signal than on anode signals. The influence of tD and τ
is illustrated in fig. 2.9 Eq. 2.18 is essential to select an optimized feedback time constant
dependent on the characteristics of the detector. Usually, the charge-to-voltage gain is a
requirement imposed by the lowest measurement range of the application and is set by
a carefully selected feedback capacitance (see eq. 2.14). One parameter which can be
15
2. Analog front-end electronics
0 100 200 300 400
Drift time tD / ns
0
20
40
60
80
100
Pe
ak
 ra
tio
 P
 / 
%
RF = 1+
RF = 25 M+
RF = 5 M+
RF = 1 M+
100 101 102 103
(= / tD) / ns
0
5
10
15
20
25
30
35
40
Ba
llis
tic
 d
ef
ici
t B
 / 
%
Figure 2.9.: The peak ratio P of the output signal from the CSA is decreased by a smaller
time constant τ = RFCF (left). Moreover, the ballistic deficit B according to
eq. 2.18 depends on the time constant τ and the drift time tD of the moving
charge (right).
optimized is the value of the feedback resistor. Of course, the best choice is a value close
to infinity, since it matches the ideal CSA. However, this is practically not useful, as every
current pulse from the detector forces the amplifier to integrate the charge onto its steady
state voltage level. After a while, the amplifier overflows, when the output voltage reaches
the level of the supply voltage (or even far below this level). Then, it cannot process another
event until the feedback capacitor has discharged. A conventional method for the discharge
of the capacitor uses a switch, which is added to the feedback network. This requires an
additional reset logic and active control, as featured by an ASIC. A more prevalent practice
is to reset the amplifier by making an appropriate choice of the feedback resistor RF . The
feedback resistor discharges the capacitor CF with the characteristic time constant τ . An
optimally adjusted value of the resistor is a tradeoff between the ballistic deficit, count rate,
and frequency response of the amplifier.
2.2.6. Operational amplifier
For the design and investigation of a CSA with an operational amplifier, we will analyze the
circuits in time and frequency domains. Our notation for a complex number in the frequency
domain is
s = σ + jω , (2.19)
where j is the imaginary unit, σ is the real part, and ω is the imaginary part in the range
of real positive values. For many circuit designs, it is useful and sufficient to analyze a
network containing an operational amplifier with ideal constraints, which means the device
has an infinitely high input impedance, infinite open-loop voltage gain, etc. For a realistic
and detailed analysis, an operational amplifier can be modeled as shown in fig. 2.10. In
contrast to the ideal operational amplifier, the input impedance ZI is not infinite. The basic
function of an operational amplifier is the amplification of the voltage drop across its positive
and negative input terminals. The output voltage is therefore given by
vO = (vP − vN)A , (2.20)
16
2.2. Basic design considerations
vP
vN
ZI vD
+
–
+
–
AvD
ZO
vO
Figure 2.10.: A model of a realistic operational amplifier (Horowitz and Hill, 2015). The volt-
age difference vD between the two terminals is amplified by the bandwidth-
limited factor A. The input impedance ZI should be as high as possible,
whereas the output impedance ZO should be close to zero.
where vP is the potential at the positive terminal and vN at the negative terminal. For an ideal
operational amplifier, the transfer function A(s) has no frequency dependencies, so that the
corresponding frequency response
HA(f ) = A(0) = AOL (2.21)
is constant. AOL is the zero-frequency open-loop gain. For a realistic operational amplifier,
the frequency response of the open-loop gain has the shape of a low-pass filter. Con-
sequently, we model the transfer function A(s) of the operational amplifier as a first-order
low-pass filter in the frequency domain using
A(s) =
AOL
1 + sτ
(2.22)
with the finite open-loop voltage gain AOL and the -3 dB cutoff frequency
fc =
1
2piτ
. (2.23)
A characteristic parameter which simplifies the frequency response of an operational ampli-
fier is the gain-bandwidth product (GBP). At frequencies larger than fc , the GBP is constant
for the first-order low-pass filter frequency response. At the frequency fGBP, the open-loop
gain equals the unity gain. With the maximum open-loop voltage gain AOL, the functional
relationship is given by
fcAOL = fGBP . (2.24)
With the given parameters AOL and fGBP of an operational amplifier, we can set
τ =
AOL
2pifGBP
(2.25)
17
2. Analog front-end electronics
and by setting s = jω with ω = 2pif , the transfer function of the open-loop gain for periodic
sinusoidal signals is
A(j2pif ) =
AOL
1 + j ffGBP AOL
. (2.26)
The frequency-dependent gain GA(f ) and phase shift ΦA(f ) are derived from eq. 2.26.
GA(f ) = |A(j2pif )| = AOL√
1 +
(
f
fGBP
AOL
)2 (2.27)
ΦA(f ) = ∠A(j2pif ) = −tan−1
(
f
fGBP
AOL
)
(2.28)
0 f
c
fGBP
Frequency f / Hz
0
AOL
G
ai
n 
G
A(f
) / 
dB
0 f
c
fGBP
Frequency f / Hz
-90
-60
-30
0
Ph
as
e 
sh
ift
 )
A(f
) / 
de
g
Figure 2.11.: The gain GA(f ) and the phase shift ΦA(f ) of the frequency response of an
operational amplifier with zero-frequency gain AOL. The frequency response is
modeled as a first-order low-pass filter with the cutoff frequency fc = fGBP/AOL.
The attenuation of AOL is -20 dB/decade above fc.
2.3. Circuit design of a charge-sensitive amplifier
As briefly described in sec. 2.2.5, an operational amplifier integrates a charge, as there is a
capacitor CF in the feedback and the time constant of the RFCF network is large compared
to the drift time of the moving charge at its input. In general, a two-terminal equivalent circuit
adequately represents the feedback circuit. Thus, the RFCF network is summarized with the
impedance ZF. In accordance with fig. 2.2, the impedance of the detector is modeled with the
two-terminal impedance ZD. Both impedances are connected to the inverting terminal of the
operational amplifier. In addition, there are parasitic shunt impedances between the negative
and the positive input terminal of the operational amplifier. As they are in parallel with the
impedance of the detector, all additional shunt impedances are absorbed by the model of ZD.
For a simplified circuit analysis, the positive terminal of the operational amplifier is grounded.
If the operational amplifier requires a single supply operation, the positive terminal is biased
towards the desired potential for the virtual ground. For a circuit analysis, this is negligible.
Therefore, the initial circuit for the analysis is shown in fig. 2.12.
18
2.3. Circuit design of a charge-sensitive amplifier
–
+
vOvP
vN
ZD
iZD
ZF
iZF
i
iIn
vOut
Figure 2.12.: An operational amplifier with a current input and a voltage feedback connected
to the negative input terminal. With ZF = CF||RF, this configuration is used for
the charge-to-voltage amplification. ZD represents all impedances connected
between the input iIn and ground (e.g. capacitance of the detector and parasitic
capacitance of the negative input terminal).
2.3.1. Circuit analysis
Kirchhoff’s current law is used for the fundamental circuit analysis. For the node at the
negative terminal of the operational amplifier, the sum of all currents must be zero.
0 = i − iZD + iZF (2.29)
Further, the currents can be expressed in terms of the feedback impedance ZF and shunt
impedance ZD.
iZF =
(vO − vN)
ZF
(2.30)
iZD =
vN
ZD
(2.31)
According to eq. 2.20, and setting vP = 0, the voltage vN in eqs. 2.30 and 2.31 can be
substituted with
vN =
−vO
A
. (2.32)
The solution of eq. 2.29 with eqs. 2.30-2.32 results in the current-to-voltage transfer function
−vO
i
=
AZDZF
ZD(A + 1) + ZF
, (2.33)
where A is the transfer function of the operational amplifier according to eq. 2.22. Another
useful method for the circuit analysis is the principle of superposition. Because the system
can be described with linear equations, a superposition of all current and voltage sources is
possible. That means that first, if the current source for i is turned off (replaced with an open
circuit), the voltage source for vO is acting alone and the resulting voltage at the negative
19
2. Analog front-end electronics
input terminal of the operational amplifier is
vN|i=0 = vO
ZD
ZD + ZF
. (2.34)
Second, if the voltage source vO is turned off (replaced with a short), the current source is
acting alone and the voltage at the negative input terminal of the operational amplifier is
vN|vO=0 = i
ZDZF
ZD + ZF
(2.35)
Finally, the superposition for the voltage node vN is the sum of eqs. 2.34 and 2.35
vN = vN|i=0 + vN|vO=0 . (2.36)
By inserting eqs. 2.32, 2.34 and 2.35 into eq. 2.36, the output voltage vO can be rewritten as
vO = −A
(
vO
ZD
ZD + ZF
+ i
ZDZF
ZD + ZF
)
. (2.37)
The derived eq. 2.37 is the same as eq. 2.33, but shows the terms for the basic block
structure shown in fig. 2.13 in an intuitive and easily readable form.
ZDZF
ZD+ZF Σ A
ZD
ZD+ZF
i vI – vD vO
–
Figure 2.13.: Block diagram of the closed-loop transfer function with input current i and out-
put voltage vO. This structure shows the voltage feedback network and loop
gain, which are essential for a stability analysis. The voltage difference vD at
the negative input terminal of the amplifier is the sum of the feedback voltage
and an input current-dependent voltage vI.
The basic block structure is used to identify the voltage feedback and the stability of the loop.
It is obvious that the feedback network β is
β =
ZD
ZD + ZF
(2.38)
and the closed-loop voltage gain G is only (Horowitz and Hill, 2015)
G =
A
1 + Aβ
= −vO
vI
. (2.39)
The closed-loop gain G becomes 1/β if the loop gain Aβ is much greater than one. On the
contrary, the closed-loop gain becomes infinite if the loop gain Aβ = -1. At this frequency,
the system tends to be unstable, and oscillates. As A and β are defined to have positive real
values, the case where the loop gain becomes -1 only occurs if the loop gain is 1 and a phase
20
2.3. Circuit design of a charge-sensitive amplifier
shift of 180◦ is introduced by the frequency response of Aβ. A phase shift of a sinusoidal
signal with 180◦ is equal to a multiplication with -1. To make the feedback circuit stable, the
phase shift therefore has to be less than 180◦ at the frequency fi, where the loop gain is 1. A
sufficient phase margin at the frequency fi is required to ensure a stable operation over the
entire temperature range, and also to cover tolerances of the used integrated circuits. The
100 101 102 103 104 105 106 107 108 109
Frequency / Hz
0
10
20
30
40
50
60
70
M
ag
ni
tu
de
 / 
dB
fi = 13 MHz
Open-loop gain A
Feedback 1/-
Loop gain A-
100 101 102 103 104 105 106 107 108 109
Frequency / Hz
-180
-150
-120
-90
-60
-30
0
Ph
as
e 
/ d
eg
Phase margin Open-loop gain AFeedback 1/-
Loop gain A-
Figure 2.14.: The Bode plot of the closed-loop transfer function. The loop gain is 1 at the
frequency fi with a phase margin of roughly 90◦. This is sufficient for a stable
operation. See text for numerical values of the example.
example in fig. 2.14 is calculated with the parameters from table 2.1 for the OPA657. The
feedback impedance was chosen to be 100 fF in parallel with 25 MΩ. Further, the detector
capacitance of 6.7 pF in parallel with a 50 GΩ resistor and the parasitic input capacitance of
5.2 pF are the shunting impedance ZD. The stability is investigated at the frequency, where
the loop gain is 1. Mathematically, this intersection can be calculated by solving
|A(j2pif )β(j2pif )| = 1 . (2.40)
From the Bode plot shown in fig. 2.14, this point can be found at
log10(|A|)− log10(|
1
β
|) = 0 . (2.41)
In this plot, the reciprocal of the feedback network β intersects the open-loop gain at ap-
proximately 13 MHz. At this frequency, the phase shift of the loop gain is 88◦, resulting in
a phase margin φ of 92◦, which is sufficient for a stable operation. The phase margin φ is
calculated by
φ = 180◦ − (∠A(2pifi ) + ∠β(2pifi )) . (2.42)
Besides the stability of the circuit, the effective input impedance also depends on the open-
21
2. Analog front-end electronics
loop voltage gain of the amplifier, and is changed by the feedback network (Horowitz and
Hill, 2015). The effective input impedance of the CSA should be very low, to make sure
that all current flows into the amplification circuit. Regarding fig. 2.12, the effective input
impedance from the CSA is defined by the ratio of the voltage at the input terminal iIn and
the input current i . The voltage at the input terminal is vN, so the effective input impedance
Z ∗I is
Z ∗I =
vN
i
. (2.43)
Solving eq. 2.29 for vN with eqs. 2.30-2.32, Z ∗I can be expressed as
Z ∗I =
ZDZF
ZD(A + 1) + ZF
. (2.44)
As A becomes infinitely large, as we assume for an ideal operational amplifier, the effective
input impedance is zero. This satisfies the principle of the virtual ground. If we assume an
ideal detector without a parasitic impedance from the CSA, ZD is infinite. For this case, the
effective input impedance
lim
ZD→∞
Z ∗I =
ZF
A + 1
(2.45)
is determined by the open-loop gain and the feedback network ZF. To make sure that all
current flows into the CSA and is integrated on the feedback capacitor, the impedance from
eq. 2.45 must be small compared to the shunting impedance ZD. Consequently, an opera-
tional amplifier with a large open-loop gain is required.
2.3.2. Charge-to-voltage transfer function
The CSA is designed to measure a charge with a voltage output signal, where the peak
amplitude is proportional to the charge seen at the input. The relation of the output voltage
vO to the input charge Q is the charge-to-voltage transfer function
vO
Q
= HQ . (2.46)
As the charge is defined to be
dQ
dt
= i , (2.47)
where i is the electrical current, the corresponding Laplace transformation of eq. 2.47 is
L{Q ′(t)} = sQ(s) = i(s) . (2.48)
If we replace i in the current-to-voltage transfer function from eq. 2.33 by eq. 2.48 and set
the feedback network ZF = 1sCF and parasitic impedance ZD =
1
sCD
to single capacitors, then
the charge-to-voltage transfer function is given by
HQ =
−A
CD + CF(A + 1)
. (2.49)
22
2.3. Circuit design of a charge-sensitive amplifier
The simplification of the impedances to single capacitors is valid, as we want to investigate
the frequency response of the charge-to-voltage conversion. If the feedback time constant
is chosen appropriately according to eq. 2.18, there is no significant peak amplitude loss
in the voltage signal. The peak amplitude of the voltage signal is therefore independent of
the feedback resistor RF. The impedance ZD can also be simplified for this analysis, as the
equivalent input resistance of the amplifier and detector is much larger than the effective
input impedance according to eq. 2.44. For an ideal operational amplifier with infinite open-
loop gain A, HQ from eq. 2.49 becomes −1/CF over the entire frequency domain. Since
the open-loop voltage gain of an operational amplifier is not independent of frequency, a
more realistic charge-to-voltage transfer function is obtained by replacing A from eq. 2.49 by
eq. 2.22
HQ(s) =
−AOL
CD + CF(AOL + 1) + sτ(CD + CF)
. (2.50)
The charge-to-voltage gain is then given by
|HQ(j2pif )| = GQ(f ) = AOL√
(CD + CF(AOL + 1))
2 +
(
f
fGBP
AOL(CD + CF)
)2 . (2.51)
The steady state of the system is derived by calculating the zero-frequency gain:
GQS = lim
f→0
GQ(f ) =
AOL
CD + CF(AOL + 1)
. (2.52)
Eq. 2.52 shows that a finite open-loop gain AOL attenuates the measured peak voltage. The
measured fraction of charge as a ratio of GQS over the ideal charge-to-voltage gain 1/CF is
given by
GQSCF =
AOL
CD
CF
+ AOL + 1
. (2.53)
It is obvious that the measured peak amplitude is decreased by an increased detector capac-
itance CD or a reduced open-loop voltage gain AOL. Thus, the operational amplifier should
provide a high and stable open-loop voltage gain for an improved system performance as
illustrated in fig 2.15.
An equally important parameter of the CSA is the rise time of the output voltage as a reaction
of a charge step at its input. As shown by eq. 2.12, the rise time is proportional to the cutoff
frequency of the system. Therefore, to determine the rise time, we have to calculate its cutoff
frequency, which is derived by using eq. 2.51
|HQ(j2pifc)|
|HQ(0)| =
1√
2
(2.54)
fc = fGBP
CD
CF
+ (AOL + 1)
AOL(
CD
CF
+ 1)
. (2.55)
Eq. 2.55 shows, that the upper bandwidth limit is lowered with an increased fraction of the
shunting capacitance CD over CF. The bandwidth is extended with a lower value of AOL, but
23
2. Analog front-end electronics
0 100 200 300 400 500 600 700 800 900
CD / CF
50
60
70
80
90
100
Fr
ac
tio
n 
G
QS
C F
 
/ %
AOL = 1
AOL = 80 dB
AOL = 70 dB
AOL = 60 dB
Figure 2.15.: The measured fraction of charge dependent on the ratio of input capacitance
CD over feedback capacitance CF. The fraction is increased with an increased
zero-frequency open-loop voltage gain AOL of the operational amplifier.
this will reduce the measured fraction of charge, as shown in fig. 2.15. This also means that
the SNR is decreased and therefore the resolution of the charge measurement is also de-
creased. AOL should therefore be as large as possible. For this assumption, the bandwidth
of the CSA is
lim
AOL→∞
fc =
fGBP
CD
CF
+ 1
. (2.56)
Eq. 2.56 shows that the cutoff frequency and therefore the rise time of the CSA are directly
proportional to the gain-bandwidth product of the amplifier. The response of the transfer
function from eq. 2.49 to a unit step is shown in fig. 2.16.
0 20 40 60
Time / ns
0
0.2
0.4
0.6
0.8
1
Ch
ar
ge
 / 
As
AOL = 1,       CD/CF = 10
AOL = 1,       CD/CF = 100
AOL = 60 dB, CD/CF = 100
Figure 2.16.: The output signal of the CSA with a unity step at its input. The rise time is
increased with an increasing ratio of CD/CF. The gain-bandwidth product of
the operational amplifier was assumed to be fGBP = 1 GHz. In this example,
the rise times are 17 ns (solid line), 32 ns (dashed line) and 35 ns (dotted line).
The decreased rise time related to a smaller open-loop gain AOL is caused by
the attenuation of the peak amplitude.
24
2.3. Circuit design of a charge-sensitive amplifier
2.3.3. Input coupling of the CSA
As shown in fig. 2.3, at least one of the electrodes is biased at a negative high voltage.
Thus, the amplifier at the cathode must be protected against the bias voltage, since it is
not common for integrated circuits to operate at high input voltages in the range of several
hundreds of volts. A capacitor in series to the amplifier input blocks the bias voltage of the
detector, but allows the detector current to flow. This electrical circuit is shown in fig. 2.17,
where the coupling capacitor CB is represented by the impedance ZB. In this configuration,
the parasitic impedance ZI of the operational amplifier and the detector impedance ZD are
separated by the impedance ZB. As the influence of the coupling capacitor is not apparent at
–
+
vOvP
vN
ZI
iZI
ZF
iZF
i
iIn
vOut
ZD
iZD
ZB
iZB
Figure 2.17.: A functionally equivalent configuration to fig. 2.12 for the charge-to-voltage am-
plification. The input impedance at the terminal iIn is a Pi network instead of
the single impedance ZD. ZB is a high-voltage coupling capacitor to protect the
low-voltage input terminals of the operational amplifier.
first glance throughout the equation for the current-to-voltage transfer function, a calculation
with an infinite loop gain turns out the simplified equation
−vO
i
=
ZDZF
ZB + ZD
. (2.57)
To eliminate the influence of the impedance ZB, it must be much smaller than ZD. In the
case that all impedances are represented by a single capacitor, the steady-state charge-to-
voltage gain is ultimately given by
vO
Q
=
1
CF
(
1 + CDCB
) . (2.58)
Eq. 2.58 shows that the coupling capacitor must be much larger than the detector capaci-
tance to avoid a peak amplitude loss, thus improving the SNR.
25
2. Analog front-end electronics
2.3.4. Noise
As has been pointed out, the noise caused by the electronics and the corresponding SNR
characterize the quality of the CSA with respect to the achievable energy resolution. The
precision of the charge measurement and the achievable timing are directly related to the
SNR. Regarding both values, the front-end electronics should not limit the intrinsic resolu-
tion of the detector. Thus, to reduce the electronics noise, the noise sources of the detector
system must be identified in the first step. The main source for the electronics noise is the op-
erational amplifier. But the passive components of an electronic circuit also generate noise.
The resulting noise at the output is the sum of all noise sources at the input, amplified by
the noise gain. The essential noise sources of the electronics circuit are shown in fig. 2.18.
The noise contribution of the operational amplifier is simplified to a model, where its noise is
–
+
vOvP
vN
inA
+ –
vnA
inRF
vOut
inDinRB
iIn
Figure 2.18.: The CSA configuration with its dominant noise sources at the input terminal
iIn. For noise analysis, the current noise sources from the bias resistor (inRB),
equivalent resistor of the detector (inD), feedback resistor (inRF), and amplifier
(inA) can be substituted by a single source, as they act in parallel. Finally, the
noise analysis is made with a current noise source in in parallel (parallel noise)
and a voltage noise source vn in series (series noise) to the input terminal.
characterized by an equivalent voltage noise source vnA and current noise source inA at the
inverting input terminal. Additionally, all resistors in the system contribute to the total noise
by adding thermal noise. The equivalent noise current of a resistor inR at temperature T is
given by
inR =
4kT
R
(2.59)
where k is the Boltzmann constant (Spieler, 2005) and the unit of noise current is A2/Hz.
The equivalent resistor RD of the detector, the biasing resistor RB, and the equivalent re-
sistor RF of the feedback impedance contribute to the total noise following eq. 2.59. Each
is represented by a current source in fig. 2.18. As there are multiple noise sources in the
system, they can all be absorbed into a single source for the voltage noise vn and a single
source for the current noise in with
in = inRD + inRB + inRF + inA , (2.60)
26
2.3. Circuit design of a charge-sensitive amplifier
as they are connected in parallel. Voltage noise sources can be summed, as they are
connected in series. As we assume that all noise sources have a flat frequency spectrum
(white noise), the resulting noise spectral density at the output is shaped by the noise transfer
function (noise gain).
The noise spectral density is usually expressed in units of nV/
√
Hz. With regard to the
block diagram of the closed-loop transfer function from fig. 2.13, the current noise source
is amplified by the current-to-voltage transfer function, whereas the voltage noise source is
amplified by the closed-loop voltage gain. As illustrated in fig. 2.19, the input noise current
Σ
ZDZF
ZD+ZF Σ A
ZD
ZD+ZF
i vI – vD vO
–
vn
–
in
Figure 2.19.: Block diagram of the closed-loop transfer function with additional current noise
source in and voltage noise source vn. Both sources act at the negative input
terminal of the operational amplifier but have different transfer functions.
in to output noise voltage vOn transfer function Gin is given by eq. 2.33 and the input noise
voltage vn to vOn transfer function Gvn is given by eq. 2.39.
fI
Frequency / Hz
v
n
/-
N
oi
se
 d
en
sit
y 
/ (n
V 
Hz
-
1/
2 )
vnGvn = vn
3
CD
CF
+ 1
4
inGin =
in
2:fCF
vOn =
q
(vnGvn)2 + (inGin)2
Total noise density
Voltage noise density
Current noise density
Figure 2.20.: The noise spectral density log-log plot for the ideal CSA with the capacitor CF
in the feedback network and the capacitor CD at its input terminal. In the lower
frequency range, the noise density is dominated by the current noise. Above
the frequency fI, the portion of the voltage noise dominates the total noise
density vOn.
For a clear view on the components of the resulting spectral noise density in fig. 2.20, the
calculations are based on the ideal model of the CSA. Fig. 2.20 shows the contribution of the
input current noise, which has a typical 1/f shape and dominates the low frequency range
(referred to as flicker noise). The noise density in the upper frequency range is dominated
by the input voltage noise but remains flat. If the open-loop gain is sufficiently large, the
voltage noise is amplified by the factor 1/β. A more realistic noise spectral density is shown
in fig. 2.21. This illustration emphasizes the impact of the feedback resistor RF and the gain-
27
2. Analog front-end electronics
fL1 fL2 fI fC
Frequency / Hz
v
n
v
n
/-
RFin
N
oi
se
 d
en
sit
y 
/ (n
V 
Hz
-
1/
2 )
fL1 =
1
2:RF (CD + CF)
fL2 =
1
2:RFCF
fI =
q
(RFin)2 ! v2n
vn2:RF (CD + CF)
fC =
fGBP
CD
CF + 1
Total noise density
Voltage noise density
Current noise density
Figure 2.21.: The noise spectral density log-log plot for the CSA with the resistor RF and
capacitor CF in the feedback network and the capacitor CD at its input terminal.
The frequency response of the current noise density has the shape of a low-
pass filter, whereas the voltage noise density has the response of a band-pass
filter. The time constant τ = RFCF determines the low cutoff frequency fL2 of
both responses. The voltage noise density is limited by the corner frequency
fc. Thus, the total noise density is bounded over the whole frequency range.
bandwidth product fGBP. The density of the flicker noise is limited in its upper value. It has
the shape of a typical low-pass filter, which is determined by the time constant RFCF of the
feedback impedance. The component related to the voltage noise has the spectral density
of a band-pass filter, where the upper-corner frequency fc is limited by the gain-bandwidth
product and the detector capacitance CD. At zero frequency, the voltage noise is limited to
the value of vn.
The noise spectral density determines the root-mean-square amplitude of the output noise
(rms noise) over a given bandwidth. As the noise spectral density is bounded over the entire
frequency range, the total rms noise Vrms is calculated by
Vrms =
√∫ ∞
0
[
(vnGvn)
2 + (inGin)
2
]
df (2.61)
An additional signal processing with filters (pulse shapers) must be adapted in accordance
to the noise spectral density due to optimal results regarding the SNR.
2.4. Implementation and Test
A major design guideline was the reduction of the total amount of components per readout
channel. This could be achieved using a single operational amplifier to build the CSA without
additional gain stages. Nevertheless, this circuit covers the desired measurement range and
does not need any pulse shapers for its basic operation. In order to fulfill the requirements,
the detector system must process energies up to 7 MeV and should be compatible with a
digitizer system with an input voltage range of 2 V. According to eq. 2.10 and the steady-
state charge-to-voltage gain of the CSA from eq. 2.52, the feedback capacitance CF should
28
2.4. Implementation and Test
Table 2.1.: A list of suitable commercial off-the-shelf operational amplifiers from different ven-
dors. They are compared in terms of our selection criteria: gain-bandwidth prod-
uct (fGBP), zero-frequency open-loop gain (AOL) and input impedance (ZI). vn is
the equivalent voltage noise source and in the current noise source. All values are
extracted from the datasheets (Analog Devices, 2013; Linear Technology, 2014,
2015; Texas Instruments, 2015).
Vendor Product fGBP / MHz AOL / dB ZI /Ω ||pF vn/ nV√Hz in/
fA√
Hz
Analog Devices ADA4817 1050 65 0.5 T ||1.4 4 2.5
Linear Techn. LTC6268 500 108 0.5 T ||0.55 4.3 5.5
Linear Techn. LTC6268-10 4000 108 0.5 T ||0.55 4.0 7.0
Texas Instrum. OPA657 1600 70 0.5 T ||5.2 4.8 1.3
be at least 120 fF. A higher gain is acceptable, since an attenuation of the pulse amplitude
can be made with less effort. Since the feedback capacitance and the electrical characteris-
tics of the detector are fixed design parameters, the operational amplifier and the feedback
resistance RF are the remaining components for an optimization of the readout circuit. As
a first step, the drift time tD of the moving charge must be taken into account to choose an
appropriate value of the time constant τ of the CSA. With regard to the induced currents
through the anode and cathode shown in fig. 2.5, we expect rise times up to 450 ns for the
cathode and up to 200 ns for the anode. Because the anode signals carry the information
used for spectroscopy, the ballistic deficit from eq. 2.18 should be minimized. A value of
1 % is appropriate. Therefore, the ratio τ/tD must be greater than 50, which corresponds to
a value of 10µs for the time constant τ or 83 MΩ for the feedback resistor RF (the nearest
matching part has 82 MΩ). With a loose constraint of 5 % (τ/tD = 9.66) for the ballistic deficit
on the cathode signal, we selected a feedback resistor of 47 MΩ for that readout channel.
For the implementation of the CSA with a COTS operational amplifier, we establish four im-
portant features for the parametric selection. First, the operational amplifier must have a
large input impedance, which is achieved by operational amplifiers with a field-effect tran-
sistor at their input terminals. Therefore, the voltage noise gain of the feedback network
(1/β) is reduced. Second, it must have a high open-loop voltage gain, so that the effective
input impedance regarding eq. 2.44 is minimized. The third feature is that the equivalent
voltage and current noise must be very low, and the fourth is that the operational amplifier
must have a sufficient large gain-bandwidth product to satisfy the requirements of rise time.
In a pulsed beam application, the achievable timing of the detector signals is important, and
must be further investigated. For an advanced analysis, the rise time should not be limited
and therefore the gain-bandwidth product should be as large as possible. In table 2.1, we list
some COTS operational amplifiers from different vendors, which match the selection criteria.
All operational amplifiers fulfill the requirements of a high input impedance and a relatively
large gain-bandwidth product. For the first tests, we choose the OPA657 because of its
larger open-loop gain and bandwidth in comparison to the ADA4817. For the second part,
we choose the LTC6268-10 because of its outstanding parameters and very low parasitic
capacitance. Both operational amplifiers are available in an almost pin compatible package.
Consequently, the evaluation procedure can be done with the same hardware. After a first
29
2. Analog front-end electronics
attempt with the LTC6268, which runs on our OPA657 PCB layout, we decided to populate
the hardware with the LTC6268-10. Unfortunately, this circuit tends to oscillate, as also ob-
served in (Brisebois, 2015). Thus the implementation of the CSA with the OPA657 shows
the best performance for the first prototype (see fig. 2.22).
Figure 2.22.: The detector assembly with unpopulated readout boards (left) and a close-up
of a populated printed circuit board with 17 CSAs (right).
Despite the higher parasitic input capacitance, a great advantage of the OPA657 over the
LTC6268 is its wide supply voltage range of 12 V. This provides a larger headroom for pulse
pile-ups until the output voltage of the amplifier saturates. Thus, this amplifier is best-suited
for high count rates with high energies. Nevertheless, the count rate capability mainly de-
pends on the signal processing system, which is limited by the input voltage range of the
digitizer and the digital pulse processing system (Abbene and Gerardi, 2015). Moreover,
any rate limitations must be determined by field experiments. The readout board contains
17 CSAs, where one channel is equipped for the readout of the cathode. All necessary
functionality, including low voltage power supply, high voltage filters, biasing, and decoupling
capacitors, is included on the PCB. Each CSA has a passive low-pass filter at its output.
The bandwidth is limited to 21.654 MHz (16.15 ns rise time), which is sufficient for the CZT
detector signals. One channel of the readout board is used for the evaluation of the CSA
and contains a test input circuit. This circuit consists of a termination resistor and a series
capacitor of 0.3 pF±0.05 pF for the charge injection in accordance with (Knoll, 2010). The
theoretical performance parameters of the CSA are listed in table 2.2. These values are
estimates and will be verified with experimental results.
2.5. Results
The readout board is evaluated with the test pulse input and with the detector shown in
fig. 2.22. The test pulse input is sourced by a signal generator with a step voltage input.
According to the current-voltage relation of a capacitor, the step voltage applied to test input
30
2.5. Results
Table 2.2.: Numerical estimation of electrical characteristics of different readout channels.
The calculations refer to a CSA based on the OPA657. The value of equivalent
noise charge (ENC) is given in units of the elementary charge e.
Parameter Cathode Anode (Pixel) Test input
RF ||CF 47 MΩ ||100 fF 82 MΩ ||100 fF 82 MΩ ||100 fF
CD/CF 119 53.05 55
Ballistic deficit 95.36 % 98.79 % 99.9 %
Charge fraction 96.34 % 98.32 % 98.26 %
Steady-state gain 9.19 V/pC 9.71 V/pC 9.82 V/pC
Cutoff frequency 13.84 MHz 30.11 MHz 29.08 MHz
Rise time 25.27 ns 11.61 ns 12.03 ns
Noise level (rms) 2.62 mV 1.82 mV 1.80 mV
Noise level (rms), BW limited 2.01 mV 1.26 mV 1.21 mV
ENC (rms), BW limited 1365 e 810 e 769 e
Peak amplitude at 511 keV in CZT 162.11 mV 171.38 mV 173.33 mV
capacitor generates a current flowing into the CSA. With a varying shape of the voltage
signal, arbitrary detector signals can be synthesized.
2.5.1. Test pulse input
The most important feature of front-end electronics is the noise performance. There are
various methods to analyze the noise of a linear time-invariant (LTI) system like the CSA. A
common method is described in the IEEE Std 1241-201, referred to as “Sine-wave testing
and fitting” (IEEE, 2011). It is known that the response of an LTI system to a pure sine-
wave is a sine-wave with the same frequency, but potentially different amplitude and phase.
Tests with a sine-wave have the advantage that the waveform can be generated very accu-
rately and the interpretation is done by standardized instruments and tools. If a sine-wave
is applied to the input, the output depends on the transfer function of the system and is
superimposed with noise. The input sine-wave is derived a posteriori by a four-parameter
sine-wave fit, and the residual is the noise level, as described in (IEEE, 2011). For this pur-
pose, the waveform is captured by our digitizer with 100 MSPS and 14 bit resolution. The
digitizer board provides an SNR of 71.3 dB at around -1 dB of full scale input (2.3 Vpp). This
corresponds to an rms noise level of 201µV. These values are measured at an input fre-
quency of 2 MHz. With the same setup, the sine-wave test with the test input of the CSA
results in an rms noise level of 1.24 mV. The result is shown in fig. 2.23. This is in accordance
with the predicted values in table 2.2 for the test input.
Moreover, other tests were made to validate the pulse shape at the output of the CSA. The
signal generator was set up to generate a step input with a rise time of approximately 8 ns.
Fig. 2.24 shows the changes in output signals for a varying amplitude AT of the test pulse
input. The fit of the peak heights to the input amplitude shows an excellent linearity, with a
maximum deviation of 2 mV. The linear fit results in a gain of 3.05, which correlates with the
ratio of test input capacitance over feedback capacitance.
31
2. Analog front-end electronics
0 100 200 300 400 500
Time mod T / ns
-4
-2
0
2
4
"
Si
ne
fit
 / 
m
V
-3 -2 -1 0 1 2 3 4
"Sinefit / mV
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s 
/ a
.U
. "Sinefit
7 = 3.357 7V
< = 1.238 mV
N = 16k
Figure 2.23.: Noise measurement of the CSA with a sine-wave test. A 2 MHz sine-wave
was applied to the test input and the difference between the output signal and
a four-parameter sine-wave fit (∆Sinefit) is plotted against the period T of the
sine-wave (left). All harmonic distortions were eliminated by the test procedure,
revealing the noise level (standard deviation σ of ∆Sinefit, right). Over N=16k
samples were used.
0 5 10 15 20 25 30 35 40
Time / 7s
0
50
100
150
200
250
300
O
ut
pu
t /
 m
V
AT = 7 mV
AT = 55 mV
AT = 100 mV
0 0.1 0.2 0.3 0.4 0.5 0.57
Test pulse amplitude AT / V
0
0.5
1
1.5
2
Pu
ls
e 
he
ig
ht
 / 
V
Linear fit
Measurement
Figure 2.24.: A measurement with different rectangular pulses with amplitudes AT injected to
the test input capacitance. The output signal was recorded with the 100 MSPS
digitizer (left). Each pulse height is the rms value of 1k events (right). The
linearity error is in the range from -2 mV to 1.5 mV.
The test pulse input was also used to evaluate the timing capabilities of the CSA. As before,
the signals were captured with the digitizer and processed offline. The test pulses were
synchronized to a known timing reference signal (sine-wave signal). The timing performance
was obtained from the difference between the zero crossing point of the sine-wave reference
and a digital constant fraction trigger (fraction = 0.2) on the output signal of the CSA. Both
timestamps were calculated by software. The results of the timing measurement are shown
in fig. 2.25. At high signal amplitudes, the timing performance is in the range of several
picoseconds (32 ps standard deviation). This value increases with lower signal amplitudes,
because the SNR decreases as well. This rough estimate of timing performance shows that
the CSA does not limit the timing performance of the CZT detector, which is in the range of
some nanoseconds (Meng and He, 2005).
2.5.2. Pixel detector
The performance of the CSA is finally evaluated with the Redlen pixel detector and the
hardware shown in fig. 2.22. The detector is biased at -600 V and the signals are captured
32
2.5. Results
-218 -106 6 118 230 342
"t / ps
291
292
293
294
295
296
297
Pu
ls
e 
he
ig
ht
 / 
m
V
0.1
0.3
0.5
0.7
0.9
-72 -48 -24 0 24 48 72
"t / ps
2221
2222
2223
2224
2225
2226
2227
2228
Pu
ls
e 
he
ig
ht
 / 
m
V
0.1
0.3
0.5
0.7
0.9
Figure 2.25.: Results of about 1k timing measurements with the 100 MSPS digitizer. The
pulse height is the maximum value of the output from the CSA. The time ∆t
is the difference between the pulse trigger and the timing reference from the
signal generator. The standard deviation of the measurements are σx = 131 ps
and σy = 1.15 mV for the left plot and σx = 32 ps and σy = 1.21 mV for the right
plot. The colormap is in logarithmic scale.
with the digitizer. The waveforms of a single pixel and the cathode are stored for an offline
analysis. The first test should measure the energy on both electrodes dependent on the
DOI. As shown in fig. 2.4, the relationship between the DOI and the weighting potentials
should be experimentally verified with data from the detector. The DOI is measured either
by the ratio of cathode-over-anode energy (Li et al., 2001) or by a direct measurement of the
drift time of the moving charge. The drift time of charge correlates with the rise time of the
output signal from the CSA (Verger et al., 2007). For our analysis, we calculate the rise time
as the time difference between two thresholds of the rising edge of the signal. On both the
cathode and the anode signal, we set the thresholds to 10 % and 90 % of the peak amplitude.
The data was collected with a 22Na radioactive source and an arbitrary selected pixel of the
2×2 center array. The anode was used to trigger an event while the cathode signal was
captured at the same time. This measurement shows the expected relationship of the DOI
and the measured cathode and anode energy (fig. 2.26). The peak amplitude of the 511 keV
photopeak of the cathode decreases linearly in correlation with the rise time. Events near
the cathode cause longest rise times. The linear shape matches the expected weighting
potential of the cathode. The pulse height of the anode signal also decreases dependent
on the calculated rise time. The DOI is clearly visible along the predicted shape of the
weighting potential, as shown in fig. 2.4. The expected peak amplitudes for the photopeak
are in accordance with the calculated values in table 2.2. A summary of the measured key
metrics for the CSA is given in table 2.3.
The correlation between the rise time of the cathode signal and the cathode-over-anode
ratio (CAR) is shown in fig. 2.27. Both values carry information about the DOI. In contrast,
the correlation between the cathode and anode rise time is smeared out. This is caused by
uncertainties in the crossing of the low-level threshold, because of the slow rising component
of the anode signal at the beginning of charge movement. The low-level trigger is difficult to
33
2. Analog front-end electronics
Figure 2.26.: Measurement of a 22Na radioactive source with the Redlen CZT detector and
the CSA readout board. The peak amplitudes of the output pulses from the
CSA are plotted against the 10 % - 90 % rise times. The weighting potentials
of the cathode (left) and an anode (right) are clearly visible along the 511 keV
photopeak (158 mV for the cathode and 182 mV for the anode at rise times of
410 ns and 200 ns).
hit exactly without additional effort. However, the measurements show that the DOI can be
Figure 2.27.: Correlation of the cathode-over-anode ratio (CAR) with the measured rise
times of the cathode pulses from the 22Na radioactive source measurement
(left). The rise time has a strong correlation to the CAR and therefore to the
depth of interaction (DOI) in the detector volume. The correlation of the anode
rise time to the DOI is smeared out (right).
calculated by the rise time of the cathode signal or the ratio of cathode-over-anode energy.
In conclusion, the measurement of the cathode rise time is more precise than the anode
rise time, because the cathode signal rises with a nearly linear slope. An advantage of the
rise time estimation is that the information is derived from one detector signal instead of
two, as required by the CAR calculation. Additionally, this calculation is error prone due to
charge-sharing events on the pixelated anode side.
It is evident that the anode energy has to be corrected dependent on the DOI. There are sev-
eral approaches for the depth correction, which are all carried out empirically. The results
34
2.6. Conclusion
Table 2.3.: Measured parameters of the front-end electronics. All channels are bandwidth-
limited to about 21.654 MHz by a passive RC low-pass filter. If appropriate, the
standard deviation σ is noted for the measured value.
Parameter Measured
value
Channel Comment
Rise time 10 % - 90 % 14.86 ns,
σ = 371 ps
Test input Input pulse: 3.0 ns fall time,
Vout: 2 Vpp
Noise level (rms) 1.24 mV Test input Sine-wave test
τ1 (82 MΩ ||100 fF) 8.091µs
σ = 86 ns
Cathode Waveform fit: f (t) = e−t/τ1
τ2 (47 MΩ ||100 fF) 4.964µs
σ = 181 ns
Anode Waveform fit: f (t) = e−t/τ2
Peak amplitude at 511 keV 158 mV Cathode Pulse height at longest drift
time
Peak amplitude at 511 keV 182 mV Anode Pulse height at longest drift
time
Steady-state gain 8.95 V/pC Cathode at 511 keV with E˜i = 4.64 eV
Steady-state gain 10.31 V/pC Anode at 511 keV with E˜i = 4.64 eV
are achieved with best-fit functions, based on e.g. polynomial (Dönmez et al., 2007) or ex-
ponential (Hong et al., 2004; Cho and Lee, 2016) equations. Our approach is based on the
weighting potential of a pixel, as this is the cause for the incorrect measurement. For depth
correction, we select the events of the 511 keV photopeak along its cluster with reduced peak
height and rise time, as shown in fig. 2.28. A fit of the weighting potential according to eq. 2.6
matches the data points. Thus, the derived mathematical relationship is used to correct the
anode energy, which is shown in fig. 2.28. We also evaluated polynomial equations for the
fitting function, resulting in comparable results with a fourth-degree polynomial (Scharnagl,
2015). The presented measurements were processes without any additional pulse shapers.
This results in an energy resolution of 4.3 % (FWHM) for the 511 keV photopeak. With the
use of additional digital pulse shapers, this results can be improved.
2.6. Conclusion
This chapter has presented a circuit design and implementation of analog front-end elec-
tronics for a CZT pixel detector. Starting from the electrical equivalent circuit of the CZT,
its electrical characteristics were discussed and a connection scheme presented. A short
summary of the signal formation in a CZT pixel detector was provided, and the weighting
potentials of the electrodes shown. Finally, these equations were applied for an analysis of
the detector signals. After a comparison of different readout circuits, we focused our investi-
gation on the CSA for the readout of the detector signals. As an ASIC-based solution is not
available for the application with high γ-ray energies and high count rates, we designed the
front-end electronics for the 8×8 pixel detector with COTS operational amplifiers.
We have shown a detailed analysis of the CSA in conjunction with the electrical model of
35
2. Analog front-end electronics
130 140 150 160 170 180 190
Anode photopeak amplitude / mV
0
50
100
150
200
250
300
350
400
450
R
is
et
im
e 
ca
th
od
e 
/ n
s
0 170 341 511 764 1062 1275
Anode energy / keV
100
101
102
Co
un
ts
22Na spectrum
7 = 511 keV
< = 9.5 keV
Figure 2.28.: The 511 keV photopeak along the signal rise time at the cathode with the cor-
responding data points used for the fit of the weighting potential (left). The
corrected spectrum of the anode energy is independent of the depth of inter-
action (right). The selected pixel has an energy resolution of 9.5 keV (standard
deviation) for the photopeak, which corresponds to 4.3 % FWHM. The spec-
trum is recorded without any additional pulse shapers.
the CZT detector. The limits of the design in terms of gain, bandwidth, and noise were given
with exact equations and numerical values. The performance of the readout electronics was
measured with synthesized detector signals from a test pulser and with a pixel detector of
size 20×20×5 mm3 from Redlen. The measurements with the test pulse showed that the
rms noise level of 1.24 mV is below the nominal intrinsic resolution of about 8 keV (FWHM at
122 keV) of the detector. Furthermore, we have shown that the SNR is also sufficient for a
timing far below 1 ns, which outstrips the expected CZT performance. All results have been
achieved without an additional pulse shaper. We also investigated the performance of the
CSAs with the detector from Redlen, and were able to verify that the claimed depth depen-
dence is in accordance with the calculated weighting potential of a pixel. We presented a
measurement of a 22Na radioactive source and showed a correction of the measured en-
ergy based on the DOI. After depth correction, we obtained an energy resolution of about
22 keV (4.3 % FWHM at 511 keV). Finally, the front-end electronics fulfill our requirements
and operate from several keV up to 7 MeV with sub-nanosecond timing capabilities.
36
3. Digital signal processing
Many relationships in technical systems can be described with a sufficient accuracy by linear
relations. For a time-limited observation, the parameters of the system can be regarded as
constant. Such systems are referred to as linear time-invariant (LTI) systems. An LTI system
has at least one input signal x(t) and one output signal y(t). The output signal is derived by
a convolution of the input signal and the impulse response h(t) of the LTI system:
y(t) = x(t) ∗ h(t) =
∫ ∞
−∞
x(τ)h(t − τ)dτ , (3.1)
where the impulse response h(t) is the output of the system with the Dirac delta distribution
δ(t) at its input. In general, a signal is the assignment of a function value to a time. Discrete-
time signals are derived by sampling a continuous-time signal, where the sampling period
T is constant. To distinguish between continuous-time signals and discrete-time signals,
discrete-time signals are denoted as x [n], where nT = t and x [n] = x(nT ) = x(t). Thus, the
convolution formula from eq. 3.1 in discrete-time domain is given by (Proakis, 1995):
y [n] = x [n] ∗ h[n] =
∞∑
k=−∞
x [k]h[n − k] . (3.2)
A large and powerful collection of mathematical techniques exists for calculating and analyze
the characteristics of (discrete-time) LTI systems (Oppenheim and Schafer, 2007; Hoffmann,
2013). The well-known methods will be utilized and applied in the following sections of this
chapter.
3.1. Unfolding-synthesis technique
During the period of this work, V.T. Jordanov described “a technique that allows the syn-
thesis of virtually any pulse shape, either exactly or as a close approximation” (Jordanov,
2016). This permits an easy adaption of signal processing to the requirements of the appli-
cation, e.g. optimized energy resolution, throughput, or timing performance. The approach
developed in this thesis was very similar to (Jordanov, 2016), which is illustrated in fig. 3.1.
If the signal of the detector is considered to have the shape of a short pulse, the output x(t)
is the result of a convolution of the transfer function of the signal conditioning electronics
and the Dirac delta distribution δ(t). After digitization of x(t), the discrete-time signal x [n] is
convoluted with the inverse transfer function h−1(t) of the signal conditioning electronics to
37
3. Digital signal processing
Detector
Signal 
conditioning 
electronics
Fast
ADC
Unfolding 
system
Synthesizing 
system
)(t
)(th
)(tx ][nx
][1 nh
][ny][nxS
][nhS
Figure 3.1.: Functional block diagram of a system using the unfolding-synthesis technique,
proposed by V.T. Jordanov. The diagram is adopted from (Jordanov, 2016).
remove the frequency response of the front-end electronics. As this step reverses the initial
convolution, it is referred to as deconvolution or unfolding. In general, this method can be
applied to any measurement system, where the dynamic properties are known (Hessling,
2008). With our approach (Födisch et al., 2016c), the unfolding system returns a step like
function xS [n], which can be simply considered as the integral of the detector pulse. Finally,
for synthesizing an arbitrary pulse shape at the output y [n], merely the step response of the
synthesizing system hS(t) must be designed in time or frequency domain. An illustration of
the signal processing chain is sketched in fig. 3.2
Time t / a.u.
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
es
 / 
a.
u.
/(t)
Time nT / a.u.
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
es
 / 
a.
u.
x[n]
Time nT / a.u.
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
es
 / 
a.
u.
xS[n]
Time nT / a.u.
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
es
 / 
a.
u.
y[n]
Figure 3.2.: Steps of signal processing with unfolding-synthesis technique. The Dirac delta
distribution δ(t) (top left) is convoluted with the signal conditioning electronics
and digitized by the ADC (signal x [n], top right). The unfolding system gen-
erates a step-like output xS [n] (bottom left), which is ultimately transformed to
an arbitrary pulse shape y [n] by the synthesizing system (bottom right). For
illustration, a cusp shape (Kowalski, 1970) is shown.
The implementation of a deconvolution system is ideally abstracted by an infinite impulse
response (IIR) filter as shown in sec. 3.2 and (Födisch et al., 2016c), whereas the synthesiz-
38
3.2. Digital deconvolution
ing system is predestined for an implementation with an finite impulse response (FIR) filter,
as shown in sec. 3.3 and (Födisch et al., 2016d). The coefficients hFIR of the FIR filter for an
arbitrary discrete-time pulse shape hp[n] are derived by
hFIR = hd ∗ hp , (3.3)
where hd is an approximation of the first derivative in discrete-time domain, e.g.
y ′[n] = y [n]− y [n − 1] , (3.4)
which corresponds to the filter coefficients hd = (1,−1).
3.2. Digital deconvolution
In the application of semiconductor detectors, the CSA is widely used in front-end electron-
ics. The output signal is shaped by a typical exponential decay. Depending on the feedback
network, this type of front-end electronics suffers from the ballistic deficit problem, or an
increased rate of pulse pile-ups. Moreover, spectroscopy applications require a correction
of the pulse-height, while a shortened pulse-width is desirable for high-throughput applica-
tions. For both objectives, digital deconvolution of the exponential decay is convenient. With
a general method and the signals of our custom CSA for CZT detectors, we show how the
transfer function of an amplifier is adapted to an IIR filter. This section investigates differ-
ent design methods for an IIR filter in the discrete-time domain and verifies the obtained
filter coefficients with respect to the equivalent continuous-time frequency response. Finally,
the exponential decay is shaped to a step-like output signal that will be processed by the
synthesizing system for an arbitrary pulse shaping.
3.2.1. Prior work
A γ-ray detector system based on a semiconductor detector such as CZT usually consists
of the detector crystal, the analog readout electronics for the amplification of the detector
signal, and a pulse processing unit. Nowadays, the pulse processing is mainly integrated by
an ASIC or by a digital circuit in an FPGA. As we recently showed (Födisch et al., 2016a),
the front-end electronics can be appropriately implemented by a CSA with a continuous re-
set through an RC feedback circuit. This type of amplifier discharges the integrated detector
current from the feedback capacitor C with the resistor R. Thus the typical signal shape with
an exponential decay is seen at the output of the CSA. A well-known problem of that circuit
is the ballistic deficit. This is caused by continuous discharge of the feedback capacitor,
even though the current of the detector is integrated. If the ratio of the RC time constant
over the integration time decreases, the ballistic deficit dominates the measured peak am-
plitude (Födisch et al., 2016a). To eliminate this effect and to reconstruct the initial charge
by a deconvolution of the exponential decay, Stein et al. presented the Moving Window
39
3. Digital signal processing
Deconvolution (MWD) (Stein et al., 1994, 1997; Georgiev and Gast, 1993; Georgiev et al.,
1994; Stein et al., 1996). Later, Jordanov et al. (Jordanov and Knoll, 1994; Jordanov, 1994;
Jordanov et al., 1994) described the same approach (Stein et al., 1996). However, both al-
gorithms are derived by an analysis of the exponential decay in the time domain. As a result,
a deconvolution of the exponential decay in the discrete-time domain is described by (Stein
et al., 1996)
y [n] = x [n] + k
n−1∑
i=−∞
x [i ] , (3.5)
where y [n] is the deconvolution of the signal x [n], which is the value of the continuous signal
x(t) at the discrete time t = nT with the sampling interval T . The value of k is set to (1−k ′),
where k ′ = e
−T
τ is “the decay constant of the preamplifier transfer function for one sampling
interval” (Stein et al., 1994). The authors proposed an alternative value k = Tτ in (Stein
et al., 1996) assuming τ  T . Both parameters for eq. (3.5) transform the exponential
decay of the signal into a step-like signal, as shown in fig. 3.3. The discrete-time signal x [n]
0 1 2 3 4 5
Time nT / =
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
e 
va
lu
e 
/ a
.u
.
x[n]
y[n]
Figure 3.3.: A sampled output signal x [n] of an amplifier with a typical exponential decay at
sampling interval T (simulated). The decay has the time constant τ = 20T , and
the parameter k of the deconvolution is k = 1 − e−Tτ . This results in a step-like
signal y [n] with a corrected amplitude due to the ballistic deficit.
shown in fig. 3.3 corresponds to an output signal of an ideal CSA with a feedback resistor R
and feedback capacitance C . For a rectangular shaped input current pulse with amplitude
I , where the current flows in the time interval from ta to tb, the continuous-time output signal
x(t) of the amplifier is calculated by
x(t) = I R
[(
1− e ta−tτ
)
θ(t − ta)−
(
1− e tb−tτ
)
θ(t − tb)
]
. (3.6)
Here θ(t) is the Heaviside step function, and tb > ta > 0. Regarding Stein’s approach for the
MWD, the presented deconvolution is calculated by the recursive representation of eq. (3.5),
40
3.2. Digital deconvolution
which is derived by
y [n] = y [n − 1] + d (3.7)
x [n] + k
n−1∑
i=−∞
x [i ] = x [n − 1] + k
n−2∑
i=−∞
x [i ] + d (3.8)
d = x [n] + (k − 1) x [n − 1] (3.9)
y [n] = y [n − 1] + x [n] + (k − 1) x [n − 1] (3.10)
According to the time-shifting property of the z-transformation (Oppenheim and Schafer,
2007)
x [n − k] Z←→ z−kX (z) , (3.11)
eq. (3.10) can be rewritten as
Y (z)
X (z)
=
1 + (k − 1)z−1
1− z−1 . (3.12)
It is obvious that eq. (3.12) is an equivalent of the generalized transfer function of an IIR
filter, which is defined as (Oppenheim and Schafer, 2007)
H(z) =
∑M
k=0 bkz
−k
1−∑Nk=1 akz−k . (3.13)
For further clarification, the fundamental operation of the pulse shapers described by Stein
et al. (Stein et al., 1994) or by Jordanov et al. (Jordanov and Knoll, 1994) is a deconvolution
of the transfer function of the amplifier and can be substituted by an IIR filter. Recently,
Jordanov presented an unfolding-synthesis technique (Jordanov, 2016) that also demands
an accurate deconvolution of the amplifier transfer function. Both constructed their digital
algorithms intuitively by an extensive analysis of the signals in the time domain. By doing
so, they neglected the established design methods for digital filters. Consequently, we will
show a further analysis of the amplifier transfer function in the s-domain (frequency domain
of continuous-time signals) and design the corresponding digital filter for the deconvolution
in the z-domain (equivalent frequency domain of discrete-time signals). Finally, we will verify
our algorithms with the signals of a CZT detector in conjunction with a CSA.
3.2.2. Discrete-time inverse amplifier transfer function
The charge-to-voltage transfer function of an ideal CSA with an RC feedback network and
the voltage vO at its output is given by (Födisch et al., 2016a)
H(s) =
vO
Q
=
sR
1 + sRC
. (3.14)
41
3. Digital signal processing
By normalizing the charge Q to the feedback capacitance C with Q/C = vQ, the transfer
function HQ becomes
HQ (s) =
vO
vQ
=
sτ
1 + sτ
, (3.15)
where τ = RC is the characteristic time constant of the CSA. The transfer function is iden-
tical to that of a first-order high-pass filter. Therefore, the signal seen at the output of the
amplifier is a convolution of the charge input signal and a high-pass filter. Moreover, as we
want to reconstruct the input signal, the deconvolution of the high-pass filter is realized with
the inverse transfer function of the amplifier. A deconvolution in the discrete-time domain
requires an adequate approximation of HQ−1 in the z-domain. Because the design methods
for discrete-time filters that transform continuous-time filters are numerous, we will focus our
investigations on a set of established methods and will test the accuracy of the transforma-
tion from the s-plane to the z-plane. At first, with the corresponding Laplace-transformation
of the difference quotient of the continuous-time signal x(t)
sX s cdx(t)
dt
= lim
h→0
x(t + h)− x(t)
h
, (3.16)
the equivalent difference quotient for the discrete-time signal x(nT ) is
lim
h→T
x(nT + h)− x(nT )
h
=
x(nT + T )− x(nT )
T
=
x [n + 1]− x [n]
T
(3.17)
By using eqs. (3.11) and (3.17), the transformation of the s-domain to the z-domain is there-
fore derived by
s −→ z − 1
T
, (3.18)
which is referred to as the forward difference method. In the same way, but by setting the
difference quotient to x(t)−x(t−h)h , the corresponding backward difference is defined by
s −→ z − 1
zT
(3.19)
It is clear that these design methods replace the continuous-time differentials with a discrete-
time difference. The exact relation of s and z in the context of the z-transformation is given
by
z = esT ⇐⇒ s = 1
T
ln (z) , (3.20)
which cannot be used for the expression of a discrete series of samples with respect to
eq. (3.11). Therefore, another substitution of s is derived by solving the differential equation
corresponding to H(s) by the approximation of an integral with the trapezoidal rule (Oppen-
heim and Schafer, 2007; Proakis, 1995). A replacement of s with
s −→ 2
T
z − 1
z + 1
, (3.21)
42
3.2. Digital deconvolution
is referred to as a bilinear transformation. Besides the approximation of s with a suitable
expression in terms of z , there exist further design techniques based on the mapping of the
zeros and poles of the transfer function in the s-plane directly into zeros and poles in the z-
plane (Proakis, 1995). The matched-z transformation, as described in (Proakis, 1995), maps
the zeros zk and poles pk of the continuous-time transfer function H(s) to the discrete-time
transfer function H(z) with the relation
H(s) =
M∏
k=1
s − zk
N∏
k=1
s − pk
−→
M∏
k=1
z − ezkT
N∏
k=1
z − epkT
= H(z) , (3.22)
where T is the sampling interval. After all of the above, the transfer functions in the z-
domain used to deconvolute the continuous-time amplifier transfer function HQ of eq. (3.15)
are summarized in tab. 3.1 (detailed results are illustrated in (Födisch et al., 2016c)).
Table 3.1.: Summary of the investigated design methods for an infinite impulse response fil-
ter. All methods rely on an approximation of the continuous-time transfer function
in the z-domain.
Design method Transfer function
Forward difference, eq. (3.18) HFD(z) =
1 +
(
T
τ − 1
)
z−1
1− z−1
Backward difference, eq. (3.19) HBD(z) =
(
T
τ + 1
)− z−1
1− z−1
Bilinear transformation, eq. (3.21) HBL(z) =
(
T
2τ + 1
)
+
(
T
2τ − 1
)
z−1
1− z−1
Matched-z transformation, eq. (3.22) HMZ(z) =
1− e−Tτ z−1
1− z−1
The transfer functions shown in tab. 3.1 are all identical in the limit T  τ . But with real
constraints, where τ is chosen to be as small as possible to meet the requirements of the
application, and T is chosen to be as large as possible in the range of the Nyquist frequency,
the results of the presented IIR filters are slightly different. After an analysis of the filter
responses in the time-domain with the example shown in fig. 3.3, it is apparent that the
bilinear transformation performs best for the specified values, as the voltage step is expected
to have a unity value without the presence of the ballistic deficit effect (fig. 3.4). As the
methods based on the difference quotient obviously result in distorted output signals, these
approaches will be neglected for our application. Moreover, the results of the matched-z
transformation are improved if this method is extended by an additional matched gain. The
43
3. Digital signal processing
0 1 2 3 4 5
Time nT / =
0
0.2
0.4
0.6
0.8
1
Sa
m
pl
e 
va
lu
e 
/ a
.u
.
x[n]
yFD[n]
yBD[n]
yBL[n]
yMZ[n]
Figure 3.4.: A comparison of the results obtained by different transfer functions for the
deconvolution of the signal x [n]. The filter based on the forward difference
method (yFD) produces a little undershoot, whereas the backward difference
(yBD) method shows an overshoot in the analogy. Further, the filters based on
the bilinear (yBL) or matched-z (yMZ) transformations seem to have a flat re-
sponse with respect to the exponential decay of the input signal.
discrete-time transfer function HMZ is adjusted by the gain correction constant G , so that
G =
|H(jω)|
|HMZ(ejωT )| (3.23)
matches the ratio of the magnitudes of the frequency responses at a specific frequency ω.
For the best gain matching of the inverse high-pass filter regarding the transfer function in
the s-domain, ω is set to the characteristic 3 dB corner frequency 1/τ , as carried out by a
comparison of different values for ω (Födisch et al., 2016c). Consequently, the optimum IIR
filter coefficients for the ideal inverse amplifier transfer function are
bn =
{
G
(
1
τ
)
,−G
(
1
τ
)
e
−T
τ
}
(3.24)
an = {1,−1} (3.25)
where the specific gain matching constant G (ω) for the described system is calculated as
G (ω) =
2
∣∣sin(ωT2 )∣∣√(ωT )2 + 1
ωτ
√
e
−2T
τ − 2e−Tτ cos(ωT ) + 1
. (3.26)
3.2.3. Application to measured signals
For a proof of concept, we will examine the deconvolution of the amplifier signals sampled
from the developed CSAs. The amplifiers are equipped with a test pulse input so that pulses
with an exponential decay and arbitrary pulse heights can be synthesized. The time con-
stant of the amplifier is in the range of several microseconds, as the feedback resistor is
chosen to be 82 MΩ and the feedback capacitance is in the range of 100 fF. The exact value
of τ = RC must be experimentally verified, because the electronic components are afflicted
44
3.2. Digital deconvolution
with tolerances. An example of a measured pulse from the CSA is shown in fig. 3.5. The
0 10 20 30 40
Time nT / 7s
0
500
1000
Sa
m
pl
es
 / 
AD
C 
Co
de
s
x[n]
fit: x*(t) = Ae-t/=
Figure 3.5.: A sampled output signal x [n] from the CSA with a test pulse at its input. The ex-
ponential decay of the signal is clearly visible, but the curve fitting of the function
x∗(t) = Ae
−t
τ exposes a variation from the ideal shape.
values are recorded with a sampling frequency of 100 MSPS with 14 bit resolution at an in-
put voltage range of approximately 2.3 V. A curve fit is applied to estimate the time constant
τ . For this example, the numerical value of τ is 8.91µs, which corresponds to a 82 MΩ
resistor in parallel with a 108.7 fF capacitor. The derived curve fit is in accordance with
the expected feedback network, but obviously it does not cover all the features of the pulse
shape. Further, a full analysis of the time constant by the best fit function indicates that the
parameter τ is nearly constant over the whole output voltage range. Yet at the same time,
a deconvolution of the exponential decay with the estimated τ by the matched-z IIR filter
with gain correction reveals distortion with regard to the predicted flat step-like function. The
results of the deconvolution of a measured signal with the inverse high-pass filter are shown
in fig. 3.6. In terms of the proposed methods for the deconvolution, the transfer function of
0 10 20 30 40
Time nT / 7s
0
500
1000
Sa
m
pl
es
 / 
AD
C 
Co
de
s
x[n]
y[n]
Figure 3.6.: The sampled signal x [n] and a deconvolution with the matched-z IIR filter with
gain correction. The output of the filter y [n] traces the variations with respect to
the ideal exponential decay. The anticipated flat step-like response is distorted.
the CSA cannot be approximated with the inferred model of a first-order high-pass filter from
eq. (3.15). However, the costs of a detailed analysis of all parasitic components of the entire
network model are much higher than the benefits. Nevertheless, the transfer function of the
45
3. Digital signal processing
CSA has been empirically found by the use of the System Identification Toolbox provided
by Mathworks (MathWorks, 2016b). The tool reports the goodness of fit between the test
and reference data to be 93.09 % (normalized root mean square error) for a continuous-time
model with one pole and one zero for the transfer function. This is improved to 96.48 % by a
model with four zeros and four poles. Anticipating the digital implementation of the IIR filter,
we chose a model with an accuracy of 96.47 % and three zeros and three poles (Födisch
et al., 2016c). Furthermore, the parameters of the inverse continuous-time transfer function
are transformed by the matched-z method and are gain-corrected at the 3 dB corner fre-
quency. This results in an IIR filter with the coefficients quoted in (Födisch et al., 2016c). The
output y [n] from the IIR filter with these coefficients, calculated by a double-precision floating
point arithmetic (64 bit), is shown in fig. 3.7. In addition to the analysis in the discrete-time
0 10 20 30 40
Time nT / 7s
0
500
1000
Sa
m
pl
es
 / 
AD
C 
Co
de
s
x[n]
y[n]
Figure 3.7.: The input x [n] and output y [n] of the third-order IIR filter for the deconvolution
of the exponential decay with the coefficients quoted in (Födisch et al., 2016c).
The output has the expected step-like shape with a flat top.
domain, the characteristics of the derived IIR filter are shown in (Födisch et al., 2016c).
3.2.4. Implementation of a higher order IIR filter
Nowadays, digital filters are realized with general purpose digital signal processors (DSPs)
or FPGAs. We will concentrate our efforts on an implementation of a higher order IIR filter
based on an FPGA from Xilinx. Along with the transfer function described by eq. (3.13), the
difference equation for the third-order IIR is
y [n] = b0x [n]+b1x [n − 1] + b2x [n − 2] + b3x [n − 3] + ...
a1y [n − 1] + a2y [n − 2] + a3y [n − 3] (3.27)
The difference equation can be comfortably mapped to the known Direct Form implementa-
tion structures of an IIR filter (Meyer-Baese, 2007). Still, at the same time, the coefficients
need quantization. With a look at the resources provided by a state-of-the-art FPGA from Xil-
inx, a word length for the coefficients of 25 bits is easily realizable, since these devices have
embedded 18 bit×25 bit multipliers (Xilinx, 2014). These come along with a pre-adder and a
post-adder in a so-called DSP slice. The built-in logic blocks, with their multiply-accumulate
46
3.2. Digital deconvolution
structures, are highly optimized for the requirements of digital filters. But a straightforward
implementation of a digital filter with restricted word lengths for the coefficients must be
evaluated with regard to quantization errors. We examined the output of the Direct Form
IIR filter in the time domain with a variable word length for coefficient quantization. Even
a length of 25 bits results in clear distortions. Satisfactory results were achieved only with
a word length of more than 34 bits. With this implementation on a Xilinx FPGA, one mul-
tiplication is mapped to two DSP slices. Thus a Direct Form implementation based on the
difference equation (3.27) requires 14 DSP slices. As we target the processing of highly
segmented detectors with more than 65 readout channels, the impact on the quantization
error is noticeable in an excessive consumption of slice logic resources of the FPGA. As
already stated by Oppenheim, the Direct Form realization of an IIR filter is very sensitive to
quantization errors, because each polynomial root of the transfer function is affected by all
of the quantization errors of the coefficients (Oppenheim and Schafer, 2007). On the whole,
the quantization modifies the location of the zeros and poles in the complex z-plane, and
therefore the response of the filter is changed. This problem is solved by a factorization of
the numerator and denominator polynomials of the transfer function in the form
H(z) = b0
M∏
k=1
1− ckz−1
1− dkz−1 (3.28)
where M is the order of the IIR filter and ck and dk are the zeros and poles in the complex
z-plane. For our derived filter, all zeros and poles are real values where the imaginary part
is zero. Thus, the transfer function can be rewritten to a cascade of three Direct Form filters
of first-order (fig. 3.8). The coefficients −bk1 and ak1 are the roots of the numerator and
x 1+b11z
–1
1–a11z–1
1+b21z–1
1–a21z–1
1+b31z–1
1–a31z–1
b0 y
Figure 3.8.: A Cascade Form infinite impulse response (IIR) filter of the third-order. Each
block is built by a first-order Direct Form IIR filter.
denominator polynomials and must be quantized. As proven by (Crochiere and Oppenheim,
1975) and (Oppenheim and Schafer, 2007), the Cascade Form of an IIR filter is generally
much less sensitive to coefficient quantization than the equivalent Direct Form implementa-
tion. Each first-order filter section of the cascade structure has the form shown in fig. 3.9.
Both the feedforward path (all-zero system) and the feedback path (all-pole system) purely
x + + y
z–1 × z–1×
ak1bk1
Figure 3.9.: A Direct Form implementation of a first-order infinite impulse response fil-
ter. The multiply-accumulate architecture from the built-in slices of the field-
programmable gate array is adapted to this basic filter structure.
47
3. Digital signal processing
match the multiply-accumulator architecture built into the FPGA. An additional register be-
tween the two paths increases the performance in terms of the maximum clock frequency,
but also introduces a delay of one clock cycle.
Results
We have implemented the presented Cascade Form structure of the third-order IIR filter
in VHDL. The design was synthesized and tested with Xilinx FPGAs. Moreover, the tools
reported a theoretical maximum frequency of 217 MHz for the filter running on a Kintex 7
XC7K325T. With an optional register between the feedforward path and feedback path, this
value increases to 295 MHz (167 MHz on Spartan 6 XC6SLX45T). For our purposes and
for most detector applications, this is sufficient. The fixed-point arithmetic uses the entire
bit widths of the built-in 25 bit×18 bit multipliers. Even though the fixed-point representation
of the poles and zeros is sufficient with a word length of 18 bits, the remaining 25 bits word
length for the sample is absolutely required. As the output of a cascade and the word
length of the feedback signal are limited to 25 bit, this truncation propagates a rounding
error from each stage to the next. However, the generic VHDL design of the filter is in good
agreement with our requirements while running on the Kintex 7 architecture, but on the
outdated 18 bit×18 bit architecture (e.g. Spartan 6), we observed unacceptable distortions
due to rounding errors. The design of our third-order IIR filter is mapped to seven DSP
slices of the Xilinx FPGA. Each Direct Form module consumes two DSP slices, while the
multiplication with the gain correction constant b0 consumes an additional multiplier of a DSP
slice.
For illustration and in reference to the prior work of Stein and Jordanov, the step-like output
of the IIR filter is turned into a trapezoidal pulse shape by applying a differentiator of the form
y [n] = x [n]− x [n −M] , M ≥ 1 , (3.29)
where M is the window length and an average filter with
y [n] =
1
N
N−1∑
k=0
x [n − k] , (3.30)
where N is the number of samples. The implementation of the eq. (3.29) is a straightforward
FIR filter. Further, the corresponding difference equation of eq. (3.30) is another direct form
implementation of an IIR filter. Together, these filters implement the MWD algorithm. The
results for a recorded sequence with our CSA and a CZT detector are shown in fig. 3.10.
This example shows that the obtained transfer function of the CSA is sufficiently robust to
be applied to an identical amplifier because the recorded sequence was captured from an
arbitrarily selected pixel with another dedicated amplifier. A single pixel calibration further
improves the results, since the absolute values of the electronic components vary.
48
3.2. Digital deconvolution
0 5 10 15 20
Time nT / 7s
0
500
1000
1500
Sa
m
pl
es
 / 
AD
C 
Co
de
s
Amplifier output
Deconvolution
MWD
Figure 3.10.: An example of the deconvolution with the obtained inverse amplifier transfer
function. Advanced pulse processing (MWD) with a differentiator (M = 128)
and an average filter (N = 64) accomplishes the pulse shaping with respect to
the well-known trapezoidal shaper from (Stein et al., 1994) or (Jordanov and
Knoll, 1994).
3.2.5. Conclusion
Based on the fundamental description from (Stein et al., 1994) of the Moving Window De-
convolution, we investigated the deconvolution of an exponential decay related to an output
signal of a CSA. We showed that the approach of Stein et al. for signal processing relies on
an inversion of a first-order high-pass filter representing the amplifier transfer function. Thus
the deconvolution can be realized with an infinite impulse response (IIR) filter. In the context
of established design methods for digital filters, we derived the optimum coefficients for the
demanded IIR filter. In conclusion, the matched-z transformation with a gain correction at the
3 dB corner frequency performed best while regarding the output in the time and frequency
domain. Moreover, we examined the signal processing with measured signals from a CSA.
We revised the apparent model of a first-order high-pass filter for the transfer function of the
amplifier because it fails in terms of accuracy. After an estimation of the transfer function by
experimental data, we enhanced this model to the third-order. Finally, we implemented the
inverse amplifier transfer function by means of an IIR filter with an adapted Cascade Form
structure. The results showed, that the proposed method for the deconvolution is realizable
with little computational effort in a Xilinx Kintex 7 FPGA. The presented method and imple-
mentation of the signal processing are essential for our advanced pulse processing of the
CZT detector signals with high resolution and high throughput.
49
3. Digital signal processing
3.3. Digital pulse synthesis
Contemporary FPGAs are predestined for the application of IIR and FIR filters. Their em-
bedded DSP blocks for multiply-accumulate operations enable efficient fixed-point compu-
tations, in cases where the filter structure is accurately mapped to the dedicated hardware
architecture. This chapter presents a generic systolic structure for high-order FIR filters,
efficiently exploiting the hardware resources of an FPGA in terms of routability and timing.
Although this seems to be an easily implementable task, the synthesizing tools require an
adaptation of the straightforward digital filter implementation for an optimal mapping. Using
the example of a symmetric FIR filter with 90 taps, we demonstrate the performance of the
proposed structure with FPGAs from Xilinx and Altera. The implementation utilizes less than
1 % of slice logic and runs at clock frequencies up to 526 MHz. Moreover, an enhancement
of the structure ultimately provides an extended dynamic range for the quantized coefficients
without the costs of additional slice logic.
3.3.1. Prior work
Nowadays, FIR filters are a major application of FPGAs in the context of digital signal pro-
cessing. An FPGA design is traditionally implemented in a hardware description language
on the register-transfer level (RTL). Therefore, the design flow from the RTL top level down
to the logic and circuit level postulates clear definitions and constraints for an optimal im-
plementation result. As a consequence, the architecture of the device impacts the design,
and a straight separation of the abstraction layers is practically impossible for an efficient
realization in terms of logic utilization or speed. Although some approaches to hardware ef-
ficient filter structures exist (Meher et al., 2008; Park and Meher, 2014), the key to a valuable
exploitation of the hardware resources is an adaptation of the FIR filter structure to the ar-
chitecture of the FPGA. Moreover, at the early stage of filter design, a reduction of hardware
complexity is possible (Mehrnia and Willson, 2016). At last, an adaptation of the mathemat-
ical modelling to the dedicated silicon (DSP blocks) maximizes performance (Xilinx, 2005)
while versatility decreases.
In fact, an implementation of numerical algorithms in an FPGA, including FIR filters, is a
trade-off between targeted precision, allocated logic, achievable clock frequency, and al-
lowed latency. An operation with a fixed-point arithmetic is usually preferred if resources
or speed weigh more than the highest precision. Along with the ongoing enhancement of
DSP capabilities, the requirements of digital filters grow evolutionarily in accordance with
the hardware. Thus, state-of-the-art FPGAs have hundreds of built-in DSP blocks, capable
of fixed-point operations with a precision of at least 18 bits (Altera Corporation, 2011; Xil-
inx, 2014). Consequently, we will adapt a symmetric FIR filter structure to a generic FPGA
architecture and meet the challenges regarding routability, timing, and precision.
Several convenient structures of FIR filters can be found in the literature, e.g. direct-form
realization (Proakis and Manolakis, 2013), transposed direct-form (Meyer-Baese, 2007),
50
3.3. Digital pulse synthesis
and symmetry exploiting direct-form structures for linear-phase systems (Oppenheim and
Schafer, 2007). The literary work covers the basic mathematical operations and structures
but bypasses the technical aspects concerning the prevalent DSP block architectures of con-
temporary FPGAs. Nevertheless, the major manufacturers of FPGAs support their platforms
with practical explanations (Altera Corporation, 2010; Taylor, 2012; Zatrepalek, 2012).
Besides the hardware-mapped structures, an improved precision of a symmetric FIR filter
without increasing the coefficient bit-widths was proposed by (Shen, 2010). The presented
parallel method implementation utilizes an accumulator in combination with variable shift op-
erations. Although FPGAs have “flexible multidata bus routing capabilities” (Shen, 2010), the
suggested shift of the accumulator input contradicts the architecture of state-of-the-art em-
bedded DSP blocks, e.g. (Altera Corporation, 2011; Xilinx, 2014). As Shen omits a bench-
mark with an FPGA implementation, the synthesis results of the optimized structure on a
Xilinx FPGA were presented by (Yuan et al., 2012). These results show a further consump-
tion of slice logic in addition to the DSP blocks, but neglect the specific architecture of the
FPGA and do not properly include the DSP blocks in their optimization.
Aim of this contribution
High-order digital filters require a cascade of sub-filters for an efficient realization (Mehrnia
and Willson, 2016) or, alternatively, an efficiently mapped hardware design. With regard to
recent hardware architectures, a FIR filter whose length exceeds the number of cascaded
DSP blocks (DSP chains) in an FPGA, can be referred to as high-order filter. A typical DSP
chain includes at least several tens of DSP blocks. This section presents the methodology
and implementation of a systolic FIR filter in an FPGA, which ideally matches the prevalent
embedded DSP block architecture. Consequently, the direct implementation of large struc-
tures avoids the cascading into small filters. The challenges of such a design are discussed,
exemplary solved, and compared to the key performance indicators of an FPGA implemen-
tation: logic utilization and clock frequency. Moreover, in reference to the “bit compression”
introduced by (Shen, 2010), we adapt their parallel method and implement an improved
precision of the frequency response without additional logic utilization.
3.3.2. FIR filter structures for FPGAs
Within the scope of high-order parallel FIR filters, it is suitable to design the filter with a
linear phase. Thus, the coefficients h[k] of the FIR system of order M would satisfy the
symmetry condition h[M − k] = h[k], where k = 0, 1, ..., M and the number N of coefficients
and taps for the FIR system is N = M + 1 (Oppenheim and Schafer, 2007). Consequently,
the generalized equation for the output y of a symmetric FIR filter with an even number N of
51
3. Digital signal processing
taps can be expressed as (Oppenheim and Schafer, 2007):
y [n] =
M−1
2∑
k=0
h[k] (x [n − k] + x [n −M + k]) . (3.31)
In the case of symmetric FIR filters, the number of coefficient multipliers is essentially
halved (Oppenheim and Schafer, 2007). Other types of symmetry, e.g. point symmetry
or an odd number N of taps, are developed by an analogy with the ongoing methodology.
However, the taps of the input samples x [n] are folded around half of the filter length by N/2
pre-adders, before the sums are multiplied by h[k]. Finally, the products are convoluted by
N/2− 1 post-adders. Hence, a basic arithmetic logic unit for a DSP operation incorporates
a pre-adder, a multiplier and a post-adder as shown in fig. 3.11.
A
B
D
F
+ × + Y/
WA
/
WC
/
WE
/
WY
/
WB
/
WD
/
WF
Figure 3.11.: The prevalent architecture for a DSP operation in an FPGA with pre-adder
(C = A + B), multiplier (E = C × D) and post-adder (Y = E + F ). The bit-
widths W of the inputs and outputs depend on the FPGA architecture.
Contemporary FPGA architectures embed multiply-accumulate blocks for digital filters, ei-
ther by integrated hardware primitives or by synthesized logic blocks. The mapping of
eq. (3.31) to parallel DSP blocks according to the direct-form implementation is convenient,
but results in a large adder tree for the running sum, which slows down the maximum clock
frequency of the system. If the design targets a high throughput at the highest clock fre-
quency, a systolic structure with a pipeline will maximize the performance of the FIR filter,
as long as latency is negligible. The basic structure is described in z-domain by using the
time-shifting property of the z-transformation x [n− k] z↔ z−kX (z) (Oppenheim and Schafer,
2007) and eq. (3.31) resulting in:
Y =
M−1
2∑
k=0
hk
(
z−k + z−M+k
)
X , (3.32)
where hk = h[k]. Furthermore, a pipeline is embedded into the systolic structure by adding
registers to the output of each running sum element. In terms of the z-transformation, the
pipeline register is a unit delay z−1. Therefore, the pipelining of the FIR filter can be ex-
52
3.3. Digital pulse synthesis
pressed as:
Y z−1 =z−1
M−3
2∑
k=0
hk
(
z−k + z−M+k
)
X +
z−1hM−1
2
(
z
−M+1
2 + z
−M−1
2
)
X (3.33)
Y z−2 =z−1
[
z−1
M−5
2∑
k=0
hk
(
z−k + z−M+k
)
X +
z−1hM−3
2
(
z
−M+3
2 + z
−M−3
2
)
X
]
+
z−2hM−1
2
(
z
−M+1
2 + z
−M−1
2
)
X . (3.34)
Eq. (3.33) illustrates the decomposition of the last term of the running sum with a pipeline
register. In addition, the second pipeline stage is formed by eq. (3.34). Finally, the complete
decomposition of the adder tree with (M − 1)/2 unit delays reveals the FIR filter function in
terms of iteration of the basic systolic element
Yk = z
−1Yk−1 + z−khk
(
z−k + z−M+k
)
X
= z−1Yk−1 + hk
(
z−2k + z−M
)
X (3.35)
Y0 = h0
(
1 + z−M
)
X , (3.36)
where the output Y from eq. (3.32) is equal to the output of the last pipeline stage at position
k = (M − 1)/2 from eq. (3.35). The block diagram of the symmetric systolic FIR filter is
shown in fig. 3.12. A detailed mapping of this structure to the Xilinx specific architecture is
shown in (Xilinx, 2005).
X z–2 z–2 z–2 z–1
+ + + +
× × × ×
z–1 + z–1 + z–1 + Y
h0 h1 hM–32
hM–1
2
Figure 3.12.: A linear phase FIR filter of order M (odd number). The filter function is folded
around half of the filter length N (even number) according to eq. (3.32), and
the adder tree for the running sum is replaced by the systolic structure derived
from eq. (3.35).
It is clear, that the pipeline registers at the outputs of the multiply-accumulate operations
cause a delay of k clock cycles, but the derived iteration from eq. (3.35) also inserts an initial
pipeline delay, as the first folded sum from eq. (3.36) is only valid after passing the tapped
53
3. Digital signal processing
X z–1 z–2 z–2 z–2
+ + + +
× × × ×
z–1 z–1+ z–1+ + Y
hM–1
2
hM–3
2
h1 h0
Figure 3.13.: A block diagram of the FIR filter structure from eq. (3.38) with reduced initial
pipeline delay.
delay line with M stages. A modification of the fundamental filter function from eq. (3.32)
brings it to
Y =
M−1
2∑
k=0
hM−1
2
−k
(
zk + z−1−k
)
X , (3.37)
and yields an alternative iterated function
Yk = z
−1Yk−1 + hM−1
2
−k
(
1 + z−1−2k
)
X (3.38)
Y0 = hM−1
2
−k
(
1 + z−1
)
X , (3.39)
where the initial delay is reduced to the unit delay. The corresponding block diagram is
shown in fig. 3.13. The overall latency is determined by the number of DSP blocks, which in
this case is N/2.
The featured symmetric systolic structure is generic so as to match the DSP block archi-
tecture of state-of-the-art FPGAs. Although the vendors put forward the systolic FIR filter,
the synthesis of the structure faces two major challenges, routability and timing, which are
discussed in the following paragraphs. The achieved results are highlighted afterwards.
Routability
Firstly, as the DSP blocks are embedded in an FPGA at dedicated locations in multiple
chain-like structures, the interconnection capabilities are limited. In correlation with the filter
order, the routability experiences less flexibility. Thus, successful mapping and routing of
relatively large filters, compared to the number of DSP blocks, mainly depends on the ability
of the development tools to exploit the DSP chain architecture. Even though a generic
VHDL design permits an unrestricted synthesis, several failures were observed during the
implementation process. Actually, physically separated DSP chains limit a realization of
high-order FIR filters beyond the length of a DSP chain. An efficient concatenation of DSP
chains is impossible for the evaluated tools without adaptation. Moreover, the mapping of the
systolic structure to dedicated slices becomes more error prone, as the number of multiply-
accumulate operations exceeds the total number of available DSP blocks. In this case, the
54
3.3. Digital pulse synthesis
X z–1 z–2 z–2 z–1 z–2
z–1
+ + + +
× × × ×
z–1 z–1+ z–1 z–1+ + Y
hM–1
2
hM–3
2
h1 h0
DSPchain 1 break DSP chain 2
Figure 3.14.: The symmetric systolic FIR structure with additional registers for improved
routability and timing. Registers between the systolic elements break the ded-
icated routes of DSP chains.
tools cannot map the systolic FIR filter to the DSP blocks and distributed arithmetic built on
the slice logic at the same time. In conclusion, the implementation of systolic FIR filters,
which overlap multiple DSP chains, requires further efforts in the design. As a manual
placement and routing is inadequate, the generic structure of the filter is adapted in such a
manner so as to support the routability.
The routability is rapidly improved by further pipeline registers between the cascaded run-
ning sum. Consequently, the iterated function from eq. (3.38) ultimately changes to
Yk =
{
z−1Yk−1 + hM−1
2
−k
(
z−bk + z−1−2k−bk
)
X (3.40)
z−1Yk−1 , (3.41)
where eq. (3.41) covers the case that a register is inserted at an arbitrary position k satisfying
the condition (M − 1)/2 > k > 1. The elements bk of eq. (3.40) represent the number of
injected registers before position k. At least one register between the DSP blocks breaks
the systolic structure to match a two-column chain architecture. Dedicated routes of DSP
chains are thus replaced through slice logic utilizing a simple register (fig. 3.14). Moreover,
this method also facilitates the routing of the systolic FIR structure within DSP blocks in
combination with distributed arithmetic. After all of the above, the number of DSP blocks or
the length of DSP chains no longer restricts the order of realizable digital filters.
Timing
Secondly, within the scale of an FPGA, DSP chains are distantly located. Each type of
device comes along with its specific physical dimensions and arrangement of configurable
logic. However, the minimum clock period for the circuit primarily depends on the longest
path between the logic elements. The interconnections of DSP chains can therefore be a
bottleneck in terms of timing and thus limit the maximum achievable clock frequency. Even
if the tools are capable of mapping systolic FIR filters spanning multiple DSP chains, the
path lengths cannot be automatically reduced. In this case, a partial break with one register
55
3. Digital signal processing
or more at distinguished positions in accordance with the DSP chain lengths reduces the
overall path lengths (fig. 3.14). The maximum clock frequency of the FPGA design is conse-
quently the maximum sample rate of the filter. That limit is verified by timing constraints with
the design tools.
Results
For the proof of concept, we chose an FPGA from Xilinx (Artix 7, XC7A35T-3CSG324) and
from Altera (Cyclone 5, 5CEFA5F23C6). Their DSP blocks (Altera Corporation, 2011; Xilinx,
2014) are able to perform the basic operations from fig. 3.11. The bit-widths of the generic
VHDL design are adjusted to be (WA, WB , WC , WD , WE , WF ) = (15, 15, 16, 18, 34, 36), where
the 15-bit wide input samples (WA, WB ) are adapted to our application, and, furthermore, the
bit-widths of the multiplier (WC , WD) and post-adder (WE , WF ) are chosen to avoid overflows
and to fit into both FPGA architectures. To illustrate an example, a low-pass filter with a cutoff
frequency fpass to be one-tenth of the sampling frequency fs and a stopband frequency fstop
to be one-eighth of fs was selected. Furthermore, the stopband attenuation Astop is chosen
to be 102 dB, which corresponds to the quantization noise floor of 18 bit signed coefficients.
Estimating the number NFIR of taps with (Lyons, 2010)
NFIR ≈ Astop
22 (fstop − fpass) , (3.42)
186 taps are required to achieve that frequency response. However, we truncated NFIR to
180, because the symmetry exploiting structure ultimately fits to the total 90 DSP blocks of
the Xilinx device. The Altera device includes 150 DSP blocks. Finally, the straightforward
structure and its adaptation with dedicated register stages at distinct positions in the data
path were evaluated and compared to the state-of-the-art FIR compiler tools (Xilinx, 2015c;
Altera Corporation, 2015) in tab. 3.2.
With identical configurations for the bit-widths, both tools were capable of implementing the
systolic structure of the 90 taps FIR filter. Apparently, a Xilinx implementation is capable of
including the tapped delay line and the output registers into DSP blocks in various ways.
Therefore, the tool maps the investigated structures efficiently to the hardware architecture,
consuming very less (<0.5%) additional logic. On the contrary, the DSP architecture of
Altera is not that versatile, as only one variant of the FIR filter structure results in competitive
results with less than 1 % additional logic (full break with z−1). As a result, the structure with
one additional register stage at the output of each systolic element performs best in terms
of resource utilization.
As discussed, the timing performance is improved by further registers breaking the dedi-
cated routes. The obtained values from the tools are shown in tab. 3.3. The best timing
performance is achieved by the fully pipelined structure with two additional registers at each
output. Further registers have no noticeable impact on timing. The values for the clock fre-
quencies were obtained from the slow process corner of the timing reports. For the Xilinx
56
3.3. Digital pulse synthesis
Table 3.2.: Exemplary logic utilization of 90-tap FIR filters
90 taps Xilinx Vivado 2016.2 Altera Quartus Prime 16.0
systolic structure XC7A35T-3CSG324 5CEFA5F23C6
straightforward LUT1: 5 / 20,800
Reg3.: 4 / 41,600
DSP4: 90 / 90
ALM2: 1,345 / 29,080
Reg.: 2,537 / 58,160
DSP: 76 / 150
partial break (z−1) LUT: 3 / 20,800
Reg.: 18 / 41,600
DSP: 90 / 90
ALM: 1,292 / 29,080
Reg.: 2,434 / 58,160
DSP: 76 / 150
full break (z−1) LUT: 60 / 20,800
Reg.: 152 / 41,600
DSP: 90 / 90
ALM: 188 / 29,080
Reg.: 441 / 58,160
DSP: 88 / 150
full break (z−2) LUT: 45 / 20,800
Reg.: 3429 / 41,600
DSP: 90 / 90
ALM: 1,009 / 29,080
Reg.: 3736 / 58,160
DSP: 88 / 150
Xilinx
FIR Compiler 7.2
(Xilinx, 2015c)
LUT: 3201 / 20,800
Reg.: 4293 / 41,600
DSP: 90 / 90
not applicable
Altera
FIR Compiler 16.0
(Altera Corporation, 2015)
not applicable
ALM: 801 / 29,080
Reg.: 2883 / 58,160
DSP: 45 / 150
(1) look-up table, (2) adaptive logic module (an ALM implements a LUT), (3) register, (4) DSP block with
pre-adder, multiplier, and post-adder
implementation, the timing constraints were successively increased to reach the limit. Be-
sides the improved timing in terms of maximum clock frequency, the overall latency of the
filter is increased depending on the injected register stages.
In conclusion, at least one variant of our generic design utilizes less logic and operates
at a higher clock frequency than the compiler-generated implementation. The efficiency
is caused by the simplicity of the approach, whereas the automated design tools include
excessive configuration options.
3.3.3. Optimized fixed-point arithmetic
For the implementation of high-order FIR filters, the coefficient quantization becomes sig-
nificant, as the granularity of coefficients increases. Thus, an increased dynamic range for
the quantization of coefficients preserves the accuracy of the digital filter. The precision of a
signed fixed-point operation depends on the bit-width b supported by the hardware. Hence,
the transformation from a real coefficient h to the corresponding integer value I ∈ Z, used
for a signed fixed-point calculation is, in general, described as:
I = round
(
h2b−1
)
. (3.43)
57
3. Digital signal processing
Table 3.3.: Maximum achievable clock frequencies of 90-tap FIR filters
90 taps Xilinx Vivado 2016.2 Altera Quartus Prime 16.0
systolic structure XC7A35T-3CSG324 5CEFA5F23C6
straightforward 238.10 MHz 145.65 MHz
partial break (z−1) 303.03 MHz 148.82 MHz
full break (z−1) 476.19 MHz 218.77 MHz
full break (z−2) 526.32 MHz 231.75 MHz
Xilinx FIR Compiler 434.78 MHz not applicable
Altera FIR Compiler not applicable 213.72 MHz
The reversal of the fixed-point representation to a real number hI is calculated by:
hI =
I
2b−1
. (3.44)
The resulting rounding error is h − hI , which is decreased by an increased bit-width b. An
efficient method for an improved fixed-point precision with a limited bit-width b was shown
by Shen (Shen, 2010). This method, referred to as “bit compression”, performs a left shift
operation on the integer value, until all redundant sign extension bits are removed. However,
this equals a multiplication with 2Q , and we calculate Q as follows:
Q =
⌊
log2
(
2b−1
|h|
)
− (b − 1)
⌋
. (3.45)
The removal of sign extension bits increases the dynamic range of the fixed-point represen-
tation of the coefficients, but requires that all products are normalized to a common base
before they are added by the accumulator. Therefore, Shen’s parallel method implemen-
tation performs a normalization at the input of the running sum accumulator. Indeed, the
operation is incompatible to the DSP block architecture and is therefore synthesized on dis-
tributed logic. Thus, the proposed FIR filter structure does not exploit the entire performance
of a hardware mapped systolic FIR filter.
In general, the normalization of a partial term S from eq. (3.37) to a common base is realized
in fixed-point representation by:
2−dkS = round
(
hk2
b−1+Qk
)
(z−k + z−1−k)X (3.46)
where Qk is calculated by eq. (3.45) and dk is a normalization factor, which must be individu-
ally calculated for each coefficient. Moreover, eq. 3.46 reveals that the multiplication with 2dk
(left shift operation) performs normalization and can be applied to the delayed input samples
(fig. 3.15).
For our approach, the dk left shift operation for normalization is restricted by the bit-widths
of the DSP block architecture, and therefore Qk must be limited to an upper value. The
bounded value Q
′
k is determined by the largest bit shift applicable to the input samples but
58
3.3. Digital pulse synthesis
X z–1 z–2 z–2 z–2
× × × ×× × × ×
+ + + +
× × × ×
z–1 z–1+ z–1+ + Y
2dj 2di 2d1 2d0
hj2
sj hi2si h12s1 h02s0
Figure 3.15.: A block diagram of the systolic FIR filter with enhanced dynamic range for
fixed-point coefficients by including left shift operations.
does not exceed the bit-width WC of the pre-adder. That method benefits from a gener-
ously designed pre-adder with regard to the bit-width. With a look at the contemporary DSP
architecture from Xilinx (Xilinx, 2014), the pre-adder operates at WC = 25 bit and the mul-
tiplier supports 25 bit×18 bit operations. Thus, for 16-bit input samples and a 25-bit wide
pre-adder, the limit of Q ′k is 9-bit. An example of the calculation is shown in tab. 3.4.
Table 3.4.: Exemplary numerical values for a maximized dynamic range of the coefficients
quantization.
hk b I two’s-complement Qk Q ′k dk hk2
b−1+Q′k shifted two’s-complement
2.99 10-8 18 0 00’0000’0000’0000’0000 24 9 0 2 00’0000’0000’0000’0010
−3.99 10-5 18 -5 11’1111’1111’1111’1011 14 9 0 -2678 11’1111’0101’1000’1010
4.99 10-3 18 654 00’0000’0010’1000’1110 7 7 2 83718 01’0100’0111’0000’0110
−3.99 10-1 18 -52298 11’0011’0011’1011’0110 1 1 8 -104595 10’0110’0111’0110’1101
Results
For the evaluation of the proposed structure from fig. 3.15, we designed a low-pass filter
of order 179 with the window method (Lyons, 2010) based on Nuttall’s window (Nuttall,
1981). Furthermore, the result of the FIR filter generated by the Xilinx FIR Compiler is
compared to that of the proposed structure with additional bit shift operations and a floating
point calculation. The frequency responses, which are derived by a Fourier transform of the
simulated impulse responses of the filters, are shown in fig. 3.16.
A comparison of frequency responses of the compiled FIR filter structures from Xilinx and Al-
tera with 18 bit signed coefficients reveals no significant differences. However, our proposed
structure with shift operations for coefficient normalization results in an improved stopband
attenuation, even though it also exploits 18 bit signed coefficient multiplier. On a Xilinx FPGA,
that structure utilizes exactly the same logic resources in comparison to the equivalent vari-
ant without left shift operations. The critical path delays remain constant without reducing
the maximum clock frequency.
59
3. Digital signal processing
0 0.1 0.2 0.3 0.4 0.5
Frequency # f
s
-140
-120
-100
-80
-60
-40
-20
0
M
ag
ni
tu
de
 / 
dB
Xilinx FIR compiler (18bit signed)
proposed structure
floating-point
Figure 3.16.: The frequency responses of a symmetric systolic FIR filter with 90 taps de-
pended on the numeric representation of the coefficients.
3.3.4. Conclusion
In this section, a systolic structure for a symmetric FIR filter was proposed, where the systolic
elements ideally match the prevalent DSP block architecture of FPGAs. Thus, the derived
iterative mathematical functions for the systolic structure support the mapping of a digital
filter to various architectures with different constraints. Moreover, a generic FIR filter design
was synthesized by the tools from Altera and Xilinx to evaluate the efficiency of the structure
in terms of logic utilization and maximum clock frequency. The results confirmed that the
proposed structure with additional register stages improves routability and timing of high-
order FIR filters and is superior to state-of-the-art FIR compiler tools. Furthermore, to yield
an increased dynamic range for coefficients quantization, we enhanced the structure by shift
operations, thus improving the precision of fixed-point arithmetic. Finally, the exploitation of
the entire DSP blocks enables an efficient realization of high-order FIR filters with fixed-point
arithmetic in FPGAs, while utilizing less than 1 % additional slice logic and running at clock
frequencies above 200 MHz.
60
4. Data interface
State-of-the-art detector readout electronics require high-throughput data acquisition (DAQ)
systems. In many applications, e.g. for medical imaging, the front-end electronics are set up
as separate modules in a distributed DAQ. A standardized interface between the modules
and a central data unit is essential. The requirements on such an interface are varied, but
demand almost always a high throughput of data. Beyond this challenge, a Gigabit Eth-
ernet interface is predestined for the broad requirements of systems on a chip (SoC) up
to large-scale DAQ systems. We have implemented an embedded protocol stack for an
FPGA capable of high-throughput data transmission and clock synchronization. A versa-
tile stack architecture for the User Datagram Protocol (UDP) and Internet Control Message
Protocol (ICMP) over Internet Protocol (IP) such as Address Resolution Protocol (ARP)
as well as Precision Time Protocol (PTP) is presented. With a point-to-point connection
to a host in a Micro Telecommunications Computing Architecture (MicroTCA) system, we
achieved the theoretical maximum data throughput limited by UDP both for 1000BASE-T
and 1000BASE-KX links. Furthermore, we show that the random jitter of a synchronous
clock over a 1000BASE-T link for a PTP application is below 60 ps.
4.1. State-of-the-art
Distributed data acquisition systems are commonly spread over different fields of applica-
tion in nuclear physics or medical imaging. Depending on the application, there are various
requirements for the interconnections of submodules. The main challenge for an interface is
the user acceptance with respect to handling and interoperability of different device types.
In addition, the data throughput of the interface is an important criterion for usability and
should not limit the performance of the whole DAQ system. Even though proprietary inter-
faces can fulfill these requirements, standardized technologies benefit from industry-proven
components and are essential for reliable applications. A popular and well accepted spec-
ification is the IEEE 802.3 Standard for Ethernet (IEEE, 2015). This standard specifies the
physical layer used by the Ethernet. Until now, connections up to 100 Gbit/s are specified
and are going to be established by the industry. Nevertheless, for embedded systems, a
Gigabit Ethernet connection is the state-of-the-art. A widespread technology is known as
1000BASE-T, which defines the 1 Gbit/s Ethernet over twisted pair copper cables. The ap-
plication of Gigabit Ethernet is not restricted to the use in Local Area Networks. It also finds
its way into board-to-board applications. E.g. the backplane of a MicroTCA system should
implement at least one port for an Ethernet connection (PICMG, 2011), which is usually im-
61
4. Data interface
plemented as 1000BASE-KX on the physical layer. A link on the electrical backplane uses
two differential pairs to establish a Gigabit Ethernet connection. With Gigabit Ethernet, the
possibilities of applications range from a high speed data transfer to clock synchronization
in a distributed DAQ system (Moreira et al., 2009). This work is related to the implementa-
tion and test of an embedded Gigabit Ethernet protocol stack for FPGAs. With a versatile
stack architecture, we will demonstrate the performance of high-throughput data transfers
with UDP and clock synchronization over the PTP. Our aim is to investigate the maximum
achievable data throughput with an FPGA-based SoC as data source and a PC as receiver.
For our application we need a high-throughput DAQ to cope with the count rate expected
for prompt gamma imaging in particle therapy (Hueso-González et al., 2015). This will be
evaluated with a 1000BASE-T and 1000BASE-KX link on a MicroTCA system. In addition,
we will demonstrate the performance of a synchronized point-to-point connection as shown
in (Girerd et al., 2009) with a Xilinx FPGA and different hardware for the physical layer.
4.2. Embedded Gigabit Ethernet protocol stack
The high-throughput data interface should connect an FPGA to a host, and should be syn-
chronizable to a master over an Ethernet link. For this purpose, an FPGA mezzanine card
was designed (Födisch et al., 2014), which equips commonly used FPGA boards with an
Ethernet interface. Moreover, the data interface must be capable to operate on standardized
hardware platforms, e.g. a MicroTCA advanced mezzanine card (AMC) with an FPGA and
backplane Ethernet. The hardware used for the investigations is shown in fig. 4.1, and is the
base for the embedded protocol stack running in an FPGA.
Figure 4.1.: The hardware used for our investigations. An FPGA mezzanine card was de-
signed (left) for evaluation of throughput and synchronization. Additionally, the
interface must be compatible to standardized hardware platforms (right), e.g.
a MicroTCA advanced mezzanine card (designed by Deutsches Elektronen-
Synchrotron, DESY).
62
4.2. Embedded Gigabit Ethernet protocol stack
Physical layer
The embedded Gigabit Ethernet protocol stack connects to the physical layer through the
data link layer regarding the Open Systems Interconnections (OSI) model as shown in
fig. 4.2. Higher level functions shall be implemented above the embedded protocol stack
Embedded protocol stack
Media Access Control
Xilinx PCS/PMA
Xilinx GTX
TI DP83865 /
Marvell 88E1111
1000BASE-KX 1000BASE-T
1. Physical layer
2. Data link layer
3. Transport layer
User defined interface
GMII
Figure 4.2.: Layer stack according to the OSI model for our embedded Gigabit Ethernet pro-
tocol stack. The higher level protocols are implemented in the transport layer.
For evaluation of a 1000BASE-T link we use the external ICs DP83865 from
Texas Instruments and 88E1111 from Marvell. All other components are imple-
mented in a single FPGA.
in an application layer. The IEEE 802.3 Standard for Ethernet defines different types of cop-
per based connections between two transceivers over the physical layer. For 1000BASE-T,
it is proposed to use four pairs of wires for the signal transmission. The 1000BASE-KX tech-
nology uses two pairs for the transmission of data. The specific signaling and coding in the
physical layer will be done with industry-proven integrated circuits (ICs). For the 1000BASE-
T signal coding we will use the IC 88E1111 from Marvell (Marvell, 2004) and DP83865 from
Texas Instruments (Texas Instruments, 2004). To access the physical layer according to the
1000BASE-KX technology, we will use a GTX Transceiver of the Xilinx Kintex 7 FPGA (Xilinx,
2015a) in combination with the “Xilinx 1G/2.5G BASE-X PCS/PMA Core” (Xilinx, 2015b). All
physical layer transceivers (PHYs) have a common interface to the overlying data link layer
and its Media Access Control (MAC). The MAC connects to the PHY via the Gigabit Media
Independent Interface (GMII). The MAC should be implemented in the FPGA. Due to clear
specifications on high data throughput and hardware, we do not intend to provide a compat-
ibility to other PHYs with Reduced or Serial Gigabit Media Independent Interface (RGMII or
SGMII) or even lower speeds as specified for 10BASE-T or 100BASE-T.
Data link layer
The data link layer with respect to fig. 4.2 contains the MAC and a management interface.
The MAC controls the access to the PHY and transmits the data in an Ethernet packet. It
processes the input and output signals of the GMII with a frequency of 125 MHz. An Eth-
ernet packet encapsulates the Ethernet frame by adding the preamble and the start frame
63
4. Data interface
delimiter (SFD). The MAC composes (and also decomposes) the Ethernet packet with 8 bits
per clock cycle (8 ns) from the Ethernet frame. This is essential for a maximum line rate
of 1 GBit/s. The standard for Ethernet demands that two consecutive Ethernet packets are
separated by the interframe gap (IFG) for at least 96 bit times (96 ns).
Usually all PHYs provide a management interface for the configuration of their internal reg-
ister sets. The Management Data Input/Output (MDIO) interface is used for a basic link
configuration (e.g. autonegotiation advertisement).
Transport layer
The transport layer shall provide a stack for higher-level protocols encapsulated in the Eth-
ernet frame. Its architecture must be easily extensible for any desired protocol in the layer
stack. We target a maximized data throughput from the application layer for UDP. The the-
oretical data throughput of the UDP with a payload of 1472 Byte, which corresponds to a
Maximum Transfer Unit (MTU) of 1500 Byte for the Ethernet frame, is 114.09 MiB/s. If the
host supports jumbo frames with an MTU of 9000 Byte, the maximum data throughput is in-
creased to 118.34 MiB/s. The embedded protocol stack should not limit the frame size of an
Ethernet packet. Although various implementations of Gigabit Ethernet protocol stacks (Löf-
gren et al., 2005; Dollas et al., 2005; Kühn et al., 2008; Uchida, 2008; Herrmann et al.,
2009; Alachiotis et al., 2010; Lieber and Hutchings, 2011; Nagy et al., 2011; Alachiotis et al.,
2012; Sasi et al., 2013; Mahmoodi et al., 2014; Zhou and Yao, 2014; Batmaz and Dogan,
2015) are published, there exists no solution which achieves the theoretical maximum data
throughput with UDP. Only (Uchida, 2008) reached maximum performance with a TCP/IP
processor. A comparison of slice logic resources, as it is done in (Löfgren et al., 2005;
Herrmann et al., 2009; Alachiotis et al., 2010; Lieber and Hutchings, 2011; Alachiotis et al.,
2012; Mahmoodi et al., 2014), is not our intention, because each implementation is based
on a different FPGA architecture. Whereas slice logic utilization is an important design cri-
terion, it varies in accordance of generic configurations (e.g. FIFO depths) as well as the
supported features (e.g. checksum calculations). Thus, a comparison to other implementa-
tions without the context of the application is difficult. In order to provide all the necessary
functionality, we need a protocol stack that serves ARP, ICMP, PTP and UDP with the focus
on maximum data throughput. A header of a protocol should be partially configurable by an
user interface but also calculated automatically (e.g. length fields). All stack layers support
a checksum calculation if it is required by the protocol. In terms of Löfgren’s classification
proposed in (Löfgren et al., 2005), our requirements belong to a “Medium UDP/IP” core.
64
4.3. Implementation
4.3. Implementation
4.3.1. System overview
Our implementation is designed as intellectual property (IP) core with the hardware de-
scription language VHDL. It includes the MAC as well as an embedded protocol stack. For
the 1000BASE-KX implementation the PHY is already included in the FPGA. As shown in
fig. 4.3, the Gigabit Ethernet IP core gets its data from the application layer through a com-
mon First-In-First-Out (FIFO) interface. The asynchronous FIFO is designed to operate at
Gigabit Ethernet
IP core
Application
layer
PHY
Microcontroller
PLLs Receive clock
Reference clock
External clock
RXD[7:0]
RX_DV
RX_ER
GTX_CLK
TXD[7:0]
TX_EN
TX_ER
bus
interface
FIFO
interface
System clocks
PHY
1000BASE-T 
or
1000BASE-KX
FPGA
Figure 4.3.: An overview of the SoC with the embedded Gigabit Ethernet IP core and its in-
terfaces. In case of the 1000BASE-KX implementation, the PHY will be included
in the FPGA.
frequencies of 125 MHz with a bit width of 32 bit and stores at least the payload of one
packet. It is used to stream the application data with high throughput into the transport layer
(UDP) of the core. Another interface to the core is built with an 32 bit microcontroller (Harboe,
2016). The microcontroller with its bus-interface limits the data throughput for this interface
far below the limit of a protocol. So it is used for slow-control applications over UDP, ICMP
and ARP. The MDIO interface is also handled by the microcontroller and is not shown. Fig.
4.3 shows the signals of the GMII and their directions between the MAC (embedded in the
Gigabit Ethernet IP core) and the PHY. The same signals will be used for the 1000BASE-KX
implementation with the embedded Xilinx PHY. The system clocks as well as the necessary
clocks for the PHY will be generated by built-in phase-locked loops (PLLs) of the FPGA.
4.3.2. Media Access Control
The functions of the MAC are restricted to the basic needs for interfacing the GMII. Fig. 4.4
shows the basic structure of the module for the transmission datapath of the MAC. For the
transmission datapath, it will compose the Ethernet packet with its preamble and SFD which
are initially stored in a shift register. In the following states, data is passed through this
register and the arithmetic logic unit (ALU) for the checksum calculation. Finally, the 32 bit
65
4. Data interface
idle
pre
sendlastcrc
gap
Simplified FSM
tx_en
txd[7:0]
tx_busy
phy_txen
phy_txer
phy_txd[7:0]
Shift register 8x8
Register 32CRC ALU
Figure 4.4.: Basic structure of the MAC for the transmission datapath. The output signals
are connected to the GMII and the input is sourced by the transport layer.
frame check sequence (FCS) is added at the end of the frame. The finite state machine
(FSM) of the module controls this dataflow and keeps the IFG at a programmable number of
clock cycles. The module for the receiving datapath is built in the same way. It decomposes
the Ethernet frame out of a received Ethernet packet and passes it to the transport layer. The
MAC logic is capable of running at the speed of the transceiver clocks (125 MHz). So there is
no need for additional FIFOs for clock domain crossing. This results in a deterministic latency
for the complete datapath from the transport layer to physical layer and vice versa. An
example of a transmitted Ethernet packet is shown in fig. A.1. The waveforms are captured
with an integrated logic analyzer (Xilinx Chipscope).
4.3.3. Embedded protocol stack
With a look at the OSI reference model and its layers for a network communication, the stack
architecture implies a dataflow from the top layer to the bottom layer. That means that the
application passes its data from the transport layer to the data link layer until it is transmitted
by the physical layer. So data will be “pushed” from the source to the sink and we call this
dataflow as “Data-Push” model shown in fig. 4.5. The scheme in fig. 4.5 implies, that the
Layer 3 UDP data
Layer 2 IP header UDP data
Layer 1 ETH header IP header UDP data FCS
t0 t1 t2 t3 t4
Figure 4.5.: An example of a dataflow through the stack layers driven by the “Data-Push”
model
application layer has valid data which is transported through the UDP layer (layer 3). The
underlying IP layer (layer 2) will start its transmission with one clock cycle delay, beginning
with its own data for the IP header. The data coming from the upper layer has to be buffered
in the underlying layer, while this layer sends its own data. The same situation occurs when
66
4.3. Implementation
the IP layer passes its data to the Ethernet layer (layer 1). Finally, the dataflows initiated at
time t0 and t1 are encapsulated at time t4 and t3 respectively. To keep this data valid for the
latency during transmission time, data buffers are needed. As a consequence, a layer has
to buffer at least the data of the overlying layer. One can also easily imagine the situation
where two layers have valid data and pass it to a shared underlying layer. In this case, the
number of data buffers doubles. A consistent data flow through all layers with the “Data-
Push” model is handled with the appropriate number of data buffers. This model consumes
additional memory for redundant data.
An alternative approach for a dataflow is shown in fig. 4.6. We call this model as “Data-Pull”
model. In contrast to the “Data-Push” model from fig. 4.5, the dataflow is initiated by the
Layer 3 UDP data
Layer 2 IP header UDP data
Layer 1 ETH header IP header UDP data FCS
t0 t1 t2 t3 t4 t5
Figure 4.6.: An example of a dataflow through the stack layers driven by the “Data-Pull”
model
low-level layer. The data of the overlying layers is just passed through a single register stage
at the time when it is encapsulated into the frame of the underlying layer. This reduces
the amount of data buffers tremendously to a single register at each interconnection. In
the example shown in fig. 4.6, the latency from the UDP data in layer 3 to the time when
it is encapsulated in layer 1 is reduced to two clock cycles (from t3 to t5). Each register
stage in the underlying layer introduces one clock cycle delay. Of course the dataflow can
be optimized to zero latency without additional register stages, but this will cause timing
problems. Data buffers are needed in the application layers as well, but buffer redundancy
in comparison to the “Data-Push” model is eliminated. The costs for this implementation are
a simple arbiter and control logic and range far below those of the “Data-Push” approach.
The basic scheme of the interconnections of layers is shown in fig. 4.7. All modules in the
same layer N+1 pass their state to the arbiter logic. In the simplest case this is a FIFO
state which indicates whether there is valid data to send or not. In case of valid data the
arbiter decides which module of a layer is served first and passes this information to the
underlying layer N. The module from layer N controls the dataflow of the overlying layer with
its controlbus. After all, the data from layer N+1 is multiplexed to the receiving module in
layer N. A real data transfer of the implemented “Data-Pull” model is shown in fig. A.2. The
example in fig. A.2 shows at its initial clock cycle at position 1 that the UDP layer has valid
data to send (signal “udp_fifo_empty” is low). In conjunction with the arbiter bus, the IP layer
also reports that there is valid data to send (signal “ip_fifo_empty” is low). With this condition
the Ethernet layer starts the transmission of data (signal “tx_en” is high) at position 2. At
position 14 the Ethernet layer pulls the data from the overlying layer by setting the signal
“tx_next_eth” to high. The Ctrl Demux from the interconnection logic of the layers shown in
67
4. Data interface
Data bus
Layer
N, 0
Layer
N+1, 0
Layer
N+1, 1
Ctrl 
Demux
Data 
Mux
Arbiter
N+1
Control bus
Arbiter bus
Signals:
Figure 4.7.: Interconnections of layers with an arbiter and control logic. This architecture
eliminates the need for redundant data buffers in a layer stack.
fig. 4.7 switches this signal to the IP layer (signal “tx_start_ip” is high). At the next clock cycle
(position 15), the IP layer transmits its data occurring with an additional delay of one clock
cycle in the frame of the Ethernet layer (signal “txd”, position 16). The IP layer encapsulates
the application data from the UDP layer in the same way into its frame. This can be seen by
the control signals “tx_next_ip” and “tx_start_udp” at position 33 and the UDP data (signal
“udp_txd”) and the IP data (signal “ip_txd”) at position 34 and 35 respectively. Finally, the
MAC composes the entire packet as shown in fig. A.1.
4.3.4. Clock synchronization
An important issue in a distributed DAQ is a uniform clock distribution. Although a dedicated
clock line is a simple and precise solution, it cannot be used for an absolute synchroniza-
tion of all timestamps in the system. For this purpose an additional data signal for the
transmission of a known timestamp reference is needed. The PTP offers the possibility to
synchronize the timestamps over a data link. Additionally, a Gigabit Ethernet link has the
property, that a transmission clock is embedded in the datastream, because the transferred
data is synchronous to this reference clock. As a consequence, a receiver can recover this
clock frequency. In a 1000BASE-T application, the slave recovers the clock from the master
out of the data stream. This task is done by the PHY (see fig. 4.8). So it is possible to
synchronize the clock signals as well as the timestamps over a single Gigabit Ethernet link.
It is also known that the clock offset from a master and a slave cannot be corrected with PTP
below a resolution of 8 ns (this corresponds to the transceiver clock of 125 MHz) without a
phase alignment of the clocks. An accurate implementation is already done with the White
Rabbit project (Moreira et al., 2009), but does not support a 1000BASE-T link by default. A
synchronization over a 1000BASE-T link was done by (Girerd et al., 2009). They achieved a
precision of 180 ps with the DP83865 from Texas Instruments and an FPGA from Altera. The
limiting factor was the jitter of the built-in PLL of the FPGA. Our implementation is based on
Xilinx FPGAs with an improved jitter. So we want to determine the absolute precision which
68
4.4. Measurements and results
is achievable with these devices and different ICs for the physical layer. The implemented
clocking scheme is shown in fig. 4.8. Each PHY is configured with the MDIO interface to act
as a master or as a slave. During the autonegotiation procedure, these configurations are
advertised.
Gigabit Ethernet
IP core
with PTP
PHY
(Master)
Master clock
125 MHz
GMII
1000BASE-T
FPGA
125 MHz
Synchronized
clock
Pulse per
second
Crystal
oscillator 
25 MHz
PHY
(Slave)
PLL
Crystal
oscillator 
25 MHz
Gigabit Ethernet
IP core
with PTP
FPGA
125 MHz
GMII
Recovered clock
125 MHz
Synchronized
clock
PLL
Pulse per
second
Figure 4.8.: Scheme of the clock synchronization through a point-to-point connection over
a 1000BASE-T link. One PHY acts as master and embeds the clock reference
into the datastream. The slave recovers a synchronized clock signal with a
frequency of 125 MHz. A PLL inside the FPGA is used to build up the clock
tree. Our Ethernet IP core provides synchronized timestamps and a pulse per
second for the test setup.
4.4. Measurements and results
For our performance test on a 1000BASE-T link we use the Xilinx evaluation board SP605
equipped with a Spartan 6 (LX45T) FPGA and the PHY 88E1111 from Marvell. We also
use an FPGA Mezzanine Card (FMC) equipped with two PHYs from Texas Instruments
(DP83865) attached to the SP605. The host is a MicroTCA crate equipped with an AMC
CPU module from Concurrent Technologies (AM 900/412-42) and a MicroTCA Carrier Hub
(MCH) from N.A.T. (NAT-MCH-PHYS). The CPU module provides two 1000BASE-T ports at
the front and two 1000BASE-KX ports at the backplane. A 1000BASE-KX link to the CPU is
established with a Kintex 7 (325T) on the HGF-AMC from DESY/KIT through the MicroTCA
backplane and the switch from the MCH. The operating system on the host is Ubuntu.
To evaluate the MAC Layer and the latency of the entire stack, we measured its output
signals on the GMII. A maximum throughput is achieved if the transmit enable signal (see
“phy_txen” in fig. A.1) is high all the time except the time for the IFG. A constant latency
is achieved, if the beginning of a transmission cycle and the arrival time of a packet at the
receiver have a time deviation much smaller than a clock cycle. Both conditions could be
experimentally verified, which indicates that the MAC layer is capable of transferring the
maximum throughput with a constant latency (see fig. A.3). The measurement shown in
fig. A.3 was captured with an oscilloscope during a transmission of UDP packets with a
fixed payload of 20 Byte. This test was chosen to verify a maximum throughput, a constant
latency of the core (see measurement “packet length” and “IFG” in fig. A.3) and the overall
system latency between two PHYs (see measurement “Phy1-Phy2 latency” in fig. A.3). For
69
4. Data interface
this setup we used the Marvell 88E1111 on both sides connected by a cable of 50 cm length.
4.4.1. Throughput performance
To check the performance of our FPGA implementation with a 1000BASE-T PHY, we es-
tablished a point-to-point connection between the FPGA and the CPU module. The host
serves a UDP socket where the incoming data throughput is measured. The data from the
FPGA contain an increasing 32 bit counter value which is used to identify a missing packet
or a corrupted datastream. For this measurement the throughput of the UDP on a Gigabit
Ethernet link is our reference. As mentioned in sec. 4.2, this value is 114.09 MiB/s for a
payload of 1472 Byte. If the payload is decreased, the data throughput decreases as well
because of the increasing rate of protocol overhead. Tab. 4.1 shows the achievable data
throughput dependent on the UDP payload. Additional overhead in the Ethernet packet lim-
its the line rate. Thus the Ethernet Standard limits the MTU per frame to 1500 Byte (this
Table 4.1.: Theoretical data throughput dependent on the payload of a UDP packet.
UDP payload / Byte Data throughput / (MiB/s) Line rate / (1 GBit/s)
8972 118.339 99.3 %
1472 114.094 95.7 %
1024 111.991 93.9 %
512 105.597 88.6 %
256 94.775 79.5 %
results in a UDP payload of 1472 Byte), it is also common to use jumbo frames with an MTU
of 9000 Byte (8972 Byte UDP payload). Our implementation supports jumbo frames and
this performance will also be evaluated with the MicroTCA host. The measurement of data
throughput at the host requires also a measurement of time for the corresponding amount of
bytes. Because Linux is not a real-time operating system, this time measurements are above
the nanosecond scale. The setup with a Linux host is sufficient for an average estimation of
data throughput. Our application measures the incoming bytes on the socket, checks if data
is valid and prints out the error rate and the average data throughput every ten seconds.
The results of the data throughput tests with three different devices on the physical layer are
shown in tab. 4.2. An example of a measurement of data throughput over 9 hours is shown
in fig. 4.9. During this measurement all data were transferred without errors. The standard
deviation of data throughput was 2196 Byte/s. This is caused by uncertainties in the time
measurement and the latency of packet buffering in the operating system.
The results show an excellent performance up to the theoretical limit of the UDP data
throughput. The values above this limits are caused by frequency uncertainties for the
transmission clock. The reference values from tab. 4.1 are calculated at a clock frequency
of 125 MHz. A fixed deviation of that frequency and the mentioned lack of a precise time
measurement on a Linux system can cause a data throughput value above the reference
value. This tests also show the importance of an efficient host as receiver. If the host is
not configured appropriately, packet losses will happen. In our configuration packet losses
70
4.4. Measurements and results
114.108 114.112 114.116
Data throughput / (MiB / s)
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s 
/ a
.u
. Data throughput
7 = 114.112 MiB/s
< = 2196 Byte/s
N = 3.2k
Figure 4.9.: Distribution of data throughput for 1472 Byte UDP payload measured over
9 hours.
Table 4.2.: Measured data throughput dependent on the UDP payload. The tests were per-
formed with different hardware platforms. Data throughput is the mean value for
more than 100 s.
Physical layer Hardware UDP payload / Byte Data throughput / (MiB/s)
1000BASE-T TI DP83865 8972 118.344
1000BASE-T Marvell 88E1111 8972 118.345
1000BASE-KX Xilinx GTX 8972 118.316
1000BASE-T TI DP83865 1472 114.112
1000BASE-T Marvell 88E1111 1472 114.097
1000BASE-KX Xilinx GTX 1472 114.079
1000BASE-T TI DP83865 1024 112.015
1000BASE-T Marvell 88E1111 1024 111.995
1000BASE-KX Xilinx GTX 1024 111.977
1000BASE-T TI DP83865 512 105.643
1000BASE-T Marvell 88E1111 512 105.641
1000BASE-KX Xilinx GTX 512 105.619
1000BASE-T TI DP83865 256 94.827
1000BASE-T Marvell 88E1111 256 94.836
1000BASE-KX Xilinx GTX 256 94.823
at the host receiver just occur at payloads smaller than approximately 256 Byte. This is
caused by the excessive load of more than 388 kPackets/s. As a second test we measured
the performance of the ICMP protocol layer with ordinary Ping requests from the host to the
FPGA device. A Ping request is processed by an interrupt routine with the microcontroller
inside the FPGA. The host generated at least 1000 Ping requests at an interval of 200 ms.
The results are shown in tab. 4.3. The Ping requests were sent while the FPGA transmits
UDP packets with maximum data throughput. The arbiter of the IP layer prioritizes the ICMP
protocol, so that a parallel UDP data stream did not block the ICMP layer. Because of the
switch of the MCH for the 1000BASE-KX backplane links, the round-trip time (RTT) for a
Ping request is higher than for a point-to-point connection. If the UDP layer application was
turned off, the RTT of the backplane link was decreased to 0.252 ms with 0.051 ms standard
deviation. The same test was done for the ARP layer, which has a higher priority than the IP
71
4. Data interface
Table 4.3.: The results of the ICMP layer test with Ping requests. The host generated 1000
Ping requests and the round-trip-time is measured (min, mean, and max value).
Physical layer Hardware Min / ms Mean / ms Max / ms Std.-dev / ms
1000BASE-T TI DP83865 0.203 0.315 0.444 0.067
1000BASE-T Marvell 88E1111 0.189 0.307 0.547 0.066
1000BASE-KX Xilinx GTX 0.740 1.091 1.441 0.162
layer in the protocol stack. The Linux host generated additional ARP request with the arping
command, while the FPGA handles the ICMP and UDP layer. As a result, all requests were
served without losses with a RTT below 1 ms for all hardware platforms.
4.4.2. Synchronization
The performance of the clock synchronization is limited by the accuracy of the clock recovery
system in the signal chain of the 1000BASE-T slave. Whereas the clock of the master can
achieve the desired precision by choosing an appropriate clock source, the precision of the
clock of the slave depends on its components in the signal chain for the clock distribution and
recovery (see fig. 4.8). For our evaluation hardware, the quality of the signal chain is mainly
determined by the PHY which is responsible for the clock recovery out of the datastream.
The second component which influences the achievable precision is the FPGA, where the
recovered clock is used for timestamp generation. Usually a PLL inside the FPGA is used
to build the clock tree for all clock domains. So we want to evaluate, whether the built-
in PLL limits the overall system. At first we measured the phase noise of a clock source
with very low jitter which will be used as the input signal for the PLL. The phase noise
is correlated with the random jitter and therefore it determines the precision of the timing
system. The measurements of a low jitter clock and the performance of the built-in PLL of
the FPGA (Spartan 6 LX45T) with that input is shown in fig. 4.10. These measurements
were taken with a HA7062B phase noise analyzer from Holtzworth Instrumentation and the
signal generator SMA100A from Rohde & Schwarz as low jitter clock source. Although
there are various configurations possible for the PLL, for this measurement we set up the
multiplier and the divider value to 8. The integrated phase noise of the Xilinx PLL was found
out to be 6.47 ps in the range of 10 Hz to 1 MHz. Without any additional hardware, this
constitutes a design limit for the precision of synchronous timestamp generation with the
FPGA. To find out the random jitter of the clock synchronization over a 1000BASE-T link,
we set up a point-to-point link between a master and a slave PHY and measured the clock
to clock jitter in the time domain. The clock of the master triggers the measurement of time
difference between the two rising edges of both clocks. The results of the measurement for
the PHY DP83865 (master and slave) are shown in fig. 4.11. The clock signal is measured
at an output pin of the FPGA with a frequency of 125 MHz (see fig. 4.8). It is buffered
with an output register (ODDR2 primitive from Xilinx). During the measurement illustrated
in fig. 4.11, the FPGA sent UDP packets with maximum throughput and PTP packets at
an interval of 1 s. For another measurement, we bypassed the PLL and distributed the
72
4.4. Measurements and results
101 102 103 104 105 106
Frequency offset / Hz
-180
-160
-140
-120
-100
-80
Ph
as
e 
no
is
e 
/ (d
Bc
/H
z)
Xilinx PLL 125 MHz
Clock source 125 MHz
Figure 4.10.: A measurement of the phase noise of a 125 MHz low jitter clock source (red)
which sources a PLL in a Spartan 6 LX45T. The corresponding output of the
PLL (blue) has an integrated phase noise of 6.47 ps in the range of 10 Hz to
1 MHz.
recovered slave clock with an ordinary built-in clock buffer to the timing logic. With regard to
that, the jitter was increased from 55 ps to 64 ps. Finally, we repeated the measurement of
fig. 4.11 with the PHY 88E1111 for the master and the slave. With this setup we achieved a
clock to clock jitter of 70 ps. In both setups the clock source of the the master was an crystal
oscillator with approximately 6 ps random jitter (measured with the phase noise analyzer in
the range from 10 Hz to 1 MHz). As a result, we can state that the precision is influenced
by all components in the signal chain. It depends mainly on the clock recovery system
of the PHY and the ability of the PLL of the FPGA to reduce random jitter. In addition
to the measurement of the clock to clock jitter, we have taken measurements to estimate
the precision of synchronization with timestamps generated by the master and the slave.
Both devices run on the synchronized clock signal with the same frequency. An absolute
synchronization of the timestamps is performed with PTP every second. Each synchronized
device generates one pulse per second (PPS) at an output of the FPGA. A measurement of
the time difference between the PPS signal of the master and the slave is shown in fig. A.4.
The measurement was running over 13 hours in the lab and shows that the timestamps are
synchronized with a random jitter of 59 ps. The digital logic for timestamp generation in the
FPGA was sourced by a clock signal with a frequency of 125 MHz from the built-in PLL. For
the measurements shown in fig. A.4 we used the DP83865. The same measurements were
repeated with the 88E1111 PHY and resulted in a random jitter of 72 ps for the timestamp
synchronization with a short-term measurement (approx. 2 hours). All measurements show
a constant offset up to 8 ns which cannot be reduced with the PTP.
73
4. Data interface
3.75 3.8 3.85 3.9 3.95 4 4.05
"t / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s 
/ a
.u
. "t Master-Slave
7 = 3.901 ns
< = 54.9 ps
N = 483k
Figure 4.11.: Clock to clock jitter between the master and the slave PHY. Both are the
DP83865. The standard deviation σ of the distribution is about 55 ps.
4.4.3. Resource utilization
Our Ethernet IP core can be configured with an arbitrary FIFO size for the application above
the UDP layer. Our basic configuration consists of two channels for the application interface
to the UDP layer. One channel is interfaced by the microcontroller and one is interfaced by
a high-throughput application. On each interface there is one FIFO for the payload with a
depth of at least two UDP packets. The payload size of one packet is set by generics and
is by default 1472 Byte. For the support of jumbo frames this size can be easily adjusted
to 8972 Byte. Larger FIFO depths and payload sizes are possible as well. The FIFOs are
implemented on Dual-Port Block Memory integrated in the Xilinx FPGA. They can also be
placed on distributed slice registers. The ICMP, PTP and ARP layer can store one packet to
send. All receiving datapaths are configured to store one packet as well. Because there are
various configurations possible, we renounce a comparison to other implementations. The
resource utilization reported by the Xilinx tools for the Ethernet stack with support for ARP,
PTP, ICMP and UDP is presented in tab. 4.4 and tab. 4.5, using the Xilinx ISE 14.7 tools
for the implementation. The 1000BASE-T implementation with support for PTP and an MTU
Table 4.4.: Slice Logic utilization for the Gigabit Ethernet stack with a Spartan 6 (LX45T)
Module Slices Slice Reg LUTs LUTRAM BRAM
MAC 140 459 345 16 0
Ethernet 246 479 648 0 0
ARP 163 498 492 0 0
IP 274 546 669 0 0
ICMP 60 169 108 24 1
UDP 237 551 573 1 9
PTP 672 1890 2071 0 0
Total 1792 4592 4906 41 10
of 1500 Byte on a Spartan 6 FPGA with 6822 slices occupies 1792 slices corresponding to
26.27 % (16,42 % without PTP) total occupied slices.
The implementation on Kintex 7 is designed for a 1000BASE-KX link on a MicroTCA back-
74
4.5. Conclusion
plane. This implementation uses a Xilinx IP core with a GTX transceiver as PHY. This
consumes additional logic but doesn’t need an external PHY. This implementation aims only
at maximum data throughput and is not designed to perform a synchronization over the Mi-
croTCA backplane. Thus, the PTP layer is not included. The 1000BASE-KX implementation
Table 4.5.: Slice logic utilization for the Gigabit Ethernet stack with a Kintex 7 (325T)
Module Slices Slice Reg LUTs LUTRAM BRAM
GMII_to_GTX 446 997 826 71 0
MAC 94 299 276 32 0
Ethernet 137 378 419 0 0
ARP 173 498 485 0 0
IP 250 546 684 0 0
ICMP 61 169 114 24 1
UDP 198 580 580 1 5
Total 1359 3467 3384 128 6
without support for PTP and an MTU of 1500 Byte on a Kintex 7 FPGA with 50959 Slices oc-
cupies 1359 Slices corresponding to 2.67 %. All reports for slice logic utilization also include
several logic for internal tests and debug options.
4.5. Conclusion
With the need of a high-throughput UDP application, we have presented an entire stack
architecture for a Gigabit Ethernet interface on an FPGA. The stack was built for the proto-
cols UDP, ICMP, IP, ARP and PTP and can be easily extended or cut down in functionality.
For a straightforward implementation we showed two basic models for the dataflow in a
stacked architecture. Our embedded Gigabit Ethernet protocol stack is designed with the
“Data-Pull” model to eliminate redundant buffers. A clear modular architecture for each layer
with a control and arbiter logic at the interconnections keeps this implementation versatile.
The underlying MAC and physical layer are also replaceable. All modules are written in
VHDL and tested on Xilinx Spartan 6 and Kintex 7. We demonstrated the achievable data
throughput with an UDP application on a 1000BASE-T and 1000BASE-KX link. In both
cases we reached the maximum data throughput of 114.1 MiB/s with an MTU of 1500 Byte
and 118.3 MiB/s with jumbo frames of 9000 Byte. The overall performance for other use
cases is also excellent. Finally, we investigated the performance of a clock synchronization
over a 1000BASE-T link. In dependence of the PHY, we achieved a precision of 55 ps for
the clock to clock jitter between the master and the slave. An absolute synchronization of
timestamps was done with PTP. The long-term test showed a standard deviation of 59 ps
for the synchronized timestamps. Due to the generic data interface, the UDP/IP stack can
be easily adapted to detector applications where a high data throughput is required. With
regard to the requirements of precise timing applications, the relative timing is in the sub
nanosecond range whereas the absolute accuracy remains in the limits of PTP.
75

5. Experimental results
5.1. Digital pulse shapers
The fundamental methods for processing signals from nuclear electronics have been de-
scribed in (Kowalski, 1970; Nicholson, 1974; Spieler, 2005; Knoll, 2010). All literary work
covers the concepts based on analog electronics. Nowadays, nuclear electronics are grad-
ually adapted to digital signal processing. However, the digital counterparts of the traditional
analog processing chain have to be validated experimentally.
At first, the established pulse shapers for spectroscopy applications, often referred to as slow
shapers, are constructed and compared in the digital domain. Furthermore, the necessary
fast shapers for extracting timing information are derived with respect to special features of
the signals from CZT detectors. For this purpose, detector like stimuli were injected to the
CSA (fig. 5.1), which responses with a typical exponentially decaying signal.
Agilent 33500B
signal generator
FMC digitizer
100 MSPS, 
14 bit
10 MHz reference
A
B
C
CSA
R&S SMA 100A
signal generator 100 MHz clock
10 MHz sine
0 1 2 3 4 5 6
Time nT / 7s
0
2000
4000
6000
Sa
m
pl
es
 / 
AD
C 
Co
de
s A: Testpulse input
B: Amplifier output
Figure 5.1.: The setup used for the evaluation of energy and timing properties of the hard-
ware. A step input with variable rise time (signal A) is used the generate a
detector like output from the CSA (signal B) for the digitizer. The entire system
is sourced by a master clock generator.
The digital pulse shapers were evaluated with known test signals from the CSA with regard
to spectroscopy and timing performance. Afterwards, the performance of digital pulse pro-
cessing is evaluated with real detector signals from scintillation and CZT detectors. To prove
the versatility of the algorithms, signals of different types of detectors were digitized and
processed.
77
5. Experimental results
5.1.1. Spectroscopy application
It has been demonstrated (Hartog and Muller, 1947), that an optimum pulse shape in time
domain reduces the equivalent noise charge of a pulse height analysis. In general, the well-
known cusp pulse shape is regarded as the optimum pulse shape (Kowalski, 1970). “This
pulse shape has become the standard by which the performance of other methods of pulse
shaping are compared” (Knoll, 2010). The ideal cusp pulse shape has infinite length and
only a theoretical relevance. Whereas the cusp pulse is not exactly realizable by analog
electronics, digital electronics enable unlimited pulse shape synthesis. Since the shape of
a cusp is described by an exponential decay with time constant τ symmetric around the
peak, the amplitude falls off the -40 dB level at a duration above 5τ . Consequently infinite
cusp shape is sufficiently well approximated by an FIR filter of that length. An example is
illustrated in fig. 5.2. The shown example with a test pulse at the input of the presented CSA
0 5 10 15 20 25
Time nT / 7s
0
200
400
600
800
1000
1200
Sa
m
pl
es
 / 
AD
C 
Co
de
s
Amplifier output
Deconvolution
Cusp shaper
1260 1262 1264 1266 1268 1270
Pulse height / ADC codes
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = 1265
< = 2.15
N = 13k
Figure 5.2.: The output of the amplifier with a test pulse of 55 mV at its input. The deconvo-
lution of the exponential decay is implemented by an IIR filter. Finally, the output
of the IIR filter is shaped by an FIR filter with arbitrary step response, which is
in this case a cusp shape (left). The analysis of about 13k peak heights reveals
a standard deviation of 2.15 bins.
reveals an energy resolution of 2.15 bins (standard deviation σ) for a step input of 55 mV,
which corresponds to an SNR of 76.64 dB at -1 dB full scale input (14 bit). The presented
measurement is used as a reference for additionally evaluated digital pulse shapers. In
reference to the comparison of analog pulse shaping circuits from (Kowalski, 1970), the re-
sults are normalized to the relative noise charge QN,rel, where the approximated cusp shape
represents the unit value. On the basis of the established CR-(RC)n shapers mentioned in
(Spieler, 2005), the n filter stages of the equivalent analog circuit can be implemented by
an IIR filter. Increasing digital low-pass filter stages experiences no notable efforts in con-
trast to an analog implementation. Moreover, we investigated suitable pulse shapes based
on well-known window function for digital signal processing (Heinzel et al., 2002). Window
functions are implemented by an FIR filter as described in sec. 3.3.
In fig. 5.3, the kernels of the pulse shapes are adapted from the predefined window functions
from the DSP system toolbox of Matlab (MathWorks, 2016a). A measurement with the same
procedure as illustrated in fig. 5.2 was used to evaluate the performance of the developed
78
5.1. Digital pulse shapers
0 0.25 0.5 0.75 1 1.25 1.5
Time nT / 7s
0
0.2
0.4
0.6
0.8
1
Am
pl
itu
de
CR-RC
CR-(RC)2
CR-(RC)4
CR-(RC)6
CR-(RC)8
0 0.25 0.5 0.75 1
Time nT / 7s
0
0.2
0.4
0.6
0.8
1
Am
pl
itu
de
Cusp
Gaussian
Kaiser
Nuttall
Triangular
Figure 5.3.: Illustration of typical step responses of the well-known CR-(RC)n shapers im-
plemented by an IIR filter (left) and synthesizable pulse shapes derived from
established window functions (right). The window shapers are suitable for an
implementation by an FIR filter.
digital pulse shapers. As mentioned, the standard deviation of the peak heights are normal-
ized to that of the cusp shaper to obtain the relative noise charge QN,rel. The results related
to noise performance are shown fig. 5.4.
1 2 3 4 5 6
Shaper peaking time / 7s
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
1.4
1.45
Q N
,re
l
CR-RC
CR-(RC)2
CR-(RC)4
CR-(RC)6
CR-(RC)8
1 2 3 4 5 6 7 8 9 10
Shaper peaking time / 7s
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
1.4
1.45
Q N
,re
l
Cusp
Gaussian
Kaiser
Nuttall
Triangular
Figure 5.4.: Resulting relative noise chare QN,rel for the IIR filter based pulse shapers (left)
and the FIR filter based pulse shapers (right). The relative noise charge is
normalized to the optimum case of the cusp shaper.
The application of digital pulse shapers reveals the expected performance with regard to
their analog equivalents. With an increasing number of cascaded IIR filter stages of the
CR-(RC)n shapers, the relative noise charge decreases. On the top of that, the cusp shaper
performs best, whereas only the triangular pulse shape results in an notable improvement
of energy resolution. Other window based shapers, e.g. with a Gaussian pulse shape, are
lined up with their performance in the range of the CR-(RC)n shapers. The numerical values
are summarized in tab. 5.1.
The measured results of the digital pulse shapers are in accordance with the traditional
analog implementations presented in (Kowalski, 1970). A major benefit from the digital im-
plementation is actually that arbitrary pulse shapes can be synthesized. Even pulse shapes
which are “not exactly realizable” (Kowalski, 1970) by an analog implementation can be
synthesized and implemented by a digital circuit. By using the presented shapers for a
peak height analysis, the energy resolution of the signal processing chain is improved, as
79
5. Experimental results
the analog counterparts are practically not implementable. However, the presented perfor-
mance values have been derived by a signal with a fixed rise time. For realistic signals, e.g.
the cathode signal of the CZT pixel detector, the rise time varies and the performance of the
pulse shapers tremendously changes due to an potentially peak amplitude loss (fig. 5.6).
The results of a peak height analysis with a fixed amplitude at the test input of the CSA but
a varying rise time is shown in fig. 5.5.
1 2 3 4 5 6
Shaper peaking time / 7s
1.0
1.4
1.8
2.2
2.6
3.0
3.4
3.8
Q N
,re
l
CR-RC
CR-(RC)2
CR-(RC)4
CR-(RC)6
CR-(RC)8
1 3 5 7 9 11 13 15
Shaper peaking time / 7s
1.0
1.4
1.8
2.2
2.6
3.0
3.4
3.8
Q N
,re
l
Cusp
Gaussian
Kaiser
Nuttall
Triangular
Figure 5.5.: Relative noise charge QN,rel for CZT like signals with a varying rise time up to
500 ns. The corresponding energy resolution is worsened in comparison to the
measurement with a fixed rise time. The shapers with a wide spread top, e.g.
Kaiser window, achieve best results.
As shown in fig. 5.5, the relative noise charge is worsened for signals with varying rise times.
Obviously, the shapers with a widely spread top achieve best performance. Presuming that a
flat top improves the performance, the impact of a shaper with a flat top is exemplary shown
in fig. 5.6. It is obvious, that the proposed window shapers have to be adapted to the features
0 1 2 3 4 5 6 7 8 9 10
Time nT / 7s
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sa
m
pl
es
 / 
a.
u. tp1 = 07s
shaped tp1
tp2 = 0.57s
shaped tp2
tp3 = 17s
shaped tp3
0 1 2 3 4 5 6 7 8 9 10
Time nT / 7s
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sa
m
pl
es
 / 
a.
u. tp1 = 07s
shaped tp1
tp2 = 0.57s
shaped tp2
tp3 = 17s
shaped tp3
Figure 5.6.: Illustration of the amplitude loss and therefore degradation of energy resolution
as a consequence of rise time variations (left). An adaption of the pulse shaper
to the expected rise times eliminates the attenuation and improves energy res-
olution. A truncation of the pulse shape to a flat top with a duration equal to the
maximum rise time is convenient (right).
of signals from a CZT. Pulse shapes with a truncated flat top equal to the duration of the
maximum expected rise time performs best, as the peak amplitude is preserved. Truncated
shapers and the results with the same setup from fig. 5.5 are illustrated in fig. 5.7. With the
adaptation of shapers to the expected rise time variation, the energy resolution of a peak
80
5.1. Digital pulse shapers
0 0.25 0.5 0.75 1 1.25 1.5
Time nT / 7s
0
0.2
0.4
0.6
0.8
1
Am
pl
itu
de
1 2 3 4 5 6 7 8 9 10
Shaper peaking time / 7s
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
Q N
,re
l
Cusp
Gaussian
Kaiser
Nuttall
Triangular
Figure 5.7.: Adaptation of the pulse shapes by truncation with a flat top (left). The resulting
relative noise charge QN,rel is improved and the optimum peaking time of the
shaper depends on the detector amplifier configuration.
height analysis is improved in contrast to the measurement shown in fig. 5.5. Again, the
cusp shaper achieves best result, whereas the triangular (trapezoidal) shape outperforms
the typical window shapers. Consequently, the trapezoidal shaper even outperforms the
cusp shaper, if the relative noise charge is weighted by resource utilization in an FPGA.
With a look at the achieved energy resolution (tab. 5.1), the numerical validation and the
ranking of performance is comparable to the analog circuits presented in (Kowalski, 1970).
But, the flexibility of an arbitrary pulse shape synthesis is inaccessible by analog electronics.
Thus, for high energy resolution spectroscopy applications with a CZT pixel detector, digital
pulse processing is necessary, or even indispensable with regard to varying rise times of the
detector signal.
Table 5.1.: Resulting relative noise charge QN,rel and corresponding peaking time tp of dif-
ferent shapers for fixed rise time signals and variable rise times up to 500 ns
as expected from the CZT detector. The last column shows the results for the
truncated shapers with a flat top of 500 ns. The CR-(RC)n shapers cannot be
truncated by the cascaded IIR filter structure.
Fixed rise time Variable rise time + truncated shaper
Shape tp QN,rel tp QN,rel tp QN,rel
Cusp 5.50µs 1.00 15.00µs 7.97 6.00µs 1.21
Triangular 2.00µs 1.06 13.50µs 2.61 2.00µs 1.24
Nuttall 2.50µs 1.15 6.50µs 1.50 2.50µs 1.28
Gaussian 2.00µs 1.17 5.00µs 1.50 2.00µs 1.28
Kaiser 2.00µs 1.28 3.90µs 1.50 1.50µs 1.39
CR-(RC)n
n = 1 1.59µs 1.33 2.50µs 1.49 - -
n = 2 1.36µs 1.22 3.18µs 1.47 - -
n = 4 1.59µs 1.16 4.32µs 1.48 - -
n = 6 1.82µs 1.14 5.23µs 1.48 - -
n = 8 2.27µs 1.13 6.14µs 1.48 - -
81
5. Experimental results
5.1.2. Timing applications
The established methods for timestamp generation out of a voltage signal are mainly based
on triggering a threshold crossing of a shaped signal. Detecting that point in time on a sharp
rising edge with low noise distortion is apparently much easier and less error-prone than
processing noisy signals with long rise (or fall) times. Hence, the pulse shapes derived for
peak amplitude measurement, which often slow down the rising edge, are practically not
suitable for extracting timing information, as the crossing point is smeared out.
In order to find the optimum timing method for CZT detector signals, a performance analysis
of a digital constant fraction discriminator (CFD) applied to the unprocessed output signal of
the charge sensitive amplifier is investigated first. With the same setup used for the energy
measurements with the test pulse (fig. 5.1), the input and output from the CSA as well as
the timing reference from the master clock source have been sampled by the digitizer. For a
proof of concept, the signals were initially processed offline by a software. By doing so, the
timing reference is extracted by a four parameter sine fit, such that zero crossing points are
identified by the calculated phase angle of the sine wave. The digital CFD is implemented to
operate on a stored sequence of samples containing the whole pulse of a triggered event.
The peak value of the pulse is extracted and the position of the corresponding fraction be-
fore the peak position is used for calculating the timestamp information. A linear interpolation
between the two sample points results in a timestamp accuracy better than the sampling pe-
riod of 10 ns of the FMC digitizer. The intention of that experiment is to obtain the maximum
achievable timing resolution with the CFD method applied to signals equal to that of a CZT
pixel detector.
Depending on the amplitude of the output signal, the precision of the digital CFD method
is measured by the standard deviation of the timestamp difference σ(∆t) between the zero
crossing point of the sine wave and the crossing point of the fraction dependent threshold
(fig. 5.8). The best fraction for the CFD is carried out experimentally and ranges about a
0 2k 4k 6k 8k 10k 12k 14k
Amplitude / ADC codes
101
102
103
<("
t) /
 ps
rise time = 10ns, fraction = 0.45
0.3 0.4 0.5 0.6
Fraction of CFD
24
26
28
30
32
34
36
38
40
42
44
<("
t) /
 ps
A = 7046 A = 9389 A = 14020
0 100 200 300 400
Rise time / ns
101
102
103
<("
t) /
 ps
A A = 7046
fixed amplitude, fraction = 0.45
Figure 5.8.: The achievable timing precision (standard deviation σ of the timestamp differ-
ence ∆t) is improved with an increased amplitude A of the signal, i.e. the SNR
is increased (left). Further, the precision depends on the selected fraction of the
digital CFD (middle) and the rise time of the signal (right).
value of 0.45. For the evaluated signal shape, that value corresponds to that component of
the signal, where the slope matches a linear shape most. An increasing SNR also improves
the timing precision. Our setup achieves an resolution in the range of 30 ps, which almost
82
5.1. Digital pulse shapers
hits the timing limits of the signal generator from Agilent (Keysight Technologies, 2016). The
measurement with a fixed input amplitude but varying rise time (fig. 5.8, right) confirms
the assumption, that a slow rising edge degrades the timing performance. As a result, the
synchronized digitizer is capable of a timing precision much better than the sampling period.
Depending on the features of the detector signal, a timing far below 1 ns is possible, which
is at first glance sufficient for a timing application with a CZT detector.
Besides the degradation of timing precision σ(∆t), the varying rise times also have an impact
on accuracy of timing. With a slow rise time, the absolute timestamp difference between the
zero crossing of the reference signal and the threshold crossing of the detector signal is
increased. Although the timewalk can be reduced by small fractions for the CFD (fig. 5.9),
that effect dominates the overall timing performance excessively. Actually, the precision
would be also degraded by a smaller fraction as well. An accuracy in the range of several
0 40 80 120 160 200 240
"t / ns
10
100
200
300
400
R
is
e 
tim
e 
/ n
s
fCFD = 0.05
fCFD = 0.25
fCFD = 0.45
Figure 5.9.: Illustration of the rise time dependent time walk of the constant fraction discrim-
inator (CFD). A lower fraction fCFD of the CFD reduces the walk, but also lowers
the precision σ.
tens of nanoseconds is unacceptable for most pulsed beam applications, as the time walk
overlaps the repetition rate of the accelerator. To overcome the rise time dependent time
walk of the detector signals, the variation in rise time has to be compensated. Several
approaches have been proposed for large volume detectors (Fouan and Passerieux, 1968;
Kozyczkowski and Bialkowski, 1976), but they all based on sophisticated analog circuits. In
the digital domain, the rise time compensation can be constructed intuitively as sketched in
tab. 5.2.
To fulfill the first claim of a fast rising edge for precise timing, a convenient method is a cal-
culation of the first derivative of signal data (tab. 5.2, first row). By doing so, the nearly linear
slope of the CSA output is transformed into a rectangular shaped signal which correlates to
the current of the detector. A timestamp generated out of that signal would be independent
of charge collection time, as it triggers the time when the prompt event hits the detector and
therefore the current starts to flow. But on the contrary, the calculation of the derivative in
time domain is also a gain for high frequency noise. With regard to practical constraints,
the derived SNR is too low for a precise timing. However, the introduced noise level can
be reduced by an additional low-pass filter. An averaging FIR filter transforms the assumed
83
5. Experimental results
rectangular current pulse into a triangular current pulse (tab. 5.2, second row). If the length
of the averaging shaper equals the minimum drift time, the amplitude of the shaped signal
remains constant at its maximum level.
Table 5.2.: Steps for compensating a variable rise time to a fixed one. Firstly, the output
signal of the CSA is differentiated to obtain a nearly rectangular shape. Secondly,
the rectangle pulse shape is averaged to reduce noise and to limit the rise time
to a fixed value.
Input Filter Output
-2T 0 T nT
0
Y
X 1− z−1 Y
-2T 0 T nT
0
Y´
-2T 0 T nT
0
Y´
X 1 + z−1 + ...+ z−n Y
-2T 0 T nT
0
Y*
Finally, the necessary differentiation by means of coefficients of form bd = (1,-1) and the
unscaled averaging filter with n coefficients of value 1, ba = (1,...,1), can be implemented by
a single FIR filter. After a convolution of both kernels, the resulting FIR filter is of form
bc = (1, 0, ..., 0,−1) , (5.1)
where the number of coefficients equal zero is n−1. The application of the filter from eq. 5.1
to a simulated signal is illustrated in fig. 5.10. Here, a signal with different rise times t1, t2
0 t1 t2 t3 tf tc
Time nT
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sa
m
pl
es
 / 
a.
u. Rise time t1
compensated t1
Rise time t2
compensated t2
Rise time t3
compensated t3
Figure 5.10.: Application of the compensation filter from eq. 5.1 to a simulated signal with
different rise times t1,t2,t3. The variation in rise time is compensated at the
costs of an amplitude loss.
or t3 is transformed by the filter to a signal with constant rise time (tc − tf ). The rise time is
compensated at the costs of a reduced amplitude of the output signal, which also degrades
84
5.2. γ-ray spectroscopy
the SNR and therefore timing precision, as illustrated in fig. 5.11.
0 20 40 60 80
"t / ns
10
100
200
300
400
R
is
e 
tim
e 
/ n
s
< = 0.33ns
7 = 13.33ns
< = 0.70ns
7 = 13.48ns
< = 1.05ns
7 = 13.63ns
< = 1.50ns
7 = 13.63ns
< = 0.25ns
7 = 35.60ns
< = 0.52ns
7 = 35.74ns
< = 0.76ns
7 = 35.88ns
< = 1.01ns
7 = 36.03ns
< = 0.25ns
7 = 53.68ns
< = 0.53ns
7 = 53.87ns
< = 0.79ns
7 = 53.90ns
< = 1.02ns
7 = 54.01ns
fCFD=0.05 0.25 0.45
Figure 5.11.: Resulting accuracy and precision of the digital constant fraction discriminator
applied to a rise time compensated signal with a fixed amplitude. The com-
pensation filter was chosen to compensate rise times larger than 100 ns. The
accuracy is below 400 ps, while the precision is at least ten times better than
the sampling period T of 10 ns.
Finally, the walk of the measurement shown in fig. 5.9 is compensated for a minimum rise
time of 100 ns. With the sampling period of 10 ns, the compensation FIR filter is of length 11
with 9 zero coefficients regarding eq. 5.1. The measurement with the rise time compensation
shows, that the accuracy is improved to a value of about 400 ps, whereas the precision
remains almost constant in comparison to the uncompensated dataset shown in fig. 5.8.
5.2. γ-ray spectroscopy
The developed hardware and algorithms should be evaluated with different detectors for γ-
ray spectroscopy. For evaluation and calibration of the detectors, a 22Na radioactive source is
used. Its typical β+ decay causes an annihilation with an electron and produces a coincident
pair of γ-rays with an energy of 511 keV, which can be used for energy as well as timing
measurements. For an advanced analysis of the performance of the CZT pixel detector,
additional radioactive sources have been used. At first, a 241Am source is used to derive
the energy resolution in the low energy range of 60 keV. Secondly, for linearity and a more
practical relevant use case, a radium paint source was investigated. The experiments with
radioactive sources should validate the performance of the algorithms with various detectors
and especially reveal the characteristics of the CZT pixel detector.
5.2.1. Energy resolution of scintillation detectors
Several measurements with scintillation detectors were performed to evaluate the devel-
oped digital algorithms with signals of different features. Each scintillation crystal used has
different characteristics regarding energy and timing resolution. The performance of various
scintillation materials with a conventional setup was reported by (Roemer et al., 2015). For
85
5. Experimental results
our investigation, we concentrate on the application and test of the deconvolution algorithm
and pulse shaping for energy measurement and timestamp recovery. For that purpose, three
different scintillation detectors were chosen (tab. 5.3).
Table 5.3.: Overview of used scintillation detectors with associated readout electronics for
evaluation of the digital pulse processing algorithms.
Scintillation material Readout electronics High-voltage
NaI HZDR proprietary +525 V
CeBr3 Hamamatsu R2059 -1200 V
Gd3Al2Ga3O12 (GAGG) Philips XP 2972 -1500 V
Firstly, a NaI detector with built-in amplifier from HZDR inventory was examined. The detec-
tor operates at +525 V and the output signal has the typical exponential decay with a time
constant of approximately 43.46µs. The noise filtered signal is suitable for a peak-height
analysis. It is obvious, that a spectroscopy application potentially suffers from pulse-pileups,
caused by high count rates in comparison to the decay time constant. To overcome this
problem, the traditional analog approach uses additional pulse shaping amplifiers as a com-
bination of high-pass filters and low-pass filters (Knoll, 2010) to shorten the pulse-width.
However, this requires highly adapted readout electronics and increases total number of
components, which is unfavorable for high channel counts. Fig. 5.12 compares the pulse
processing in which the left plot represents the traditional analog pulse-height analysis with-
out pulse shortening and the right plot illustrates the application of the proposed digital pulse
processing. Despite the fact of a simplified hardware, the achievable energy resolution for
0 10 20 30 40 50
Time nT / 7s
0
5
10
15
20
25
Sa
m
pl
es
 / 
m
V
Amplifier output
Low-pass FIR filter
0 10 20 30 40 50
Time nT / 7s
0
5
10
15
20
25
Sa
m
pl
es
 / 
m
V
Amplifier output
Deconvolution
Shaper
Figure 5.12.: Signals from a NaI detector and applied digital pulse processing. The left plot
shows the application of a simple noise filter implemented by a 100-tap low-
pass FIR filter (fc = 500 kHz). The pulse-width (solid line) is ultimately short-
ened by the proposed pulse shaping algorithms (right).
that type of detector signals is indeed not significantly improved by digital pulse processing
(see also (Di Fulvio et al., 2016)), as long as the ballistic deficit problem is negligible and the
analog processing chain matches the electrical characteristics of the detector, i.e. bandwidth
and gain. The obtained pulse-height spectra from a 22Na radioactive source are shown in
fig. 5.13.
Secondly, the characteristics of a CeBr3 should be verified by digital signal processing meth-
86
5.2. γ-ray spectroscopy
0 20 40 60
Pulse height / mV
100
101
102
Co
un
ts
22Na spectrum
7 = 25.038 mV
< =  1.200 mV
0 20 40 60
Pulse height / mV
100
101
102
Co
un
ts
22Na spectrum
7 = 25.213 mV
< =  1.214 mV
Figure 5.13.: Exemplary comparison of the classical pulse-height analysis without additional
pulse shaping (left) and with the proposed deconvolution algorithm and trape-
zoidal shaper. The energy resolution of the NaI detector is not significantly im-
proved and remains at approximately 11 % FWHM for the 511 keV photopeak.
ods. Therefore, the current-to-voltage gain for that detector is realized by a single 50 Ω
resistor, as the photomultiplier tube provides sufficiently large gain. For this reason, the
bandwidth is not significantly limited by an operational amplifier. The signals of the detector
were digitized by an Agilent MSO1234 Oscilloscope with 20 GSPS and 4 GHz bandwidth.
An example of a digitized waveform is shown in fig. 5.14. Fitting the exponential decay as
shown in fig. 5.14 to the entire dataset of 60123 waveforms, the distribution of the character-
istic light decay time results in a standard deviation of 1.2 ns with a mean value of 23.4 ns.
0 20 40 60 80 100 120 140
Time nT / ns
0
10
20
30
40
Sa
m
pl
es
 / 
m
V
Amplifier output
curve fit: Ae-t/=
Figure 5.14.: An example of the output signals of the CeBr3 detector (left). The sampled volt-
age signals correlates with the current generated by the photomultiplier tube.
A curve fit to the exponential decay reveals the typical decay time constant
τ of the scintillation material of 23.4 ns with a standard deviation of 1.2 ns.
The 511 keV photopeak of the 22Na source is visible at a signal amplitude of
36.87 mV (right). The FWHM of the photopeak is 6.75 mV (18.3 %).
The characteristic value of light decay time from fig. 5.14 are in accordance with the re-
ported values of 19.3 ns±3 ns (Roemer et al., 2015), 20.0 ns±1 ns (Ra et al., 2008). A
smaller value of 17.0 ns was reported by (Shah et al., 2005), but has been measured with
a different method (delayed coincidence method (Bollinger and Thomas, 1961)). However,
87
5. Experimental results
the goodness of fit is improved by higher signal amplitudes corresponding to an increased
SNR, but the peak height of the fit results in an poor energy resolution of 18.3 % (FWHM,
511 keV). As the energy of the incident radiation hitting the scintillator is proportional to the
total amount of light produced (Knoll, 2010), the integral of the voltage pulse is proportional
to the energy. The numerical integration of that pulse can be realized by an IIR filter, and fi-
nally, pulse shaping can be applied for the peak height analysis similar to the pulse shaper for
the NaI detector from fig. 5.12. A simple digital integrator is derived by applying the bilinear
transformation to the Laplace operator 1/s. This results in an IIR filter with the coefficients
bn =
T
2
(1, 1) , an = (1,−1) , (5.2)
where T is the sampling period. Along with a trapezoidal shaper, the peak height of the
shaped pulse is proportional to the numerical integration of the voltage pulse (fig. 5.15). To
0 0.2 0.4 0.6 0.8 1
Time nT / 7s
0
10
20
30
40
Sa
m
pl
es
 / 
m
V
Amplifier output
Integration
Shaper 0.20 7s
Shaper 0.28 7s
Shaper 0.36 7s
0 170 341 511 764 1062 1275
Energy / keV
100
101
102
Co
un
ts
22Na spectrum
7 = 511.0 keV
< =  25.7 keV
Figure 5.15.: A signal from a CeBr3 detector with proper digital pulse processing (left). Dif-
ferent shaper length were applied to achieve an optimized energy resolution.
The illustrated spectrum of a 22Na radioactive source reveals an energy reso-
lution of 11.8 % FWHM at the 511 keV photopeak (right) with a digital shaper
of length 0.20µs.
optimize the energy resolution, different lengths of trapezoidal shapers have been applied
for pulse processing. The flat top of the trapezoidal shaper was set to a constant value of
about 5τ ≈ 120 ns. The averaging component of the shaper was modified in order to find
the best length.
As it is not the intention to optimized the signal chain for high resolution spectroscopy with
CeBr3 detectors, this analysis is necessary to get a rough estimate of the energy resolution
of the detector and to use the calibrated energy data for selecting events of a further timing
measurement. The timing measurement with a 22Na source can be improved, if an energy
cut is applied to the photons with 511 keV. Thus the timing spectrum is cleaned of random
coincidences or scattered events.
The methods for processing signals from a CeBr3 detector were applied to a GAGG scintil-
lation detector. The results are illustrated in fig. 5.16
The obtained decay time τ from the GAGG scintillator of approximately 121 ns is larger
than the value of 88 ns±10 ns from the datasheet (Roemer et al., 2015; Furukawa Denshi,
88
5.2. γ-ray spectroscopy
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time nT / 7s
0
10
20
30
40
50
Sa
m
pl
es
 / 
m
V
Amplifier output
curve fit: Ae-t/=
0 0.5 1 1.5 2 2.5
Time nT / 7s
0
10
20
30
40
50
60
Sa
m
pl
es
 / 
m
V
Amplifier output
Integration
Shaper
0 170 341 511 764 1062 1275
Energy / keV
100
101
102
Co
un
ts
22Na spectrum
7 = 511.0 keV
< = 27.0 keV
Figure 5.16.: An output signal of the GAGG detector and corresponding curve fit of the ex-
ponential decay (top left). The mean value of the decay time τ is 121 ns with a
standard deviation of 10.1 ns (top right). The current pulse is integrated by an
IIR filter and processed by a trapezoidal shaper (bottom left) for energy mea-
surement (bottom right). The energy resolution of the 511 keV photopeak is
63.45 keV (12.42 % FWHM).
2016), but ranges in the specified tolerance. The decay time τ was again used to set up the
trapezoidal shaper for energy measurement. The flat top of the shaper is set to match the
entire current integration time of about 5τ ≈ 600 ns. The averaging component of the shaper
is set to 700 ns. Thus, the total pulse width of the shaper is 2µs.
The obtained energy resolution is below the reported values of 6.3 % (Iwanowska et al.,
2013) or 6.47 % (Roemer et al., 2015). “The mismatch between the measured and the re-
ported value is mainly due to the choice of the PMT. The XP2972 PMT is optimized for
timing applications.” (Golnik, 2016). This statement is also valid for the measurements with
the CeBr3 crystal and the Hamamatsu R2059 PMT. Actually, the timing performance of both
detectors is of interest. It could be shown, that an IIR filter is suitable to perform an inte-
gration of detector signals, required for charge measurement if a current-to-voltage amplifi-
cation is utilized in the readout electronics. This approach simplifies the conventional setup
with a QDC with complex triggering and gating as implemented in previous prototypes of
the Compton camera (Kormoll, 2013; Golnik, 2016). Thus the complexity can be reduced
and throughput is increased. Further, a significant improvement of the peak height spectra
with respect to analog pulse processing could not be shown, as the energy resolution was
mainly limited by the readout electronics of the scintillation detectors. But, a major benefit is
the easiness of adjusting the signal integrator and pulse shaper with respect to the detector.
89
5. Experimental results
5.2.2. Energy resolution of a CZT pixel detector
Most of the investigations have been done with the CZT detectors shown in fig. 5.17.
Figure 5.17.: Two CZT detectors from Redlen. These crystals were used to benchmark the
key parameters of the electronics, algorithms and detector characteristics.
Both detectors were manufactured by Redlen and have a dimension of 20×20×5 mm3. It
is clearly visible, that both crystals came from different productions runs, as the electrode
layout differs. The detector with the number 16119 (fig. 5.17, left), referred to as “CZT1”
has a chamfer on the top left corner and no steering grid electrode between the pixels. The
detector with the number 32784, referred to as “CZT2” has a pixel layout with a steering grid.
Actually, a steering grid is not needed for the operation of a CZT, but can improve energy
resolution and reduces charge-sharing effects (Jung et al., 2005). Nevertheless, recent
electrode layouts of the pixel detectors from Redlen did not include a steering grid, as energy
resolution is primarily improved by optimizing properties of the material (e.g. purity of the
crystal) and therefore a steering grid is not needed anymore (Grosser, 2015). Consequently,
the steering grid was connected to the potential of the anodes for all measurements. Besides
the different electrode layouts, the detectors differ in the measured dark current by a factor of
10. For evaluation of spectroscopy performance, CZT1 with the lower dark current of about
10 nA at a bias voltage of 500 V was used.
In the first step, optimal pulse processing for best energy resolution was carried out exper-
imentally with the methods applied to the test pulses. As the energy resolution of the CZT
practically depends on the depth-of-interaction, events near the cathode side of the detector
promise best signal quality. By selecting a low energy γ-ray source irradiating the detector
from the cathode side, the interaction will mostly take place near the cathode electrode, as
low energy photons are well stopped by the CZT, while attenuation decreases for higher
energy photons, which are then more likely to be absorbed in the deeper regions of the
material at the anode side or even cross the detector without an interaction. The cross sec-
tions of a CdZnTe compound is shown in (Kormoll, 2013; NIST, 2016). Consequently, an
90
5.2. γ-ray spectroscopy
241Am source with a γ-ray emission at 60 keV was used for initial measurements of energy
resolution and to find the optimal pulse shaping with respect to spectroscopy applications.
From the theoretical point of view, the optimum shaping frequency can be estimated by the
intersection frequency fI of the current and voltage noise spectral density shown in fig. 2.21.
Rough estimates with assumptions on total voltage noise and current noise range about
an intersection at 200 kHz, which corresponds to a rise time of 1.75µs. However, all noise
sources cannot be estimated exactly, thus an experimental verification of optimum rise times
for the digital pulse shapers is necessary. As demonstrated, different pulse shapes have an
impact on energy resolution and require an appropriate adaptation to the detector signal.
The window based pulse shapes with a flat top have been used for evaluation of energy res-
olution of the pixel detector (fig. 5.18). The best results are achieved with the cusp shaper
2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shaper peaking time / 7s
4.7
4.9
5.1
5.3
5.5
5.7
5.9
6.1
An
od
e 
"
E 
/ k
eV
 
Cusp
Gaussian
Triangular
2 3 4 5 6 7 8 9 10 11
Shaper peaking time / 7s
7.8
8.2
8.6
9
9.4
9.8
Ca
th
od
e 
"
E 
/ k
eV
 CuspGaussian
Triangular
Figure 5.18.: Obtained energy resolution (FWHM) of the 60 keV photopeak from a measure-
ment with a 241Am radioactive source dependent on the applied digital pulse
shaper. The results are in accordance with the test pulse setup and reveals
and optimum peaking time for the triangular shaper of about 4.0µs for the an-
ode and about 2.5µs for the cathode.
with a peaking time of 13.5µs (10.0µs) for the anode (cathode). As this rise time is prac-
tically not relevant for a high throughput application, the triangular (trapezoidal) shaper is
considered to perform best, as it is easily implementable in an FPGA without utilizing addi-
tional multipliers. The resulting energy spectrum is illustrated in fig. 5.19.
Further investigations have been done with a 22Na source with characteristic γ-ray emissions
at 511 keV and 1275 keV. With the selected trapezoidal shaper, the uncorrected energy plot
from the cathode and anode peak amplitudes reveals the discussed plot from fig. 2.26, but
with an improved resolution (fig. 5.20).
After depth correction, the energy resolution of the 511 keV photopeak is about 10.81 keV.
The obtained energy resolutions verify the assumption made by (Kormoll, 2013; Rohling,
2015) also for the investigated pixel detector. Both assumed the energy resolution RCZT
(FWHM) of a CZT detector in general as a function of the deposited energy L to be
RCZT = 6 keV + 0.15 keV
√
L/keV . (5.3)
A calculation for the 60 keV (511 keV) photopeak with eq. 5.3 results in an energy resolution
91
5. Experimental results
0 10 20 30 40 50 60 70
Energy / keV
100
101
102
Co
un
ts
241Am spectrum
7 = 60.0 keV
< =  1.8 keV
Figure 5.19.: Obtained spectrum from a measurement with the 241Am radioactive source
with the derived shaper constants. The bi-parametric spectrum (left) clearly
highlights the 60 keV photopeak and multi-pixel events (cathode energy is
larger than anode energy). The one-dimensional projection of single pixel
events reveals an energy resolution of about 4.23 keV (FWHM) at 60 keV. The
events in the range of 35 keV belongs to characteristic X-ray escapes of the
CdZnTe compound.
of 7.2 keV (9.4 keV), which are roughly in accordance with the measured value of 4.23 keV
(10.81 keV).
Moreover, to check spectroscopic linearity and detector performance at a more realistic γ-
ray irradiation with multiple energies, the spectrum of a radium paint source was recorded,
calibrated and depth corrected. The result is shown in fig. 5.22.
The peaks below 100 keV correspond to characteristic X-rays emitted by the source. The
identified γ-ray lines were emitted by a 226Ra decay. A list of detected peaks is noted in
tab. 5.4. The prominent X-ray lines at about 76 keV and 85 keV belong to the characteristic
Table 5.4.: Detected peaks with a measurement of a radium paint source. The results were
used to estimate linearity of the CZT detector upon a wider energy spectrum.
label type Energy meas. Energy ∆E
a Cd,Te X-ray escapes ≈ 45 keV 47.1 keV n.a.
b Pb X-rays (Kα) ≈ 75 keV 76.4 keV n.a.
c Pb X-rays (Kβ) ≈ 85 keV 86.8 keV n.a.
1 226Ra γ-ray 186 keV 185.2 keV -0.8 keV
2 214Pb γ-ray 242 keV 242.7 keV +0.7 keV
3 214Pb γ-ray 295 keV 294.5 keV -0.5 keV
4 214Pb γ-ray 352 keV 352.0 keV -0.0 keV
5 214Bi γ-ray 609 keV 609.6 keV +0.6 keV
6 214Bi γ-ray 1120 keV 1120.2 keV +0.2 keV
7 214Bi γ-ray 1764 keV 1761.0 keV -3.0 keV
X-ray energies of Pb. Moreover, these X-rays potentially cause an interaction with the detec-
tor material, where the element Cd emits characteristic X-rays of about 23 keV and 27 keV
(Kα and Kβ) and the Te element emits X-rays of about 27 keV and 30 keV. These X-rays
cannot be used for a measurement of linearity, as the energy resolution of the detector is not
92
5.3. γ-ray timing
Figure 5.20.: Calibrated measurement of a 22Na source illustrated as bi-parametric spec-
trum with cathode and anode energy (left) and visualization of depth depen-
dency of the anode energy (right). To improve energy resolution, the anode
energy must be corrected depending of the depth-of-interaction, which corre-
lates with the cathode rise time.
0 170 341 511 764 1062 1275
Energy / keV
100
101
102
Co
un
ts
22Na spectrum
7 = 511.0 keV
< =   4.6 keV
Figure 5.21.: The results of a measurement with a 22Na source and optimized trapezoidal
shaper after depth correction. The energy resolution of the 511 keV photopeak
is 4.6 keV (standard deviation). This corresponds to 11 keV FWHM (2.2 %). All
events of a single pixel are shown.
as precise as needed for this exercise. But linearity can be well calculated with the known
γ-ray emissions of the 226Ra decay chain. The response of the detector shows an linear-
ity in the range of ±1 keV, while ignoring the value at 1761 keV, because the statistic is not
meaningful with only a few counts.
5.3. γ-ray timing
5.3.1. Timing performance of scintillation detectors
Even though the research targets an application of CZT detectors for prompt γ-ray imaging,
the timing performance of scintillation detectors is of interest because of two reasons. At
the one hand, the digital algorithms should be verified for various types of detector signals
and on the other hand, the scintillation detectors provide an excellent reference for a coin-
93
5. Experimental results
0 50 100 150 200 250 300 350
Energy / keV
100
101
102
Co
un
ts
(a)
(b)
(c)
(1)
(2) (3) (4)
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Energy / MeV
100
101
102
103
Co
un
ts
(1)(2)(3)(4)
(5)
(6)
(7)
Figure 5.22.: Spectrum of a radium paint source. The energy region below 100 keV (left) is
dominated by characteristic X-rays, while γ-ray emission were detected up to
2 MeV (right).
cidence measurement with a 22Na source. Thus, a straightforward timing characterization
was performed by measuring the annihilation photons emitted by the 22Na source. For that,
two detectors face the radioactive source with an angle of 180◦. An event triggering both
detectors is used to calculate the time difference between the two occurrences. For real
coincidences, the variation of timestamp differences reveals the combined timing precision
of both detectors. Thus, to characterize an unknown material like CZT, the characteristics
of the reference detector must be well-known. As reported by (Pausch et al., 2016), CeBr3
combines an excellent time resolution of 164 ps FWHM at 511 keV (Fraile et al., 2013) and
very good energy resolution <4 % FWHM at 662 keV (Shah et al., 2005). While this stud-
ies utilizes highly optimized setups for this kind of detector, (Roemer et al., 2015) reports a
value of 365 ps±50 ps for the timing resolution at 511 keV. Consequently, the CeBr3 detec-
tor is assumed to be an excellent reference for coincidence measurements with the 22Na
source.
For timing measurements, the waveforms were recorded with 20 GSPS and 4 GHz band-
width again, and the trigger was externally generated by an FPGA utilizing the FMC digitizer
card. The measurement was performed with the oscilloscope instead of the FMC digitizer
to ensure that the bandwidth did not limit timing performance. At the end, the timestamps of
each event were calculated by a software implementing the digital CFD already described.
That trigger method is used to evaluate the timing performance of all investigated detectors.
Several setups for the digital CFD have been tested. In general, best results have been
achieved with a threshold of one fourth of the peak amplitude and an additional FIR low-pass
filter adapted to the minimum rise times of the detector signals. As the bandwidth limit was
set too tight, the timing resolutions were worsened, because the sharpness of the rising edge
was smeared out. But if bandwidth was not proper limited, the rms noise level is relatively
high and distorts the triggering timestamp. The bandwidth limit and fraction of the CFD
threshold have to be individually identified depending of the shape and noise of the detector
signal. The further investigations of digital pulse processing with scintillation detectors have
been done to prove the versatility of the approaches. Moreover, digital timing outclasses an
94
5.3. γ-ray timing
analog timing circuit with regard to handling. Major efforts for analog CFD timing are related
to adjusting delay times and fractions. The digital CFD technique requires no delay times,
as the samples are stored in a memory and can be traced back to an individual fraction.
Finally, the investigations were necessary to elaborate a well-known reference detector for
timing measurements with a CZT detector. A measurement with two scintillation detectors
is shown in fig. 5.23.
0 1 2 3 4 5 6
"tCeBr
3
 - GAGG / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = 3.654 ns
< = 762 ps
N = 2k
Figure 5.23.: Measurement of coincident events in the GAGG and CeBr3 detector. A 22Na
source was placed between the two detectors centered with a distance of about
70 mm. Events with a full absorption were selected from the coincident energy
spectra (left) for calculating the difference of timestamps. The fit of a Gaussian
distribution to the timing measurement (right) reveals a standard deviation of
762 ps, which correspond to a timing resolution of 1.8 ns (FWHM).
The investigations of the digital timing with a GAGG and a CeBr3 detector have been done
to select the best timing reference for evaluating CZT timing performance. The summed
timing performance of both scintillation detectors is assumed to be much better than the
expected timing performance of the CZT (tab. 5.5). With regard to (Roemer et al., 2015),
where the timing resolution of the GAGG (CeBr3) was estimated to be 3.9 ns (365 ps), the
measured timing resolution for our setup primarily depends on the timing resolution of the
GAGG detector. Moreover, our digital approach with a resolution of 1.8 ns outstrips the
traditional analog setup utilized in (Roemer et al., 2015). Furthermore, the digital timing
benefits from easily adaptable parameters due to varying rise times in contrast to the analog
CFD timing. On the contrary, fast timing detectors are characterized by fast signal rise
times, which require an increased bandwidth for the continuous-time signal and therefore
an increased sampling rate for the quantized discrete-time signal. Assuming a rise time of
1 ns for a fast scintillation detector, the sampling interval of the ADC has to be 700 MSPS to
digitize the signal with a bandwidth of about 350 MHz.
5.3.2. Timing performance of CZT pixel detectors
The characteristics (e.g. rise time and decay time) and performance (e.g. dependence of
timing on energy) of suitable scintillation materials for prompt γ-ray timing have been stud-
ied extensively (Shah et al., 2005; Ra et al., 2008; Fraile et al., 2013; Roemer et al., 2015).
95
5. Experimental results
Such experimental data also exists for CZT detectors, but the published results are very
diverse, as instruments (planar, coplanar, orthogonal strip, or pixel detector), methods (CFD
timing, waveform fit, etc.) and setup (high voltage bias, event selection, etc.) strongly vary.
E.g. (Parnham et al., 1995) reports a timing resolution of 5.3 ns (FWHM at 511 keV) for a
3×3×2 mm3 planar CZT detector. (Vaska et al., 2005) reports a value of 10 ns (FWHM
at 511 keV) even for a thinner planar detector from the same manufacturer (eV Products)
with dimensions 5×5×1.4 mm3. The difference in both studies was an applied bias volt-
age of 600 V or rather 100 V, which corresponds to electric field strength of 3000 V/cm and
714 V/cm. All studies quoted in tab. 5.5 allow the conclusion that an increased bias voltage
results in an improved timing performance for CZT detectors. At the same time, the dark
current related to the bias voltage is increased. Consequently, the SNR decreases and the
energy resolution is worsened. A summary of published timing measurements related to
various kinds of CZT applications and setups is summarized in tab. 5.5.
Table 5.5.: Overview of timing performance of different types of CZT detectors. Besides the
geometry and electrode layout, an important parameter for timing is the applied
bias voltage for the detector.
Reference Detector Bias voltage Timing resolution
(Parnham et al., 1995) planar
(3×3×2 mm3)
600 V 5.3 ns (511 keV)
planar
(5×5×10 mm3)
2000 V 18.6 ns (511 keV)
(Amrami et al., 2000) 4×5 pixel layout
(10×12.5×5 mm3)
unknown 4-84 ns (511 keV)
(Amrami et al., 2001) 4×4 pixel layout
(10×10×4 mm3)
unknown 16.8 ns (511 keV)
with cathode signal
4×4 pixel layout
(10×10×4 mm3)
unknown 5.1 ns (511 keV)
with pixel signal
(Okada et al., 2002) planar
(4×4×2 mm3)
200 V 17.0 ns (60 keV)
planar
(4×4×2 mm3)
500 V 12.0 ns (60 keV)
(Meng and He, 2005) 11×11 pixel layout
(10×10×10 mm3)
1400 V 11.6 ns (511 keV)
(Vaska et al., 2005) planar
(5×5×1.4 mm3)
100 V 10 ns (511 keV)
coplanar grid
(15×15×7.5 mm3)
1000 V 21 ns (511 keV)
(Drezet et al., 2007) orthogonal strip
(20×16×0.9 mm3)
500 V 1.9 ns (511 keV)
planar
(20×16×0.9 mm3)
500 V 2.6 ns (511 keV)
(Komarov et al., 2012) 9 pixel layout
(20×20×5 mm3)
1000 V >25 ns (511 keV)
(Hueso-González et al.,
2014)
orthogonal strip
(20×20×5 mm3)
500 V 2.8 ns (< 12.5 MeV)
96
5.3. γ-ray timing
Our measurements with the pixel detectors were strictly made with a bias voltage of 600 V,
as this results in a good energy resolution and avoids any destruction in the material due to
an overvoltage. Comparable results to our setup were published by (Meng and He, 2005)
and (Komarov et al., 2012). Both studies investigated the timing of a pixel detector at the
cathode signal. Although their electric field strengths were larger than that we used in our
setup, a timing resolution of several tens of nanoseconds seems to be reasonable for a
CZT pixel detector. However, this value has to be experimentally verified with our front-end
electronics, algorithms and detectors. For this purpose, the two detectors shown in fig. 5.17
were examined.
The setup for timing measurements with the CZT detector is straightforward and equal to
the measurement with two scintillation detectors. A simultaneous detected event from both
detectors in a relatively large coincidence window (>4µs) is recorded and stored by the dig-
itizer. As a waveform-fit is not implementable with reasonable efforts in a high-throughput
application (Meng and He, 2005), we also studied a linear fit based on a FIR filter (Wohs-
mann, 2014). But at the moment, the approach with digital pulse shapers for timing in
conjunction with a digital CFD algorithm performs best with regard to SNR and robustness.
The results of a measurement with the 22Na source and the CZT detector in coincidence
with the CeBr3 detector is shown in fig. 5.24.
-60 -40 -20 0 20 40
"tCeBr
3
 - tCZT1 / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = -0.01 ns
< = 13.92 ns
N = 6k
Figure 5.24.: Results of a timing measurement with a 22Na source and the CZT1 pixel detec-
tor in coincidence with the CeBr3 detector. The bi-parametric energy plot (left)
is used to select coincident events with a full absorption of the 511 keV photo-
peak. The distribution of the timestamp differences of selected events with an
energy of 511 keV±50 keV is shown in the right plot. The standard deviation σ
of 13.92 ns corresponds to a timing resolution of 32.71 ns (FWHM).
A closer look at the dependencies of timing of the CZT detector, shown in fig. 5.25, re-
veals that timing performance is not significantly degraded with a lowered energy related
to scattered events in the CZT detector. But with regard to measured cathode-over-anode
ratio, which corresponds to the DOI in the detector, a slight S-shaped structure is visible
in fig. 5.25 (right). The varying timing performance at different depths in the detector can
be explained by a non-uniformity of the electric field caused by an inhomogeneous material
(Butcher et al., 2013). Further investigations were made with the two crystals at a pulsed
particle beam in sec. 5.4.1. Moreover, the CZT-CZT coincidence timing was measured sim-
97
5. Experimental results
Figure 5.25.: Measured dependencies of CZT timing performance. Timing performance of
scattered events with an energy below 511 keV is not significantly worsened
(left). But depending on the depth-of-interaction (Cathode over anode ratio,
right), a S-shaped structure is visible.
ilarly to the CZT-CeBr3 setup. The measurement shown in fig. 5.26 confirms the statement,
that the timing performance is not significantly worsened below 511 keV. Additionally, this
claim is supported by the measurements presented in (Okada et al., 2002) with a 241Am
radioactive source at 60 keV. Actually, timing performance on scattered low energy photons
is more important than fully absorbed photons with high energies, because only scattered
photons with relatively low scattering angles, and therefore low energies, are considered to
be valid events for an image reconstruction with a Compton camera.
-150 -100 -50 0 50 100 150
tCZT1 - tCZT2 / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = -0.76 ns
< = 25.33 ns
N = 48k
Figure 5.26.: Measured coincidence timing with the two different CZT crystals (CZT1,
CZT2). The energy over timing plot reveals no significant worsening for scat-
tered photons below 511 keV (left). The standard deviation σ of timing distri-
bution for all events is 25.33 ns, which corresponds to a timing resolution of
59.53 ns (FWHM).
98
5.4. Measurements with a particle beam
5.4. Measurements with a particle beam
5.4.1. Bremsstrahlung Facility at ELBE
The challenges of a prompt γ-ray imaging system are a high flux of high energy photons. For
an image reconstruction based on prompt γ-ray timing, a detector with a timing resolution
of several hundreds of picoseconds is desirable (Pausch et al., 2016; Golnik, 2016). A CZT
pixel detector can definitely not fulfill that requirement. But the performance estimation of
CZT for energies above 2 MeV or any rate limitations are still unsolved challenges of CZT
detectors in general, and are also of high relevance for Compton cameras or any other po-
tentially upcoming applications. Such experimental data was acquired at the linear electron
accelerator ELBE at HZDR. At the bremsstrahlung facility of the ELBE accelerator, photons
with energies up to 12.5 MeV are generated. The energy range of that bremsstrahlung pho-
tons covers the energy range of prompt γ-rays in a clinical proton beam. The bunches are
pulses with a frequency of 13 MHz (76.92 ns). A compton camera setup equal to that de-
scribed in (Hueso-González et al., 2014) was evaluated, but without a scintillation detector
as absorber (fig. 5.27).
Figure 5.27.: Setup of two CZT detectors with attached readout electronics. Both detectors
face the beam on their anode sides. The distance between both layers is 2 cm.
The setup was faced with the anode side towards the photon beam. Thus, high energy γ-
rays are more likely to be absorbed on the cathode side. Firstly, the signals of the cathode
and a center pixel were sampled and stored with the digitizer. In a first run, both detectors
were investigated independently to suppress additional scattering events. By using the cal-
ibration obtained with a 22Na source and the assumption, that the response of the detector
can be extrapolated from the low energy range up to several MeV, the measured results
show that CZT pixel detectors are capable of detecting photon energies up to 8 MeV. At the
same time, timestamps were calculated relative to the bunch frequency of the accelerator.
The results for both detectors are illustrated in fig. 5.28.
99
5. Experimental results
Figure 5.28.: Amplitude over timing spectrum of both CZT detectors (left: CZT1, right:
CZT2). The detectors are capable of detecting photon energies up to 8 MeV.
CZT2 has a better timing resolution, which needs further investigation.
It is clearly visible, that detector CZT2 has a better timing resolution over the entire energy
range. By plotting the DOI, i.e. rise time of cathode signal, over timestamp difference ∆t, a
distortion of the timing spectrum due to a potentially non-uniformity of the detector crystal is
apparent (fig. 5.29). As a results, it can be stated that the timing resolution depends on the
Figure 5.29.: Dependencies of timing performance on DOI, i.e. rise time of the cathode sig-
nal. CZT2 reveals an uniform timing distribution as expected (right), but timing
performance of CZT1 is potentially degraded by a non-uniformity in the detec-
tor material. The measurements were performed with exactly the same readout
electronics.
CZT material, as the measurements were recorded with the same readout electronics. To
explain these effects, further investigations with a larger set of CZT crystal are necessary. At
this time, only two detectors were available for experiments. The overall timing performance
of both detectors in the relevant energy range up to 8 MeV is summarized in fig.5.30.
Besides energy and timing related characteristics of the CZT, the maximum detector load
and throughput of the electronics is important for a prompt γ-ray imaging system. For this
purpose, the photon flux of the beam was increased to a level, where the monitoring detector
(3"×3" BGO) counts 638,660 events per second. At the same time, the cathode signal of the
CZT1 was recorded for a duration of 5µs per triggered event. After digital pulse processing,
a separation of events without pile-up effects is easily implementable by a leading edge
100
5.4. Measurements with a particle beam
0 10 20 30 40 50 60 70
"tCZT1 - ELBE / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = 41.6 ns
< = 5.19 ns
N = 80k
0 10 20 30 40 50 60 70
"tCZT2 - ELBE / ns
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
7 = 41.6 ns
< = 2.99 ns
N = 69k
Figure 5.30.: Timing performance over the entire energy range up to 8 MeV. CZT1 (left) re-
veals a timing resolution of 12.2 ns (FWHM), whereas CZT2 (right) performs
better with 7.03 ns (FWHM). All events of total number N are plotted. The
standard deviation of the distribution is σ.
trigger with a constant level threshold. The results are shown in fig.5.31. By counting all
0 10 20 30 40
Time nT / 7s
0
0.5
1
1.5
Sa
m
pl
es
 / 
V
Amplifier output
Deconvolution
Fast shaper
2 3 4 5 6 7 8 9 10 11
Triggered cathode events / (50 7s)
0
0.2
0.4
0.6
0.8
1
N
or
m
al
iz
ed
 c
ou
nt
s
Distrib.
6 = 5.26
N = 99k
Figure 5.31.: Recorded sequence of the cathode signal at a high photon flux. A mean value
of 5.083 counts per 50µs was extracted. The number of occurrences has the
shape of a Poisson distribution with the average number of events λ.
detected pulses in a total number of 100k recorded sequences, a mean value of 5.083
counts per 50µs was obtained. This corresponds to 101,660 counts per second in the CZT
pixel detector. The estimate of count rate capability is slightly increased by using the average
number λ of the fitted Poisson distribution (≈105 kcps).
Finally, the total amount of coincident events in the stacked layout was estimated. A cal-
culation of timestamp difference of both cathode trigger events instead of the difference to
accelerator frequency, a coincidence timing spectrum was obtained (fig. 5.32). An event
was triggered on the cathode of the detector on top of the stack, while the cathode signal
of the detector in the second layer was observed. A cluster of coincident events is clearly
visible in fig. 5.32. For this experiment, the monitoring detector counted about 78,000 events
per second. However, the bunch structure due to random coincidences is not visible, but is
potentially more prominent at higher photon fluxes. To estimate the total number of coin-
cident events in a CZT-CZT configuration, events within the cluster of 45 ns were selected
(fig. 5.33). The coincident rate for all interactions within this region is 5.34 %. With the same
101
5. Experimental results
Figure 5.32.: Coincident timing spectrum of cathode triggers on both CZT detectors. The
measurement contains 129,788 events.
Figure 5.33.: Selected events in a coincident window of 45 ns. The total amount of events
within this region is 6,926. This corresponds to a coincident rate of 5.34 %
setup, a coincident rate of about 3 % was measured with 22Na source.
102
6. Discussion
For completeness, it has to be mentioned that the presented experiments at the ELBE
beam have been repeated with a proton beam at OncoRay. Due to the higher frequency
of the particle accelerator (106.3 MHz), a measurement of CZT timing performance was not
satisfactory. The bunch structure was only visible for selected high energy events. But a
measurement of coincident events with the stacked CZT setup verified the results obtained
at ELBE bremsstrahlung beam. Moreover, the spectroscopy performance of the CZT for
prompt γ-ray emissions at realistic scenarios has yet to be carried out.
It has been demonstrated that the developed instruments can be easily adapted to various
types of detectors. However, investigations and experimental results rely on the application
of CZT pixel detectors in a Compton camera. A Compton camera approach with one or two
CZT detectors in the scatter layers was initially pursued, but finally rejected by the group due
to several doubts. Concerns were justified by experimental results with a CZT orthogonal
strip detector and several design studies based on simulation.
An outstanding result of investigations with the orthogonal strip (or cross-strip) CZT detec-
tor was a good energy resolution of about 3.3 % (FWHM at 511 keV), but “the depth-of-
interaction determination was not working satisfactorily” (Kormoll, 2013). This was caused
by the fact that the readout electronics were only able to trigger events near the cathode
side of the detector. Consequently, the entire detector volume was not exploited, resulting in
a poor efficiency of about 10 % (Hueso-González et al., 2014). The subsequent experimen-
tal results with the pixel detector presented in this work, overcome the triggering issue on
multiple cathodes. Moreover, redesigned electronics improved energy resolution, enabled
DOI determination, and maximized the total number of useable events throughout the entire
volume of the detector. It could be shown that CZT pixel detectors for application to range
assessment in particle therapy require high dynamic range analog front-end electronics and
digital pulse processing at high sampling rates. There is still a question mark over the appli-
cability of an ASIC, as a suitable integrated circuit has not been developed until now.
A fundamental assumption for all design studies of a Compton camera by means of simu-
lations was a threshold of 100 keV for CZT detectors. “In reality, a lower energy threshold
of 50 keV is rather not practical for the CZT detector” (Rohling, 2015). That statement is
disproved for the investigated CZT pixel detector, and even the low energy range of CZT de-
tectors is not limited to 50 keV. The measurements with a 241Am indicated that a threshold
of 10 keV is realizable. Consequently, the number of valid events due to Compton scattering
in the CZT detector can be increased by utilizing high dynamic range front-end electronics
103
6. Discussion
and a pixel detector.
Finally, low efficiency and a little number of valid events attained by experiments were mainly
caused by the orthogonal strip layout of the detector in combination with the readout elec-
tronics used for the first prototype. Meaningful conclusions regarding the Compton camera
must be derived from experimental results with the pixel detector and digital pulse process-
ing. Also, larger pixel detectors with a thickness of at least 10 mm or 15 mm should be taken
into account for simulations and experimental setups. The lack of efficiency of a Compton
camera (CC) was also confirmed by other groups. “The detection efficiency for the CC [...]
were found to be too low to make the current system clinically viable” (Polf et al., 2015).
But their detection system was purely designed with CZT detectors in the scatter and ab-
sorber layer. Moreover, their ASIC-based readout electronics have been developed for low
energies (<2 MeV) and low count rate environments, which substantiates the demand for
suitable integrated electronics.
In general, the high channel count of pixel detectors is still challenging, as long as the system
simultaneously requires a large signal bandwidth (above 10 MHz). We have developed an
interface card utilizing 64 readout channels at 32.5 MSPS as a Rear Transition Module (RTM)
for a MicroTCA system. Signal processing is implemented in the corresponding AMC, i.e. the
readout of a detector with 8×8 pixels fits into a single slot of a MicroTCA system (fig. B.1).
Final steps at construction and functional tests are currently underway. That implementation
exposes the requirements on hardware for digital signal processing. The incoming data rate
from that 64 channels is 24.69 GByte/s. By multiplexing two ADC channels to a serial lane
with differential signaling, the interface must be clocked at 780 MHz. Such data rates can be
easily handled by FPGAs (Glöckner, 2015), but increase requirements on interconnections
with regard to signal integrity and the total number of pin counts.
In conclusion, all experimental data with CZT based Compton cameras suffer from limita-
tions due to electronics. Unfortunately, sufficient experimental data with the digitalized CZT
detector setup could not be obtained during this study in time. Thus the Compton camera
approach cannot be abandoned until clinical experiments have been performed with appro-
priate electronics and high-performance grade detectors. Even if the Compton camera is not
feasible for range verification in particle therapy, the proposed instrumentation and digital al-
gorithms can be ultimately applied to any potentially satisfying method to improve overall
detection performance. New developments in the application of nuclear detector systems
should utilize digital sampling techniques. Furthermore, pulse processing can be imple-
mented at a higher level of abstraction (Wohsmann, 2014; Scharnagl, 2015). Processing
detector signals at an algorithmic level, either in real-time by FPGAs or offline by software,
can be regarded as state-of-the-art.
104
7. Summary
Background: The irradiation of cancer patients with charged particles, mainly protons and
carbon ions, has become an established method for the treatment of specific types of tumors.
In comparison with the use of X-rays or γ-rays, particle therapy has the advantage that the
dose distribution in the patient can be precisely controlled. Tissue or organs lying near the
tumor will be spared. A verification of the treatment plan with the actual dose deposition by
means of a measurement can be done through range assessment of the particle beam. For
this purpose, prompt γ-rays are detected, which are emitted by the affected target volume
during irradiation.
Motivation: The detection of prompt γ-rays is a task related to radiation detection and mea-
surement. Nuclear applications in medicine can be found in particular for in vivo diagnosis.
In that respect the spatially resolved measurement of γ-rays is an essential technique for
nuclear imaging, however, technical requirements of radiation measurement during particle
therapy are much more challenging than those of classical applications. For this purpose,
appropriate instruments beyond the state-of-the-art need to be developed and tested for
detecting prompt γ-rays. Hence the success of a method for range assessment of particle
beams is largely determined by the implementation of electronics. In practice, this means
that a suitable detector material with adapted readout electronics, signal and information pro-
cessing, and data interface must be utilized to solve the challenges. Thus, the parameters
of the system (e.g. segmentation, time or energy resolution) can be optimized depending on
the method (e.g. slit camera, time-of-flight measurement or Compton camera). Regardless
of the method, the detector system must have a high count rate capability and a large mea-
suring range (≥7 MeV). For a subsequent evaluation of a suitable method for imaging, the
mentioned parameters may not be restricted by the electronics. Digital signal processing is
predestined for multipurpose tasks, and, in terms of the demands made, the performance of
such an implementation has to be determined.
Materials and methods: In this study, the instrumentation of a detector system for prompt γ-
rays emitted during particle therapy is limited to the use of a cadmium zinc telluride (CdZnTe,
CZT) semiconductor detector. The detector crystal is divided into an 8×8 pixel array by seg-
mented electrodes. Analog and digital signal processing are exemplarily tested with this
type of detector and aims for application of a Compton camera to range assessment. The
electronics are implemented with commercial off-the-shelf (COTS) components. If appli-
cable, functional units of the detector system were digitalized and implemented in a field-
programmable gate array (FPGA). An efficient implementation of the algorithms in terms of
timing and logic utilization is fundamental to the design of digital circuits. The measurement
105
7. Summary
system is characterized with radioactive sources to determine the measurement dynamic
range and resolution. Finally, the performance is examined in terms of the requirements of
particle therapy with experiments at particle accelerators.
Results: A detector system based on a CZT pixel detector has been developed and tested.
Although the use of an application-specific integrated circuit is convenient, this approach
was rejected because there was no circuit available which met the requirements. Instead,
a multichannel, compact, and low-noise analog amplifier circuit with COTS components has
been implemented. Finally, the 65 information channels of a detector are digitized, pro-
cessed and visualized.
An advanced digital signal processing transforms the traditional approaches of nuclear elec-
tronics in algorithms and digital filter structures for an FPGA. With regard to the characteristic
signals (e.g. varying rise times, depth-dependent energy measurement) of a CZT pixel de-
tector, it could be shown that digital pulse processing results in a very good energy resolution
(≈ 2 % FWHM at 511 keV), as well as permits a time measurement in the range of some tens
of nanoseconds. Furthermore, the experimental results have shown that the dynamic range
of the detector system could be significantly improved compared to the existing prototype of
the Compton camera (≈ 10 keV..7 MeV). Even count rates of ≈ 100 kcps in a high-energy
beam could be ultimately processed with the CZT pixel detector. But this is merely a limit
of the detector due to its volume, and not related to electronics. In addition, the versatil-
ity of digital signal processing has been demonstrated with other detector materials (e.g.
CeBr3). With foresight on high data throughput in a distributed data acquisition from multiple
detectors, a Gigabit Ethernet link has been implemented as data interface.
Conclusions: To fully exploit the capabilities of a CZT pixel detector, a digital signal pro-
cessing is absolutely necessary. A decisive advantage of the digital approach is the ease
of use in a multichannel system. Thus with digitalization, a necessary step has been done
to master the complexity of a Compton camera. Furthermore, the benchmark of technology
shows that a CZT pixel detector withstands the requirements of measuring prompt γ-rays
during particle therapy. The previously used orthogonal strip detector must be replaced by
the pixel detector in favor of increased efficiency and improved energy resolution. With the
integration of the developed digital detector system into a Compton camera, it must be ul-
timately proven whether this method is applicable for range assessment in particle therapy.
Even if another method is more convenient in a clinical environment due to practical consid-
erations, the detector system of that method may benefit from the shown instrumentation of
a digital signal processing system for nuclear applications.
106
8. Zusammenfassung
Hintergrund: Die Bestrahlung von Krebspatienten mit geladenen Teilchen, vor allem Pro-
tonen oder Kohlenstoffionen, ist mittlerweile eine etablierte Methode zur Behandlung von
speziellen Tumorarten. Im Vergleich mit der Anwendung von Röntgen- oder γ-Strahlen hat
die Teilchentherapie den Vorteil, dass die Dosisverteilung im Patienten präziser gesteuert
werden kann. Dadurch werden um den Tumor liegendes Gewebe oder Organe geschont.
Die messtechnische Verifikation des Bestrahlungsplans mit der tatsächlichen Dosisdepo-
sition kann über eine Reichweitenkontrolle des Teilchenstrahls erfolgen. Für diesen Zweck
werden prompte γ-Strahlen detektiert, die während der Bestrahlung vom getroffenen Zielvol-
umen emittiert werden.
Fragestellung: Die Detektion von prompten γ-Strahlen ist eine Aufgabenstellung der Strah-
lenmesstechnik. Strahlenanwendungen in der Medizintechnik finden sich insbesondere in
der in-vivo Diagnostik. Dabei ist die räumlich aufgelöste Messung von γ-Strahlen bere-
its zentraler Bestandteil der nuklearmedizinischen Bildgebung, jedoch sind die technischen
Anforderungen der Strahlendetektion während der Teilchentherapie im Vergleich mit klassis-
chen Anwendungen weitaus anspruchsvoller. Über den Stand der Technik hinaus müssen
für diesen Zweck geeignete Instrumente zur Erfassung der prompten γ-Strahlen entwick-
elt und erprobt werden. Die elektrotechnische Realisierung bestimmt maßgeblich den Er-
folg eines Verfahrens zur Reichweitenkontrolle von Teilchenstrahlen. Konkret bedeutet dies,
dass ein geeignetes Detektormaterial mit angepasster Ausleseelektronik, Signal- und Infor-
mationsverarbeitung sowie Datenschnittstelle zur Problemlösung eingesetzt werden muss.
Damit können die Parameter des Systems (z. B. Segmentierung, Zeit- oder Energieauflö-
sung) in Abhängigkeit der Methode (z. B. Schlitzkamera, Flugzeitmessung oder Compton-
Kamera) optimiert werden. Unabhängig vom Verfahren muss das Detektorsystem eine hohe
Ratenfestigkeit und einen großen Messbereich (≥7 MeV) besitzen. Für die anschließende
Evaluierung eines geeigneten Verfahrens zur Bildgebung dürfen die genannten Parameter
durch die Elektronik nicht eingeschränkt werden. Eine digitale Signalverarbeitung ist für uni-
verselle Aufgaben prädestiniert und die Leistungsfähigkeit einer solchen Implementierung
soll hinsichtlich der gestellten Anforderungen bestimmt werden.
Material und Methode: Die Instrumentierung eines Detektorsystems für prompte γ-Strahlen
beschränkt sich in dieser Arbeit auf die Anwendung eines Cadmiumzinktellurid (CdZnTe,
CZT) Halbleiterdetektors. Der Detektorkristall ist durch segmentierte Elektroden in ein 8×8
Pixelarray geteilt. Die analoge und digitale Signalverarbeitung wird beispielhaft mit diesem
Detektortyp erprobt und zielt auf die Anwendung zur Reichweitenkontrolle mit einer Compton-
Kamera. Die Elektronik wird mit seriengefertigten integrierten Schaltkreisen umgesetzt.
107
8. Zusammenfassung
Soweit möglich, werden die Funktionseinheiten des Detektorsystems digitalisiert und in
einem field-programmable gate array (FPGA) implementiert. Eine effiziente Umsetzung der
Algorithmen in Bezug auf Zeitverhalten und Logikverbrauch ist grundlegend für den Entwurf
der digitalen Schaltungen. Das Messsystem wird mit radioaktiven Prüfstrahlern hinsichtlich
Messbereichsdynamik und Auflösung charakterisiert. Schließlich wird die Leistungsfähigkeit
hinsichtlich der Anforderungen der Teilchentherapie mit Experimenten am Teilchenbeschle-
uniger untersucht.
Ergebnisse: Es wurde ein Detektorsystem auf Basis von CZT Pixeldetektoren entwick-
elt und erprobt. Obwohl der Einsatz einer anwendungsspezifischen integrierten Schaltung
zweckmäßig wäre, wurde dieser Ansatz zurückgewiesen, da kein verfügbarer Schaltkreis
die Anforderungen erfüllte. Stattdessen wurde eine vielkanalige, kompakte und rauscharme
analoge Verstärkerschaltung mit seriengefertigten integrierten Schaltkreisen aufgebaut. Let-
ztendlich werden die 65 Informationskanäle eines Detektors digitalisiert, verarbeitet und vi-
sualisiert. Eine fortschrittliche digitale Signalverarbeitung überführt die traditionellen An-
sätze der Nuklearelektronik in Algorithmen und digitale Filterstrukturen für einen FPGA. Es
konnte gezeigt werden, dass die digitale Pulsverarbeitung in Bezug auf die charakteris-
tischen Signale (u.a. variierende Anstiegszeiten, tiefenabhängige Energiemessung) eines
CZT Pixeldetektors eine sehr gute Energieauflösung (≈ 2 % FWHM at 511 keV) sowie eine
Zeitmessung im Bereich von einigen 10 ns ermöglicht. Weiterhin haben die experimentellen
Ergebnisse gezeigt, dass der Dynamikbereich des Detektorsystems im Vergleich zum beste-
henden Prototyp der Compton-Kamera deutlich verbessert werden konnte (≈ 10 keV..7 MeV).
Nach allem konnten auch Zählraten von ≈ 100 kcps in einem hochenergetischen Strahl mit
dem CZT Pixeldetektor verarbeitet werden. Dies stellt aber lediglich eine Begrenzung des
Detektors aufgrund seines Volumens, nicht jedoch der Elektronik, dar. Zudem wurde die
Vielseitigkeit der digitalen Signalverarbeitung auch mit anderen Detektormaterialen (u. a.
CeBr3) demonstriert. Mit Voraussicht auf einen hohen Datendurchsatz in einer verteilten
Datenerfassung von mehreren Detektoren, wurde als Datenschnittstelle eine Gigabit Ether-
net Verbindung implementiert.
Schlussfolgerung: Um die Leistungsfähigkeit eines CZT Pixeldetektors vollständig auszu-
nutzen, ist eine digitale Signalverarbeitung zwingend notwendig. Ein entscheidender Vorteil
des digitalen Ansatzes ist die einfache Handhabbarkeit in einem vielkanaligen System. Mit
der Digitalisierung wurde ein notwendiger Schritt getan, um die Komplexität einer Compton-
Kamera beherrschbar zu machen. Weiterhin zeigt die Technologiebewertung, dass ein CZT
Pixeldetektor den Anforderungen der Teilchentherapie für die Messung prompter γ-Strahlen
stand hält. Der bisher eingesetzte Streifendetektor muss zugunsten einer gesteigerten Ef-
fizienz und verbesserter Energieauflösung durch den Pixeldetektor ersetzt werden. Mit der
Integration des entwickelten digitalen Detektorsystems in eine Compton-Kamera muss ab-
schließend geprüft werden, ob dieses Verfahren für die Reichweitenkontrolle in der Teilchen-
therapie anwendbar ist. Auch wenn sich herausstellt, dass ein anderes Verfahren unter klin-
ischen Bedingungen praktikabler ist, so kann auch dieses Detektorsystem von der gezeigten
Instrumentierung eines digitalen Signalverarbeitungssystems profitieren.
108
A. Waveform diagrams
109
A. Waveform diagrams
0
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
5
1
5
2
5
3
5
4
5
5
5
6
5
7
5
8
5
9
6
0
6
1
6
2
6
3
6
4
6
5
6
6
6
7
6
8
6
9
7
0
7
1
7
2
7
3
7
4
7
5
7
6
tx
en tx
d
A
A
00
40
9E
03
68
C
5
40
D
8
55
05
50
05
08
00
45
00
28
A
5
40
00
40
11
13
B
F
C
0
A
8
00
0F
C
0
A
8
00
01
04
01
04
00
14
87
E
F
0C
76
98
5A
0C
76
98
5B
0C
76
98
5C
A
A
tx
bu
sy
ph
y
tx
en
ph
y
tx
d
A
A
55
D
5
00
40
9E
03
68
C
5
40
D
8
55
05
50
05
08
00
45
00
28
A
5
A
5
40
00
40
11
13
B
F
C
0
A
8
00
0F
C
0
A
8
00
01
04
01
04
00
14
87
E
F
0C
76
98
5A
0C
76
98
5B
0C
76
98
5C
A
A
12
B
E
6D
E
5
A
A
Figure A.1.: An example of a composed Ethernet packet through the MAC layer for GMII.
The transmission starts at position 3 with the preamble (signal “phy_txd”). The
MAC also adds the SFD (pos. 10), padding data (pos. 65-71) for a minimum
payload length of 46 Byte and the FCS (pos. 71-75). The IFG is controlled with
the tx_busy signal.
110
0
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
5
1
5
2
5
3
5
4
5
5
5
6
5
7
5
8
5
9
6
0
6
1
6
2
6
3
ud
p
fi
fo
em
pt
y
ip
fi
fo
em
pt
y
ud
p
tx
en
ud
p
tx
d
0C
04
01
04
00
14
87
E
F
0C
76
98
5A
0C
76
98
5B
0C
76
98
5C
0C
tx
st
ar
t
ud
p
tx
ne
xt
ip
ip
tx
en
ip
tx
d
0C
45
00
28
A
5
40
00
40
11
13
B
F
C
0
A
8
00
0F
C
0
A
8
00
01
04
01
04
00
14
87
E
F
0C
76
98
5A
0C
76
98
5B
0C
76
98
5C
0C
tx
st
ar
t
ip
tx
ne
xt
et
h
tx
en tx
d
A
A
00
40
9E
03
68
C
5
40
D
8
55
05
50
05
08
00
45
00
28
A
5
40
00
40
11
13
B
F
C
0
A
8
00
0F
C
0
A
8
00
01
04
01
04
00
14
87
E
F
0C
76
98
5A
0C
76
98
5B
0C
76
98
5C
A
A
Figure A.2.: Example of the dataflow for a UDP packet through the transport layers with the
“Data-Pull” model.
111
A. Waveform diagrams
Figure A.3.: The transmission enable signal (signal “PHY1_TX_EN”) of the transmitting MAC
and the receive enable signal (signal “PHY2_RX_EN”) of the receiving PHY. The
oscilloscope measurement verifies that the MAC keeps the IFG at 96 ns while
sending data with maximum throughput and constant latency. The packet length
of 592 ns corresponds to a UDP payload of 20 Byte.
112
Figure A.4.: Oscilloscope measurement of the time difference between the master and the
slave. Each FPGA outputs a PPS signal (for this measurement every 268.4 ms)
which shows the precision of the absolute timestamp synchronization with PTP.
The standard deviation of the time difference is 58.932 ps with a constant offset
of about 3.94 ns
113

B. Rear Transition Module
Figure B.1.: Top and bottom side of the Rear Transition Module (RTM) with 64 ADC chan-
nels for the MicroTCA implementation of the readout electronics. Each channel
is capable of sampling at 32.5 MSPS (12 bit,1 Vpp). Total width of the PCB is
182.5 mm and total height is 148.5 mm.
115

Bibliography
Abbene L and Gerardi G. 2015. High-rate dead-time corrections in a general
purpose digital pulse processing system. J Synchrotron Radiat, 22(5):1190–1201.
doi: 10.1107/S1600577515013776.
Alachiotis N, Berger SA, and Stamatakis A. 2010. Efficient PC-FPGA Communication over
Gigabit Ethernet. In: 10th Int Conf Computer and Information Technology, pp. 1727–1734.
IEEE. doi: 10.1109/CIT.2010.302.
Alachiotis N, Berger SA, and Stamatakis A. 2012. A versatile UDP/IP based PC↔FPGA
communication platform. In: Int Conf Reconfigurable Computing and FPGAs, pp. 1–6.
IEEE. doi: 10.1109/ReConFig.2012.6416725.
Altera Corporation. 2010. Implementing FIR Filters and FFTs with 28-nm Variable-Precision
DSP Architecture. White Paper WP-01140-1.0.
Altera Corporation. 2011. Enabling High-Performance DSP Applications with Arria V or
Cyclone V Variable-Precision DSP Blocks. White Paper WP-01159-1.0.
Altera Corporation. 2015. FIR II IP Core, UG-01072, 2015.10.01.
Amrami R, Shani G, Hefetz Y, Levy M, Pansky A, and Wainer N. 2000. PET properties
of pixellated CdZnTe detector. In: Proc 22nd Ann Int Conf Engineering in Medicine and
Biology Society, pp. 94–97. IEEE. doi: 10.1109/IEMBS.2000.900677.
Amrami R, Shani G, Hefetz Y, Pansky A, and Wainer N. 2001. Timing performance
of pixelated CdZnTe detectors. Nucl Instrum Methods Phys Res A, 458(3):772–781.
doi: 10.1016/S0168-9002(00)00810-X.
Analog Devices. 2013. ADA4817-1/ADA4817-2, 1 GHz FastFET Op Amps, Datasheet.
Awadalla S. 2015. Solid-State Radiation Detectors: Technology and Applications. CRC
Press.
Batmaz B and Dogan A. 2015. UDP/IP Protocol Stack with PCIe Interface on FPGA. In:
Proc Int Conf Embedded Systems and Applications, pp. 49–53. WorldComp.
Batzler WU, Schön D, Baumgardt-Elms C, Schüz J, Eisinger B, Stegmaier C, and Lehnert
M. 1999. Krebs in Deutschland: Häufigkeiten und Trends. Gesamtprogramm zur Krebs-
bekämpfung. Arbeitsgemeinschaft Bevölkerungsbezogener Krebsregister in Deutschland,
2 Edition.
117
Bibliography
Bollinger LM and Thomas GE. 1961. Measurement of the Time Dependence of Scintil-
lation Intensity by a Delayed-Coincidence Method. Rev Sci Instrum, 32(9):1044–1050.
doi: 10.1063/1.1717610.
Brisebois G. 2015. Op Amp Combines Femtoamp Bias Current with 4GHz Gain Bandwidth
Product, Shines New Light on Photonics Applications. LT J Analog Innovation, 25(2):4–9.
Butcher J, Hamade M, Petryk M, Bolotnikov AE, Camarda GS, Cui Y, De Geronimo
G, Fried J, Hossain A, Kim KH, Vernon E, Yang G, and James RB. 2013. Drift
Time Variations in CdZnTe Detectors Measured With Alpha Particles and Gamma Rays:
Their Correlation With Detector Response. IEEE Trans Nucl Sci, 60(2):1189–1196.
doi: 10.1109/TNS.2012.2234762.
Cho HY and Lee CS. 2016. Improvement of the energy resolution in a pixellated CdZnTe
detector using depth sensing based on pulse rise-time correlation. J Instrum, 11(02):
C02081. doi: 10.1088/1748-0221/11/02/C02081.
Cho HY, Lee JH, Kwon YK, Moon JY, and Lee CS. 2011. Measurement of the drift mobilities
and the mobility-lifetime products of charge carriers in a CdZnTe crystal by using a tran-
sient pulse technique. J Instrum, 6(01):C01025. doi: 10.1088/1748-0221/6/01/C01025.
Crochiere RE and Oppenheim AV. 1975. Analysis of linear digital networks. Proc IEEE, 63
(4):581–595. doi: 10.1109/PROC.1975.9793.
De Antonis P, Morton EJ, and Menezes T. 1996. Measuring the bulk resistivity of CdZnTe
single crystal detectors using a contactless alternating electric field method. Nucl Instrum
Methods Phys Res A, 380(1-2):157–159. doi: 10.1016/S0168-9002(96)00335-X.
Del Sordo S, Abbene L, Caroli E, Mancini AM, Zappettini A, and Ubertini P. 2009. Progress
in the Development of CdTe and CdZnTe Semiconductor Radiation Detectors for Astro-
physical and Medical Applications. Sensors, 9(5):3491–3526. doi: 10.3390/s90503491.
Di Fulvio A, Shin T, Hamel M, and Pozzi S. 2016. Digital pulse processing for NaI(Tl) detec-
tors. Nucl Instrum Methods Phys Res A, 806:169–174. doi: 10.1016/j.nima.2015.09.080.
Dollas A, Ermis I, Koidis I, Zisis I, and Kachris C. 2005. An Open TCP/IP Core for Reconfig-
urable Logic. In: Proc 13th Ann Symp Field-Programmable Custom Computing Machines,
pp. 297–298. IEEE. doi: 10.1109/FCCM.2005.20.
Dönmez B, Kim J, and He Z. 2007. 3D position sensing on UltraPeRL CdZnTe detectors.
In: Nucl Sci Symp Conf Rec, pp. 420–423. IEEE. doi: 10.1109/NSSMIC.2007.4436361.
Drezet A, Monnet O, Mathy F, Montemont G, and Verger L. 2007. CdZnTe detectors for
small field of view positron emission tomographic imaging. Nucl Instrum Methods Phys
Res A, 571(1-2):465–470. doi: 10.1016/j.nima.2006.10.292.
Fiedler F, Dersch U, Golnik C, Kormoll T, Müller A, Rohling H, Schöne S, and Enghardt W.
2011. The use of prompt γ-rays for in-vivo dosimetry at therapeutic proton and ion beams.
In: Nucl Sci Symp Conf Rec, pp. 4453–4456. IEEE. doi: 10.1109/NSSMIC.2011.6152493.
118
Bibliography
Födisch P, Lange B, and Kaever P. 2013. Eine Ausleseelektronik für CZT-Detektoren mit
dem RENA-3 IC von Nova R&D. In: 104. Tag Studiengr Elektr Instrum, pp. 135–143.
DESY.
Födisch P, Sandmann J, Lange B, and Kaever P. 2014. Taktsynchronisierung und Zeitmes-
sung in einem verteilten Datenerfassungssystem. In: 105. Tag Studiengr Elektr Instrum,
pp. 238–242. DESY.
Födisch P, Lange B, and Kaever P. 2015. Ein VHDL basierter Gigabit Ethernet Protokoll-
stapel für FPGAs. In: 106. Tag Studiengr Elektr Instrum, pp. 52–76. DESY.
Födisch P, Berthel M, Lange B, Kirschke T, Enghardt W, and Kaever P. 2016a. Charge-
sensitive front-end electronics with operational amplifiers for CdZnTe detectors. J Instrum,
11(09):T09001. doi: 10.1088/1748-0221/11/09/T09001.
Födisch P, Lange B, Sandmann J, Büchner A, Enghardt W, and Kaever P. 2016b. A syn-
chronous Gigabit Ethernet protocol stack for high-throughput UDP/IP applications. J In-
strum, 11(01):P01010. doi: 10.1088/1748-0221/11/01/P01010.
Födisch P, Wohsmann J, Lange B, Schönherr J, Enghardt W, and Kaever P. 2016c. Digital
high-pass filter deconvolution by means of an infinite impulse response filter. Nucl Instrum
Methods Phys Res A, 830:484–496. doi: 10.1016/j.nima.2016.06.019.
Födisch P, Bryksa A, Lange B, Enghardt W, and Kaever P. 2016d. Implementing High-Order
FIR Filters in FPGAs. arXiv:1610.03360 [cs.DC].
Fouan JP and Passerieux JP. 1968. A time compensation method for coinci-
dences using large coaxial Ge(Li) detectors. Nucl Instrum Methods, 62(3):327–329.
doi: 10.1016/0029-554X(68)90384-4.
Fraile L, Mach H, Vedia V, Olaizola B, Paziy V, Picado E, and Udías J. 2013. Fast timing
study of a CeBr3 crystal: Time resolution below 120 ps at 60Co energies. Nucl Instrum
Methods Phys Res A, 701:235–242. doi: 10.1016/j.nima.2012.11.009.
Furukawa Denshi. 2016. Ce:GAGG Scintillator Crystal, product information. [last up-
date: 2014, retrieved: 03.11.2016]. URL: http://www.furukawa-denshi.co.jp.
Gan B, Wei T, Gao W, Liu H, and Hu Y. 2016. A low-noise 64-channel front-end readout
ASIC for CdZnTe detectors aimed to hard X-ray imaging systems. Nucl Instrum Methods
Phys Res A, 816:53–61. doi: 10.1016/j.nima.2016.01.070.
Garson A, Li Q, Jung IV, Dowkontt P, Bose R, Simburger G, and Krawczynski H. 2007.
Leakage currents and capacitances of thick CZT detectors. In: Nucl Sci Symp Conf Rec,
pp. 2258–2261. IEEE. doi: 10.1109/NSSMIC.2007.4436597.
Georgiev A, Gast W, and Lieder RM. 1994. An analog-to-digital conversion
based on a moving window deconvolution. IEEE Trans Nucl Sci, 41(4):1116–1124.
doi: 10.1109/23.322868.
119
Bibliography
Georgiev A and Gast W. 1993. Digital pulse processing in high resolution, high throughput,
gamma-ray spectroscopy. IEEE Trans Nucl Sci, 40(4):770–779. doi: 10.1109/23.256659.
Girerd C, Autiero D, Carlus B, Gardien S, Marteau J, and Tromeur W. 2009. MicroTCA im-
plementation of synchronous Ethernet-Based DAQ systems for large scale experiments.
In: 16th IEEE-NPSS Real Time Conf, pp. 22–27. IEEE. doi: 10.1109/RTC.2009.5321791.
Glöckner T. 2015. Entwurf, Implementierung und Test eines JESD204B-IP-Cores für ein
Highspeed-ADC-Board mit einem Xilinx-FPGA. Dresden University of Applied Sciences,
Bachelor thesis.
Golnik C, Bemmerer D, Enghardt W, Fiedler F, Hueso-González F, Pausch G, Roemer K,
Rohling H, Schöne S, Wagner L, and Kormoll T. 2016. Tests of a Compton imaging proto-
type in a monoenergetic 4.44 MeV photon field – a benchmark setup for prompt gamma-
ray imaging devices. J Instrum, 11(06):P06009. doi: 10.1088/1748-0221/11/06/P06009.
Golnik C. 2016. Treatment verification in proton therapy based on the detection of prompt
gamma-rays. Technische Universität Dresden, Dissertation.
Grosser A. 2015. Director, Product Management, Redlen Technologies, Inc. – Private
communication.
Grupen C and Shwartz B. 2011. Particle Detectors. Cambridge University Press, 2 Edition.
Harboe Ø. 2016. The Zylin ZPU, The worlds smallest 32 bit CPU with GCC toolchain. [last
update: 21.04.2015, retrieved: 03.11.2016]. URL: https://github.com/zylin/zpu.
Hartog HD and Muller FA. 1947. Optimum Instrument Response for Dis-
crimination against Spontaneous Fluctuations. Physica, 13(9):571–580.
doi: 10.1016/0031-8914(47)90027-X.
He Z, Knoll GF, and Wehe DK. 1998. Direct measurement of product of the electron mobility
and mean free drift time of CdZnTe semiconductors using position sensitive single polarity
charge sensing detectors. J Appl Phys, 84(10):5566–5569. doi: 10.1063/1.368601.
He Z, Li W, Knoll GF, Wehe DK, Berry J, and Stahle CM. 1999. 3-D position sensitive
CdZnTe gamma-ray spectrometers. Nucl Instrum Methods Phys Res A, 422(1-3):173–
178. doi: 10.1016/S0168-9002(98)00950-4.
He Z. 2001. Review of the Shockley-Ramo theorem and its application in semicon-
ductor gamma-ray detectors. Nucl Instrum Methods Phys Res A, 463(1-2):250–267.
doi: 10.1016/S0168-9002(01)00223-6.
Heinzel G, Rüdiger A, and Schilling R. 2002. Spectrum and spectral density estima-
tion by the Discrete Fourier transform (DFT), including a comprehensive list of win-
dow functions and some new at-top windows. [last update: 2002, retrieved: 03.11.2016].
URL: http://hdl.handle.net/11858/00-001M-0000-0013-557A-5.
120
Bibliography
Herrmann FL, Perin G, de Freitas JPJ, Bertagnolli R, and dos Santos Martins JB. 2009.
A Gigabit UDP/IP Network Stack in FPGA. In: 16th Int Conf Electronics, Circuits, and
Systems, pp. 836–839. IEEE. doi: 10.1109/ICECS.2009.5410757.
Hessling JP. 2008. A novel method of dynamic correction in the time domain. Meas Sci
Technol, 19(7):075101. doi: 10.1088/0957-0233/19/7/075101.
Hoffmann R. 2013. Signalanalyse und -erkennung: Eine Einführung für Informationstech-
niker. Springer.
Hong J, Bellm EC, Grindlay JE, and Narita T. 2004. Cathode depth sensing in CZT detec-
tors. In: Proc SPIE, Vol 5165, pp. 54–62. doi: 10.1117/12.506216.
Horowitz P and Hill W. 2015. The Art of Electronics. Cambridge University Press, 3 Edition.
Hueso-González F, Golnik C, Berthel M, Dreyer A, Enghardt W, Fiedler F, Heidel K, Kormoll
T, Rohling H, Schöne S, Schwengner R, Wagner A, and Pausch G. 2014. Test of Comp-
ton camera components for prompt gamma imaging at the ELBE bremsstrahlung beam.
J Instrum, 9(05):P05002. doi: 10.1088/1748-0221/9/05/P05002.
Hueso-González F. 2016. Nuclear methods for real-time range verification in proton therapy
based on prompt gamma-ray imaging. Technische Universität Dresden, Dissertation.
Hueso-González F, Enghardt W, Fiedler F, Golnik C, Janssens G, Petzoldt J, Prieels
D, Priegnitz M, Roemer KE, Smeets J, Vander Stappen F, Wagner A, and Pausch
G. 2015. First test of the prompt gamma ray timing method with heteroge-
neous targets at a clinical proton therapy facility. Phys Med Biol, 60(16):6247–6272.
doi: 10.1088/0031-9155/60/16/6247.
IEEE. 2011. IEEE Standard for Terminology and Test Methods for Analog-
to-Digital Converters. IEEE Std 1241-2010 (Revision of IEEE Std 1241-2000).
doi: 10.1109/IEEESTD.2011.5692956.
IEEE. 2015. IEEE Standard for Ethernet. IEEE Std 802.3-2015 (Revision of IEEE Std 802.3-
2012).
iseg. 2014. High Voltage Power Supply, SHQ High Precision Series. Operator’s Manual,
Rev. 2014-02-13-17-34, iseg Spezialelektronik GmbH.
Iwanowska J, Swiderski L, Szczesniak T, Sibczynski P, Moszynski M, Grodzicka M, Kamada
K, Tsutsumi K, Usuki Y, Yanagida T, and Yoshikawa A. 2013. Performance of cerium-
doped Gd3Al2Ga3O12 (GAGG:Ce) scintillator in gamma-ray spectrometry. Nucl Instrum
Methods Phys Res A, 712:34–40. doi: 10.1016/j.nima.2013.01.064.
Jordanov VT. 1994. Deconvolution of pulses from a detector-amplifier configuration. Nucl
Instrum Methods Phys Res A, 351(2-3):592–594. doi: 10.1016/0168-9002(94)91394-3.
Jordanov VT. 2016. Unfolding-synthesis technique for digital pulse processing. Part 1: Un-
folding. Nucl Instrum Methods Phys Res A, 805:63–71. doi: 10.1016/j.nima.2015.07.040.
121
Bibliography
Jordanov VT and Knoll GF. 1994. Digital synthesis of pulse shapes in real time for high
resolution radiation spectroscopy. Nucl Instrum Methods Phys Res A, 345(2):337–345.
doi: 10.1016/0168-9002(94)91011-1.
Jordanov VT, Knoll GF, Huber AC, and Pantazis JA. 1994. Digital techniques for real-time
pulse shaping in radiation measurements. Nucl Instrum Methods Phys Res A, 353(1-3):
261–264. doi: 10.1016/0168-9002(94)91652-7.
Jung I, Garson AB, Perkins JS, Krawczynski H, Matteson J, Skelton RT, Burger A, and Groza
M. 2005. Thick pixelated CZT detectors with isolated steering grids. In: Nucl Sci Symp
Conf Rec, pp. 206–210. IEEE. doi: 10.1109/NSSMIC.2005.1596237.
Kaatsch P, Spix C, Katalinic A, Hentschel S, Luttmann S, Stegmaier C, Caspritz S, Christ
M, Ernst A, Folkerts J, Hansmann J, and Klein S. 2015. Krebs in Deutschland 2011/2012.
Gesundheitsberichterstattung-Hefte, 10. doi: 10.17886/rkipubl-2015-004.
Keysight Technologies. 2016. 33500b Series Waveform Generators.
Kim JW, Kim D, and Yim H. 2009. Pinhole Camera Measurements of Prompt Gamma-rays
for Detection of Beam Range Variation in Proton Therapy. J Korean Phys Soc, 55(4):
1673–1676. doi: 10.3938/jkps.55.1673.
Knoll GF. 2010. Radiation Detection and Measurement. John Wiley & Sons, 4 Edition.
Komarov S, Yin Y, Wu H, Wen J, Krawczynski H, Meng LJ, and Tai YC. 2012. Investigation
of the limitations of the highly pixilated CdZnTe detector for PET applications. Phys Med
Biol, 57(22):7355–7380. doi: 10.1088/0031-9155/57/22/7355.
Kormoll T, Fiedler F, Golnik C, Heidel K, Kempe M, Schoene S, Sobiella M, Zuber K,
and Enghardt W. 2011a. A Prototype Compton Camera for In-Vivo Dosimetry of
Ion Beam Cancer Irradiation. In: Nucl Sci Symp Conf Rec, pp. 3484–3487. IEEE.
doi: 10.1109/NSSMIC.2011.6152639.
Kormoll T, Fiedler F, Schöne S, Wüstemann J, Zuber K, and Enghardt W. 2011b. A Compton
imager for in-vivo dosimetry of proton beams - a design study. Nucl Instrum Methods Phys
Res A, 626-627:114–119. doi: 10.1016/j.nima.2010.10.031.
Kormoll T. 2013. A Compton Camera for In-vivo Dosimetry in Ion-beam Radiotherapy.
Technische Universität Dresden, Dissertation.
Kowalski E. 1970. Nuclear Electronics. Springer Verlag, 1 Edition.
Kozyczkowski JJ and Bialkowski J. 1976. Amplitude and rise time compensated tim-
ing optimized for large semiconductor detectors. Nucl Instrum Methods, 137(1):75–83.
doi: 10.1016/0029-554X(76)90251-2.
Krause M and Supiot S. 2015. Advances in radiotherapy special feature. Br J Radiol, 88
(1051):20150412. doi: 10.1259/bjr.20150412.
122
Bibliography
Kühn W, Gilardi C, Kirschner D, Lang J, Lange S, Liu M, Perez T, Yang S, Schmitt L, Jin D,
Li L, Liu Z, Lu Y, Wang Q, Wei S, Xu H, Zhao D, Korcyl K, Otwinowski JT, Salabura P,
Konorov I, and Mann A. 2008. FPGA based compute nodes for high level triggering in
PANDA. J Phys Conf Ser, 119(2):022027. doi: 10.1088/1742-6596/119/2/022027.
Lee W, Bolotnikov A, Lee T, Camarda G, Cui Y, Gul R, Hossain A, Utpal R, Yang G, and
James R. 2016. Mini Compton Camera Based on an Array of Virtual Frisch-Grid CdZnTe
Detectors. IEEE Trans Nucl Sci, 63(1):259–265. doi: 10.1109/TNS.2015.2514120.
Li W, He Z, Knoll G, Wehe D, and Berry J. 2001. Experimental results from an Imarad
8×8 pixellated CZT detector. Nucl Instrum Methods Phys Res A, 458(1-2):518–526.
doi: 10.1016/S0168-9002(00)00913-X.
Lieber P and Hutchings B. 2011. FPGA Communication Framework. In:
Int Symp Field-Programmable Custom Computing Machines, pp. 69–72. IEEE.
doi: 10.1109/FCCM.2011.39.
Linear Technology. 2014. LTC6268/LTC6269 500MHz Ultra-Low Bias Current FET Input Op
Amp, Datasheet 62689f.
Linear Technology. 2015. LTC6268-10/LTC6269-10 4GHz Ultra-Low Bias Current FET Input
Op Amp, Datasheet 626810f.
Löfgren A, Lodesten L, Sjoholm S, and Hansson H. 2005. An analysis of FPGA-based
UDP/IP stack parallelism for embedded Ethernet connectivity. In: 23rd NORCHIP Conf,
pp. 94–97. doi: 10.1109/NORCHP.2005.1596997.
Lu P, Gomolchuk P, Chen H, Beitz D, and Grosser A. 2015. Ruggedization of CdZnTe de-
tectors and detector assemblies for radiation detection applications. Nucl Instrum Methods
Phys Res A, 784:44–50. doi: 10.1016/j.nima.2015.01.022.
Lyons RG. 2010. Understanding Digital Signal Processing. Addison Wesley, 3 Edition.
Mahmoodi MR, Sayedi SM, and Mahmoodi B. 2014. Reconfigurable Hardware Implementa-
tion of Gigabit UDP/IP Stack Based on Spartan-6 FPGA. In: 6th Int Conf Information Tech-
nology and Electrical Engineering, pp. 1–6. IEEE. doi: 10.1109/ICITEED.2014.7007955.
Marvell. 2004. 88E1111 Datasheet, Integrated 10/100/1000 Ultra Gigabit Ethernet
Transceiver.
MathWorks. 2016a. MATLAB and DSP System Toolbox.
MathWorks. 2016b. MATLAB and System Identification Toolbox.
McCleskey M, Kaye W, Mackin D, Beddar S, He Z, and Polf J. 2015. Evaluation of a
multistage CdZnTe Compton camera for prompt γ imaging for proton therapy. Nucl Instrum
Methods Phys Res A, 785:163–169. doi: 10.1016/j.nima.2015.02.030.
123
Bibliography
Meher PK, Chandrasekaran S, and Amira A. 2008. FPGA Realization of FIR Filters by Effi-
cient and Flexible Systolization Using Distributed Arithmetic. IEEE Trans Signal Process,
56(7):3009–3017. doi: 10.1109/TSP.2007.914926.
Mehrnia A and Willson AN. 2016. FIR Filter Design Using Optimal Factoring:
A Walkthrough and Summary of Benefits. IEEE Circuits Syst Mag, 16(1):8–21.
doi: 10.1109/MCAS.2015.2510178.
Meng L and He Z. 2005. Exploring the limiting timing resolution for large volume CZT
detectors with waveform analysis. Nucl Instrum Methods Phys Res A, 550(1-2):435–445.
doi: 10.1016/j.nima.2005.04.076.
Meyer-Baese U. 2007. Digital Signal Processing with Field Programmable Gate Arrays.
Springer, 3 Edition.
Moreira P, Serrano J, Wlostowski T, Loschmidt P, and Gaderer G. 2009. White
Rabbit: Sub-Nanosecond Timing Distribution over Ethernet. In: Int Symp Precision
Clock Synchronization for Measurement, Control and Communication, pp. 1–5. IEEE.
doi: 10.1109/ISPCS.2009.5340196.
Nagy F, Hegyesi G, Valastyán I, Imrek J, Király B, and Molnár J. 2011. Hardware Accel-
erated UDP/IP Module for High Speed Data Acquisition in Nuclear Detector Systems. In:
Nucl Sci Symp Conf Rec, pp. 810–813. IEEE. doi: 10.1109/NSSMIC.2011.6154544.
Nicholson PW. 1974. Nuclear Electronics. John Wiley & Sons Ltd.
NIST. 2016. XCOM: Photon Cross Sections Database. National Institute
of Standards and Technology. [last update: 04.10.2016, retrieved: 03.11.2016].
URL: https://www.nist.gov/pml/xcom-photon-cross-sections-database.
Nuttall A. 1981. Some windows with very good sidelobe behavior. IEEE Trans Acoust, 29
(1):84–91. doi: 10.1109/TASSP.1981.1163506.
Okada Y, Takahashi T, Sato G, Watanabe S, Nakazawa K, Mori K, and Makishima K. 2002.
CdTe and CdZnTe detectors for timing measurements. IEEE Trans Nucl Sci, 49(4):1986–
1992. doi: 10.1109/TNS.2002.801709.
Oppenheim AV and Schafer RW. 2007. Discrete-Time Signal Processing. Prentice Hall, 3rd
International Edition.
Park SY and Meher PK. 2014. Efficient FPGA and ASIC Realizations of a DA-Based
Reconfigurable FIR Digital Filter. IEEE Trans Circuits Syst II, Exp Briefs, 61(7):511–515.
doi: 10.1109/TCSII.2014.2324418.
Parnham KB, Eissler EE, Jovanovic S, and Lynn KG. 1995. A Study of the Tim-
ing Properties of Cd0.9Zn0.1Te. In: Nucl Sci Symp Conf Rec, pp. 136–138. IEEE.
doi: 10.1109/NSSMIC.1995.504194.
124
Bibliography
Pausch G, Petzoldt J, Berthel M, Enghardt W, Fiedler F, Golnik C, Hueso-González F,
Lentering R, Roemer K, Ruhnau K, Stein J, Wolf A, and Kormoll T. 2016. Scintillator-
Based High-Throughput Fast Timing Spectroscopy for Real-Time Range Verification in
Particle Therapy. IEEE Trans Nucl Sci, 63(2):664–672. doi: 10.1109/TNS.2016.2527822.
Peterson SW, Robertson D, and Polf J. 2010. Optimizing a three-stage Compton camera
for measuring prompt gamma rays emitted during proton radiotherapy. Phys Med Biol, 55
(22):6841–6856. doi: 10.1088/0031-9155/55/22/015.
PICMG. 2011. MicroTCA enhancements for rear I/O and precision timing (PICMG-MTCA-
4-R1.0). PCI Industrial Computer Manufacturers Group (PICMG).
Polf JC, Avery S, Mackin DS, and Beddar S. 2015. Imaging of prompt
gamma rays emitted during delivery of clinical proton beams with a Compton cam-
era: feasibility studies for range verification. Phys Med Biol, 60(18):7085–7099.
doi: 10.1088/0031-9155/60/18/7085.
Proakis JG. 1995. Digital Signal Processing: Principles, Algorithms and Applications. Pren-
tice Hall International, 3 Edition.
Proakis JG and Manolakis DK. 2013. Digital Signal Processing. Pearson Education Limited,
pearson New International Edition.
PTCOG. 2016. Particle therapy facilities in operation and in a planning stage. Particle Ther-
apy Co-Operative Group (PTCOG). [last update: October 2016, retrieved: 03.11.2016].
URL: http://www.ptcog.ch.
Ra S, Kim S, Kim HJ, Park H, Lee S, Kang H, and Doh SH. 2008. Luminescence and
Scintillation Properties of a CeBr3 Single Crystal. IEEE Trans Nucl Sci, 55(3):1221–1224.
doi: 10.1109/TNS.2008.920253.
Ramachers Y and Stewart DY. 2007. Energy resolution improvement in room-temperature
CZT detectors. J Instrum, 2(12):P12003. doi: 10.1088/1748-0221/2/12/P12003.
Redlen Technologies. 2011. M1757 CZT Radiation Detector datasheet.
Richard MH, Chevallier M, Dauvergne D, Freud N, Henriquet P, Le Foulher F, Letang JM,
Montarou G, Ray C, Roellinghoff F, Testa E, Testa M, and Walenta AH. 2009. Design
Study of a Compton Camera for Prompt γ Imaging During Ion Beam Therapy. In: Nucl
Sci Symp Conf Rec, pp. 4172–4175. IEEE. doi: 10.1109/NSSMIC.2009.5402293.
Richter C, Pausch G, Barczyk S, Priegnitz M, Keitz I, Thiele J, Smeets J, Stappen FV,
Bombelli L, Fiorini C, Hotoiu L, Perali I, Prieels D, Enghardt W, and Baumann M. 2016.
First clinical application of a prompt gamma based in vivo proton range verification system.
Radiother Oncol, 118(2):232–237. doi: 10.1016/j.radonc.2016.01.004.
Roellinghoff F, Richard MH, Chevallier M, Constanzo J, Dauvergne D, Freud N, Henriquet
P, Le Foulher F, Létang J, Montarou G, Ray C, Testa E, Testa M, and Walenta A. 2011.
125
Bibliography
Design of a Compton camera for 3D prompt-γ imaging during ion beam therapy. Nucl
Instrum Methods Phys Res A, 648:S20–S23. doi: 10.1016/j.nima.2011.01.069.
Roemer K, Pausch G, Bemmerer D, Berthel M, Dreyer A, Golnik C, Hueso-González F, Ko-
rmoll T, Petzoldt J, Rohling H, Thirolf P, Wagner A, Wagner L, Weinberger D, and Fiedler
F. 2015. Characterization of scintillator crystals for usage as prompt gamma monitors in
particle therapy. J Instrum, 10(10):P10033. doi: 10.1088/1748-0221/10/10/P10033.
Rohling H. 2015. Simulation studies for the in-vivo dose verification of particle therapy.
Technische Universität Dresden, Dissertation.
Rossi L, Fischer P, Rohe T, and Wermes N. 2006. Pixel Detectors: From Fundamentals to
Applications. Springer.
Sasi A, Saravanan S, Pandian SR, and Sundaram RS. 2013. UDP/IP stack in FPGA for
hard real-time communication of Sonar sensor data. In: Ocean Electronics (SYMPOL),
pp. 1–6. IEEE. doi: 10.1109/SYMPOL.2013.6701940.
Scharnagl C. 2015. Entwurf eines numerischen Algorithmus als digitale Schaltung und Ver-
gleich der Implementierungsmöglichkeiten in einem FPGA. Dresden University of Applied
Sciences, Diploma thesis.
Schumann A, Petzoldt J, Dendooven P, Enghardt W, Golnik C, Hueso-González F, Kormoll
T, Pausch G, Roemer K, and Fiedler F. 2015. Simulation and experimental verification
of prompt gamma-ray emissions during proton irradiation. Phys Med Biol, 60(10):4197–
4207. doi: 10.1088/0031-9155/60/10/4197.
Shah K, Glodo J, Higgins W, van Loef E, Moses W, Derenzo S, and Weber M. 2005.
CeBr3 scintillators for gamma-ray spectroscopy. IEEE Trans Nucl Sci, 52(6):3157–5159.
doi: 10.1109/TNS.2005.860155.
Shen Z. 2010. Improving FIR Filter Coefficient Precision [DSP Tips & Tricks]. IEEE Signal
Process Mag, 27(4):120–124. doi: 10.1109/MSP.2010.936777.
Smeets J. 2012. Prompt gamma imaging with a slit camera for real time range control in
particle therapy. Université Libre de Bruxelles, Dissertation.
Spieler H. 2005. Semiconductor Detector Systems. Oxford University Press.
Stein J, Scheuer F, Gast W, and Georgiev A. 1996. X-ray detectors with
digitized preamplifiers. Nucl Instrum Methods Phys Res B, 113(1-4):141–145.
doi: 10.1016/0168-583X(95)01417-9.
Stein J, Georgiev A, Büchner A, and Gast W. 1994. Circuit arrangement for the digital
processing of semiconductor detector signals. United States Patent 5,307,299.
Stein J, Georgiev A, Büchner A, and Gast W. 1997. Schaltungsanordnung für die digitale
Verarbeitung von Halbleiterdetektorsignalen. Europäische Patentschrift EP0550830B1.
126
Bibliography
Taya T, Kataoka J, Kishimoto A, Iwamoto Y, Koide A, Nishio T, Kabuki S, and Inaniwa
T. 2016. First demonstration of real-time gamma imaging by using a handheld Comp-
ton camera for particle therapy. Nucl Instrum Methods Phys Res A, 831:355–361.
doi: 10.1016/j.nima.2016.04.028.
Taylor AP. 2012. Ins and Outs of Digital Filter Design and Implementation. Xcell, 78:36–41.
Texas Instruments. 2004. DP83865 Gig PHYTER V 10/100/1000 Ethernet Physical Layer,
Datasheet SNLS165B.
Texas Instruments. 2015. OPA657 1.6-GHz, Low-Noise, FET-Input Operational Amplifier,
Datasheet SBOS197F.
Uchida T. 2008. Hardware-Based TCP Processor for Gigabit Ethernet. IEEE Trans Nucl
Sci, 55(3):1631–1637. doi: 10.1109/TNS.2008.920264.
Vaska P, Bolotnikov A, Carini G, Camarda G, Pratte JF, Dilmanian FA, Park SJ, and James
RB. 2005. Studies of CZT for PET applications. In: Nucl Sci Symp Conf Rec, pp. 2799–
2802. IEEE. doi: 10.1109/NSSMIC.2005.1596916.
Verburg JM and Seco J. 2014. Proton range verification through prompt gamma-ray spec-
troscopy. Phys Med Biol, 59(23):7089–7106. doi: 10.1088/0031-9155/59/23/7089.
Verger L, Ouvrier-Buffet P, Mathy F, Montemont G, Picone M, Rustique J, and Riffard C.
2005. Performance of a new CdZnTe portable spectrometric system for high energy ap-
plications. IEEE Trans Nucl Sci, 52(5):1733–1738. doi: 10.1109/TNS.2005.856716.
Verger L, Gros d’Aillon E, Monnet O, Montémont G, and Pelliciari B. 2007. New trends in
γ-ray imaging with CdZnTe/CdTe at CEA-Leti. Nucl Instrum Methods Phys Res A, 571
(1-2):33–43. doi: 10.1016/j.nima.2006.10.023.
Wahl CG, Kaye WR, Wang W, Zhang F, Jaworski JM, King A, Boucher YA, and He Z. 2015.
The Polaris-H imaging spectrometer. Nucl Instrum Methods Phys Res A, 784:377–381.
doi: 10.1016/j.nima.2014.12.110.
Wermes N. 2006. Pixel Vertex Detectors. arXiv:physics/0611075 [physics.ins-det].
Wohsmann J. 2014. Entwurf und Implementierung eines digitalen Signalverarbeitungsal-
gorithmus für die Ausleseelektronik eines CdZnTe-Pixeldetektors. Dresden University of
Applied Sciences, Diploma thesis.
Xilinx. 2005. DSP: Designing for Optimal Results. Xcell Publications.
Xilinx. 2014. 7 Series DSP48E1 Slice, UG479 (v1.8).
Xilinx. 2015a. 7 Series FPGAs GTX/GTH Transceivers, UG476.
Xilinx. 2015b. 1G/2.5G Ethernet PCS/PMA or SGMII v15.0, PG047.
Xilinx. 2015c. FIR Compiler v7.2, PG149.
127
Yuan J, Feng QY, and Wang D. 2012. Design of High-Precision
FIR Filter Based on Verilog HDL. Adv Mat Res, 433-440:5198–5202.
doi: 10.4028/www.scientific.net/AMR.433-440.5198.
Zatrepalek R. 2012. Using FPGAs to Solve Tough DSP Design Challenges. Xcell, 78:
42–47.
Zhou S and Yao L. 2014. Gigabit Ethernet Data Transfer Based on FPGA. In: Trustworthy
Computing and Services: Int Conf ISCTCS, pp. 290–296. Springer Berlin Heidelberg.
doi: 10.1007/978-3-662-43908-1_37.
Danksagung
Ich danke Herrn Prof. Dr. Wolfgang Enghardt für die Überlassung des Themas und die
Betreuung der Arbeit.
Bei Herrn Prof. Dr. Uwe Hampel bedanke ich mich für die Übernahme des Zweitgutachtens.
Ein besonderer Dank gilt Herrn Prof. Dr. Peter Kaever für die fachliche Betreuung der Ar-
beit sowie die Freiheiten in Bezug auf die Umsetzung des Themas. Dir mir gebotenen
Möglichkeiten zur Erarbeitung der Forschungsergebnisse sind von unschätzbarem Wert.
Danken möchte ich außerdem Herrn Prof. Dr. Jens Schönherr für die konstruktive Zusam-
menarbeit und die Betreuung und Begutachtung der Diplomarbeiten.
Ich danke Bert Lange herzlichst für die tägliche Unterstützung sowie den stetigen und
ehrlichen Gedankenaustausch.
Mein Dank gilt auch Marc Berthel für die nächtliche Unterstützung während der Experimente
mit den Teilchenbeschleunigern und die Hilfe beim Aufbau und Test der Detektoren. Außer-
dem danke ich Dr. Thomas Kormoll dafür, dass er mich über CZT Detektoren unterrichtet
hat.
Ebenfalls möchte ich mich bei Dr. Christian Golnik, Dr. Fernando Hueso-González und Katja
Römer für die technische Unterstützung während der Strahlzeiten bedanken.
Ich danke auch Dr. Andree Büchner und Timo Kirschke für Ihre wertvollen Erfahrungen in
der analogen Schaltungstechnik.
Meinen Eltern Dr. Holger Födisch und Astrid Födisch danke ich für sämtliche Entlastungen,
die ich in den letzten Jahren annehmen durfte.
Mein größter Dank gilt meiner (zukünftigen) Frau Doreen. Dafür, dass du mir Rückhalt und
Verständnis entgegengebracht hast, und immer auf mich gewartet hast. Ich danke Dir auch
dafür, dass Du Dich so liebevoll um unsere Kinder Amelie und Ferdinand sorgst.
129
Anlage 1
Technische Universität Dresden
Medizinische Fakultät Carl Gustav Carus
Promotionsordnung vom 24.10.2014
Erklärungen zur Eröffnung des Promotionsverfahrens
1. Hiermit versichere ich, dass ich die vorliegende Arbeit ohne unzulässige Hilfe Dritter
und ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe; die
aus fremden Quellen direkt oder indirekt übernommenen Gedanken sind als solche
kenntlich gemacht.
2. Bei der Auswahl und Auswertung des Materials sowie bei der Herstellung des Manu-
skripts habe ich Unterstützungsleistungen von folgenden Personen erhalten:
Prof. Dr. W. Enghardt, Prof. Dr. P. Kaever
3. Weitere Personen waren an der geistigen Herstellung der vorliegenden Arbeit nicht
beteiligt. Insbesondere habe ich nicht die Hilfe eines kommerziellen Promotionsbera-
ters in Anspruch genommen. Dritte haben von mir weder unmittelbar noch mittelbar
geldwerte Leistungen für Arbeiten erhalten, die im Zusammenhang mit dem Inhalt der
vorgelegten Dissertation stehen.
4. Die Arbeit wurde bisher weder im Inland noch im Ausland in gleicher oder ähnlicher
Form einer anderen Prüfungsbehörde vorgelegt.
5. Die Inhalte dieser Dissertation wurden in folgender Form veröffentlicht:
[1] Födisch P, Lange B, and Kaever P. 2013. Eine Ausleseelektronik für CZT-Detektoren
mit dem RENA-3 IC von Nova R&D. In: 104. Tag Studiengr Elektr Instrum, pp. 135–
143. DESY
[2] Födisch P, Sandmann J, Lange B, and Kaever P. 2014. Taktsynchronisierung
und Zeitmessung in einem verteilten Datenerfassungssystem. In: 105. Tag Studiengr
Elektr Instrum, pp. 238–242. DESY
[3] Födisch P, Lange B, and Kaever P. 2015. Ein VHDL basierter Gigabit Ethernet
Protokollstapel für FPGAs. In: 106. Tag Studiengr Elektr Instrum, pp. 52–76. DESY
[4] Födisch P, Berthel M, Lange B, Kirschke T, Enghardt W, and Kaever P. 2016a.
Charge-sensitive front-end electronics with operational amplifiers for CdZnTe detec-
tors. J Instrum, 11(09):T09001. doi: 10.1088/1748-0221/11/09/T09001
[5] Födisch P, Lange B, Sandmann J, Büchner A, Enghardt W, and Kaever P. 2016b. A
synchronous Gigabit Ethernet protocol stack for high-throughput UDP/IP applications.
J Instrum, 11(01):P01010. doi: 10.1088/1748-0221/11/01/P01010
[6] Födisch P, Wohsmann J, Lange B, Schönherr J, Enghardt W, and Kaever P. 2016c.
Digital high-pass filter deconvolution by means of an infinite impulse response filter.
Nucl Instrum Methods Phys Res A, 830:484–496. doi: 10.1016/j.nima.2016.06.019
[7] Födisch P, Bryksa A, Lange B, Enghardt W, and Kaever P. 2016d. Implementing
High-Order FIR Filters in FPGAs. arXiv:1610.03360 [cs.DC]
6. Ich bestätige, dass es keine zurückliegenden erfolglosen Promotionsverfahren gab.
7. Ich bestätige, dass ich die Promotionsordnung der Medizinischen Fakultät der Tech-
nischen Universität Dresden anerkenne.
8. Ich habe die Zitierrichtlinien für Dissertationen an der Medizinischen Fakultät der Tech-
nischen Universität Dresden zur Kenntnis genommen und befolgt.
9. Ich bin mit den “Richtlinien zur Sicherung guter wissenschaftlicher Praxis, zur Ver-
meidung wissenschaftlichen Fehlverhaltens und für den Umgang mit Verstößen” der
Technischen Universität Dresden einverstanden.
Anlage 2
Hiermit bestätige ich die Einhaltung der folgenden aktuellen gesetzlichen Vorgaben
im Rahmen meiner Dissertation (Nicht angekreuzte Punkte sind für meine Dissertation nicht
relevant.)
2 das zustimmende Votum der Ethikkommission bei Klinischen Studien, epidemiologischen Un-
tersuchungen mit Personenbezug oder Sachverhalten, die das Medizinproduktegesetz betref-
fen
2 die Einhaltung der Bestimmungen des Tierschutzgesetzes
2 die Einhaltung des Gentechnikgesetzes
4 die Einhaltung von Datenschutzbestimmungen der Medizinischen Fakultät und des Universi-
tätsklinikums Carl Gustav Carus.
