Charge Pump Clock Generation PLL for the Data Output Block of the Upgraded ATLAS Pixel Front-End in 130 nm CMOS by Kruth, A et al.
Charge Pump Clock Generation PLL for the Data Output Block of the Upgraded ATLAS
Pixel Front-End in 130 nm CMOS
A. Krutha, G. Ahluwaliaa, D. Arutinova, M. Barberoa, M. Gronewalda, T. Hempereka, M. Karagounisa,
H. Kruegera, N. Wermesa, D. Fougeronb, M. Menounib, R. Beccherlec, S. Dubed, D. Elleged,
M. Garcia-Sciveresd, D. Gnanid, A. Mekkaouid, V. Gromove, R. Kluite, J. Schippere.
a University of Bonn, Physics Department, Nussallee 12, 53115 Bonn, Germany
b CPPM, Aix-Marseille Universite Marseille, CNRS/IN2P3, Marseille, France
c INFN, Genova via Dodecaneso 33, IT-16146 Genova, Italy
d LBNL, 1 Cyclotron Road, Berkeley, CA 94720, USA
e NIKHEF, Science Park 105, 1098 XG Amsterdam, Netherlands
kruth@physik.uni-bonn.de
Abstract
FE-I4 is the 130 nm ATLAS pixel IC currently under devel-
opment for upgraded Large Hadron Collider (LHC) luminosi-
ties. FE-I4 is based on a low-power analog pixel array and
digital architecture concepts tuned to higher hit rates [1]. An
integrated Phase Locked Loop (PLL) has been developed that
locally generates a clock signal for the 160 Mbit/s output data
stream from the 40 MHz bunch crossing reference clock. This
block is designed for low power, low area consumption and re-
covers quickly from loss of lock related to single-event tran-
sients in the high radiation environment of the ATLAS pixel
detector. After a general introduction to the new FE-I4 pixel
front-end chip, this work focuses on the FE-I4 output blocks
and on a ﬁrst PLL prototype test chip submitted in early 2009.
The PLL is nominally operated from a 1.2V supply and con-
sumes 3.84mW of DC power. Under nominal operating con-
ditions, the control voltage settles to within 2% of its nominal
value in less than 700 ns. The nominal operating frequency for
the ring-oscillator based Voltage Controlled Oscillator (VCO) is
fV CO = 640MHz.
The last sections deal with a fabricated demonstrator that pro-
vides the option of feeding the single-ended 80MHz output
clock of the PLL as a clock signal to a digital test logic block
integrated on-chip. The digital logic consists of an eight bit
pseudo-random binary sequence generator, an eight bit to ten
bit coder and a serializer. It processes data with a speed of
160Mbit/s. All dynamic signals are driven off-chip by custom-
made pseudo-LVDS drivers.
I. INTRODUCTION TO THE NEW PIXEL
DETECTOR FRONT-END CHIP
FE-I3 is the pixel detector front-end chip of the current AT-
LAS experiment at the LHC. Simulations have shown that due
to the architecture of this chip, it will suffer from various sources
of inefﬁciency and its performance will degrade signiﬁcantly
with increased LHC luminosities [2]. Furthermore, the sen-
sors of the innermost pixel layers will suffer from severe per-
formance degradation after a few years of operation in the hos-
tile radiation environment close to the interaction point. It is
for these reasons that an international collaboration is already
working on a new silicon detector front-end chip called FE-I4
suitable for LHC upgrades scheduled for 2013 or later. The
ﬁrst upgrade will be the Insertable B-Layer (IBL). As it imposes
complex engineering efforts to disassemble the present detector,
a new layer of pixels will be inserted into the present tracker at
a radius of r ≈ 3.7 cm. A second upgrade will be a full re-
placement of the complete tracker using four to ﬁve pixel layers
between ≈ 3.7 cm and ≈ 25 cm together with silicon strips at
larger radii in about 2020. FE-I4 is meant to serve for both up-
grades.
Among its new features are an increased die area 18.8mm ×
20.2mm but smaller individual pixels of 50μm × 250μm. One
front-end chip consists of 336 × 80 pixels. The active area of
the front-end pixel chip has been increased from 75% to 90%.
In order to ﬁt the clustered nature of physical hits, the new ar-
chitecture groups four pixels into one digital region with a ﬁve
deep buffer for local hit storage. The hit processing logic works
in a way that not every hit is sent to the periphery of the chip.
Instead hits are stored locally in the pixel region until the de-
cision about the relevance of the hit is made. This reduces the
trafﬁc on the double column bus by a factor of 400.
FE-I4 will be manufactured in a 130 nm standard CMOS pro-
cess technology. The thin SiO2 gates of the 130 nm technology
node give natural radiation hardness to the transistor devices de-
spite high radiation levels and make the use of enclosed layout
transistors no longer a hard requirement which helps to increase
the packing density.
The output stages of the FE-I4 are located in the periphery of the
chip. The clock signal for the data processing at 160Mbit/s is
locally generated on-chip by a single ring-oscillator based PLL
and is used in the FEI4 data output block.
II. PHASE LOCKED LOOP
Figure 1 depicts the block diagram of the PLL with its
main building blocks: Phase Frequency Detector (PFD), Charge
Pump (CP), Loop Filter (LF), differential VCO, Frequency Di-
vider (FD) and Output Buffers (BUF). The architecture is that
of a classic type II charge pump PLL. The advantage of a type
II PLL over a type I PLL is that it provides better correction of
the PLL output for errors at the input. Additionally the loop gain
548
and stability properties are set independent of each other and the
PFD of a type II PLL does not only detect phase mismatch but





























640 MHz320 MHz160 MHz80 MHz40 MHzfout =
:2
DN
Figure 1: Schematic block diagram of the PLL.
The nominal VCO oscillation frequency is fV CO =
640MHz. At the time the design of the PLL started, it had
not been decided whether the 160Mbit/s front-end output data
will be processed at 160MHz single-edge or 80MHz double
edge. The PLL prototype can provide both clock frequencies
derived from fV CO. Besides, the choice of a higher frequency
fV CO eases the task of generating lower frequency outputs with
a clean 50% duty cycle required for double edge data process-
ing. Furthermore, the physical dimensions of the capacitive ele-
ments required in the LF are smaller (cf. Eq. 1) and the devices
consume less die area. This enables an on-chip integration of the
complete LF without external components. Due to synergy with
other projects, the PLL also provides higher frequency clocks
at fOUT = 320MHz and fOUT = 640MHz. The mentioned
beneﬁts come at the price of a slightly increased power con-
sumption for the VCO and the high frequency divider stages.
The PLL will be located in the periphery of the FE-I4 chip and
the increased power consumption of a single PLL on the chip is
negligible compared to the overall power budget.









where ICP is the charge pump current (cf. Fig. 3), KV CO is
the VCO gain, Rnotch and Cnotch are loop ﬁlter elements (cf.
Fig. 4) and N = 16 is the frequency division factor of the loop.
A. Phase Frequency Detector and Loss of Lock
Detection
The PFD uses a classical architecture with an additional loss
of lock detection circuitry (see Fig. 2). The loss of lock detec-
tion latches the DN signal -resp. UP signal- of the PFD output
with the rising edge of the fFB signal coming from the feed-
back branch of the control loop Fb2Fast -resp. the rising edge of
the fREF reference clock signal Ref2Fast- delayed by a certain
time T . This delay time T determines the sensitivity of the loss
of lock detection. A loss of lock resulting in DN= high -resp.
UP= high- for longer than T (neglecting the propagation delay
of a D-ﬂipﬂop) will cause the signal Fb2Fast -resp. the signal
Ref2Fast- to go high indicating severe changes in VCTRL. The
value for T has to be chosen large enough in order to prevent
















Figure 2: Schematic of the phase frequency detector and the loss of
lock detection.
B. Charge Pump
The charge pump uses a differential architecture with a com-
plementary dummy branch (see Fig. 3). Thus the charging and
the discharging current source provide an almost constant cur-
rent without switching on or off. While the main branch is con-
trolled by the UP and the DN signal coming from the PFD,
the complementary branch is controlled by UP and DN. The
inverted signals are delayed by the propagation delay of the in-
verters used. The switching transistors M1 to M4 in the charge
pump are minimum size devices and thus the charge injected
into the loop ﬁlter upon breaking the current path is minimized.
As a consequence spikes on VCTRL due to charge injected from














Figure 3: Schematic of the charge pump with its dummy branch.
549
C. Loop Filter
The ﬁrst branch of the LF (cf. Fig. 4) with the capacitance
Cpole gives a low-pass characteristic to the control loop. How-
ever, the control loop is unstable with the associated frequency
pole. The second branch of the LF (Rnotch, Cnotch) creates
a frequency notch in order to increase the phase margin of the
open-loop transfer function. By a rule of thumb 10 × Cpole
should be less than Cnotch in order to ensure sufﬁcient phase
margin. The third branch of the LF (Rripple, Cripple) forms an-
other non-dominant frequency pole that ﬁlters high frequency
noise on VCTRL. The characteristic frequency response of
the overall control loop can still be considered a second order
system. The sum of all the capacitance values in the LF is
CSUM ≈ 10 pF. All capacitors are vertical natural caps fully
integrated on chip. The die area consumption of the PLL core is






Figure 4: Schematic of the loop ﬁlter.
D. Differential Voltage-Controlled Oscillator
The VCO consists of three inverters connected as a ring os-
cillator and a fourth inverter that serves as a buffer. The in-
verters are differential pairs loaded with PFET active loads and







Figure 5: Schematic of a VCO inverter stage.
The differential architecture guarantees an oscillator with
50% duty cycle output. Both the differential pairs and the
cross-coupled stages are fed by tail current sources. The con-
trol voltage VCTRL at the output of the LF controls the tail cur-
rent sources directly whereas the PMOS loads are controlled
by the inverted VCTRL. As a result the oscillator can be tuned
over a wide frequency range and an oscillation frequency of
fV CO = 640 MHz is guaranteed for 3σ process variations with-
out additional external tuning. The implemented VCO design is
a trade-off between an extended VCO tuning range and noise
sensitivity.
E. Frequency Dividers and Output Buffers
The FDs consist of four custom-made divide by two toggle-
ﬂipﬂops. The VCO output frequency of fV CO = 640MHz is
consecutively divided down to 320MHz, 160MHz, 80MHz and
ﬁnally to 40MHz equaling a total frequency division factor of
N = 16.
In the output buffering stages, the differential clock signals from
the dividing chain are converted to single-ended clock signals.
Before the clock signals are sent out of the chip, the lower fre-
quency clock signals are all gated with the 640MHz clock for
clock alignment. It is also possible to disable the lower fre-
quency clocks in order to save dynamic power consumption.
The periphery of the test chip includes silicon proven LVDS
drivers integrated into the pads that send the dynamic signals
off chip [1].
III. INTEGRATED DIGITAL TEST LOGIC
The digital test logic integrated on the fabricated PLL test
chip consists of an eight bit pseudo random binary sequence
generator, an eight bit ten bit coder and a serializer. The clock
signal for the test logic can either be an external clock or the
80MHz single-ended output of the PLL core. The output data
of the serializer is a 160Mbit/s double data rate bit stream. The
integration of the test logic on-chip provides a built-in self-test
for the PLL output signal integrity. The test logic implemented
resembles a large part of the future FE-I4 data output block.
IV. SIMULATION RESULTS















T = 40 ◦C, slow corner
T = 40 ◦C, fast corner
T = 40 ◦C, nominal corner
Figure 6: Settling of VCTRL with 3σ process variations.
The simulation is based on a parasitic extraction of the PLL
core with layout parasitic capacitances included. The PLL
550
VCTRL settles in less than tsettle = 1.5μs in all process corners.
Under nominal conditions VCTRL settles in tsettle ≈ 650 ns to
an accuracy of 2% of its ﬁnal value. In order to investigate
the PLL response to single-event transients, charges of 3 pC in
1.5 ns pulses [5] have been injected into various nodes of the
control loop. Figure 7 shows the settling of VCTRL being inter-
rupted by a charge injection at t = 900 ns into the very same
node that controls the oscillation frequency of the VCO. Fur-
thermore, Fig. 7 sketches the reaction of the loss of lock de-
tection. While VCTRL is rising, the VCO is oscillating too
slowly. Consequently the Ref2Fast signal is high, indicating
that the reference clock is too fast resp. fV CO is too low. When
the charge injection takes place VCTRL drastically increases,
speeding-up the VCO and thus the Fb2Fast signal changes to
high, indicating that the frequency of the signal coming from




















Figure 7: PLL response to a 3 pC charge injection at t = 900 ns onto
the node that holds VCTRL.
From noise simulations the VCO phase noise is
−83.3 dBc/Hz @ 1MHz offset and the noise is dominated by
ﬂicker noise of the bias current sources. The phase noise can
be signiﬁcantly improved to −90.0 dBc/Hz @ 1MHz offset by
enlarging the area of the devices in these bias circuits. The
enlargement of these devices does not affect the total die area
consumption of the PLL core and will be incorporated in future
designs.
V. MEASUREMENT RESULTS
Figure 8 shows the PCB designed for the measurements of
the PLL demonstrator. The trim potentiometers on the right al-
low for a ﬂexible adjustment of bias currents and voltages. The
input reference clock is fed to the SMA connector at the bot-
tom. Next to the SMA connector on the right, jumpers can be
used to enable or disable the different outputs of the test chip.
The connection points for the probe heads are located at the top.
The demonstrator itself is bonded onto the PCB close to a cus-
tom made LVDS transceiver chip that is also bonded onto the












Figure 8: Test PCB designed for measurements on the PLL demonstra-
tor.
For all measurements, the input reference clock has been
supplied by an Agilent 81134A pulser with a jitter rms of 2 ps
according to the data sheet. The oscilloscope used in the mea-
surements is a Tektronix TDS5104B 5GS/s, 1GHz scope with
active differential probes of 1GHz bandwidth. The equipment
used limits the measurement accuracy for signals with frequen-
cies higher than 160MHz. However, it needs to be kept in
mind that the lower frequency clocks are internally generated
from the higher frequency clocks. Thus the encouraging re-
sults for the lower frequency clocks indicate well functioning
higher frequency clocks. As the output clock measurements are
performed on the PCB, these measurements always include the
performance characteristics of the LVDS drivers integrated into
the output pads of the test chip.
Table 1 summarizes the results obtained for the PLL demon-
strator. The results have been obtained by triggering the scope
on one edge and measuring the time jitter resp. frequency jitter
on the consecutive edge (cycle-to-cycle jitter) with the built-in
measurement functions of the scope. The duty cycle has also
been acquired with the measurement functions of the scope.





40 40 80 160 320 640
Jitter pk-pk
[ps]
44 82 74 94 70 106
σ-Frequency
[kHz]
6.5 19 79 258 1710 8100
σ-Period [ps] 4.1 12 12 11 17 20
Duty Cycle
Deviation [%]
x 0.24 0.33 0.10 x x
Figure 9 shows the eye diagram of a 160Mbit/s data stream
551
sent out by the digital test block. The test logic uses the chip in-
ternal single-ended 80MHz clock output of the PLL core. The
shift of the crossing points indicates a deviation of the duty cycle
from the ideal 50%. The deviation is attributed to an asymmetry






Figure 9: 160Mbit/s serialized output data stream of the on-chip digital
test logic using the PLL 80MHz clock output.
The opening of the eye diagram is ≥6.0 ns on the time axis
and 284mV on the voltage axis. The reduction of signal level
on the voltage axis is not related to the PLL characteristics but
to signal overshoot due to off-chip impedance mismatch.
The tracking range of the VCO is 336 MHz ≤ fVCO ≤
976 MHz. Outside of this range the Fb2Fast -resp. Ref2Fast-
signals go to permanent high.
VI. CONCLUSION
A new ATLAS Front-End chip FE-I4 is being developed in
a 130 nm standard CMOS technology for use for upgraded LHC
luminosities, both for the Insertable B-Layer project and Super-
LHC. FE-I4 is based on a low-power analog pixel array and
new digital architecture concepts. After a short introduction to
the new features of the FE-I4 chip, the focus is on the output
stages. In order to handle the expected hit rate, the front-end
will stream data out at 160Mbit/s. A type-II PLL has been
developed to generate the necessary clock signal with a well-
deﬁned duty cycle from the available 40MHz bunch crossing
reference clock. The PLL core draws a low current of 3.2mA
from a 1.2V supply and consumes a die area of only 255μm ×
225μm. The VCO of the PLL is based on a three-stage differen-
tial ring oscillator working at a nominal frequency of 640MHz.
The design trade-offs involved with the choice of a ring oscil-
lator in terms of area, noise and locking range are discussed.
Choosing an oscillation frequency higher than the output fre-
quency for the VCO guarantees a lower area consumption of
the LF capacitors and a well-deﬁned duty cycle handling at the
expense of slightly increased power consumption for the VCO
and the four-stage dividing chain. In the ATLAS experiment,
the PLL will be placed in a hostile radiation environment. In
case of single-event transients due to severe charge injections, a
short settling time to recover from a loss of lock is important.
The presented PLL recovers from any given upset in less than
1.5μs.
A stand-alone PLL test chip has been submitted for fabrication
early in 2009. Among its outputs are clock signals with 80MHz
for double edge data transfer and 160MHz for single edge data
stream out at 160Mbit/s. The differential clock output lines are
driven by integrated LVDS drivers. Simulation results as well as
performance measurements for this test chip are presented and
discussed.
The PLL is equipped with on-chip loss-of-lock detection cir-
cuits. Furthermore, the demonstrator includes a digital block
for 160Mbit/s double data rate output streaming, consisting of
an eight bit pseudo random binary sequence generator, an eight
bit to ten bit coder and a serializer. The integrity of the serial-
ized 160Mbit/s double data rate bit stream generated by the test
logic has been investigated and has been found acceptable.The
ﬁrst prototype of the complete FE-I4 IC is scheduled for tape
out at the end of 2009.
REFERENCES
[1] M. Karagounis et al., Development of the ATLAS FE-
I4 pixel readout IC for b-layer Upgrade and Super-LHC,
TWEPP’08, 2008, pp.70-75.
[2] D. Arutinov et al., Digital Architecture and Interface of the
new ATLAS Pixel Front-End IC for Upgraded LHC Lu-
minosity, IEEE Transactions on Nuclear Science, Vol.56,
No.2, 2009
[3] B. Razavi, RF Microelectronics, Prentice Hall PTR, ISBN
0-13-887571-5, 1998.
[4] S. Cheng et al., Design and Analysis of an Ultrahigh-
Speed Glitch-Free Fully Differential Charge Pump With
Minimum Output Current Variation and Accurate Match-
ing, IEEE Transactions on Circuits and Systems-II: Ex-
press Briefs, Vol.53, No.9, 2006.
[5] L. Wang et al., An SEU-Tolerant Programmable Fre-
quency Divider, Proceedings of ISQED’07.
552
