Efficient Parallel Carrier Recovery for Ultrahigh Speed Coherent QAM Receivers with Application to Optical Channels by Gianni, Pablo et al.
Hindawi Publishing Corporation
Journal of Electrical and Computer Engineering
Volume 2013, Article ID 240814, 14 pages
http://dx.doi.org/10.1155/2013/240814
Research Article
Efficient Parallel Carrier Recovery for Ultrahigh Speed Coherent
QAM Receivers with Application to Optical Channels
Pablo Gianni, Laura Ferster, Graciela Corral-Briones, and Mario R. Hueda
Laboratorio de Comunicaciones Digitales, Universidad Nacional de Co´rdoba (CONICET), Avenida Ve´lez Sarsfield 1611,
X5016GCA Co´rdoba, Argentina
Correspondence should be addressed to Pablo Gianni; giannipablo@gmail.com
Received 10 December 2012; Accepted 28 March 2013
Academic Editor: Ashkan Ashrafi
Copyright © 2013 Pablo Gianni et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This work presents a new efficient parallel carrier recovery architecture suitable for ultrahigh speed intradyne coherent optical
receivers (e.g., ≥100Gb/s) with quadrature amplitude modulation (QAM). The proposed scheme combines a novel low-latency
parallel digital phase locked loop (DPLL) with a feedforward carrier phase recovery (CPR) algorithm.The new low-latency parallel
DPLL is designed to compensate not only carrier frequency offset but also frequency fluctuations such as those induced by
mechanical vibrations or power supply noise. Such carrier frequency fluctuations must be compensated since they lead to higher
phase error variance in traditional feedforward CPR techniques, significantly degrading the receiver performance. In order to
enable a parallel-processing implementation in multigigabit per second receivers, a new approximation to the DPLL computation
is introduced.The proposed technique reduces the latency within the feedback loop of the DPLL introduced by parallel processing,
while at the same time it provides a bandwidth and capture range close to those achieved by a serial DPLL. Simulation results
demonstrate that the effects caused by frequency deviations can be eliminatedwith the proposed low latency parallel carrier recovery
architecture.
1. Introduction
The recent emergence of the updated standards IEEE 802.3
for 40 and 100 gigabit per second (Gb/s) Ethernet and G.709
for 40 and 100Gb/s optical transport network (OTN), as
well as the first commercially available devices implementing
these data rates, reveals the vertiginous growth on the
bandwidth demand in the last decade [1, 2].
The projected increase on the bandwidth demand (e.g.,
≥100Gb/s) has set the bases for the next generation of
Ethernet and OTN, and it has, therefore, renewed interest
on coherent detection and spectrally efficient modulation
techniques such as 𝑀-ary phase-shift keying (𝑀-PSK) and
𝑀-ary quadrature amplitude modulation (𝑀-QAM). More
precisely, the conjunction among intradyne coherent detec-
tion, polarization-division multiplexing (PDM), 16-QAM,
and electronic dispersion compensation (EDC) [3, 4] allows
to reach good tradeoff among complexity, spectral efficiency,
minimization of nonlinear distortions, and the possibility
to completely compensate with zero penalty the main fiber
channel impairments [3] (i.e., polarization mode dispersion
(PMD) and chromatic dispersion (CD) [5]). In particular,
intradyne detection is preferred over the alternative hetero-
dyne or homodyne architectures because it replaces complex
optical phase-locked loops (PLLs) with more robust and
easier to implement digital carrier recovery (CR) techniques.
In other words, all of these aspects can be summarized in
an improved receiver sensitivity in comparison to intensity
modulation direct detection (IM/DD) schemes [6, 7].
In this context, CPR fulfils a fundamental role in coher-
ent optical receivers [3, 8]. Feedforward phase estimation
schemes such as Viterbi-Viterbi (VV) [9] or blind phase
search (BPS) [10] algorithms have been proposed for optical
coherent receivers, because of their good laser linewidth
tolerance and feasibility for parallel implementation. More
specifically, significant amounts of CD lead to an enhance-
ment of the phase noise introduced by the local oscillator
and a lower tolerance with respect to carrier frequency
offsets. In these feedforward CPR schemes, a perfect com-
pensation of carrier frequency offset is assumed. However,
2 Journal of Electrical and Computer Engineering
this condition may not be always satisfied in practice. In fact,
it has been shown that the phase error variance increases
with the frequency offset, degrading the performance of
the feedforward phase estimation stage [11]. Feedforward
techniques to estimate and compensate frequency offset
have been investigated in previous works [12–15]. Moreover,
parallel architectures of these techniques are feasible for
implementation in high-speed receivers. In particular, [15]
has been conceived as data-aided (DA) algorithm that uses
training sequences to enhance the capture range up to near
1/(2𝑇), being 𝑇 the symbol duration, whereas [12–14] are
nondata-aided algorithm (NDA) with capture range close to
1/(8𝑇) for 16-QAM scheme.
Although accurate frequency offset estimation and com-
pensation can be carried out by well-known techniques, a
static frequency offset has been assumed in all these propos-
als. As it has been recently demonstrated, transmitter or local
oscillator laser frequency instability caused by mechanical
vibrations significantly degrades the performance of feedfor-
wardCPR algorithms [16]. Other effects such as power supply
noise may also introduce laser frequency fluctuations which
can be modeled as a frequency modulation with a sinusoid
of large amplitude (e.g., ∼250MHz) and low frequency (e.g.,
≤35KHz) [16]. The effectiveness of frequency offset estima-
tion techniques, such as those mentioned earlier, is limited
due to the large amplitude of the modulation signal (i.e.,
large laser frequency change rate). Recent publications have
proposed architectures for compensation of laser frequency
fluctuations when quadrature phase-shift keying modulation
(QPSK) is used [2, 17, 18]. For example, a two-stage carrier
recovery parallel architecture based on a low-latency parallel
DPLL and the feedforward VV CPR algorithm has been
proposed in [17]. This technique offers an excellent tradeoff
between complexity and performance for coherent QPSK
receivers in the presence of laser phase noise, sinusoidal
frequency jitter, and frequency offset. In this work, we gen-
eralize the technique introduced in [17] for application to
𝑀-QAM optical receivers.
As mentioned before, feedforward CPR blocks based on
the VV or BPS algorithms achieve good laser linewidth tol-
erance and overcome some of the latency-related limitations
[8]. We show here that traditional decision directed DPLLs
[19] offer advantages in some aspects of the operation of CPR,
for example, the tracking of large amplitude sinusoidal carrier
frequency jitter experienced by typical lasers. A traditional
PLL is often modeled as a linear filter, assumption which
is useful to compute the small signal transfer function [19].
However, the PLL is actually a nonlinear filter precluding,
in this way, the use of the unfolding techniques discussed
by Parhi in [20], are applicable only to strictly linear filters.
Therefore, a different approach to reduce the latency of the
PLL parallel implementation must be found.
In the present work we introduce a new parallel carrier
recovery algorithm which combines a novel low-latency
parallel DPLL with a traditional feedforward CPR algorithm.
The new low-latency parallel DPLL is used to compensate
not only frequency offset but also frequency fluctuations.The
proposed DPLL approach takes out of the feedback loop as
much processing as possible in order to simplify the loop and
Equalization Slicer
Carrier
recovery
𝑒−𝑗𝜑𝑛
𝑟𝑛 ?̂?𝑛
Figure 1: Simplified block diagram of the coherent receiver with
equalization.
reduce its latency. Then, the bottleneck of the critical PLL
feedback path is broken by using a novel approximation to
the DPLL computation, which provides a capture range and
bandwidth close to those achieved by serial DPLLs [17, 21].
Computer simulations demonstrate that the degradations
caused by frequency offset and laser frequency fluctuations
can be eliminated with the proposed parallel carrier recovery
technique. Unlike the superscalar parallelization (SP) meth-
ods [22–25], the technique proposed here does not require
training symbols to avoid the acquisition problem.Moreover,
the buffers required by the SP scheme are completely avoided
in our approach.
The remainder of the paper is organized as follows.
Section 2 presents the system model and analyzes the effects
of the carrier frequency fluctuations on the receiver per-
formance. Section 3 describes the two-stage carrier recovery
technique. Section 4 introduces the new low-latency parallel
DPLL, while numerical results are shown and discussed in
Section 5. Finally, conclusions are drawn in Section 6.
2. System Model
Figure 1 shows a simplified block diagram of the coherent
receiver with electronic dispersion compensation. Then, the
sample at the equalizer output can be expressed as
𝑟
𝑛
= 𝑎
𝑛
𝑒
𝑗𝛼
𝑛
+ 𝑧
𝑛
, (1)
where 𝑎
𝑛
is the 𝑛th transmitted symbol and 𝛼
𝑛
is the total
phase noise. Component 𝑧
𝑛
represents the amplified spon-
taneous emission (ASE) noise sample, which is modeled as a
white complex Gaussian random variable with power 𝜎2 [3].
The equalized output signal (1) can be rewritten as
𝑟
𝑛
=
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
𝑒
𝑗𝜃
𝑛
, (2)
where |𝑟
𝑛
| and 𝜃
𝑛
are the magnitude and the phase of the
complex sample 𝑟
𝑛
, respectively. In 𝑀-PSK and 𝑀-QAM
systems, the symbol information is contained totally or
partially in the phase of 𝑟
𝑛
, respectively. The received phase
𝜃
𝑛
can be expressed as
𝜃
𝑛
= 𝜁
𝑛
+ Ω
𝑐
𝑛 + ΔΩ
𝑛
+ 𝜙
𝑛
, (3)
where 𝜁
𝑛
is the phase of the transmitted symbol 𝑎
𝑛
and
Ω
𝑐
is the angular carrier frequency offset given by Ω
𝑐
=
2𝜋𝑇𝑓
𝑐
, with 𝑓
𝑐
and 𝑇 being the carrier frequency offset and
the symbol duration, respectively. Term ΔΩ
𝑛
represents the
phase change generated by frequency fluctuations. In this
Journal of Electrical and Computer Engineering 3
Equalization Slicer
Frequency
offset
estimation
VV
or
BPS
𝑟𝑛 ?̃?𝑛 ?̂?𝑛
𝑒−𝑗Ω̂𝑐𝑛 𝑒
−𝑗𝜑𝑛
Carrier recovery
Figure 2: Simplified block diagram CPR based on BPS algorithm
with frequency offset compensation.
𝜑𝑛
?̃?𝑛
𝑒−𝑗𝜑𝑛
(?̃?𝑛−𝑖)
𝑀 arg(·)(·)𝑀
𝑁−1
∑
𝑖=0
𝑧−(𝑁−1)/2
Unwrap(·)/𝑀
𝑒−𝑗(·)
𝑢𝑛
?̂?𝑛
Figure 3: Block diagram of the Viterbi-Viterbi algorithm.
work we assume that the carrier is modulated by a sinusoidal
interfering signal; therefore
ΔΩ
𝑛
=
𝐴
𝑝
Δ𝑓
𝑐
sin (2𝜋𝑇Δ𝑓
𝑐
𝑛) , (4)
where 𝐴
𝑝
and Δ𝑓
𝑐
are the amplitude and frequency of the
modulation tone.
Component 𝜙
𝑛
is the total phase noise given by
𝜙
𝑛
= 𝜙
(laser)
𝑛
+ 𝜙
(ASE)
𝑛
, (5)
where 𝜙(laser)
𝑛
and 𝜙(ASE)
𝑛
are the laser phase noise and the
ASE generated phase noise, respectively. Laser phase noise is
modeled as a Wiener process as follows:
𝜙
(laser)
𝑛
=
𝑛
∑
𝑘=−∞
𝜂
𝑘
, (6)
where 𝜂
𝑘
s are independent, identically distributed, Gaussian
random variables with zero mean and variance 𝜎2
𝜂
= 2𝜋𝑇Δ],
being Δ] the laser linewidth [8].
2.1. FeedforwardCPR. Typical carrier recovery techniques for
coherent optical receivers combine a frequency offset com-
pensation stage followed by a feedforward phase estimation
block based on the well-known VV or BPS algorithms (see
Figure 2) [13]. Once the frequency offset is removed, the VV
or BPS block estimate and compensate the phase noise.
Figure 3 shows a simplified block diagram of the VV
algorithm implementation.TheVVblock estimates the phase
noise based on the 𝑀th power of the received signal as
follows:
𝜑
𝑛
=
1
𝑀
U (arg {𝑢
𝑛
}) , (7)
?̃?𝑛
?̃?𝑛
𝑒−𝑗𝜑𝑛
?̂?𝑛
𝑒−𝑗?̂?𝑏
...
...
...
𝑑𝑛,𝑏 𝑠𝑛,𝑏Slicer
Mux
Min
|𝑑𝑛−𝑖,𝑏|
2
Block 0:
test phase ?̂?󳰀0
Block 1:
test phase ?̂?󳰀1
Block 𝐵 − 1:
test phase ?̂?󳰀𝐵−1
−
?̂?0
?̂?1
?̂?𝐵−1
𝑠𝑛,1
𝑠𝑛,𝐵−1
𝑠𝑛,0
𝑁−1
∑
𝑖=0
𝑒−𝑗(·)
𝑧−(𝑁−1)/2
Unwrap(·)
Figure 4: Block diagram of the blind phase search algorithm.
where U is the unwrap function and 𝑢
𝑛
is the output of the
VV estimator given by
𝑢
𝑛
=
𝑁−1
∑
𝑖=0
(𝑟
𝑛−𝑖
)
𝑀
, (8)
with 𝑁 being an integer odd number which represents the
VV estimator length (see [8] for more details).
An alternative to the VV estimator is the so-called BPS
algorithm shown in Figure 4. The BPS blocks estimates the
phase noise as follows:
𝜑
𝑛
= U{arg min
𝜑
𝑏
{𝑔 (𝑟
𝑛
, 𝜑
𝑏
)}} , (9)
where 𝜑
𝑏
is the test phase defined as
𝜑
𝑏
=
𝑏
𝐵
⋅
𝜋
2
, 𝑏 ∈ {0, 1, . . . , 𝐵 − 1} , (10)
where 𝐵 is the number of phases to be tested; term 𝑔(𝑟
𝑛
, 𝜑
𝑏
)
is given by
𝑔 (𝑟
𝑛
, 𝜑
𝑏
) = 𝑠
𝑛,𝑏
=
𝑁−1
∑
𝑖=0
󵄨󵄨󵄨󵄨
𝑑
𝑛−𝑖,𝑏
󵄨󵄨󵄨󵄨
2
=
𝑁−1
∑
𝑖=0
󵄨󵄨󵄨󵄨󵄨
𝑟
𝑛−𝑖
𝑒
−𝑗𝜑
𝑏
− Q (𝑟
𝑛−𝑖
𝑒
−𝑗𝜑
𝑏
)
󵄨󵄨󵄨󵄨󵄨
2
,
(11)
whereQ(⋅) is the slicer function and𝑁 is, again, the estimator
length (see [10] for more details).
Both VV and BPS techniques efficiently compensate the
effects of the laser phase noise. Particularly, the VV architec-
ture is preferred for𝑀-PSK modulation schemes because of
its uniform angular spacing and constant modulus between
4 Journal of Electrical and Computer Engineering
0 100 200 300 400 500
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
𝐴𝑝 (MHz)
VV 𝑁 = 21
VV 𝑁 = 11
VV 𝑁 = 7
O
SN
R 
pe
na
lty
 at
 B
ER
=
1𝑒
−
3
(d
B)
Figure 5: Performance of VV CPR with QPSK modulation versus
the frequency deviation amplitude (𝐴
𝑝
) for Δ𝑓
𝑐
= 35KHz, 1/𝑇 =
32Gs/s, and Δ] = 250KHz.
symbols. Although there exist alternatives that enable the VV
to operate with𝑀-QAM schemes [26], the BPS algorithm is
preferred because it performs better in the presence of laser
phase noise in spite of its greater computational complexity.
2.2. Effects of Frequency Fluctuations. Mechanical vibrations
cause small deformations of electronic components, such as
the laser cavity, leading to frequency fluctuations (see [16] and
references therein). As expressed in the introduction, these
fluctuations can be described as a frequencymodulation with
a sinusoidal signal of large amplitude (e.g., 𝐴
𝑝
∼ 250MHz)
and low frequency (e.g., Δ𝑓
𝑐
≤ 35KHz). Without loss of gen-
erality, we consider in this work differential QPSK and 16-
QAM differentially encoded in quadrant. Figures 5 and 6
show the optical signal-to-noise ratio (OSNR) penalty at a
bit-error-rate (BER) of 10−3 versus the tone amplitude 𝐴
𝑝
for Δ𝑓
𝑐
= 35KHz. We use the feedforward VV and BPS
CPR schemes depicted in Figures 3 and 4, respectively, with
1/𝑇 = 32 giga-samples per second (Gs/s), laser linewidth
Δ] = 250KHz, and several values of the estimator length,
𝑁. Perfect estimation of the frequency offset is assumed. At
the selected symbol rate, and within the jitter tone amplitude
range of concern, QPSK does not show a significant penalty
when the averaging block length is properly chosen. On
the other hand, note that the performance in the 16-QAM
case is significantly deteriorated with the amplitude of the
frequency modulation tone, which agrees with that reported
in [16]. Notice also that the value of the estimator length that
minimizes the penalty depends on the tone amplitude. This
fact suggests the need for an automatic adjustment algorithm
for𝑁.
0 50 100 150 200 250
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
𝐴𝑝 (MHz)
BPS 𝑁 = 21
BPS 𝑁 = 11
BPS 𝑁 = 7
O
SN
R 
pe
na
lty
 at
 B
ER
=
1𝑒
−
3
(d
B)
Figure 6: Performance of BPS CPR with 16-QAM versus the freq-
uency deviation amplitude (𝐴
𝑝
) for Δ𝑓
𝑐
= 35KHz, 1/𝑇 = 32Gs/s,
Δ] = 250KHz, and 𝐵 = 32.
Equalization Slicer
DPLL
VV
or
BPS
𝑟𝑛 ?̃?𝑛 ?̂?𝑛
𝑒−𝑗𝜑𝑛
Two-stage carrier recovery
𝑒−𝑗𝜓𝑛
Figure 7: Simplified block diagramof the two-stage carrier recovery
technique.
3. Carrier Recovery with Compensation of
Frequency Fluctuations
Based on the results shown in Section 2.2, we conclude that
the tracking of frequency fluctuations becomes an essential
task in ultrahigh speed intradyne coherent optical receivers.
Towards this end, a two-stage carrier recovery algorithm
is proposed in this work (see Figure 7). A first CPR stage
is based on a low-latency parallel DPLL, which is used
to compensate not only frequency offset but also carrier
frequency fluctuations.The second CPR stage is based on the
renowned VV [9] or BPS [10] algorithm, which operates on
the signal demodulated by the DPLL. The second CPR stage
is mainly used to compensate the laser phase noise.
Parallel architectures for both stages must be provided
for multigigabit applications. Feedforward phase estimation
schemes such as VV or BPS are attractive for high-speed
coherent receivers owing to their good laser linewidth toler-
ance and feasibility for parallel implementation.Nevertheless,
the low-latency parallel DPLL proposed in [17] has been
designed for QPSK format. In the following section, we
generalize the scheme introduced in [17] for application to
𝑀-QAM.
Journal of Electrical and Computer Engineering 5
𝜓𝑛−1
𝜃𝑛
|𝑟𝑛|
1
1 − 𝑧−1
NCO
Phase
error
computation
𝜖𝑛
𝑧−1
1 − 𝑧−1
𝐾(𝑝)
𝐾(𝑖)
?̃?𝑛Mod
𝜋/2
Figure 8: Phase domain digital PLL.
3.1. Phase Domain Digital PLL. We consider a phase domain
DPLL in order to reduce computational complexity. The do-
main change results in the substitution of complexmultipliers
by real adders, allowing in this way to increase the processing
rate of the system, a fundamental aspect in multi-gigabit
communications where high processing rates are required.
In a decision directed carrier recovery loop (see Figure 8),
the symbol information is first removed [19]. In QPSK
receivers, this operation can be easily carried out in the phase
domain as follows:
𝜙
𝑛
= (𝜃
𝑛
)
𝜋/2
, (12)
where (⋅)
𝐻
denotes modulus 𝐻. In the absence of phase
noise and frequency deviations (i.e., 𝜙
𝑛
= 0 for all 𝑛 and
𝑓
𝑐
= Δ𝑓
𝑐
= 0), notice that 𝜙
𝑛
= (𝜁
𝑛
)
𝜋/2
= 𝜋/4 for all 𝑛. A
similar approach can be adopted for𝑀-QAM. For example,
for 16-QAM the symbol phase 𝜁
𝑛
reduced to the first quadrant
results in (𝜁
𝑛
)
𝜋/2
∈ {arctan(1/3), 𝜋/4, arctan(3)}. Figure 9
depicts the entire QPSK and 16-QAM constellations in the
complex plane, where the labels 𝑖 and 𝑞 stand for the real and
imaginary axes, respectively. Moreover, the shaded areas in
Figure 9 highlight the quadrant reduction given by (12).
The phase at the numerically controlled oscillator (NCO)
output of a type II second-order DPLL (see Figure 8) can be
expressed as
𝜓
𝑛
= 𝜓
𝑛−1
+ 𝐾
(𝑝)
𝜖
𝑛
+ 𝐾
(𝑖)
𝜖
𝑛−1
, (13)
where all addition operations in the following analysis are
modulus 2𝜋, and the constants𝐾(𝑝) and𝐾(𝑖) are the loop pro-
portional and integral gains, respectively; 𝜖
𝑛
is the phase error
given by
𝜖
𝑛
= (𝜙
𝑛
− 𝜓
𝑛−1
)
𝜋/2
− 𝜌
𝑛
, (14)
where 𝜌
𝑛
is the symbol phase of the transmit symbol reduced
to the first quadrant; that is, 𝜌
𝑛
= (𝜁
𝑛
)
𝜋/2
. Finally, term 𝜖
𝑛−1
in
(13) is the accumulated phase error given by
𝜖
𝑛−1
=
𝑛−1
∑
𝑘=−∞
𝜖
𝑘
. (15)
Since the phase symbol is not known apriori at the re-
ceiver, we use a tentative decision of the transmit symbol to
estimate the phase 𝜌
𝑛
as follows:
𝜌
𝑛
≈ 𝑓 (
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
, 𝜃
𝑛
) , (16)
where 𝜃
𝑛
is the phase of the demodulated received sample,
reduced to the first quadrant; that is,
𝜃
𝑛
= (𝜃
𝑛
− 𝜓
𝑛−1
)
𝜋/2
= (𝜙
𝑛
− 𝜓
𝑛−1
)
𝜋/2
. (17)
Note that (𝑎+𝑏)
𝐻
= ((𝑎)
𝐻
+ (𝑏)
𝐻
)
𝐻
; therefore, since 𝜙
𝑛
=
(𝜃
𝑛
)
𝜋/2
, we can get (17). For example, for QPSK
𝜌
𝑛
≈ 𝑓 (
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
, 𝜃
𝑛
) =
𝜋
4
∀𝑛, (18)
while for 16-QAM,
𝜌
𝑛
≈ 𝑓 (
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
, 𝜃
𝑛
)
=
{{{{{{{
{{{{{{{
{
𝜋/4 if 󵄨󵄨󵄨󵄨𝑟𝑛
󵄨󵄨󵄨󵄨
≥ 𝜌
𝑢
or 󵄨󵄨󵄨󵄨𝑟𝑛
󵄨󵄨󵄨󵄨
≤ 𝜌
𝑙
arctan(1
3
) if 𝜌
𝑙
<
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
< 𝜌
𝑢
and 𝜃
𝑛
≤
𝜋
4
arctan (3) if 𝜌
𝑙
<
󵄨󵄨󵄨󵄨
𝑟
𝑛
󵄨󵄨󵄨󵄨
< 𝜌
𝑢
and 𝜃
𝑛
>
𝜋
4
.
(19)
Figure 10 shows the 16-QAM constellation reduced to
the first quadrant of the complex plane and the decision
boundaries according to (19).
3.2. Evaluation of DPLL for Tracking Frequency Fluctuations.
The effectiveness of the decision directed DPLL to track
frequency fluctuations is analyzed in the following section.
In our carrier recovery scheme, the serial DPLL is used for
compensation of frequency offset and fluctuations, while a
feedforward CPR block based on the BPS algorithm is used
for phase noise estimation.This carrier recovery architecture
will be denoted as S-DPLL + BPS.
Figure 11 shows the OSNR penalty versus the modulation
tone amplitude, 𝐴
𝑝
, for Δ𝑓
𝑐
= 35KHz and Δ] = 250KHz.
The BPS filter length is𝑁 = 21, while the test phase number
is 𝐵 = 32. Note that the performance degradation caused by
the carrier frequency fluctuation is eliminated with the new
combined S-DPLL + BPS carrier recovery technique.
Figure 12 presents the tolerance of BPS and S-DPLL+BPS
architectures to the laser phase noise in the presence of a fre-
quencymodulation tone with𝐴
𝑝
= 140MHz,Δ𝑓
𝑐
= 35KHz.
Thesemodels were comparedwith the BPS algorithmwithout
influence of frequency fluctuations (i.e., 𝐴
𝑝
= 0). The last
mentioned scheme is used as a benchmark. It is interesting to
highlight the important degradation caused by the frequency
fluctuations in the solution solely based on theBPS algorithm.
Again notice that the effects of the carrier frequency fluctu-
ations are mitigated by using the proposed S-DPLL + BPS
carrier recovery algorithm.
4. New Low Latency Parallel DPLL for M-QAM
Maximum clock frequency of complex digital signal pro-
cessors for the state of the art 28 nm CMOS technology is
limited to less than 1GHz.Thus, the use of parallel processing
techniques for the implementation of multigabits per second
receivers is mandatory. Unfortunately, the nonlinear filter
6 Journal of Electrical and Computer Engineering
𝑞
𝑖
QPSK
(a)
𝑞
𝑖
16-QAM
(b)
Figure 9: QPSK and 16-QAM square constellation.
𝑖
𝑞
Lower
bound (?̂?𝑙)
Upper
bound (?̂?𝑢)
𝜋
4
bound
Figure 10: 16-QAM square constellation after modulus 𝜋/2 opera-
tion.
nature of the DPLL impedes the use of the unfolding tech-
niques [20]. Since low latency is a key factor to track freq-
uency fluctuations, thenwe develop a new approach to reduce
the latency in the parallel implementation of DPLL.
4.1. Parallel Type II DPLL forM-QAM. From (13) it is possible
to show that
𝜓
𝑛+1
= 𝜓
𝑛
+ 𝐾
(𝑝)
𝜖
𝑛+1
+ 𝐾
(𝑖)
𝜖
𝑛
= 𝜓
𝑛−1
+ 𝐾
(𝑝)
(𝜖
𝑛
+ 𝜖
𝑛+1
) + 𝐾
(𝑖)
(𝜖
𝑛−1
+ 𝜖
𝑛
) ,
(20)
0 50 100 150 200 250
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
𝐴𝑝 (MHz)
BPS
−0.5
S-DPLL + BPS
O
SN
R 
pe
na
lty
 at
 B
ER
=
1𝑒
−
3
(d
B)
Figure 11: Performance of the proposed S-DPLL + BPS carrier re-
covery scheme versus the frequency deviation amplitude (𝐴
𝑝
) for
Δ𝑓
𝑐
= 35KHz, 1/𝑇 = 32Gs/s, Δ] = 250KHz,𝑁 = 21, and 𝐵 = 32.
where
𝜖
𝑛+1
= (𝜙
𝑛+1
− 𝜓
𝑛
)
𝜋/2
= (𝜙
𝑛+1
− 𝜓
𝑛−1
− 𝐾
(𝑝)
𝜖
𝑛
− 𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝜌
𝑛+1
(21)
with 𝜌
𝑛+1
given by (16) and
𝜃
𝑛+1
= (𝜙
𝑛+1
− 𝜓
𝑛−1
− 𝐾
(𝑝)
𝜖
𝑛
− 𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
. (22)
For the type II second-order DPLL, the steady-state error
is zero (i.e., lim
𝑛→∞
𝜖
𝑛
→ 0) [19]. Thus, assuming that
Journal of Electrical and Computer Engineering 7
0
4
0.5
1
1.5
2
2.5
3
3.5
O
SN
R 
pe
na
lty
 at
 B
ER
=
1𝑒
−
3
(d
B)
10−6 10−5 10−4 10−3
BPS With
BPS With
WithS-DPLL + BPS
Δ%𝑇
𝐴𝑝 = 0MHz
𝐴𝑝 = 140MHz
𝐴𝑝 = 140MHz
Figure 12: Laser linewidth tolerance in the presence of vibration-
induced frequency fluctuations for Δ𝑓
𝑐
= 35KHz, 𝐴
𝑝
= 140MHz,
1/𝑇 = 32Gs/s,𝑁 = 21, and 𝐵 = 32.
the bandwidth of the loop is low-to-moderate such 𝐾(𝑝)≪1,
the contribution of the term 𝐾(𝑝)𝜖
𝑛
can be neglected; there-
fore the phase error (21) results in
𝜖
𝑛+1
≈ (𝜙
𝑛+1
− 𝜓
𝑛−1
− 𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝜌
𝑛+1
, (23)
where 𝜌
𝑛+1
is given by
𝜌
𝑛+1
= 𝑓 (
󵄨󵄨󵄨󵄨
𝑟
𝑛+1
󵄨󵄨󵄨󵄨
, 𝜃
𝑛+1
) (24)
with
𝜃
𝑛+1
= (𝜙
𝑛+1
− 𝜓
𝑛−1
− 𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
. (25)
Furthermore, since the accumulated phase error varies slowly
with time (i.e., 𝜖
𝑛
≈ 𝜖
𝑛−1
), from (20) and (23), we can obtain
𝜓
𝑛+1
≈ 𝜓
𝑛−1
+ 𝐾
(𝑝)
1
∑
𝑘=0
(𝜙
𝑛+𝑘
− 𝜓
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝐾
(𝑝)
1
∑
𝑘=0
𝜌
𝑛+𝑘
+ 2𝐾
(𝑖)
𝜖
𝑛−1
,
(26)
where 𝜌
𝑛+𝑘
is given by
𝜌
𝑛+𝑘
= 𝑓 (
󵄨󵄨󵄨󵄨
𝑟
𝑛+𝑘
󵄨󵄨󵄨󵄨
, 𝜃
𝑛+𝑘
) (27)
with
𝜃
𝑛+𝑘
= (𝜙
𝑛+𝑘
− 𝜓
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
. (28)
𝜓𝑛−1
𝜃𝑛
|𝑟𝑛|
1
1 − 𝑧−1
1
1 − 𝑧−1
Phase
error
computation
𝜖𝑛
𝑧−1
1 − 𝑧−1
𝐾(𝑝)
𝐾(𝑖)
?̃?𝑛Mod
𝜋/2
𝜓(𝑖)
𝑛−1
𝜓
(𝑝)
𝑛−1
estimation
Frequency
NCO(𝑖)
NCO(𝑝)
−
Figure 13: Modified phase domain digital PLL.
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝐹0
𝐹1
𝐹𝑃−1
𝑊0
𝑊1
𝑊𝑃−1
...
...
...
|𝑟𝑛+2𝑃−1|
|𝑟𝑛+𝑃+1|
|𝑟𝑛+𝑃| ?̂?𝑛
?̂?𝑛+1
?̂?𝑛+𝑃−1
?̂?𝑛+𝑃
?̂?𝑛+𝑃+1
?̂?𝑛+𝑃−1?̂?𝑛+2𝑃−1
?̂?𝑛
?̂?𝑛+1
𝜓
(𝑝)
𝑛
𝜓
(𝑝)
𝑛+1
𝜓
(𝑝)
𝑛+𝑃−1
𝜓
(𝑝)
𝑛−1
Figure 14: Implementation of the low latency parallel proportional
loop for𝑀-QAM.
Let 𝑃 be the parallelization factor. Following a similar
analysis, it is possible to derive that
𝜓
𝑛+𝑚
≈ 𝜓
𝑛−1
+ 𝐾
(𝑝)
𝑚
∑
𝑘=0
(𝜙
𝑛+𝑘
− 𝜓
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝐾
(𝑝)
𝑚
∑
𝑘=0
𝜌
𝑛+𝑘
+ (𝑚 + 1)𝐾
(𝑖)
𝜖
𝑛−1
,
𝑚 = 0, 1, . . . , 𝑃 − 1.
(29)
A type II DPLL can be considered as two separate feed-
back loops: the proportional and integral loops (see Figure 13).
Thus, the NCO output (29) can be rewritten as
𝜓
𝑛+𝑚
= 𝜓
(𝑝)
𝑛+𝑚
+ 𝜓
(𝑖)
𝑛+𝑚
, 𝑚 = 0, 1, . . . , 𝑃 − 1, (30)
where 𝜓(𝑝)
𝑛+𝑚
and 𝜓(𝑖)
𝑛+𝑚
are the NCO components due to the
proportional and integral paths, respectively.
8 Journal of Electrical and Computer Engineering
|𝑟𝑛+2𝑃−1
+2𝑃−1
|
|𝑟𝑛+𝑃+1
+𝑃+1
|
|𝑟𝑛+𝑃|
?̂?𝑛+𝑃
?̂?𝑛+𝑃+1
?̂?𝑛+2𝑃−1
?̂?𝑛
?̂?𝑛
?̂?𝑛
?̂?𝑛
?̂?𝑛+1
?̂?𝑛+1 ?̂?𝑛+1
?̂?𝑛+1
?̂?𝑛+𝑃−1
?̂?𝑛+𝑃−1?̂?𝑛+𝑃−1 ?̂?𝑛+𝑃−1
𝜓
(𝑝)
𝑛
𝜓
(𝑝)
𝑛+1
𝜓
(𝑝)
𝑛+𝑃−1
𝜓
(𝑝)
𝑛−1
Low-latency parallel proportional loop Low-latency parallel integral loop
𝜃
𝜃𝑛
𝜃𝑛
𝜓(𝑖)𝑛−1−𝐿−𝑃
𝜓(𝑖)𝑛−𝐿−𝑃
𝜓(𝑖)𝑛+𝑃−2−𝐿−𝑃
𝜓(𝑖)𝑛−1−𝐿
𝐹0
𝐹1
𝐹𝑃−1
...
... ...
...
...
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
P
E
C
P
E
C
P
E
C
?̂?𝑛
?̂?𝑛+1
?̂?𝑛+𝑃−1
∑
𝑧−1
1 − 𝑧−1
𝑧−1
1 − 𝑧−1
𝐾(𝑖)
𝑃
𝑃
2
𝜓(𝑖)𝑛−𝐿
𝜓(𝑖)𝑛+𝑃−2−𝐿
𝑊0
𝑊1
𝑊𝑃−1
(·)𝜋/2
(·)𝜋/2
(·)𝜋/2
𝑛+𝑃
Figure 15: Implementation of the low latency parallel type II DPLL.
|𝑟𝑛+2𝑃−1|
|𝑟𝑛+𝑃+1|
|𝑟𝑛+𝑃|
?̂?𝑛+𝑃
?̂?𝑛+𝑃+1
?̂?𝑛+2𝑃−1
𝜓
(𝑝)
𝑛−1
𝜓
(𝑝)
𝑛−1
...
...
...
...
𝐾(𝑝)
𝐾(𝑝)
−
−
−
𝐹
𝐹1, 𝐹2
?̂?𝑛+𝑃
?̂?𝑛+𝑃+1
?̂?𝑛+2𝑃−1
To block 𝐹
Register
𝑓clock =
1
𝑃𝑇
∑
∑
. . . 𝐹𝑃−1
(a)
?̂?𝑛
?̂?𝑛
|𝑟𝑛| ?̂?𝑙 < |𝑟𝑛| ≤ ?̂?𝑢
1/0
1/0
?̂?𝑛
−𝜓
(𝑝)
𝑛−1
?̂? ≥ 𝜋/4
𝛼0 𝛼0 𝛼1 𝛼2
msb00 01 10 11
𝛼0: 𝜋/4
𝛼1:
𝛼2:
arctan(1/3)
arctan(3)
(b)
Figure 16: Block diagram of the implementation of (a) “𝑊
𝑃−1
” module and (b) block “𝐹
𝑘
” for 16-QAM.
4.2. Proportional Loop. From (29), it is simple to show that
𝜓
(𝑝)
𝑛+𝑚
≈ 𝜓
(𝑝)
𝑛−1
+ 𝐾
(𝑝)
𝑚
∑
𝑘=0
(𝜙
𝑛+𝑘
− 𝜓
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝐾
(𝑝)
𝑚
∑
𝑘=0
𝜌
𝑛+𝑘
.
(31)
From (12) and (17), note that
(𝜙
𝑛+𝑘
− 𝜓
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
= (𝜙
𝑛+𝑘
− 𝜓
(𝑖)
𝑛−1
− 𝜓
(𝑝)
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
= ((𝜃
𝑛+𝑘
− 𝜓
(𝑖)
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
− 𝜓
(𝑝)
𝑛−1
)
𝜋/2
.
(32)
Journal of Electrical and Computer Engineering 9
Thus, expression (31) can be rewritten as
𝜓
(𝑝)
𝑛+𝑚
≈ 𝜓
(𝑝)
𝑛−1
+ 𝐾
(𝑝)
𝑚
∑
𝑘=0
(𝜙
𝑛+𝑘
− 𝜓
(𝑝)
𝑛−1
)
𝜋/2
− 𝐾
(𝑝)
𝑚
∑
𝑘=0
𝜌
𝑛+𝑘
,
(33)
where
𝜙
𝑛+𝑘
= (𝜃
𝑛+𝑘
− 𝜓
(𝑖)
𝑛−1
− 𝑘𝐾
(𝑖)
𝜖
𝑛−1
)
𝜋/2
. (34)
From (32) and (34), note that (28) can be rewritten as
𝜃
𝑛+𝑘
= (𝜙
𝑛+𝑘
− 𝜓
(𝑝)
𝑛−1
)
𝜋/2
. (35)
For example, from (18) and (33) the NCO output (33) for
QPSK reduces to [17]
𝜓
(𝑝)
𝑛+𝑚
≈ 𝜓
(𝑝)
𝑛−1
+ 𝐾
(𝑝)
𝑚
∑
𝑘=0
(𝜙
𝑛+𝑘
− 𝜓
(𝑝)
𝑛−1
)
𝜋/2
− 𝐾
(𝑝)
(𝑚 + 1)
𝜋
4
.
(36)
Unfortunately, it is still highly complex for 𝑀-QAM
(33) to be implemented with digital signal processors for
the state of the art 28 nm CMOS technology as a result of
the complexity required to carry out in one clock cycle the
computation of the function 𝜌
𝑛+𝑘
= 𝑓(|𝑟
𝑛+𝑘
|, 𝜃
𝑛+𝑘
) and then
the last summation in (33). This problem can be mitigated
if terms 𝑓(|𝑟
𝑛+𝑘
|, 𝜃
𝑛+𝑘
) are precomputed by using the NCO
output of the previous clock cycle; that is,
𝜃
𝑛+𝑘
≈ (𝜙
𝑛+𝑘
− 𝜓
(𝑝)
𝑛−1−𝑃
)
𝜋/2
. (37)
As we shall show later, the performance degradation
caused by (37) is negligible in practical situations (e.g., 16-
QAMwith𝑃 ≤ 80). For 16-QAM, this behavior can be under-
stood from the facts that (i) only the nondiagonal symbols use
𝜃
𝑛+𝑘
(see (19)) and (ii) laser frequency fluctuations are slow
compared to the baud rate.
A low-latency parallel implementation of the propor-
tional loop can be easily derived from (33)–(37). Figure 14
shows the architecture of the low-latency parallel type I
DPLL. Block “𝐹
𝑘
” (𝑘 = 0, 1, . . . , 𝑃 − 1) computes terms
𝜌
𝑛+𝑘
= 𝑓(|𝑟
𝑛+𝑘
|, 𝜃
𝑛+𝑘
) with 𝜃
𝑛+𝑘
given by (37), while block
“𝑊
𝑘
” evaluates the summations of (33). Block “𝑊
𝑘
” (𝑘 =
0, 1, . . . , 𝑃− 1) uses a fast adder (e.g., a Wallace tree and carry
save adder [20]) to quickly calculate the NCO output (33).
Furthermore, the gain𝐾(𝑝) is assumed to be a power of 2 (i.e.,
𝐾
(𝑝)
= 2
−𝑁
𝑝 with 𝑁
𝑝
being a positive integer). In this way,
multiplications by the proportional gain 𝐾(𝑝) are reduced to
simple bit shift operations. Again note that all additions in
(33) are modulus 2𝜋.
4.3. Integral Loop. On the other hand, from (29) and
Figure 13, we can also derive the NCO component due to the
integral path as follows:
𝜓
(𝑖)
𝑛+𝑚
≈ 𝜓
(𝑖)
𝑛−1
+ (𝑚 + 1)𝐾
(𝑖)
𝜖
𝑛−1
. (38)
0
1
2
10
5
10
6
10
7
10
8
10
9
−6
−5
−4
−3
−2
−1
Frequency (Hz)
M
ag
ni
tu
de
 (d
B)
𝑃 = 1
𝑃 = 32
𝑃 = 64
𝑃 = 80
Figure 17: Frequency response of the serial and low-latency parallel
DPLL.
The accumulated phase error can be expressed as
𝜖
𝑛+𝑚
=
𝑛+𝑚
∑
𝑘=−∞
𝜖
𝑘
= 𝜖
𝑛−1
+
𝑛+𝑚
∑
𝑘=𝑛
𝜖
𝑘
. (39)
Based on (12), (14), (30), (34), and (38), the accumulated
phase error can be evaluated as
𝜖
𝑘
= (𝜙
𝑘
− 𝜓
(𝑝)
𝑘−1
− 𝜓
(𝑖)
𝑘−1
)
𝜋/2
− 𝜌
𝑘
= (𝜙
𝑘
− 𝜓
(𝑝)
𝑘−1
)
𝜋/2
− 𝜌
𝑘
.
(40)
4.4. Parallel Architecture of the New DPLL. A parallel imple-
mentation of the type II DPLL can be easily achieved as
depicted in Figure 15. Term 𝐿 = 𝑙𝑃 with 𝑙 being a posi-
tive integer represents the latency required to compute all
the operations of the integral path (e.g., the phase error
computation (PEC) defined in (40)). Since the latency in this
path is not as critical as in the proportional loop, its effect
on the DPLL performance will be negligible, as we will show
in the next section. Similarly to 𝐾(𝑝), the integral gain 𝐾(𝑖) is
assumed to be a power of 2 (i.e., 𝐾(𝑖) = 2−𝑁𝑖 with 𝑁
𝑖
being a
positive integer).
Figure 16(a) shows a possible implementation of the block
“𝑊
𝑃−1
”, and Figure 16(b) depicts an example of a tentative
implementation of the “𝐹
𝑘
” block based on look-up tables for
16-QAM.
5. Numerical Results
In this section we evaluate the effectiveness of the proposed
two-stage CPR. We use 16-QAM differentially encoded in
quadrant on a nondispersive noisy channel with 1/𝑇 =
32Gs/s.TheOSNR at a given bit-error-rate (i.e., BER of 10−3)
10 Journal of Electrical and Computer Engineering
0 1 2 3 4
0
20
40
60
80
100
120
140
160
180
200
0.5
0.5
1
1
1
1.5
1.5
1.5
2
2
2
2
2.5
2.5
2.5
2.5
3
3
3
3
3.5
3.5
3.5
3.5
4
4
4
4
4.5
4.5
4.5
4.5
5
5
5
5
BPS 16-QAM
𝐴
𝑝
(M
H
z)
×10−4Δ%𝑇
(a)
0 1 2 3 4
0
50
100
150
200
250
300
0.5
0.5
0.5
1
1
1
1.5
1.5
1.5
2
2
2
2.53
3.54
4.55
𝐴
𝑝
(M
H
z)
×10−4
P-DPLL + BPS 16-QAM with 𝑃 = 64
Δ%𝑇
(b)
Figure 18: Surface of performance of the treated CPR algorithms under the joint effect of laser phase noise and frequency fluctuation. (a)
BPS and (b) P-DPLL + BPS schemes with Δ𝑓
𝑐
= 35 kHz, 1/𝑇 = 32Gs/s, 𝑃 = 64,𝑁 = 21, and 𝐵 = 32.
Table 1: DPLL Parameters.
Parallelism 𝐾(𝑝) 𝐾(𝑖) Processing rate
1 0.12 0.001 32GHz
32 2−5 2−10 1 GHz
64 2−6 2−10 500MHz
80 2−6 2−10 400MHz
is also used as a measure of the efficiency of the proposed CR
loop. Two different type II DPLLswere simulated for compar-
ison purposes: the already mentioned serial DPLL (S-DPLL)
and the proposed low-latency parallelDPLL (P-DPLL) shown
in Figure 15 with different parallelization factors. Moreover,
the BPS algorithm with filter length 𝑁 = 21 and 𝐵 = 32 test
phase values was considered.
The frequency responses of the DPLLs are depicted in
Figure 17.The loop filter gainswere selected in order to obtain
maximum bandwidth with 0.5 dB maximum peaking (see
Table 1). For the optical system considered here, these values
of bandwidth and peaking provide a good tradeoff between
capture range and the residual phase noise power at the input
of the slicer (see Figure 1).
Due to the fact that frequency offset values in intradyne
receivers exceed the maximum theoretical limit of 1/(8𝑇)
[27] that can be reached by decision directed algorithms at
the considered symbol rate (i.e., ±5GHz; see [28]), typical
intradyne coherent optical receivers are provided with a
coarse carrier frequency recovery (CCFR) stage [2] that
minimizes or reduces to zero this frequency gap to values
in the theoretical range. However, residual frequency offset
after CCFR can surpass the tolerance of CPR algorithms like
the VV and the one considered in this work, that is, BPS.The
capture range for the proposed P-DPLL is ∼±4GHz, which is
close to the maximum theoretical frequency offset value for
the given symbol rate (i.e., 1/(8𝑇) = 4GHz). Gear shifting
is applied into the proportional and integral gains during the
capture period.
Figure 18(a) shows the BPS CPR tolerance to the joint
effect of the laser phase noise and the sinusoidal frequency
tone amplitude, 𝐴
𝑝
. At the same time Figure 18(b) depicts
the performance of the combined architecture P-DPLL +
BPS with 𝑃 = 64 under the same conditions as the ones
already mentioned. It is interesting to note in Figure 18(b)
the significant improvement in terms of sinusoidal frequency
tolerance of the combined architectures in relation to the
single stage CPR solely based on BPS. In other words, this
improvement is evidenced in the increase of the contour line
slope, getting parallel (i.e., independent) to the 𝐴
𝑝
axis.
Figure 19 complement the current study for several values
of the parallelization factor under the same conditions earlier
detailed. Particularly, Figure 19 shows the performance of
the two stage CPR architecture DPLL in conjunction with
BPS algorithm using 16-QAM scheme. From the present
study it is possible to derive Figure 20 where the efficiency
of the proposed approximation for the parallelization of
the DPLL is evidenced. Even though the 16-QAM format
seems to be sensible to the effect of the parallelization factor,
it is possible to highlight that the performance remains
constant in a wide range of the parallelization axis and solely
increases the penalization for large values of laser linewidth
(i.e., Δ]𝑇).
5.1. Impact of Decision Errors. The impact of the decision
errors in terms of the variance of the estimated phase is
analyzed for two different PLLs with the same bandwidth
against the modified Cramer-Rao bound (MCRB) [29]. The
Cramer-Rao lower bound (CRLB) can be considered as
a fundamental limit on the performance that a linearized
system can reach in the absence of decision errors [30].
Journal of Electrical and Computer Engineering 11
0.25
0.25
0.25
0.5
0.5
0.5
0.75
0.75
0.75
1
1
1
1.25
1.25
1.25
1.5
1.5
1.5
1.75
1.75
1.75
2
2
2
2.25
2.25
2.25
0
50
100
150
200
250
300
𝐴
𝑝
(M
H
z)
0 1 2 3 4
×10−4
S-DPLL + BPS 16-QAM
Δ%𝑇
(a)
0.25
0.25
0.25
0.5
0.5
0.5
0.75
0.75
0.75
1
1
1
1.25
1.25
1.25
1.5
1.5
1.5
1.75
1.75
1.75
2
2
2
2.25
2.25
2.25
P-DPLL + BPS 16-QAM with 𝑃 = 32
0
50
100
150
200
250
300
𝐴
𝑝
(M
H
z)
0 1 2 3 4
×10−4Δ%𝑇
(b)
0 1 2 3 4
0
50
100
150
200
250
300
0.5
0.5
0.5
1
1
1
1.5
1.5
1.5
2
2
2
2.53
3.54
4.55
𝐴
𝑝
(M
H
z)
×10−4
P-DPLL + BPS 16-QAM with 𝑃 = 64
Δ%𝑇
(c)
0.5
0.5
0.5
1
1
1
1.5
1.5
1.5
1.5
2
2
2
2
2.5
2.5
2.5
3
3
3.5
3.5
4
4
4.5
4.5
5
5
5.566.5
77.58
P-DPLL + BPS 16-QAM with 𝑃 = 80
0
50
100
150
200
250
300
𝐴
𝑝
(M
H
z)
0 1 2 3 4
×10−4Δ%𝑇
(d)
Figure 19: Surface of performance of the P-DPLL + BPS architecture under the joint effect of laser phase noise and frequency fluctuation for
several parallelization factors with Δ𝑓
𝑐
= 35 kHz, 1/𝑇 = 32Gs/s,𝑁 = 21, and 𝐵 = 32.
In other words, the optimum theoretical bound is achieved
under the simplifying assumption that the additive noise
does not affect the receiver decisions about the data symbols.
Simulation results for (i) the serial DPLL (S-DPLL) and (ii)
the parallel DPLL (P-DPLL) with a parallelization factor of
𝑃 = 80 are shown in Figure 21(a).
At the OSNR regime of interest in the application con-
sidered in our work (i.e., PDM-16-QAM, 1/𝑇 = 32Gs/s,
BER < 10−2 → OSNR > 18 dB), it can be observed
that the phase noise variance in the proposed parallel DPLL
is sliglthy higher than that experienced in a serial DPLL.
Nevertheless, notice that the impact of this phase variance
increase on the performance in terms of bit-error-rate (BER)
is practically negligible (see Figure 21(b)). Finally, it is impor-
tant to highlight that catastrophic errors caused by cycle slips
are avoided in the proposed carrier recovery architecture by
using differential 16-QAM [11].
6. Conclusion
A new DPLL-based carrier recovery architecture for high
speed optical coherent receivers has been introduced in
this paper. The proposed parallel scheme builds upon a
novel DPLL computation, which breaks the bottleneck of
the feedback path. We have shown here a novel approach
that leads to a simple parallel implementation. Furthermore,
12 Journal of Electrical and Computer Engineering
0 20 40 60 80
0
0.5
1
1.5
2
2.5
3
3.5
4
Parallelism
𝐴𝑝 = 300MHz
O
SN
R 
Pe
na
lty
 at
 B
ER
1𝑒
−
3
(d
B)
Δ%𝑇 = 1𝑒 − 04
Δ%𝑇 = 2𝑒 − 04
Δ%𝑇 = 0𝑒 + 00
Δ%𝑇 = 8𝑒 − 06
(a)
0 20 40 60 80
0
0.5
1
1.5
2
2.5
3
3.5
4
Parallelism
O
SN
R 
Pe
na
lty
 at
 B
ER
𝐴𝑝 = 0MHz
𝐴𝑝 = 100MHz
𝐴𝑝 = 200MHz
𝐴𝑝 = 300MHz
1𝑒
−
3
(d
B)
Δ%𝑇 = 2𝑒 − 04
(b)
Figure 20: Two-stage CPR performance versus the parallelization factor. P-DPLL + BPS with Δ𝑓
𝑐
= 35 kHz, 1/𝑇 = 32Gs/s, 𝑁 = 21, and
𝐵 = 32 using 16-QAM scheme.
15 20 25 30
0
0.5
1
1.5
2
2.5
3
OSNR (dB)
S-DPLL 
P-DPLL
MCRB
×10−3
Ph
as
e v
ar
ia
nc
e (
ra
d2
)
(simulation)
𝑃 = 80 (simulation)With
(a)
16 18 20 22 24 26
0
OSNR (dB)
−8
−7
−6
−5
−4
−3
−2
−1
Lo
g 1
0
(B
ER
)
S-DPLL 
P-DPLL
Theory
(simulation)
𝑃 = 80 (simulation)With
(b)
Figure 21: (a) Decision errors impact on the P-DPLL performance against the modified Cramer-Rao lower bound, 1/𝑇 = 32Gs/s, using
16-QAM scheme. (b) Bit-error-rate of the DPLLs under analysis.
it has also been demonstrated that the new parallel DPLL
can provide a bandwidth and capture range similar to those
achieved by the serial DPLL.
The proposed two-stage carrier recovery architecture
based on a low-latency parallel DPLL and a feedforward
phase estimator BPS offers a low complexity, high perfor-
mance, integral solution to the frequency, and phase compen-
sation in coherent optical systems.This solution outperforms
previously proposed architectures when all optical channel
impairments present in real applications, including laser
Journal of Electrical and Computer Engineering 13
phase noise, sinusoidal frequency jitter, and frequency offset,
are accounted for in the modeling.
Acknowledgment
This paper has been supported in part by the ANPCyT
(PICT2011-2527), MINCyT, Fundacio´n Tarpuy, and Fun-
dacio´n Fulgor.
References
[1] P. Winzer, “Beyond 100G ethernet,” IEEE Communications
Magazine, vol. 48, no. 7, pp. 26–30, 2010.
[2] D. Crivelli, M. Hueda, H. Carrer et al., “A 40nm CMOS single-
chip 50Gb/s DP-QPSK/BPSK transceiver with electronic dis-
persion compensation for coherent optical channels,” in Pro-
ceedings of the IEEE International Solid-State Circuits Conference
Digest of Technical Papers (ISSCC ’12), pp. 328–330, February
2012.
[3] D. E. Crivelli, H. S. Carrer, and M. R. Hueda, “Adaptive digital
equalization in the presence of chromatic dispersion, PMD, and
phase noise in coherent fiber optic systems,” inProceedings of the
IEEEGlobal Telecommunications Conference (GLOBECOM ’04),
vol. 4, pp. 2545–2551, December 2004.
[4] M. Kuschnerov, F. N. Hauske, K. Piyawanno et al., “DSP for
coherent single-carrier receivers,” Journal of Lightwave Technol-
ogy, vol. 27, no. 16, pp. 3614–3622, 2009.
[5] G. P. Agrawal, Fiber-Optic Communication Systems, Wiley-
Interscience, 2nd edition, 1997.
[6] O. E. Agazzi, M. R. Hueda, H. S. Carrer, and D. E. Crivelli,
“Maximum-likelihood sequence estimation in dispersive opti-
cal channels,” Journal of Lightwave Technology, vol. 23, no. 2, pp.
749–763, 2005.
[7] O. E. Agazzi, M. R. Hueda, D. E. Crivelli et al., “A 90 nm CMOS
DSP MLSD transceiver with integrated AFE for electronic
dispersion compensation of multimode optical fibers at 10
Gb/s,” IEEE Journal of Solid-State Circuits, vol. 43, no. 12, pp.
2937–2957, 2008.
[8] M. Taylor, “Phase estimation methods for optical coherent
detection using digital signal processing,” Journal of Lightwave
Technology, vol. 27, no. 7, pp. 901–914, 2009.
[9] A. Viterbi, “Nonlinear estimation of PSK-modulated carrier
phase with application to burst digital transmission,” IEEE
Transactions on Information Theory, vol. 29, no. 4, pp. 543–551,
1983.
[10] T. Pfau, S. Hoffmann, and R. Noe´, “Hardware-efficient coherent
digital receiver concept with feedforward carrier recovery for
M-QAM constellations,” Journal of Lightwave Technology, vol.
27, no. 8, pp. 989–999, 2009.
[11] E. Ip and J. Kahn, “Feedforward carrier recovery for coherent
optical communications,” Journal of Lightwave Technology, vol.
25, no. 9, pp. 2675–2692, 2007.
[12] I. Fatadin and S. Savory, “Compensation of frequency offset for
16-QAM optical coherent systems using QPSK partitioning,”
IEEE Photonics Technology Letters, vol. 23, no. 17, pp. 1246–1248,
2011.
[13] H. Leng, S. Yu, X. Li et al., “Frequency offset estimation for
optical coherent m-QAM detection using chirp z-transform,”
IEEE Photonics Technology Letters, vol. 24, no. 9, pp. 787–789,
2012.
[14] S. Dris, I. Lazarou, P. Bakopoulos, and H. Avramopoulos, “Fre-
quency offset estimation in m-QAM coherent optical systems
using phase entropy,” in Proceedings of the Conference on Lasers
and Electro-Optics (CLEO ’12), pp. 1–2, May 2012.
[15] X. Zhou, X. Chen, and K. Long, “Wide-range frequency offset
estimation algorithm for optical coherent systems using train-
ing sequence,” IEEE Photonics Technology Letters, vol. 24, no. 1,
pp. 82–84, 2012.
[16] M. Kuschnerov, K. Piyawanno, M. S. Alfiad, B. Spinnler, A.
Napoli, and B. Lankl, “Impact of mechanical vibrations on laser
stability and carrier phase estimation in coherent receivers,”
IEEE Photonics Technology Letters, vol. 22, no. 15, pp. 1114–1116,
2010.
[17] P. Gianni, G. Corral-Briones, C. Rodriguez, H. Carrer, and
M. Hueda, “A new parallel carrier recovery architecture for
intradyne coherent optical receivers in the presence of laser
frequency fluctuations,” inProceedings of the Global Telecommu-
nications Conference (GLOBECOM ’11), pp. 1–6, 2011.
[18] N. Stojanovic, Y. Zhao, B. Mao, C. Xie, F. N. Hauske, and M.
Chen, “Robust carrier recovery in polarization division multi-
plexed receivers,” in Proceedings of the Optical Fiber Com-
munication Conference, Technical Digest (Optical Society of
America), Los Angeles, Calif, USA, March 2012.
[19] E. A. Lee and D. G. Messerschmitt, Digital Communication,
KAP, 1st edition, 1992.
[20] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and
Implementation, Wiley-Interscience, 1999.
[21] P. Gianni, H. S. Carrer, G. Corral-Briones, and M. R. Hueda,
“A novel low-latency parallel architecture for digital PLL with
application to ultra-high speed carrier recovery systems,” in
Proceedings of the 7th Southern Conference on Programmable
Logic (SPL ’11), pp. 31–36, April 2011.
[22] K. Piyawanno, M. Kuschnerov, B. Spinnler, and B. Lankl, “Low
complexity carrier recovery for coherent QAM using super-
scalar parallelization,” in Proceedings of the 36th European Con-
ference and Exhibition on Optical Communication (ECOC ’10),
pp. 1–3, September 2010.
[23] X. Zhou and Y. Sun, “Low-complexity, blind phase recov-
ery for coherent receivers using QAM modulation,” in Pro-
ceedings of the Optical Fiber Communication Conference and
Exposition and the National Fiber Optic Engineers Conference
(OFC/NFOEC ’11), pp. 1–3, March 2011.
[24] Q. Zhuge, M. E. Mousa-Pasandi, X. Xu et al., “Linewidth-
tolerant low complexity pilot-aided carrier phase recovery for
m-QAM using superscalar parallelization,” in Proceedings of
the Optical Fiber Communication Conference, Technical Digest
(Optical Society of America), Los Angeles, Calif, USA, March
2012.
[25] Q. Zhuge, M. Morsy-Osman, X. Xu et al., “Pilot-aided carrier
phase recovery for m-QAM using superscalar parallelization
based PLL,” Optics Express, vol. 20, no. 17, pp. 599–519, 2012.
[26] I. Fatadin, D. Ives, and S. Savory, “Laser linewidth tolerance for
16-QAM coherent optical systems using QPSK partitioning,”
IEEE Photonics Technology Letters, vol. 22, no. 9, pp. 631–633,
2010.
[27] D. G. Messerschmitt, “Frequency detectors for PLL acquisition
in timing and carrier recovery,” IEEE Transactions on Commu-
nications Systems, vol. 27, no. 9, pp. 1288–1295, 1979.
[28] Z. Tao, L. Li, L. Liu et al., “Improvements to digital carrier phase
recovery algorithm for High-Performance optical coherent
receivers,” IEEE Journal of Selected Topics in Quantum Elec-
tronics, vol. 16, no. 5, pp. 1201–1209, 2010.
14 Journal of Electrical and Computer Engineering
[29] A. N. D’Andrea, U. Mengali, and R. Reggiannini, “Themodified
Cramer-Rao bound and its application to synchronization
problems,” IEEETransactions onCommunications, vol. 42, no. 2,
pp. 1391–1399, 1994.
[30] H.Meyr,M.Moeneclaey, and S. A. Fechtel,Digital Communica-
tion Receivers, Synchronization, Channel Estimation, and Signal
Processing, Wiley-Interscience, 2nd edition, 1997.
Submit your manuscripts at
http://www.hindawi.com
Control Science
and Engineering
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
 International Journal of
 Rotating
Machinery
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2013
Part I
VLSI Design
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
ISRN 
Signal Processing
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Mechanical 
Engineering
Advances in
Modelling & 
Simulation 
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Advances in
Acoustics &
Vibration
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
ISRN 
Sensor Networks
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Electrical and Computer 
Engineering
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Distributed
Sensor Networks
International Journal of
ISRN 
Robotics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
International Journal of
Antennas and
Propagation
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
ISRN 
Electronics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Advances in
OptoElectronics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2013
Hindawi Publishing Corporation 
http://www.hindawi.com Volume 2013
The Scientific 
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Active and Passive  
Electronic Components
Chemical Engineering
International Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
ISRN 
Civil Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2013
 Journal of 
Sensors
