Joint symbol and chip synchronization for a burst-mode-communication superregenerative MSK receiver by López Riera, Alexis et al.
1260 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 64, NO. 5, MAY 2017
Joint Symbol and Chip Synchronization
for a Burst-Mode-Communication
Superregenerative MSK Receiver
Alexis López-Riera, Francisco del Águila-López, Pere Palá-Schönwälder, Member, IEEE,
Jordi Bonet-Dalmau, Member, IEEE, Rosa Giralt-Mas, and F. Xavier Moncunill-Geniz
Abstract— In this paper we describe a superregenerative (SR)
MSK receiver able to operate in a burst-mode framework where
synchronization is required for each packet. The receiver is based
on an SR oscillator which provides samples of the incoming
instantaneous phase trajectories. We develop a simple yet effec-
tive technique to achieve joint chip and symbol synchronization
within the time limits of a suitable preamble. We develop some
general results and focus on the case of the IEEE 802.15.4 MSK
physical layer. We provide details on a VHDL implementation on
an FPGA where the most complex digital processing block is an
accumulator. Simulation and experimental results are provided
to validate the described technique.
Index Terms— Low-power communication receivers, MSK
demodulation, RF receivers, superregenerative receiver,
synchronization.
I. INTRODUCTION
SUPERREGENERATIVE (SR) Receivers (SRR) areamong the most energy-efficient detectors for wireless
data reception. They were introduced almost a century ago [1]
but were later relegated to low demanding applications due
to the superiority of superheterodyne receivers in terms of
selectivity. The last years have seen a significant increase in
SR-related research trying to take advantage of the intrinsic
properties of SRR [2] [3], namely their low-complexity
structure that allows low-power [4] and/or low-cost implemen-
tations. These advantages are at the expense of a sub-optimum
performance mainly due an equivalent noise bandwith
greater than the theoretical minimum for conventional
modulations [5]. Several of today’s wireless communication
needs may be served by SRR thanks to recent demonstrations
of the SR principle to PSK [6] [7] and FSK/MSK
modulations [8]. As it is well-known, the synchronization
problem (which appears at many levels) is a main issue in
receiver design, accounting for a significant fraction of receiver
cost in terms of area or consumption [9]. In the SR context,
very low-complexity synchronization approaches are to be
targeted to avoid spoiling the whole point of SR reception.
Manuscript received August 1, 2016; revised November 6, 2016; accepted
November 29, 2016. Date of publication January 5, 2017; date of current
version April 20, 2017. Work supported by Spanish Grant TEC2015-65748-R
(MINECO/FEDER). This paper was recommended by Associate Editor
C. Panazio.
The authors are with the Department of Mining, Industrial and ICT
Engineering (EMIT), Manresa School of Engineering (EPSEM), Universitat
Politécnica de Catalunya (UPC), Manresa 08242, Spain (e-mail:
alexis.lopez.riera@upc.edu).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCSI.2016.2636022
Digital communications can be classified as burst-mode
or continuous transmissions, each placing different require-
ments on the synchronization scheme. In burst-mode, also
called packet communications, each data packet has a training
sequence (sometimes called preamble) to achieve carrier, chip
and symbol synchronization. In some cases, once acquired,
synchronization can be maintained until the end of the packet
thanks to the stability of the crystal oscillators.
The synchronization problem in SR receivers for ASK
modulations has received some attention, e.g., in [4]
or [10]. Recently [11], an asynchronous preamble based on
pseudo-noise sequences with amplitude-modulated pulses
with a suitable pulse shaping function, combined with a
fractional symbol period delay, is proposed. There, the SR
oscillator (SRO) signal is fed to an envelope detector and
chip and symbol synchronization is achieved thanks to a
suitable processing scheme.
MSK modulation [12] belongs to the class of continuous
phase modulations (CPM), and is equivalent to an offset
quadrature phase shift keying (OQPSK) modulation with half-
sine pulse shaping [13]. MSK exhibits low bandwidth, constant
envelope for efficient power amplification, and achieves QPSK
bit error rate (BER) for so called precoded-type MSK. These
properties make MSK the choice for power-efficient wireless
networks such as IEEE 802.15.4 [14].
The SRR has recently been shown to be able to detect MSK
signals: in the approach in [8], a SR front end is able to deliver
samples of the instantaneous received signal phase using 1-bit
ADC. In conventional MSK receivers, after preamplification
and IQ downconversion, the I and Q signals are digitized
and synchronization and data detection is achieved by digital
signal processing techniques. This is a flexible approach at the
expense of a non-negligible amount of hardware resources and
power consumption.
There is a wealth of literature describing different synchro-
nization approaches. Basic material is presented in [9], [13]
while more recent results may be easily found, for instance,
in [15] and references therein. Some of them require sampling
the phase trajectories with a sampling frequency higher than
the MSK symbol rate fx , with ×8 factors being common [16].
For instance, [17] describes a method that is well-suited for
digital implementation with a sampling rate that is only twice
the symbol rate. This is also the case of, e.g., [18]. On the
other hand, timing recovery techniques such as [19] or [20],
while operating at the symbol rate, do not achieve the higher
1549-8328 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
LÓPEZ-RIERA et al.: JOINT SYMBOL AND CHIP SYNCHRONIZATION FOR A BURST-MODE-COMMUNICATION SUPERREGENERATIVE MSK RECEIVER 1261
level synchronization (chip and symbol) required, for instance,
for IEEE 802.15.4.
Following [9], the possible approaches to synchroniza-
tion can be divided into two categories: ad-hoc and derived
structure. In the ad-hoc structure case, an a-priori proposed
hardware is thought to be able to address the problem. In the
derived structure case, the structure is obtained as the solution
to an optimization problem, with no a priori guesses.
In the case of the SR receiver, a sampling frequency higher
than fx would imply a wider receiver bandwidth, meaning
lower signal to noise ratio (SNR) [5]. Hence, here we develop a
technique that performs joint symbol and chip synchronization
after the observation of a known synchronization preamble
while operating at fx . It belongs to the data-aided, ad-hoc
structure family of methods. After acquisition, we rely on the
stability of the local clock to keep synchronization until the
end of the packet. In consonance with the low-cost, low-power
and low-complexity objectives of SR reception, the whole
synchronization procedure is translated into a simple digital
implementation where the most complex digital building block
is an accumulator. We present general ideas which may be used
for a variety of standards but, due to its specific interest, we
concentrate on an implementation targeting the physical level
of MSK IEEE 802.15.4 packet reception.
II. FRAMEWORK
In what follows we call chip the basic transmission unit
lasting Tx seconds, as this is the case with several standards
using Direct-Sequence Spread-Spectrum at the physical layer,
such as IEEE 802.15.4. However, the whole discussion is also
extensible to the cases where the basic transmission unit is
a bit. Also, for convenience, chip values will be assumed
to be ±1. First, we recall some results on SR receivers
for narrowband FSK [8] receiver. The response of an SRO
operating in the linear mode to a signal given by
x(t) =
∞∑
n=−∞
pc(t − nTx ) cos(ωnt + φn) (1)
may be written as [5], [8]
s(t) = K PG
∞∑
n=−∞
|H (ωn)|p(t − nTx)
× cos(ω0t + n(ωn − ω0)Tx + φn +  H (ωn)), (2)
assuming that the changes from stability to instability happen
at t = nTx and that we take one quench cycle per chip, i.e.,
Tq = Tx . The constant K PG (pulse gain) corresponds to the
constant K in [5], [8] while K will have a different meaning
in this paper.
The expression given by (1) includes MSK making pc(t)
a rectangular pulse occupying the whole chip length, and
selecting the instantaneous frequency of the modulated signal
as a function of the transmitted digital signal xn as
ωn = ωc + ωd xn, (3)
with xn = ±1, ωd = ωx/4, and choosing φn such that phase
continuity is ensured as corresponds to a continuous-phase
FSK modulation [21].
Fig. 1. Representation of the excess phase with respect to ωct as a function
of data: black squares correspond to −1, white squares to 1. Black and white
circles correspond to nominal or δ-advanced samples, respectively.
From the inspection of (2) we observe that the response
is a train of RF pulses, each oscillating at ω0. While on
the transmitter side each chip has the same amplitude, in
the received pulse train we observe that each pulse has an
amplitude proportional to |H (ωn)| which is dependent on the
deviation between ωn and ω0. The small modulation index
of MSK makes the amplitude variations induced by |H (ωn)|
insufficient for signal detection. These amplitude variations
are even further reduced when the SRO is operating near
saturation. Hence, suitable MSK detection schemes based on
the SR principle rely on the instantaneous phase trajectories
of s(t) [8].
A representation of the phase trajectories resulting from
an MSK signal at the SRO input is shown in Fig. 1. In the
synchronized state, we will sample at multiples of Tx (δ = 0)
and the differences between successive phase observations,
ϕn = ψn − ψn−1, will always be ±π/2.
However, when acquiring synchronization, there will be an
unknown δ added to the nominal sampling instants and ϕn will
depend on the transmitted data:
ϕn(δ) =
⎧
⎪⎪⎪⎨
⎪⎪⎪⎩
π/2 if xn−1 = xn = 1
−π/2 if xn−1 = xn = −1
π/2 − δπ/Tx if xn−1 = −1 and xn = 1
−π/2 + δπ/Tx if xn−1 = 1 and xn = −1
(4)
We call chip synchronization the problem of estimating the
value of δ ∈ [0, Tx).
The second synchronization problem is symbol synchro-
nization. In burst-mode transmissions the training sequence
in each burst often includes a sequence of known symbols to
aid this process. The symbol synchronization problem consists
in estimating an integer delay K , between the received and the
local training sequences.
In the next section, we will derive a technique to solve
both problems simultaneously (together with some comments
on carrier synchronization). Frame synchronization will be
addressed in Section IV.
III. SIMULTANEOUS SYNCHRONIZATION
Let’s define x as the vector of N chips in the preamble,
x = [x0, x1, . . . , xN−2, xN−1]T (5)
1262 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 64, NO. 5, MAY 2017
From (4), it follows that:
a) The positions n corresponding to two different consecu-
tive chips, i.e., where xn = xn−1, give phase values that
have a linear dependence with δ. After being filtered,
these will be used to achieve chip synchronization.
b) The positions n corresponding to two equal consecutive
chips do not depend on δ. So, irrespective on the current
value of δ, these can be exploited to gain information
suitable for symbol synchronization.
In order to separate both information sources, x will be
decomposed in two orthogonal terms,
x = i + q (6)
where i is related to the chip synchronization and is given by
i = (x − xd)/2 (7)
and q is related to the symbol synchronization and is given by
q = (x + xd)/2 (8)
with xd = [x−1, x0, x1, . . . , xN−2]T . It follows that i has
its nonzero values whenever there are two different consec-
utive values in x, and q has its nonzero values whenever
there are two identical consecutive values in x. A suitable
synchronization preamble should have a significant number
of nonzero entries in both terms in order to provide similar
amounts of information for chip and symbol synchronization,
i.e., ||i|| and ||q|| should be similar. Note that, formally, the
first element of xd, x−1, does not belong to the preamble and
would be a random value if the preamble is preceded by noise.
In practice however, the transmitters perform some kind of
power ramping and this value is not random. For notational
convenience, in what follows we will assume that x−1 = xN−1.
A. Carrier Synchronization
When targeting a simple synchronizer suited for SR
receivers, the carrier synchronization problem does not need
to be addressed for many standards. We may take reasonable
assumption that the transmitter derives data and carrier clocks
from the same crystal. A frequency mismatch  gives a phase
error
ϕ = 2π fcTx. (9)
Hence, an IEEE 802.15.4 receiver operating at 2.45 GHz
at a chip rate of 2 Mchip/s will see phase errors of
0.441◦(ppm). With the current availability of low cost crys-
tals with frequency stabilities of 10 ppm, in the worst case,
i.e.,  = 20 ppm, phase errors of 8.82 ◦ are obtained, which
have a limited effect on receiver performance. The effects of
frequency stability are further discussed in Section IV.
B. Symbol and Chip Synchronization
Conceptually, the symbol synchronization process will con-
sist in the following. After each received phase sample (ϕn),
the vector ϕn = [ϕn−(N−1), . . . , ϕn−1, ϕn]T is built. Then, the
scalar product
qˆn = 2
π
ϕn · q, (10)
Fig. 2. Block diagram to compute iˆn and qˆn (see text). The shaded part is
the FIR filter part. The block labeled f (iˆK ) implements (15).
which measures the similarity with q, is computed (lower
signal path in Fig. 2). This is done for each new sample until
a maximum is detected. If qˆn exhibits a sharp peak with low
sidepeaks this maximum is easily located.
Formally, in the absence of noise, we are looking for the
displacement K that solves the equation
qˆK = ||q|| (11)
where || · || denotes the euclidean norm.
Once the displacement K is found, the information on δ
found by averaging the positions n where xn = xn−1, which,
from (4), exhibit the property
xnϕn(δ) = π/2 − δπ/Tx . (12)
Making use of the vector i defined in (7), this is achieved
through the scalar product
iˆn = 2
π
ϕn · i (13)
which, in the absence of noise, will give
iˆK = ||i||
(
1 − 2δ
Tx
)
, (14)
from which δ is directly obtained as
δ = Tx
2
(
1 − iˆK||i||
)
. (15)
The upper part of Fig. 2 illustrates this procedure.
C. Possible Ambiguities
Fig. 3 shows the qualitative behavior of qˆn near the max-
imum for a range of current δ values. From this figure, it
follows that, if the current value of δ is already zero, there
is an ambiguity in the maximum. In the leftmost subfigure,
(15) gives the new estimation as δ = Tx . In the rightmost
subfigure, (15) gives the new estimation as δ = 0. Both results
have the same meaning. Hence, in both cases, we get the same
point.
In practice, (11) cannot be fulfilled exactly. Instead, we find
the index K such that
qˆK > α||q|| (16)
with 0 < α < 1 a threshold. A block diagram describing this
procedure is shown in the lower signal path depicted in Fig. 2.
LÓPEZ-RIERA et al.: JOINT SYMBOL AND CHIP SYNCHRONIZATION FOR A BURST-MODE-COMMUNICATION SUPERREGENERATIVE MSK RECEIVER 1263
Fig. 3. Qualitative behavior of qˆn/||q|| as a function of the current value of
δ. The shaded circle indicates the chosen maximum.
Fig. 4. Format of the Physical Protocol Data Unit (PPDU) as mandated
by [14].
As α < 1, there is the possibility that the threshold is
crossed by two consecutive samples. In the presence of noise,
the sample giving the maximum value of qˆn may be the wrong
one. In Section IV-C we come back to this with simulations
for a specific implementation.
The whole approach described up to here is valid for a wide
range of standards. Implementation details and the effects of
noise are strongly dependent on each particular case. In the
following section we focus on the 802.15.4 case.
IV. IMPLEMENTATION EXAMPLE: IEEE 802.15.4
In what follows, we focus on the particular case of the
IEEE 802.15.4 standard [14], where the features of SR recep-
tion may become attractive. This standard specifies an MSK
(or, equivalently, O-QPSK) modulation in some of the bands.
In the 2.45 GHz band, the standard specifies a PPDU format
as shown in Fig. 4, and each 4-bit data symbol is mapped into
a 32-chip PN sequence.
In particular, the synchronization header SHR is a five-octed
field composed by two subfields. The first one (preamble)
consists in the ×8 repetition of the zero symbol
s0 = [1, 1,−1, 1, 1,−1,−1, 1, 1, 1,−1,−1,−1,−1, 1, 1,
−1, 1,−1, 1,−1,−1, 1,−1,−1,−1, 1,−1, 1, 1, 1,−1],
(17)
giving a total preamble length of N = 256 chips. The
preamble is followed by the octet corresponding to the SFD
field, which has a value of 0x7A.
Chip and symbol synchronization should be complete before
the arrival of the SFD field, hence the available time is 256Tx .
From (7) and (8), we get the following q and i vectors
q = [0, 1, 0, 0, 1, 0,−1, 0, 1, 1, 0,−1,−1,−1, 0, 1, 0,
0, 0, 0, 0,−1, 0, 0,−1,−1, 0, 0, 0, 1, 1, 0] × NS (18)
i = [1, 0,−1, 1, 0,−1, 0, 1, 0, 0,−1, 0, 0, 0, 1, 0,−1, 1,
− 1, 1,−1, 0, 1,−1, 0, 0, 1,−1, 1, 0, 0,−1] × NS ,
(19)
Fig. 5. Frequency deviation effect on synchronization. Increase in input
power to keep PER = 10−2 vs.  with α = 0.65. Simulation and experimental
results.
Fig. 6. qˆn/||q|| in the absence of noise (left) and a single realization
of qˆn with SNR = 3.3 dB (right). δ = 0.5 in both cases.
with the notation ×NS indicating NS vector repetitions. From
these we obtain ||i|| = 34 NS and ||q|| =
√
7
4 NS , with NS
the number of symbols chosen to be used for synchronization
(NS ≤ 8). Note that the norms ||i|| and ||q|| are similar, which
makes the IEEE 802.15.4 preamble suitable for the proposed
synchronization scheme.
A. Carrier
For the sake of implementation simplicity, we have chosen
to ignore the carrier synchronization problem. To validate this
simplification, we have considered that data and carrier clocks
are derived from the same crystal and have simulated (Fig. 5)
the increase in input signal power required to keep a Packet
Error Rate PER = 10−2 as a function of the relative mismatch
fcTx · introduced in (9). Under IEEE 802.15.4 conditions, the
loss in sensitivity with 10 ppm crystals in the worst case, i.e.,
(ppm) = 20, is less than 0.2 dB and with 20 ppm crystals is
less than 0.7 dB which may be acceptable in most applications.
B. Computation of K
With the preamble defined by the IEEE 802.15.4 stan-
dard [14], qˆn exhibits eight very sharp peaks, corresponding
to the repetition of each zero symbol (Fig. 6).
A static choice of the threshold α (16) required for symbol
synchronization is not trivial. A high value α  1 will work
correctly for high SNR but will miss many packets as SNR
decreases while, with a low value of α, the probability of false
synchronizations increases.
To make a reasonable choice of the threshold value, we
may investigate what happens in the case of a noise level that
produces a given link quality. Specifically, the IEEE 802.15.4
1264 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 64, NO. 5, MAY 2017
Fig. 7. Simulated chip error rate, symbol error rate, bit error rate, and packet
error rate as a function of SNR.
Fig. 8. Probability density function of qˆn/||q|| for different values of SNR.
The inset depicts the PDF in absence of signal and with SNR = 3.3 dB.
standard specifies a minimum sensitivity to achieve
PER = 10−2 with a PSDU length of 20 octets.
To simulate the effects of noise, we have considered
Gaussian noise added to the in-phase and quadrature compo-
nents of the input signal. Fig. 7 shows some simulation results
as a function of SNR considering perfect synchronization. This
figure shows curves for the raw chip error rate (CER), the
symbol error rate (SER), the BER and the PER. From this
graph, we conclude that the worst case scenario, corresponding
to the sensitivity threshold given by PER = 10−2 is given by
SNR = 3.3 dB.
We investigate the filter output qˆn after the full preamble
of Ns = 8 zero symbols. When the noise power is suffi-
ciently low, we may expect a Gaussian distribution around
the normalized maximum value of unity. An increase in noise
level increases the output variance proportionally. However,
as the noise level increases further the nonlinearity associated
to the phase rollover outside the −π · · ·π interval (which the
detector is unable to account for) comes into play: phase values
of ±(π +ε) are treated as ±(−π +ε). This effect is confirmed
by the simulation results shown in the main plot in Fig. 8. This
figure shows that, when the noise level increases, the variance
of qˆn increases as expected but, at the same time, the mean
value decreases due to the effect of the nonlinearity.
The dependence of the median value and Interquartile
Range (IQR) of qˆn/||q|| with SNR is depicted in Fig 9. From
the traces in Figs. 8 and 9 we see that the nonlinearity effect
discussed before becomes relevant for SNR < 4 dB.
To set a reasonable worst-case threshold value for α we
have simulated the PDF of qˆn/||q|| in the presence of the
highest noise level, i.e., SNR = 3.3 dB. The inset plot in
Fig. 8 depicts the PDF of qˆn with preamble plus noise and
Fig. 9. Median and IQR value of qˆn/||q|| vs. SNR.
Fig. 10. Contour plot showing the probability distribution of estimated δ/Tx
vs. true δ/Tx with SNR = 3.3 dB. Data are processed at the end of the 8th
symbol (without threshold). Contour levels are at [1, 3, 5, 7].
without signal (noise only). It follows that α may be assigned
a wide range of values. When α < 0.35 we start having a
significant number of false synchronizations in the absence
of preambles and with α > 0.65 we start missing preambles.
Instead of developing an adaptive threshold control scheme as,
for instance, in [22], which would require additional receiver
resources, we have chosen a fixed value of α = 0.65, which
allows synchronization for signal levels down to the sensitivity
level.
For low noise (or high signal) levels, even the highest
value α = 0.65 will make the comparator fire before the
full preamble has been processed. However, even in the
presence of noise, qˆn has very narrow peaks separated by
one symbol (32 chips) (Fig. 6). Hence, the comparator may
fire an integer multiple m of s0 symbols earlier, producing
K − 32m instead of K . This poses no problem, as the frame
synchronization algorithm will search for the SFD, skipping
eventual s0 symbols arising from an early synchronization.
C. Computation of δ
Once the comparator has fired, (15) is used to estimate the
value of δ. The effects of noise have been simulated. First
of all, we may consider Fig. 10, relating the true and the
estimated values of δ/Tx , considering that the comparator fires
at the end of the last symbol, i.e., without threshold. This figure
has been computed for the case SNR = 3.3 dB and shows
that the relation is not linear but becomes compressed. This
is a consequence of the fact that high levels of noise reduce
the range of iˆn due to nonlinearities in a similar way as has
been shown for qˆn . The reduction is dependent on the real
LÓPEZ-RIERA et al.: JOINT SYMBOL AND CHIP SYNCHRONIZATION FOR A BURST-MODE-COMMUNICATION SUPERREGENERATIVE MSK RECEIVER 1265
Fig. 11. Contour plot showing the probability distribution of estimated δ/Tx
vs. true δ/Tx with SNR = 3.3 dB, firing when the threshold α = 0.65 has
been crossed and with the correction (21). Contour levels are at [1, 3, 5, 7].
value of δ. For δ = Tx/2 the mean of iˆn is zero, regardless
of the noise level, and for δ = 0 and δ −→ Tx maximum
compression happens.
Also, in practice, we compute δ not at the end of the last
symbol but whenever the threshold is crossed, which will often
happen m symbols earlier than the last symbol. For δ = 0 and
δ −→ Tx this means that
iˆK < ||i|| (20)
and the range given by (15) will be limited, never reaching
the extremes {0, Tx} even in the absence of noise.
In an attempt to correct these effects, the estimated value
of δ can be obtained from
δ = Tx
2
(
1 − iˆK
α||i||
)
. (21)
The idea behind (21) is to reduce the norm of i by the same
amount as the implicit reduction of q given by (16). This helps
to restore the [0..Tx ] range of δ.
Fig. 11 shows the effect of the correction given by (21)
on a numerical simulation taking the full synchronization
process into account. This figure shows that the variance of
the estimation of δ/Tx increases as a consequence of the
amplification required to restore its dynamic range. Among
the effects that are included in Figs. 10 and 11 there is the
possibility of the ambiguities in the location of the maximum,
as explained in Section III-C. As the noise level increases,
for values of δ near zero or Tx there is a probability that the
secondary peak, K −1, rises above the main one, K , (Fig. 3).
If the secondary peak is wrongly detected, the estimation of δ
given by (15) is made with
iˆK−1 = ||i||
(
δ
Tx
− 1
)
(22)
instead of iˆK . The wrong estimation of δ brings us from
−Tx − δ to −3δ/2 instead of to 0. As was mentioned before,
when δ = 0 the detection of a secondary peak gives no error.
When δ ≈ Tx/2 there is only one maximum. The most critical
point is around δ ≈ Tx/3 because (15) brings us to −Tx/2
where we easily get the previous chip instead of the current
one (Fig. 1) while the probability to detect a wrong peak is
still high.
Fig. 12. Increase in input signal power required to keep PER = 10−2 vs.
the mismatch δ/Tx .
Fig. 13. Filter implementation.
Furthermore, we have investigated the effects of a mismatch
in δ on receiver performance. The simulated PER has been
computed for different values of δ/Tx and the signal power
has been increased to achieve PER = 10−2. Fig. 12 shows
the increase in signal power needed to keep the link quality
constant as a function of the mismatch δ/Tx . It follows that
even errors as high δ/Tx = 0.1 have a limited effect on the
overall performance.
D. SFD Detection
Once chip and symbol synchronization has been achieved,
frame synchronization consists in locating the SFD 7A. This
takes place at the bit level. As the threshold is crossed at an
unknown instant before the full preamble has been processed,
we may have to wait up to m symbols to find the SFD. If the
SFD is not detected within this time, the state machine con-
trolling synchronization restarts the synchronization process.
V. FPGA IMPLEMENTATION
The whole technique has been described in VHDL and
implemented on an FPGA. An MSK front-end, as described
in [8] provides a stream of phase samples ϕn every Tx rep-
resenting the interval [−π, π), with the 20 integer values
[−10, 9] coded as signed(4 downto 0) signals. These
phase samples are stored in a shift register to build the
vector ϕn (upper shaded block in Fig. 13).
A. Filter
The digital implementation of the filters described in Fig. 2
is extremely simple, taking into account that the coefficients
1266 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 64, NO. 5, MAY 2017
TABLE I
THRESHOLD CROSSING DETECTOR
TABLE II
QUENCH-PHASE GENERATING COUNTER
are either 0 or ±1. Both filter outputs are computed sequen-
tially by the same circuit. An accumulator with an enable
signal (for the 0 coefficients), adds the current phase sample
or its inverted version (center shaded block in Fig. 13).
To simplify the representation in Fig. 13, the same clock is fed
to all the registers that store ϕn . In the real implementation,
there are additional enable signals. These allow, for instance,
loading the phase samples every Tx and rotate them much
faster to compute the filter outputs between each Tx .
In our implementation we have used only NS = 7 symbols
for synchronization. The last symbol of the preamble is used
to adjust the receiver so that the SFD is detected with the
correct synchronization. Vector ϕn has a length of 224 and to
compute qˆn and iˆn we need 224 × 2 clock cycles.
In Fig. 2, the signals qi and ii are the filter coefficients
of q and i, which are stored and rotated in 32-element shift
registers. They appear as qi (signed(1 downto 0)) and
ii (signed(1 downto 0)) in Fig. 13.
The previous values of qˆn and iˆn are also stored every Tx
to be able to implement the following step.
B. Thresholds and δ Estimation
The threshold crossing detector is implemented with a
digital comparator, as described in the code in Table I.
The value of δ is a function of the resulting iˆK
(signed(10 downto 0)), as shown in Fig. 2.
Equation (21) has been implemented with a lookup table
taking the 5 most significant bits of iˆK .
Regarding the estimation of δ it should be noted that
the receiver is able to generate a finite number of different
δ values, corresponding to different quench phases. This is
equivalent to saying that the δ values are quantized. The
end of the synchronization process should trigger the actual
estimation of δ. This is indicated with rst from which the
quench phase is derived as shown in Table II.
C. Data Receiver
Once synchronization is achieved, the phase ϕn correspond-
ing to the n-th chip is quantized with one bit (hard decision)
and stored in a 32 bit shift register. Then, these 32 chips are
Fig. 14. Schematic of the data detection circuit.
compared with the 16 possible 32-chip sequences correspond-
ing to the 16 symbols specified by the IEEE 802.15.4 standard.
At the end of each symbol, the number of coincidences sc is
compared with the previous number of coincidences sc−1 and
the index of the best symbol c∗ is stored. This may be done
serially with a counter with an enable signal. After cycling
through all 16 possibilities, the stored index gives the most
likely symbol and its corresponding 4 bits. This is described
schematically in Fig. 14 where clk2 is clk divided by 32.
D. Resources as a Function of Data Rates
Operating at 10 kchip/s the FIR filter may operate serially at
low clock speeds. As has been shown, to perform correlation
224 × 2 clock cycles (plus a few more for additional storage
and processing) are required per chip. This translates into a
clock frequency slightly over 4.48 MHz. The data detection
block can also operate serially. As each symbol required
32 × 16 clocks, it follows that the clock frequency could be
as low as 160 kHz.
Operating at 2 Mchip/s, the correlator should operate with
32 values of ϕ simultaneously. So, Ns clocks are required
for each chip. Taking Ns = 7 this means that the clock
frequency should be 28 MHz. As usual, clock frequency can be
traded for area/circuit complexity: a full serial implementation
would require a 896 MHz clock. The parallelization increases
the accumulator width but does not increase significantly the
number of registers, as the 224 × 5 bits of phase history are
required in any case. The data detection block can operate
serially with reasonable clocks: 32×16 clocks means 32 MHz
clock frequency.
VI. EXPERIMENTAL TEST SETUP AND RESULTS
The experimental test setup is depicted in Fig. 15. It consists
in a data generator module, a transmitter part, an RF signal
generator, the SR receiver and a data analyzer module.
The data generator module consists in a Raspberry Pi [23]
single-board computer (TxPi). The TxPi runs an application
that talks at a high level to a transmitter FPGA (TxFPGA)
through an SPI link. The TxFPGA embeds the 802.15.4 MAC
layer [24] and the symbol to chip conversion. The TxFPGA
generates the chip and clock signals to be MSK modulated
with an Agilent E4431B RF generator.
LÓPEZ-RIERA et al.: JOINT SYMBOL AND CHIP SYNCHRONIZATION FOR A BURST-MODE-COMMUNICATION SUPERREGENERATIVE MSK RECEIVER 1267
Fig. 15. Experimental test setup.
Fig. 16. Detail of the SR Receiver and FPGA RX blocks.
Fig. 17. SY NC , SF D and RX (PER) versus input signal power with
a threshold level α = 0.65. The lower and upper RX traces are obtained,
respectively, with and without the correction given by (21).
The generator output is fed to the input of an SR receiver
consisting in an analog front-end with an 1-bit ADC as
described in [8], and the digital parts, described in this
paper and in [8], have been implemented on an FPGA
board (RxFPGA). The quench signal is generated on the same
RxFPGA as described in [25]. The RxFPGA also includes
the chip-to-symbol mapping circuitry and embeds an VHDL
implementation of the 802.15.4 MAC layer [24].
Both TxFPGA and RxFPGA have full transceiver capa-
bilities. They are Terasic DE0-Nano boards [26] containing
a Cyclone IV EP4CE22 FPGA, belonging to the lowest-
cost Altera product range. With no special efforts put into
optimizing the FPGA usage and including several features just
for testing, our implementation takes up 22% of the FPGA
resources: 9% devoted to the MAC layer implementation
and 13% to the transceiver.
With this test setup, we have done the following
measurements. Fig. 17 depicts the experimental results
corresponding to 104 transmitted packets with random
waiting times between packets and a threshold α = 0.65
operating at fx = 10 kchip/s. Initially, no correction of iˆK has
been done, i.e., (15) has been used instead of (21). Specifically,
this figure depicts the ratios SY NC = (NT X − NSY NC )/NT X ,
S F D = (NT X −NS F D)/NT X and R X = (NT X −NR X )/NT X ,
with NT X , the number of transmitted packets, NSY NC , the
number of times the threshold has been crossed (and
synchronization detected), NS F D , the number of SFD
Fig. 18. SY NC , SF D . and RX (PER) versus threshold level α.
delimiters detected, and NR X , the number of correctly
received packets. Note that R X = PER. This figure shows
that the synchronization preamble error rate SY NC is less than
10−2 for an input power of −119.2 dBm. Here it may also be
seen that PER = 10−2 is achieved with an input power level
of −118.2 dBm. This figure is 10 dB better than the bit-rate-
adjusted sensitivity mandated by the IEEE 802.15.4 standard
(−85 − 10 log(2 Mchip/s/10 kchip/s) = −108 dBm). The
effect of taking a correction given by (21) is also shown on
the same figure. It follows that this enhances the sensitivity
by only 0.3 dB. The effects of crystal frequency mismatch
have been measured and were shown in Fig. 5. For this
measurement, an FPGA board with external clock input was
used. It follows that the carrier synchronization problem can
be neglected for commonly available crystals, simplifying the
receiver design.
The effects of setting a different threshold level α on the
resulting PER has also been investigated experimentally in
Fig. 18. The figure shows that there is a wide range of quasi-
optimum threshold values between α = 0.54 and α = 0.7.
If the threshold is set higher we start missing packets. If the
threshold is set lower, synchronization is achieved too early,
which is equivalent to having a shorter correlating filter. This
means that the output is noisier and the estimation of δ is
worse. Also, as was shown in Fig. 8, a lower threshold means
that false synchronization may arise from noise only.
VII. CONCLUSIONS
In this paper we have presented a technique to enable the
reception of burst-mode MSK communications basing on an
SR receiver core. We have presented a simple synchronization
technique that is able to operate under the conditions imposed
by IEEE 802.15.4 standard [14]. The principle is extensible to
other simpler protocols based on MSK signaling with suitable
i and q PN sequences. Basing on the MSK receiver core
described in [8], we describe a simple, all-digital synchronizer
which performs joint symbol and chip synchronization. This
is followed by a frame synchronizer.
First, we have investigated the performance of the synchro-
nizer in the ideal case. Then, emphasis has been put on an
implementation suitable to detect the preamble defined by [14]
and the performance of this implementation as a function
of different noise levels has been investigated. The effects
of the nonlinearity of the equivalent phase quantifier have
also been investigated. Furthermore, the main points of the
1268 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS–I: REGULAR PAPERS, VOL. 64, NO. 5, MAY 2017
FPGA implementation of the proposed technique have been
described. In contrast to alternative architectures, the resulting
SR MSK receiver requires no complex and expensive blocks
such as multi-bit ADC converters and digital multipliers. The
analog part is as simple as a preamplifier and an oscillator.
The digital part is extremely simple with the most complex
digital block being an accumulator. With the addition of
the corresponding chip-to-symbol mapping and a suitable
MAC layer protocol, a whole receiver has been implemented.
Experimental results validate the practical feasibility of the
described approach and yield an overall receiver sensitivity
10 dB better than is required by [14].
ACKNOWLEDGMENT
The authors would like to thank the Associate Editor and
the anonymous reviewers for their helpful comments as they
greatly improved this paper.
REFERENCES
[1] E. H. Armstrong, “Some recent developments of regenerative circuits,”
Proc. Inst. Radio Eng., vol. 10, no. 4, pp. 244–260, Aug. 1922.
[2] D. G. Lee and P. P. Mercier, “A 1.65 mW PLL-free PSK receiver employ-
ing super-regenerative phase sampling,” in Proc. Biomed. Circuits Syst.
Conf. (BioCAS), Oct. 2015, pp. 1–4.
[3] S. M. Fatemi, M. Sharifkhani, and A. Fotowat-Ahmady, “A unified
solution for super-regenerative systems with application to correlator-
based UWB transceivers,” IEEE Trans. Circuits Syst. I, Reg. Papers,
vol. 62, no. 4, pp. 1033–1041, Apr. 2015.
[4] Y. Zheng, Y. Zhu, C. W. Ang, Y. Gao, and C. H. Heng,
“A 3.54 nJ/bit-RX, 0.671 nJ/bit-TX burst mode super-regenerative UWB
transceiver in 0.18-μm CMOS,” IEEE Trans. Circuits Syst. I, Reg.
Papers, vol. 61, no. 8, pp. 2473–2481, Aug. 2014.
[5] F. X. Moncunill-Geniz, P. Pala-Schonwalder, and O. Mas-Casals,
“A generic approach to the theory of superregenerative reception,” IEEE
Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 1, pp. 54–70, Jan. 2005.
[6] P. Pala-Schonwalder, J. Bonet-Dalmau, F. X. Moncunill-Geniz,
F. D. Aguila-Lopez, and R. Giralt-Mas, “A superregenerative QPSK
receiver,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 61, no. 1,
pp. 258–265, Jan. 2014.
[7] G. H. Ibrahim and A. N. Hafez, “An 8-PSK digital phase detection
technique for super-regenerative receivers,” in Proc. IEEE Int. Conf.
Electron., Circuits, Syst. (ICECS), Dec. 2015, pp. 240–243.
[8] P. Palà-Schönwälder et al., “Superregenerative reception of narrowband
FSK modulations,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 62,
no. 3, pp. 791–798, Mar. 2015.
[9] H. Meyr and G. Ascheid, Synchronization in Digital Communications.
Hoboken, NJ, USA: Wiley, 1990.
[10] F. X. Moncunill-Geniz and P. Pala-Schonwalder, “A DSSS superregen-
erative receiver with tau-dither loop,” in Proc. 7th Eur. Conf. Wireless
Technol., Oct. 2004, pp. 349–352.
[11] J. P. Nair, K. Bynam, Y. J. Hong, J. Kang, P. Dwarakanath, and
M. Choudhary, “Timing synchronization in super-regenerative receivers
with a single quench cycle per symbol,” in Proc. IEEE Int. Symp.
Circuits Syst. (ISCAS), Jun. 2014, pp. 738–741.
[12] S. Pasupathy, “Minimum shift keying: A spectrally efficient modulation,”
IEEE Commun. Mag., vol. 17, no. 4, pp. 14–22, Oct. 1979.
[13] F. Xiong, Digital Modulation Techniques. Norwood, MA, USA:
Artech House, 2006.
[14] IEEE Standard 802.15.4-2011 (Revision IEEE Std 802.15.4-2006), 2011,
pp. 1–314.
[15] E. Hosseini, “Synchronization techniques for burst-mode continu-
ous phase modulation,” Ph.D. dissertation, Univ. Kansas, Lawrence,
KS, USA, Feb. 2013. [Online]. Available: https://oatd.org/oatd/
record?record=handle%3A1808%2F12963
[16] D. A. Gudovskiy, L. Chu, and S. Lee, “A novel nondata-aided synchro-
nization algorithm for MSK-type-modulated signals,” IEEE Commun.
Lett., vol. 19, no. 9, pp. 1552–1555, Sep. 2015.
[17] A. N. D’Andrea, U. Mengali, and R. Reggiannini, “A digital approach
to clock recovery in generalized minimum shift keying,” IEEE Trans.
Veh. Technol., vol. 39, no. 3, pp. 227–234, Aug. 1990.
[18] A. A. D’Amico, A. N. D’Andrea, and U. Mengali, “Feedforward joint
phase and timing estimation with OQPSK modulation,” IEEE Trans.
Veh. Technol., vol. 48, no. 3, pp. 824–832, May 1999.
[19] K. Mueller and M. Müller, “Timing recovery in digital synchronous data
receivers,” IEEE Trans. Commun., vol. COM-24, no. 5, pp. 516–531,
May 1976.
[20] G. R. Danesfahani and T. G. Jeans, “Optimisation of modified Mueller
and Müller algorithm,” Electron. Lett., vol. 31, no. 13, pp. 1032–1033,
Jun. 1995.
[21] A. B. Carlson, Communication Systems: An Introduction to Signals and
Noise in Electrical Communication. New York, NY, USA: McGraw-Hill,
1986.
[22] E. Brigant and A. Mammela, “Adaptive threshold control scheme
for packet acquisition,” IEEE Trans. Commun., vol. 46, no. 12,
pp. 1580–1582, Dec. 1998.
[23] Raspberry-Pi. [Online]. Available: https://www.raspberrypi.org/products/
raspberry-pi-2-model-b/
[24] E. Costa-Molero, “Implementation of the data link layer for a super-
regenerative transceiver,” M.S. thesis, Jun. 2014.
[25] A. L. Riera et al., “A proof-of-concept superregenerative QPSK trans-
ceiver,” in Proc. 21st IEEE Int. Conf. Electron., Circuits Syst. (ICECS),
Dec. 2014, pp. 167–170.
[26] DE0-Nano Development and Education Board. [Online]. Available:
http://www.terasic.com.tw/cgi-bin/page/archive.pl?No=593
Alexis López-Riera received the Eng. Tec.
Telecommun. and Master in Electronic Eng.
degrees from the Universitat Politécnica de
Catalunya (UPC), Manresa and Barcelona,
Catalonia, Spain, in 2010 and 2013, respectively.
From 2010 to 2013, he was a part-time Assistant
Professor at the Department of Electronic System
Design and Programming, Manresa School of
Engineering, UPC, Manresa, Catalonia, Spain,
where he has been teaching Computer Sciences and
Digital Systems. He is currently working toward
the Ph.D. degree in the area of superregenerative receivers.
Francisco del Águila-López received the Eng.
Telecommun. and Ph.D. degrees from the Univer-
sitat Politécnica de Catalunya (UPC), Barcelona,
Catalonia, Spain, in 1996 and 2003, respectively.
He is currently a Lecturer at the Department of
Mining, Industrial and ICT Engineering, Manresa
School of Engineering, UPC, Manresa, Catalonia,
Spain, where he has been teaching circuit theory,
analog electronics, data transmission, and telematics
since 1997. He has also been involved in several
government and industry-funded research projects.
His research interests include switched circuits, nonlinear circuits, and radio-
frequency communication circuits design.
Pere Palá-Schönwälder (Sâ’88–Mâ’05) received
the Eng. Telecommun. and Ph.D. degrees from
the Universitat Politécnica de Catalunya (UPC),
Barcelona, Catalonia, Spain, in 1989 and 1994,
respectively. He is currently an Associate Professor
at the Department of Mining, Industrial and ICT
Engineering, Manresa School of Engineering, UPC,
Manresa, Catalonia, Spain, where he has been teach-
ing circuit theory, analog signal processing, com-
munications electronics, radio-frequency, and digital
electronics design since 1990. He is the contact
person of UPC’s Communication Circuits and Systems Research group and
has been the project leader of several government and industry-funded research
projects. His current research interests include computer-aided circuit design,
nonlinear circuits, and the design of low-power RF communication circuits.
LÓPEZ-RIERA et al.: JOINT SYMBOL AND CHIP SYNCHRONIZATION FOR A BURST-MODE-COMMUNICATION SUPERREGENERATIVE MSK RECEIVER 1269
Jordi Bonet-Dalmau (S’92–A’00–M’04) received
the Eng. Telecommun. and Ph.D. degrees from
the Universitat Politécnica de Catalunya (UPC),
Barcelona, Catalonia, Spain, in 1995 and 1999,
respectively. He is currently an Associate Professor
at the Department of Mining, Industrial and ICT
Engineering, Manresa School of Engineering, UPC,
Manresa, Catalonia, Spain. His current research
interests include steady-state and stability analysis
of nonlinear and distributed circuits.
Rosa Giralt-Mas received the Eng. Telecommun.
and Ph.D. degrees from the Universitat Politécnica
de Catalunya (UPC), Barcelona, Catalonia, Spain,
in 1995 and 2010, respectively. From 1993 to 1997,
she was a Telecommunications Consultant. She is
currently a Lecturer with the Department of Mining,
Industrial and ICT Engineering, Manresa School
of Engineering, UPC, Manresa, Catalonia, Spain,
where she has been teaching circuit theory, telecom-
munication systems engineering, and project man-
agement since 1997. She has also been involved
in several government and industry-funded research projects. Her current
research interests include the analysis and design of communication systems
and ICT project management.
F. Xavier Moncunill-Geniz received the Eng.
Telecommun. and Ph.D. degrees from the Universitat
Politécnica de Catalunya (UPC), Barcelona,
Catalonia, Spain, in 1992 and 2002, respectively.
He is currently an Associate Professor at the Depart-
ment of Mining, Industrial and ICT Engineering,
Manresa School of Engineering, UPC, Manresa,
Catalonia, Spain, where he has been teaching
circuit theory and analog electronics. His current
research interests include radio-frequency circuit
design, ultra-wideband communications, and
wireless sensor networks.
