A Mixed-Signal Demodulator for a Low-Complexity IR-UWB Receiver: Methodology, Simulation and Design by CREPALDI M et al.
05 August 2020
POLITECNICO DI TORINO
Repository ISTITUZIONALE
A Mixed-Signal Demodulator for a Low-Complexity IR-UWB Receiver: Methodology, Simulation and Design / CREPALDI
M; CASU M.R; GRAZIANO M.; ZAMBONI M. - In: INTEGRATION. - ISSN 0167-9260. - STAMPA. - 42(2009), pp. 47-60.
Original
A Mixed-Signal Demodulator for a Low-Complexity IR-UWB Receiver: Methodology, Simulation and
Design
Publisher:
Published
DOI:10.1016/j.vlsi.2008.07.005
Terms of use:
openAccess
Publisher copyright
(Article begins on next page)
This article is made available under terms and conditions as specified in the  corresponding bibliographic description in
the repository
Availability:
This version is available at: 11583/1840495 since:
Elsevier
A Mixed-Signal Demodulator for a
Low-Complexity IR-UWB Receiver:
Methodology, Simulation and Design
Marco Crepaldi ∗, Mario R. Casu, Mariagrazia Graziano and
Maurizio Zamboni
Dipartimento di Elettronica, Politecnico di Torino,
Corso Duca degli Abruzzi 24, 10129, Torino, Italy
Abstract
This works presents an integrated 0.18µm CMOS 2-PPM demodulator based on a
switched capacitor network for an Energy Detection Impulse-Radio UWB receiver.
The circuit has been designed using a top-down methodology that allows to dis-
cover the impact of low-level non-idealities on system-level performance. Through
the use of a mixed signal simulation environment, performance figures have been
obtained which helped evaluate the influence at system-level of the non-idealities of
the most critical block. Results show that the circuit allows the replacement of the
ADC typically employed in Energy Detection receivers and provides about infinite
equivalent quantization resolution. The demodulator achieves 190 pJ/bit at 1.8V.
Key words: UWB Communications, Mixed-Signal Integrated Circuits, Design
Methodology, Energy Detection, 2-PPM Modulation.
1 Introduction
The Impulse-Radio Ultrawideband (IR-UWB) technology is a promising solu-
tion for short-range indoor applications. It is particularly suited for applica-
tions aimed at connecting portable devices in Wireless Private Area Networks
(WPAN) and for low-power sensor networks with low computational demands,
reduced complexity transceivers and centralized control for multiple accesses.
Generally, transceivers are designed to have high bandwidth, low peak powers
∗ Corresponding author
Email address: marco.crepaldi@polito.it (Marco Crepaldi).
at transmitters, low complexity and the flexibility of supporting different data
rates. The “carrier-less” transmission relies on short duration pulses which sat-
isfy FCC spectral requirements about both ultra-wide bandwidth occupations
and low power spectral densities [1]. Thanks to these features, IR-UWB is
particularly suited to low-power applications in which extended battery lives
are a fundamental requirement [2].
For this kind of applications, low-complexity architectures must be employed
in UWB receivers. Typically two approaches are used: The coherent and the
non-coherent ones. The first perform demodulation by correlating the incom-
ing UWB pulses with an internally generated waveform template. The sec-
ond do not make any attempt to calculate the correlation of the incoming
pulse, and perform demodulation without any a-priori information regarding
the channel. The non-coherent receivers permit lower power consumption and
lower complexities than in coherent receivers with a slight penalty in the Bit-
Error-Rate (BER). This drawback is overcome in short-range applications in
which the possibility of saving energy at the receiver side dominates the per-
formance loss and the power budget at the transmitter. For low data rates
coherent receivers generally have an energy per bit 10 times higher than for
non-coherent receivers [3]. Among the various non-coherent alternatives, the
Energy Detection (ED) approach is particularly interesting. Notwithstanding
the 3 dB loss in the BER with respect to coherent receivers, the UWB modu-
lated information is simply recovered by evaluating the received pulses energy.
Due to the nature of ED schemes, receivers are insensitive to phase dependent
modulations, thus data is transmitted according to time-based modulation
schemes. Their low-power consumption is appealing in battery-powered short-
range applications, in which full CMOS integration plays a crucial role for the
overall device cost.
In typical implementations of Energy Detection receivers, energy is calcu-
lated after signal rectification by using Integrate-and-Dump (I&D) units. Such
units, realized as open loop Gm−C integrators for achieving large bandwidths
[4], are typically followed by an Analog-to-Digital converter (ADC) and by a
digital back-end that permits demodulation. Performance is affected by the
ADC resolution and the features of the Integrator unit; in addition to this,
the ADC represents one of the most power-hungry and silicon area-consuming
blocks [5]. Since IR-UWB communication systems operate by means of time-
domain modulations, it is possible to use ad-hoc solutions by reorganizing
the general architectures thus eliminating the ADC and allowing low-power
consumption. In the case of 2-PPM modulations, it is possible to replace the
Analog-to-Digital conversion stage with simpler blocks which allow to com-
pare the pulses energies in the analog domain [6], [7], therefore avoiding any
quantization effect. With the aim of a low-power receiver and using a sin-
gle comparator as in [6], this work presents an integrated differential 2-PPM
CMOS demodulator formed by an open-loop Gm − C structure, called Bi-
2
phase integrator. It inherently provides Analog-to-Digital conversion without
the use of any ADC, features offset rejection and exhibits nearly the same
error-rate performance of an ideal Energy Detection receiver. The demodula-
tor is composed of an Operational Transconductance Amplifier (OTA) and a
differential switched capacitor network. The demodulator consumes 950µW
and achieves 190 pJ/bit at 5Mbit/s.
The design and the simulation of the entire unit have been carried out with
a design methodology based on different abstraction levels, the development
of a proper VHDL-AMS simulation environment and the use of a mixed-
signal simulation tool, ADVanceMS (ADMS, Mentor Graphics). Description
levels with different degrees of accuracy allow to discover the impact of the
abstraction refinement, thus help the designer trade between precision and
simulation time [8]. For the low-level descriptions, the circuit building blocks
have been designed in a mixed-mode UMC CMOS 0.18µm technology and
simulated with SPICE BSIM3 transistor models. The design methodology
combined to the simulation tool allowed to discover design weaknesses thus
accurately predicting performance in presence of blocks non-idealities.
The paper is organized in the following parts: Section 2 introduces the principle
of operation of the reduced complexity ED receiver which employs the Bi-
phase demodulator, and highlights its differences with respect to the ordinary
Energy Detection receivers. Section 3 introduces the successive refinement
step design methodology. Section 4 introduces the Bi-phase demodulator unit
and its main building blocks - the integrator and the comparator - explains
the typical design issues and clarifies the trade-off between performance and
low-power consumption. In addition, the section justifies the abstraction level
required for simulating each of the two units. Section 5 reports both functional
and system-level performance simulations. Finally, conclusions are drawn in
section 6.
2 Low-complexity energy detection receiver
IR-UWB transmission is based on the use of short duration baseband pulses,
on the order of one nanosecond, without the need of any carrier to provide
bandwidth shifts. The Power-Spectral-Density (PSD) is very low with respect
to narrowband and wideband modulations but the total transmitted power
can be considerably high because of the very high bandwidth. The baseband
pulses are phase-modulated, like in Bi-Phase Shift Keying (BPSK), or time-
modulated, like in Pulse Position Modulation (PPM).
Typical Impulse Radio ED receivers front-ends include a Low-Noise-Amplifier
(LNA), a Squaring Unit (()2), an analog integrator and finally an A/D con-
3
verter, as figure 1 shows. Sophisticated operations other than simple demod-
LNA ()2 I & D A/DAntenna
Fig. 1. Typical Energy Detection receiver
ulation, like for instance the synchronization, are typically done in the digital
domain by elaborating the ADC output’s raw data.
A typical modulation scheme is the 2-PPM (Bi-phase Pulse-Position-Modulation)
in which the transmitted pulse is modulated according to its relative position
within a time frame, as shown in figure 2. Whether a ‘0’ is sent, the pulse is
m(t)2
>
      
      
      
      
      
      






      
      
      
      
      
      






t
m(t)
t
PRI PRI
’1’’0’
0/1
∫
∫
Fig. 2. 2-PPM modulation.
placed in the first half of a Pulse-Repetition-Interval (PRI), while in case of
a ‘1’, the pulse is placed in the second one. At the receiver, the UWB signal,
rectified by the squarer, is integrated and dumped in the two PRI’s halves by
the I&D unit. The two obtained analog values represent the signal energies as-
sociated to the two PPM phases. After A/D conversion, data is demodulated
by comparing the energies of the two PPM phases numerically.
With the intent of reducing complexity and power consumption, in this work
we replace the ADC with a simple zero-threshold comparator by giving the
analog integrator the capability to provide a voltage whose sign determines
the information bit. We call this new receiver as Bi-Phase Demodulator and
the reason of this name will be clear momentarily. The circuit is based upon
the charge redistribution principle.
4
The Bi-phase demodulator shown in figure 3 is composed of two parts, the Bi-
phase integrator and the comparator. The Bi-phase integrator recalls the
Vota Vout
Iout
Ios
Antenna ()2
+
−
−
+O
TA
R
es
et
Ph
as
e 
0
Ph
as
e 
1
Co
m
pa
re
In
hi
bi
t
0 / 1
Com
+osV
+
Sa
ve
Cl
am
pe
r
La
tc
h
Integration
NetworkLNA
Bi−phase integrator comparator
Bi−phase demodulator
Fig. 3. Energy Detection receiver with Bi-phase demodulator
typical Gm-C integrator structures: It is composed of an open loop transcon-
ductor loaded with a capacitor contained in the integration network. After this
first conditioning part which generates a modulation-dependent analog volt-
age, the result is processed by the zero-threshold comparator which converts
the demodulation voltage into a binary digital quantity.
Differently from other works in which full receiver front-ends are presented,
here the contribution is more focused on the design of a single unit of the
demodulation chain and on how its non-idealities impact on system-level per-
formance. With respect to [7], in which baseband processing relies on a single-
ended inverter, here the demodulator is composed of a fully-differential Op-
erational Transconductance Amplifier inclusive of common mode stabilization
network. The circuit described in detail in section 4, is a mixed-signal device
because the analog voltage processing in the integration network of figure 3
is controlled by switching transistors. Therefore, to obtain the impact at sys-
tem level of a single block, a proper mixed-signal design methodology must
be employed in the various design stages.
3 Design methodology
Hardware description languages like VHDL and Verilog are typically employed
in digital design. With the introduction of VHDL-AMS, a superset of the
VHDL language, it became possible to employ both continuous-time and digi-
tal descriptions in the same simulation, thus allowing true mixed-signal simula-
tions in which analog and digital parts are simulated concurrently. The design
and the simulation of the Bi-phase demodulator and of the receiver front-end
need not only a proper language but also a design tool through which the
description language can be “brought to life”. An example is ADVanceMS
(ADMS, Mentor graphics) which can simulate both VHDL-AMS and SPICE
5
>(  )2
(  )2
LNA 2(  )
Fu
lly
 b
eh
av
io
ur
al
Su
bs
tit
ut
e−
an
d−
pl
ay
Ph
ys
ic
al
 b
lo
ck
s
id
en
tif
ic
at
io
n
LNA
P3−S P3−L
P1
P3−S
>LNA BP−I
BP−I
P2
Ph
as
e 
I
Ph
as
e 
II
Ph
as
e 
III
Fig. 4. Design flow phases using VHDL-AMS and ADMS.
descriptions. Interesting examples of the use of VHDL-AMS and the ADMS
tool are [9], [10]. The language combined with the tool allows the designer to
create an ad-hoc simulation environment in which the system functionalities
can be tested. In addition, the possibility to use VHDL-AMS and SPICE de-
scription in the same environment allows to refine the blocks description from
pure behavioral models down to circuit structural descriptions (e.g. transistor-
level). In other words, top-down methodologies typical of the digital design
can be also applied to the case of mixed-signal devices.
We adopted the design methodology outlined in [11] where we proposed a suc-
cessive refinements-based approach. We took advantage of the inherent parti-
tioning properties of the VHDL-AMS language, that is the possibility to assign
each part of the design the description of its interface with the outer world,
i.e. its entity, and the description of the reaction of its inner parts to the sig-
nals listed in the entity as well as to its internals states, i.e. its architecture.
Our approach consists of three steps: Phase I - Architectural description, Phase
II - Partitioning and Phase III - Substitute-and-play. In figure 4, the different
block shapes in Phase II and III identify the entity/architecture partitioning.
The block labeled as BP-I indicates the Bi-phase integrator.
During Phase I the general system architecture is conceived. This functional
description does not necessarily assign each block of the architecture a specific
6
task but rather consider the system as a whole. In addition it is also possible
to prove the coherence with other high-level languages such as Matlab [12].
During Phase II, the architecture is partitioned into single blocks, each with
a proper VHDL-AMS entity and architecture. It is possible to partition the
whole front-end in smaller units such as the LNA, the squarer and the Bi-
phase demodulator. At this level electrical compatibility among the units must
be provided. Each unit has its own terminals which identify inputs, outputs
and power supply nets. The units are still behaviorally modeled but the de-
scription is detailed enough to let the designer consider the first macroscopic
non-ideal parameters in the models, like saturation, slew-rates, input and out-
put impedances et cetera. Since the testbenches are inherited from Phase I,
it is possible to evaluate the effect of these non-idealities on the system per-
formance. It is then now possible to investigate on the macroscopic front-end
requirements for a single block (e.g. gain, linearity, bandwidth) and to de-
rive constraints for the successive circuit-level design phases. In addition, it is
possible to determine the most critical blocks through simulated performance
figures or electrical-level considerations.
During Phase III, the description of one or more blocks is refined. For exam-
ple, the Bi-Phase integrator, that was described in Phase II with VHDL-AMS
equations, can be substituted with a transistor-level description. During this
phase, the component instantiations are replaced without changing the up-
per level VHDL-AMS source code. This “painless” substitution, we call it
Substitute-and-Play, is allowed by the partitioning done in Phase II since elec-
trical terminals do not change once defined (the blocks interfaces, that it the
“entities”, are not modified). The substitution operation can be applied also
to a subset of the blocks the designer considers relevant to understand its
effect on the system. Typically, one or two blocks are replaced at at time.
Whether the layout-extracted SPICE netlist is available, or the effect of para-
sitics is relevant for the system-level performance evaluation, the designer can
import such low-level description of a block in the simulation environment.
Whether the results obtained in Phase III do not differ from those obtained
during Phase II, it is possible to simulate the system by using the simpler
higher-level model (and this will save simulation time) and to focus on the
refinement of the front-end units which have not been considered yet.
In the remainder of the paper we will use the following notation: P1 - Concep-
tual Phase I, P2 - Behavioral model - Phase II, P3-S - SPICE level - Phase
III, P3-L - Layout level - Phase III.
7
4 Integrated demodulator
4.1 Bi-phase integrator
This subsection introduces the Bi-phase integrator unit. It is organized in two
parts: The first one deals with the principle of operation whereas the second
one focuses on the transistor-level design of the device.
Principle of operation
Figure 5 details the Bi-Phase integration network of figure 3. The integrator
Vota
Iout
O
TA
+
−
VoutC p C L1L0C
Inhibit Compare
Reset
Phase 1 Phase 0
Fig. 5. Principle of the Bi-phase integrator
includes the transconductor, which consists of an Operational Transconduc-
tance Amplifier (OTA) [13], and a switched capacitor network [14]. The former
transforms the input voltage variations Vin into current Iout, while the latter
controls current injection in two identical integration capacitors CL0 and CL1
associated with the two PPM phases. The overall parasitic capacitance due
to the integration network and the OTA output stage can be divided in two
parts: the differential and the common mode capacitance. For this preliminary
analysis we will consider only the differential capacitor modeled as Cp since
the common mode parts have no effect. The Bi-phase integrator operates with
five control signals, Phase 0, Phase 1, Compare, Reset and Inhibit. The first
three control the integration of the incoming UWB pulses, the fourth one re-
sets the state of the integrating capacitors and finally the fifth one shorts the
differential output of the transconductance amplifier. As explained later, the
whole demodulation process is based on the charge redistribution principle.
When the device is idle, signal Inhibit forces the differential output terminals
of the transconductor to the same potential, then shorting parasitic capacitor
Cp before starting the integration of the first PPM phase. When the first in-
tegration starts, Inhibit is deactivated and signal Phase 0 is asserted forcing
current Iout in the equivalent capacitance CL0+Cp. When the first integration
phase finishes, signal Phase 0 is deactivated and Inhibit is asserted again. As
a result, the charge in capacitor CL0, now isolated from the rest of the circuit,
remains (ideally) constant. When the second integration starts, signal Inhibit
8
is deactivated and Phase 1 is asserted therefore forcing current in CL1 + Cp.
In the end, the charge in CL0 and CL1 is proportional to the input signal:
After the deactivation of Phase 1 and the assertion of signal Inhibit, the final
demodulation is possible by activating signal Compare. Charge redistribution
principle is responsible for setting the sign of output voltage Vout according
to the information bit. Whether the charge in CL0 is higher than the charge
in CL1, the voltage Vout will be positive, otherwise Vout will be negative. The
comparator evaluates the information bit by comparing such voltage to zero.
Finally, signal Reset zeroes the charge in CL0 and CL1 and another demodu-
lation cycle is possible.
It is easy to understand that the device inherently provides offset rejection
thanks to its fully balanced structure. In fact, if the sole offset voltage VOS is
present at the OTA input, the resulting offset current IOS is fed alternatively in
CL0 and CL1: After the final charge redistribution the resulting output voltage
is ideally zero.
The beneficial effect of the Inhibit switch can be better explained by analyzing
in detail the operation of the device. On top of figure 6 the control phases sig-
nal activation and deactivation are schematized. When the receiver acquires
modulated data, the demodulator operates in a “steady state” mode and in-
tegration control signals are asserted periodically. The figure shows different
conditions (from A to F) according to the current operation phase and allows
to follow voltage variations across Cp and integration capacitors CL0 and CL1.
With this scheme it is possible to focus our attention on the initial and final
conditions across Cp, CL0 and CL1 and to understand how to combine them
together to obtain the mathematical expression representing the influence of
parasitics on Vout. Condition G represents the final charge redistribution after
having completed the two demodulation phases.
We define Ti as the idle time between the deactivation of Phase 0 and the
activation of Phase 1 and vice-versa.
In addition, we define the voltages V
p,(n)
iph0
, V
p,(n)
fph0
, V
p,(n)
iph1
, V
p,(n)
fph0
across the capac-
itance Cp (superscript p), where n indicates the current demodulation period.
The equations state how the initial and final conditions on voltage across Cp
impact on CL0 and CL1 at the beginning and at the end of each demodulation
phase. Subscripts i and f indicate the initial and final conditions across Cp
during the idle period, for Phase 0 (ph0) and Phase 1 (ph1) operation states,
respectively.
To take into consideration the charge transfer it is also necessary to define the
quantities V
(n)
iph0
, V
(n)
fph0
, V
(n)
iph1
and V
(n)
fph1
which represent the voltages across both
CL0 and CL1 during the activation of the respective integration switches.
When signals Phase 0 and Phase 1 are asserted, the OTA integrates the UWB
9
Bi i i iT 1 1T T0 T0
C
Vfph0
A B C D E F G
      
      
      
      
      
      






      
      
      
      
      
      






(n−1) (n)
Squared UWB pulse
V  (t)in
‘‘Idle’’ Time 
t
0Phase
t
t
Phase 1
t
Compare
G
+
+ C L1
C L0
V (n)
out
C p C
V (n)fph1
L1
C p C L0
C p C L1
Vp (n)fph1 C p C L1
C p C L0
Vp (n)fph0 C p C L0
+ + +
+ + +
Vp (n)iph0 Vfph0
(n)
TT TT
A
E FD
Vp (n)iph1 =
=
(n)
t
Reset
t
Inhibit
Fig. 6. Circuit operation scheme during demodulation
signal generating the information required for demodulation:
V
(n)
O =
1
Cp + CL0
∫
T
(n)
0
GmVin(t) and V
(n)
1 =
1
Cp + CL1
∫
T
(n)
1
GmVin(t)
where T0 and T1 indicate the domains of integrations of equal duration for
Phase 0 and Phase 1, respectively.
With these hypotheses it is possible to obtain equations 1-4 which model the
operation of the device for the first of the two PPM integrations at time n in
case the Inhibit signal is not used.
10
V
p,(n)
fph0
=
IOSTi
Cp
+ V
p,(n)
iph0
+N
p,(n)
fph0
(1)
V
(n)
iph0
=
Cp
Cp + CL0
V
p,(n)
fph0
(2)
V
(n)
fph0
= V
(n)
iph0
+ V
(n)
0 (3)
V
p,(n)
iph1
= V
(n)
fph0
(4)
For the second PPM phase, it is possible to obtain a similar set of equations
(not reported for sake of brevity). At activation of signal Compare, considering
that the integration capacitors are equal (CL0 = CL1 = CL), the final output
voltage after the final charge redistribution is given by V
(n)
out =
1
2
{V (n)fph1−V
(n)
fph0
}.
If we combine the two equations sets, we can express V
(n)
out as a function of
V
(n−1)
out (equation 5).
V
(n)
out =
1
2
{V (n)1 −
CL
Cp + CL
V
(n)
0 + (5)
+
Cp
Cp + CL
{Np,(n)fph1 −N
p,(n)
fph0
− V (n−1)1 }+
+
C2p
(Cp + CL)2
{V (n−1)out −Np,(n−1)fph1 +N
p,(n−1)
fph0
}}
In the case in which no reset switches across Cp are employed, the differen-
tial output voltage for the n − th integration V (n)out depends on the stochastic
processes Npfph0 and N
p
fph1
both for the n -th and the (n− 1) -th demodulation
phase. If Cp is zero, it is easy to demonstrate that demodulation voltage is
ideal, that is
V
(n)
out,ideal =
1
2
{V (n)1 − V (n)0 }.
To obtain a similar effect in presence of parasitics, it is sufficient to reset the
charge accumulated in Cp before a new phase starts with signal Inhibit. In
this case, V
p,(n)
iph0
= V
p,(n)
fph1
= 0: That is, using the reset switches across Cp, the
obtained output voltage is ideal.
In both cases, the equations presented show how the symmetric structure of
the circuit completely eliminates the effect of offset contribution VOS. In fact,
the final output voltage does not depend on current IOS. In summary, the
possibility of resetting charge in Cp before starting a new demodulation and
the offset rejection capabilities allows to obtain almost ideal demodulation
performance.
11
Design
Figure 7 presents the overall schematic of the Bi-phase integrator. It shows the
LVLVVin
VA
VB
I out
Vout
OTA
C
C
L0
L1
LV
LV
LV
LV
LV
LV
LV
LV
LVLV
MN
MP
LVLV
LV
LV
LV LV
bias1 bias2
LV
LV LV
LV
LV
M1
M2 M9 M10
M7 M8
M5 M6
outm
outp
phase0
reset
Vdd
Gnd
phase1inhibit
LV LV
c
o
m
p
a
r
eLV LV
CMFB
Gnd
cmfbout
Vbias3
Outm
Vdd
Outp
Integration network
cmfbout
M4
M3
Fig. 7. Schematic of the Bi-phase integrator
internal Operational Transconductance Amplifier structure and the integra-
tion network. The circuit is fully differential and the OTA input stage consists
of a source-follower differential configuration. Current variations at the input
are mirrored in the output stage through a MOSFET configuration similar
to a current mirror. The technology employed here is a mixed-mode CMOS
0.18µm process. For enhancing the over-drive of some of the transistors, the
Low-threshold Voltage (LV) process option has been employed.
The amplifier includes auto-biasing circuits and a simple Common Mode Feed-
back Network (CMFB) made up of a differential stage only. On the one hand,
as clarified in [15], the use of a common mode stabilization network is manda-
tory for integrators employing open-loop transconductors: Thus a slight in-
crease of the overall device power consumption is unavoidable. On the other
hand, no precise control of output voltages is necessary in this case because
demodulation is based on a relative voltage comparison. For the same reason,
temperature drifts, aging and voltage supply variations are reflected in the
two integrated voltages V0 and V1 in the same way: As a result, the OTA does
not require any transconductance tuning [16].
The supply voltage is 1.8V and the common mode input bias is 0.9V. The
dynamic input and output ranges were limited to 0V-260mV and ±600mV
respectively in order to limit power consumption as much as possible [15]. The
integrator has been simulated at different temperatures up to 90◦C and for
different corner conditions: The equivalent Gm−C integrator gain, composed
of the OTA and a load capacitor, decreases of approximately 5 dB in the worst
conditions with respect to the nominal value (20 dB circa at 30◦C, typical pro-
cess) and greater self discharge in the load capacitors. Notwithstanding this,
12
demodulation performance is only marginally affected by this decrease: A gain
fluctuation of circa 5 dB does not significantly corrupt the final demodulation
voltage because the process relies on a relative comparison.
An important point is the OTA differential transconductance. The transistors
aspect ratios are such thatM2 = M4,M3 =M1,M7 = M8 andM9 =M10.
If we analyze the small-signal equivalent model, neglecting drain-source resis-
tances, it is possible to obtain a closed form of the equivalent OTA differential
transconductance Gm,
Gm =
gm2
2(1 + gm9
gm7
+ gmb7
gm7
)
(6)
where gm2, gm7, gm9 are the equivalent transconductance of M2, M7 and M9,
respectively. The quantity gmb7 is the body effect transconductance of M7. The
gain can be improved by increasing M2 aspect ratio and by reducing M9’sW/L
with respect to M7. As shown in the formula, gain is only affected by the M7
body-effect transconductance. This is not surprising because the source ter-
minals of the other transistors (i.e. M2, M4, M9 and M10) are not connected
in a differential stage fashion. Although the input and output equivalent cir-
cuits cannot be formally considered as differential stages, voltages VA and VB
shown in figure 7 do not vary with the differential input signal. This results
in a “balanced” simplification of such effect on both left and right branches of
the circuit. In this work Gm is 280µS at 30
◦C, with a resistive load of 10KΩ
at a frequency of 100KHz.
By neglecting parasitic MOSFET capacitances and taking into account the
drain-source resistances of the output stage, it is also possible to calculate the
overall transfer function of the OTA, Iout
Vin
, which depends on a generic load
impedance ZL,
Iout
Vin
= − gm2
2(1 + gm9
gm7
+ gmb7
gm7
)(1 + ZL
rds2
+ ZL
rds1
)
(7)
where rds2 and rds1 are the drain-source resistances of M2 and M1, respec-
tively. The load impedance ZL models the integration capacitances inclu-
sive of parasitics. It is evident that if ZL =
1
s(CL+Cp)
, the output function
Vout = ZLIout presents a pole at low-frequency which, in the ideal case (that is
rds1, rds2 →∞) should be at s = 0. It is easy to note that the equivalent output
impedance of the OTA, which depends on the above drain-source resistances,
influences the cut-off frequency of the resulting first order pole.
Such pole at about 9MHz in the transfer function limits the maximum inte-
gration time. Simulations show that integration windows longer than about
30 ns would cause unacceptable losses in the computed energies. The aspect
ratios of MOSFET at the OTA output stage affect gain and bandwidth of
13
the transconductance amplifier: An increase of W/L has the effect of increas-
ing the OTA gain and thus compensates for the various drops in the output
switching network, but at the same time increases the frequency of the above
mentioned pole. Moreover, the transistor sizing significantly influences other
device characteristics: An increase of W/L mitigates distortions and increases
gain but lead to a higher power consumption.
We have obtained the AC response of the OTA loaded by the integration
capacitors by means of simulations. The results and the integration behavior
is guaranteed up to several GHz. A second pole located at such high frequency
does not affect the demodulation performance because the bandwidth of the
useful signal at the squarer’s output is within the integration frequency range.
Fortunately, non-coherent Energy Detection receivers do not require extremely
precise pulse elaboration, thus allowing low W/L which help keep power at its
lowest. Nonetheless, excessive distortion is certainly unacceptable. Table 1 re-
ports the OTA Total-Harmonic-Distortion (THD) evaluated varying the signal
amplitudes as suggested in [17]. The fundamental frequency is 100MHz and
up to the 40-th order harmonics have been considered in the analysis, ranging
from 100MHz to 4GHz. Results show that for large input signal magnitudes
the THD increases up to -20 dB. Although THD is worse than in [17], here
extremely low THD values are not required because the OTA elaborates a
signal which the squarer already predistorted “on purpose”. Moreover, the
demodulation consists, as discussed before, in a relative energy comparison
that ends with a “hard” decision about the information bit: The pulse largest
energy is in the first or in the second half of the PRI. The distortion of the
squared signal produced by the OTA will not degrade appreciably the results
(unless of course the signal level is such high that the OTA stages saturate).
Simulation results validating this property are given in section 5.
Table 1
Transconductor THD for different differential inputs
Vin (mVp) 50 100 150 200 250 300 350 400
THD (dB) -26 -24.5 -23 -22.5 -22 -21.5 -20.5 -20
The demodulation performance is influenced by any asymmetry of the dif-
ferential architecture and particularly of the output-to-ground parasitics in
the two switching branches. The integration networks avoids this problem
by intrinsically providing the symmetry required to keep these contributions
balanced. Moreover, each of the two integration capacitors CL0 and CL1 are
implemented as a pair of anti-parallel MIM capacitors. The integration capac-
itors are of about 200 fF each.
The integration switches, as shown in figure 7, consist of transmission gates
that ensure very low dropouts during activation. Their aspect ratios (in this
work on the order of 20) must be chosen to minimize the effects on integra-
14
tion gain: As W/L increases, drain-source resistance decreases but parasitics
increase and may considerably reduce the integrator gain.
Table 2 shows a summary of the transistors aspect ratios for the proposed
solution. To minimize short-channel effects, except for LV transistors, lengths
have been kept higher than the minimum values. MP and MN aspect ratios
refer to p-MOS and n-MOS of the transmission gates.
Table 2
Some of the aspect ratios of the MOSFET employed in the Bi-phase integrator.
MOSFET M1 M2 M3 M4 M5 M6
W/L (µm/µm) 11.20.24
2.4
0.24
11.2
0.24
2.4
0.24
40
10
45
5
MOSFET M7 M8 M9 M10 MP MN
W/L (µm/µm) 7.20.24
7.2
0.24
2.4
0.24
2.4
0.24
4.8
0.24
4.8
0.24
One last remark is related to mismatch. Since switching transistors are sus-
ceptible to mismatch, and thus inject charge asymmetrically in the load ca-
pacitors, the output voltage can be affected by offsets.
4.2 Low-offset comparator
In order not to waste the robustness of the integrator against offset, the com-
parator which follows the Bi-Phase integrator must have excellent offset prop-
erties as well. Considering the wide literature about analog comparators and
the importance of this unit for this work, we will start this subsection by revis-
iting some of the most important design issues from the Bi-phase demodulator
point of view.
Preliminary analysis
Analog comparators are typically found in Analog-to-Digital converters in
which low-power requirements, high sample rates and low-offsets represent
their major design challenges. Several circuit topologies can be found in lit-
erature, but the majority of them are based on the use of two main building
blocks, the preamplifier and the latch [18]. The preamplifier, a transconductor
[19], elevates the input signal and amplifies it for a successive regeneration.
The latch (a negative resistance or regenerative network) generates a full-
swing digital signal through a positive feedback. Finally, some switches whose
position and number depends on circuit topology, activate and deactivate the
regenerative network and the preamplifier according to the device operation
phases.
15
The two main non-ideality sources in analog comparator are input-referred off-
set and kickback noise. The former is due to mismatch properties of MOSFET
transistor [20], and due to the non-perfect routing in the interconnection lines
of the layout [21]. For this reason, in order not to neglect important geometri-
cal effects, in our case it is necessary to include post-layout extracted netlists
of the comparator in the simulation environment. The relevance of the offset
problem has been shown in [22]: For a 0.6µm CMOS process, input-referred
offset of a regenerative stage can vary from about -10 to 10mV. Instead, the
kickback noise is a phenomenon which depends on the capacitive coupling
between the latch output and the preamplifier input [18]. This phenomenon
is relevant for large equivalent output resistances of the preamplifier. The
challenge in the present design is to keep offset contribution low rather than
reducing kickback noise because the equivalent series resistance at the output
of the Bi-phase integrator is reasonably small.
Depending on the type of preamplifier and latch employed in the design, dif-
ferent classes of comparators can be identified: the Static or Class-A, the
Class-AB and the Dynamic Latch comparators [18]. The Class-A compara-
tors are composed of a linear preamplifier and regenerative latch in cascaded
configuration. Their power consumption is high because the preamplifier is al-
ways active during the whole comparison process. Typically, the regeneration
process is slow due to the presence of two poles in the transfer function. Class
AB latched comparators are faster because the preamplifier differential stage
output is directly connected to the latch output formed by two cross-coupled
inverters. These circuits typically have a single pole in their transfer function
and the lowest power consumption with respect to the other two classes.
Various circuit level techniques are employed to eliminate offset. The Input-
Offset-Storage (IOS) and Output-Offset-Storage (OOS) techniques are the
most used [23], [24]. They consist of storing the offset voltage in a capaci-
tor through the use of dedicated switches. The capacitors emulate a voltage
source corresponding to the offset voltage at the input or at the output of the
preamplifier. OOS technique is used to store the input-referred offset of the
latch while the IOS one is used to store both the latch and the preamplifier
offsets. Other offset reduction techniques consist of digital controlled circuitry
which make use of Digital-to-Analog converters and programmable capacitive
loads [25]. These approaches permit to compensate the regenerative network
offset with an unbalanced capacitive trimming at the two latch output nodes.
In this work, the offset mitigation is achieved by employing three techniques:
The use of the analog preamplifier, the use of a clamping switch in the regen-
erative network and the use of decoupling switches to insulate the differential
input from the preamplifier. While the first technique is typically used in every
low-offset comparator, the second technique, extrapolated from the consider-
ations in [21], helps decrease the switching voltage of the latch by forcing the
16
regenerative inverters to saturation. The third technique, which works com-
bined with the second one, let the circuit account for the input referred offset
of the latch during the first operating phases. We did not employ any IOS and
OOS technique not to increase much the circuit complexity and thus not to
increase the number of control signals for circuit operation. Comparison time
is not particularly relevant because the device operates slowly, thus we have
not considered specific techniques like the use of inductive loads in the regen-
erative network [19]: The time between the comparison phase of the Bi-phase
integrator and the successive demodulation phase is about 50 ns. Finally, since
a single comparator is required in this design it is possible to relax the power
consumption constraints and use an hybrid circuit topology which helps easily
apply the offset reduction techniques.
Principle of operation
The adopted circuit topology is a hybrid between a class-A and a class-AB
comparator (figure 8): The preamplifier, always active, and the latch are de-
coupled as in a class A comparator and the regenerative is composed of two
cross-coupled inverters as in a class-AB one. The preamplifier stage is com-
posed of a differential stage connected to the input terminals by two insulation
switches M1, M2. The M3 switch forces the differential preamplifier input to
zero which is insulated from the latch through two series switches, M14 and
M15. The output voltage of the regenerative network is frozen by the switch
M20. Switches M16 and M17 allow to activate and deactivate the latch ac-
cording to the device operation phase. The comparator operation is organized
in three steps: pre-comparison, pre-amplification and comparison. The pre-
comparison phase mitigates the offset of the preamplifier and of the regener-
ative network, the pre-amplification phase amplifies the small input voltage
for the latch and the comparison phase allows the full-swing signal regenera-
tion. Aiming at reusing the same signals of the Bi-phase integrator properly
combined together, the device uses three additional controls: save, latch
and clamper. These can be generated from signals inhibit, and compare as
save = inhibit, latch = inhibit + compare and clamper = compare.
During pre-comparison signal save is deactivated and the preamplifier is
frozen by M3. Signal clamper is activated to force the comparator output
at the switching voltage through M20 and signal latch is activated as well
to turn on the latch and force inverters transistors M18, M19, M21 and M22
to saturation. During this phase the latch and the preamplifier are shorted
by transistors M14 and M15. The clamper switch helps to lower the input-
referred offset of the latch due to unbalanced routing interconnection and to
mismatch. Ideally, the offset contribution is reduced if the initial voltage in
17
the latch approximates the switching voltage of the two inverters 1 [26]. Since
M14 and M15 force the same voltage between the preamplifier and the latch
output the transistors operating point is about the switching voltage, thus
part of the offset contribution of both is lowered. During the pre-amplification
phase signals latch and save are deactivated and the differential input signal
across Inp and Inn is applied to the differential stage input. The preamplifier,
whose response depends on the the time constant associated to the its pole,
amplifies the signal. In the meantime, the output nodes are forced to the same
voltage through signal clamper and the latch is turned off not to corrupt the
preamplifier output voltage once enabled through M14 and M15. During the
comparison phase signal clamper is deactivated and signal latch is asserted
at the same time. The latch is thus activated and in the end, the full swing
regenerated output is provided across terminals out_rp and out_rm.
M1
M2
M3
M4 M5
M6 M7
M8
M9
M10 M11
M12 M13
M14
M15
M16 M17
M18 M19
M20
M21 M22
Inn
Latch
C
la
mp
er
Save
Vdd
b_1
b_1 b_2
Preamplifier
CMFB
Regenerative
network
LV
out_rp
out_rm
Inp
Fig. 8. Schematic of the low-offset comparator
Design
The clamper switch in the regenerative network consists of low threshold volt-
age (LV) transistors to ensure a very low equivalent resistance when activated.
In this case we did not employ transmission gates in order not to increase the
output nodes capacitance that would significantly reduce the circuit speed.
The other switches employ transistors with standard threshold voltage. The
preamplifier takes the bias voltages from the same biasing network of the Bi-
phase integrator through terminals b_1 and b_2. Its gain must be kept high
enough to counter the effects of process corners and temperature variations. To
guarantee sufficient amplification, it is necessary to track the differential input
more than the time constant of the preamplifier pole. In our case, simulations
1 This quantity is defined as the voltage across which the pMOS and the nMOS of
the two inverters are perfectly saturated.
18
Fig. 9. Comparator layout
show that 10 ns are enough to ensure a gain of about 20. Since preamplifier
gives the highest offset contribution, the dimensions of matched transistors
M6 and M7 have been kept high. Typically, offset voltage is due to threshold
voltage variations and position of transistors on the die. Since the first aspect
is much more relevant, offset can be thus modeled with σVoff ≃ AVT /
√
WL,
where AVT is on the order of 5mV·µm for a typical 0.18 µm process [27]. This
data is also in accordance with the matching characterization reports of our
technology. In this design, aspect ratios of transistor M6 and M7 lead to an
offset of σVoff ≃2.5mV. By estimating the other offset contributions we can
derive a total value of about 3mV.
Since the operating point of the preamplifier can vary with temperature and
process corners, as for the Bi-phase integrator, we realized a very simple Com-
mon Mode Feed-Back (CMFB) circuit to keep the preamplifier operating point
under control. As shown in figure 8, it consists of a differential stage which
19
Table 3
Aspect ratios of the MOSFET employed in the comparator.
MOSFET M1 M2 M3 M4 M5
W/L (µm/µm) 1.80.18
1.8
0.18
1.8
0.18
8
1
8
1
FN 5 5 5 4 4
MOSFET M6 M7 M8 M9 M10/M11
W/L (µm/µm) 41
4
1
1
10
1
4
5
0.18
FN 3 3 4 4 5
MOSFET M12 M13 M14 M15 M16
W/L (µm/µm) 100.18
10
0.18
1.2
0.24
1.2
0.24
1.8
0.18
FN 5 5 5 5 5
MOSFET M17 M18 M19 M20 M21/M22
W/L (µm/µm) 1.80.18
1.3
1
1.3
1
1.5
0.24
3
1
FN 5 1 1 7 3
acquires differential voltage from the preamplifier output and derives a con-
trol voltage from the transistor pairs sources. This analog voltage is used to
bias the active loads of the preamplifier and to adjust the common mode.
Transistors M12 and M13 are biased in linear region, in order not to make
the network sensitive to mismatches. This solution is quite far from the typ-
ical common mode feedback circuits but represents a good compromise for
transistor number and robustness.
As said before it is of utmost importance that the design non-idealities do not
affect the comparator performance. We have thus detailed the design of this
block down to the layout level and analyzed the parasitic effects on system
level performance thanks to the substitute-and-play methodology previously
discussed. Table 3 summarizes the transistors aspect ratios and, considering
the layout realization, also the finger numbers (FN). Figure 9 shows the com-
parator layout. It is possible to identify the common-mode feedback unit on
the right-bottom side, the preamplifier on the top and the regenerative latch
in the bottom of the figure. Here, for sake of brevity, the biasing stage has
not been shown as well as the power supply and the Bi-phase integrator metal
interconnections. In this layout we particularly cured the MOSFET placement
in order to maximize matching: The matching properties of a transistor pair
depend mostly on their mutual position in the layout [28],[20]. Here, each
MOSFET in the differential pass transistors has been placed next to each
other in a parallel geometry 2 . All the transistors have been fingered because
2 We define as “parallel pass transistors” two MOSFET’s which connect a differ-
20
this technique allows to lower mismatch sources by a factor
√
n where n is the
finger number [20]. We have chosen an odd finger number in order to balance
the parasitic interconnection resistance and capacitance of each source-drain
pair. In the latch stage (which is most sensitive to mismatch and potentially
the biggest source of offset), perfectly balanced power supply routing is also
necessary. In particular, the supply vias have been placed in the center of the
metal lines which connects the two anti-symmetric inverters in order to bal-
ance the parasitic resistance in each branch. In the layout, the save and the
latch switches have been placed next to each other to minimize mismatch.
Switching transistors efficiency also depends on the substrate resistance and
on the number of substrate contacts placed in their surroundings. This implies
that the substrate contacts must be placed as close to each switch as possible.
Imperfect metal routing can lead to input-referred offsets on the order of
several hundreds of millivolts [21], hence we privileged the placement of the
matched transistors in such a way that vertical symmetry is ensured. The sym-
metry inherently provides the same environment for the matched transistors
at sources and drains, thus lowering mismatch. In the cases designs require
multiple comparators, the layout of a single device is also important to mini-
mize global die variations. For example in flash Analog-to-Digital Converters
the comparator layout must be as much regular as possible, in such a way
that the combination of multiple units leads to a common centroid [29]. Since
here, only one comparator is required, this last aspect can be relaxed.
5 Simulations
The simulation environment includes some features which make simulation as
much close to a realistic setting as possible. It includes Additive White Gaus-
sian Noise (AWGN) and multipath diversity as an IEEE 802.15.4a waveform
database of measured channel responses for an indoor multipath environment
[2]: This allows extensive BER simulations with the possibility of varying the
noise level. The bits are modulated with a Pulse Repetition Interval (PRI) of
200 ns and an integration time of 30 ns is used for both PPM phases. The
most important entries of system’s link budget are shown in table 4.
Functional simulations
Before entering into the discussion of system-level performance we present,
through figure 10, the operation of the device employing the same notations of
figure 4. The figure shows the complete operation of the circuit including both
ential input to a differential output, i.e. M14-M15 or M1-M2.
21
Table 4
Link budget
Parameter Value
Geometric center frequency (f ′c) 3.94 GHz
Average TX power (PT ) −8.5 dBm
RX noise figure (NF ) 7 dB
Minimum Eb/N0 for BER = 10
−3 (S) 17 dB
Implementation loss (I) 1 dB
Link margin (@ dmax = 26 m) 1.22 dB
Minimum RX Sensitivity level −87 dBm
Fig. 10. Transient simulation of the Bi-phase demodulator
the integrator and the analog comparator. In these post-layout simulations we
used the schematic view of the Bi-phase integrator and the comparator netlist
extracted from the layout. All the signals are expressed with the notation V()
to suggest that the plotted waveforms are voltages extracted from a terminal.
For sake of simplicity all control signals, albeit differential, are presented as
single ended.
The output of the ideal squarer is given by signal V(squarer). This signal is
composed of both UWB signals and AWGN noise with an equivalent band-
width of 1.9GHz. Signal V(out), defined as V(Inp)-V(Inn) where Inp and
Inn are the input terminals of the comparator, represents the Bi-phase inte-
grator output. After resetting the charge in the capacitors with signal reset,
the first integration is performed by asserting signal phase0 and by deactivat-
ing signal inhibit which enables the OTA output to follow the input signal
22
Fig. 11. Comparison between the circuit and post-layout level of the comparator
unit
variations. After this, signal inhibit is asserted again. Then the same iter-
ation is performed for the phase1 signal, and afterwards the comparison is
executed by rising signal compare. In the meantime all comparator signals
(save, clamper and latch) are kept high. After the charge redistribution,
the comparator is activated in the same way as described in section 4-C. The
cycle ends with the assertion of signal reset that prepares the device to the
next demodulation process. It is also possible to identify the CMFB opera-
tion by observing waveforms V(outp) and V(outm). The common mode of
the signal is about 1V during all device operation. The two markers in figure
show two different demodulation conditions: The marker on the left indicates
a demodulation voltage of about -57mV, while the one on the right identifies
a demodulation voltage of about 60mV. The two cases represent two different
information bits, ‘1’ and ‘0’, respectively. Correspondingly, the outputs of the
comparator out_rp and out_rm switch oppositely.
It is interesting to compare the functional simulations at schematic and layout-
level. Figure 11 shows the analog comparator operation in the S3 and L3
23
cases. The differences between the two abstraction levels are not particularly
significant.
System-level performance simulations
5 7 9 11 13 15 17
10−3
10−2
10−1
100
Eb/N0 (dB)
BE
R
B(P3−S), C(P3−S), No Inhibit, VOS = 0V
B(P3−S), C(P3−S), Inhibit, VOS = 0V
B(P3−S), C(P3−S), Inhibit, VOS = 40mV
B(P3−S), C(P3−L), Inhibit, VOS = 0V
F(P1), Energy Detection curve (5 bit res.)
Non−coherent OOK receiver
Non−coherent TR receiver
Energy Detection (∞ res.) − Analytic
Fig. 12. BER curves comparison at different abstraction levels
Figure 12 shows a collection of BER curves in various conditions and for var-
ious receiver types. The simulations aimed at two aspects: The correctness
of the hypotheses used in the Bi-phase integrator design and the performance
figure of the device with different abstraction levels. First, we show the system-
level performance degradation due to the absence of the inhibit switch in
the integrator unit. Then, by using the full Spice-level unit integrator with
the inhibit switch we analyze the second aspect, that is the impact of the
comparator low-level non-idealities on system-level performance. As an over-
all result, it is possible to infer important considerations about the required
models for the successive design stages. The analises have been carried out
through a comparison among the obtained BER and the ideal performance
figures of an Energy Detection receiver. In the figure we use a symbolic no-
tation to denote the modeling levels for the receiver blocks. The front-end,
the comparator and the Bi-phase integrator are represented by symbols F, C
and B, while the values in brackets for these symbols, as defined in section
3, represent the abstraction level: P1 - Conceptual Phase I, P2 - Behavioral
Phase II, P3-S - Spice-level Phase III, P3-L - Post-layout level Phase III.
24
The “Analytic” curve refers to a theoretical Energy Detection receiver with
infinite quantization resolution [30]. The F(P1) curve refers to a quantization
resolution of 5 bits. The other cases refer to the reduced complexity receiver
described by a B(P3-S) Bi-phase integrator for different abstraction levels of
the comparator. In all these cases, the demodulator input range is 160mV. In
case no reset switches across Cp are considered, performance degrades signif-
icantly. As previously shown in (5), demodulation is affected by the voltage
across Cp stored in each operation phase and by the output voltage of the pre-
vious demodulation cycle. If in the integrator no inhibit signal is used, the
terms N
p,(n)
fph0
for the first integration phase and the corresponding N
p,(n)
fph1
for
the second one corrupt the final demodulation voltage: Thus, demodulation
performance decreases even with an ideal P2 comparator. In fact, figure shows
that the equivalent quantization resolution is worse than an ideal receiver with
a 5 bit resolution A/D. On the other hand, thanks to the Inhibit switches, the
Bi-phase integrator can approximately reach the performance of the theo-
retical Energy Detection receiver (BER curves overlap). The offset rejection
feature of the device is also proven because BER remains almost unvaried if
an input referred offset VOS of 40mV is included in the simulations. These
important results permit to verify the correctness of the previously presented
theory.
For completeness, figure 12 includes other Impulse-Radio receiver types, Trans-
mitted Reference (TR) and On-Off Keying (OOK) [31]. The curves show better
BER performance of ED compared to TR in which no threshold set is required
for demodulation. In the case of OOK receivers setting a threshold is manda-
tory and affects performance. In this case it has slightly worse performance
than Energy Detection even though the optimal threshold was chosen.
It is possible to note that the curves B(P3-S)-C(P3-L) and B(P3-S)-C(P3-S)
comparator models differ slightly especially for low SNR. Since the Bi-phase
integrator output voltage is low and comparable to the input-referred offset of
the comparator stage, for low Eb/N0 the non-perfect routing in the compara-
tor interconnections generate an input-referred offset which slightly increases
the error-rate. For higher Eb/N0 this unbalanced routing do not influence sys-
tem level performance because the final demodulation voltage after charge
redistribution is higher.
Figure 13 shows the BER of the demodulator at Eb/N0=17 dB – which cor-
responds to an acceptably low BER – as a function of the OTA input signal
amplitude. For input signals lower than input range of the OTA (on the or-
der of 250mV), the BER does not change. Whether the input signal becomes
larger, the BER performance decreases. As anticipated in section 4, this dis-
tortion does not degrade much the BER figure because OTA saturates only
for very short times, corresponding to the duration of the peaks of the UWB
signal exceeding the input range. This leads only to a slight degradation of
25
200 250 300 350 400 450 500 550 600 650
0.0005
0.0006
0.0007
0.0008
0.0009
0.001
0.002
OTA Input amplitude (mVpp)
BE
R
Fig. 13. BER at Eb/N0=17 dB for various input amplitudes
the BER. For example an increase of input amplitude from 250mV to 450mV
leads to a BER degradation of 10−4. As input signal further increases, this
effect becomes more relevant, especially at very low signal-to-noise ratios.
The most important non-ideal effect which degrades demodulator performance
is related to the input-referred offset of the comparator due to transistor mis-
match. Especially at low SNR, since noise power level is comparable to UWB
signal strength, the output voltage of the integrator is comparable to the input-
referred offset of the comparator. The simulation with input offset voltage up
to 10mV do not show BER degradation if the input signal level is sufficiently
high. This requires having an overall gain of the front-end before the OTA of
40 dB, which is compatible with other works [7].
The curves of figures 12-13 identify the limits below which the receiver works
almost ideally. If input signal’s amplitude is below 250mV and input-referred
offset is less than 40mV, the results are more or less the same regardless
of the abstraction level used for the various blocks, that is AMS, schematic
or layout level. Therefore, in order to save simulation time, it is possible to
design the remaining part and simulate the whole receiver’s front end by using
a P2 view of the already designed blocks. A summary of the obtained results
both at circuit and system level is shown in table 5. As far as performance in
concerned, there is no BER degradation with respect to the ideal case using a
P1 or P2 description, as physical effects are abstracted away. On the contrary,
schematic and layout level P3 descriptions allow to include and evaluate such
effects.
Table 5 also summarized power consumption figures that can be evaluated in
P3. When the device is idle and no signal is present at the input, the quiescent
power consumption PQ, inclusive of biasing circuit, is 400µW. During normal
operation, the power consumption PD, averaged on 4ms of demodulation ac-
26
Table 5
Results obtained during the refinement phases P1, P2 and P3
P1 P2
Circuit-level None Block identification
System-level (BER) Ideal ED receiver Ideal ED receiver
P3-S P3-L (Hybrid P3-S/P3-L)
Circuit-level PQ=400µW, PD=1mW PQ=400µW, PD=950µW
System-level (BER) Fig. 12, B(P3-S)-C(P3-S), almost ideal Fig. 13, 10−4 loss at 450mV swing
tivity, is 1mW in P3-S and 950µW in P3-L because the optimized layout has
smaller parasitics than the overestimated ones of the schematic level. With
respect to [7], in which the overall receiver consumes 2.5 nJ/bit at 16.7Mbit/s
Val=0.65V, here the energy spent by the demodulator is 190 pJ/bit at 5Mbit/s
and Eb/N0 =15 dB, Val=1.8V.
Typical UWB receiver implementations have power consumption ranging from
1mW to 330mW. This huge diversity is due to the transmission technology
used, Multi Band-OFDM or Impulse-Radio and to the technology process.
For the Impulse-Radio PPM receivers power consumption is in the range 30-
40mW and energy per bit varies between 1 and 3 nJ/bit. The highest power
contribution is due to the front-end units, which must provide a gain of about
40 dB [7]. Supposing the power gain of a single LNA cannot exceed 20 dB in the
best case, the required front-end gain can be obtained by cascading one LNA
with other gain stages. Typically the LNAs for Impulse-radio would consume
9 mW [32]. Since our demodulator consumes only 1mW and its area is way
smaller than the first RF front-end units, it contributes only for a small portion
of the power-area budget. Wrapping up the obtained results with other data
from the literature we can expect a peak power of 15-20mW for the overall
receiver. Moreover, it is understood that wireless applications in which IR-
UWB is bound to be employed will work at extremely low duty-cycles, thus
allowing to switch off the front-end blocks during idle times therefore reducing
power to a few percent of the peak [33].
Another important consideration regarding the performance of Energy De-
tection UWB receivers is narrow-band interference.The BER performance de-
pends on the strength, number and physical position of the transmitters [34].
Provided that the receiver front-end does not saturate, since the demodulator
has about ideal performance, the effect of interference is about the same as for
ideal Energy Detection receivers. Hence, for decreasing interference sensitivity
we address the reader to the specific literature. For instance, in TR receivers
attenuating in-band interference by using notch filters or increasing Signal-
to-Interference ratios (SIR) by using feedback loop mechanisms are standard
techniques [35].
27
It is also fundamental that receiver can recover synchronization in presence of
such a simplified hardware. We demonstrated in [36] how to achieve synchro-
nization with this simple Bi-phase demodulator. The synchronization func-
tionality is obtained using only the 1-bit output of the comparator and a
searchback algorithm applied on an ordered collection of differential energies:
These are obtained by delaying the Phase 0 and Phase 1 windows and by
shifting the integrations linearly within the PRI.
6 Conclusions
This paper presented a new CMOS 0.18µm 2-PPM demodulator architecture
based on a Gm − C open loop integrator, a switched capacitor network and
an analog comparator. The circuit has been designed and simulated using
an ad-hoc methodology based on a “substitute-and-play” approach which al-
lows to change the abstraction level in the simulations. Thanks to an ad-hoc
testbed and the simulation tool ADVanceMS, it is possible to change the block
descriptions and modify the simulation accuracy. The environment employs
a realistic channel model and offers the possibility to include both BSIM3
MOSFET and layout backannotated models with distributed RC parasitics.
BER results show that the device has about the same performance of an ideal
Energy Detection receiver employing infinite A/D resolution.
Once the circuit layout has been fully completed, mixed-signal insulation tech-
niques will be essential to prevent that the other front-end devices compromise
the transconductor and the comparator operations. Considering the results ob-
tained in this work, final design steps such as layout of the Bi-phase integrator
and silicon fabrication are worthy of consideration, and we plan to focus on
that our forthcoming work.
References
[1] “Revision of part 15 of the commissions rules regarding ultra-wideband
transmission systems,” Report and order, adopted February 14, 2002, released
July 15, 2002, FCC.
[2] IEEE 802.15 WPAN low rate alternative PHY task group 4A (TG4A).
[Online]. Available: http://www.ieee802.org/15/pub/TG4a.html
[3] F. S. L. David D. Wentzloffe et al., “Energy efficient pulsed-UWB CMOS
circuits and systems,” in In. Proc. International Conference on UWB 2007,
Sept. 2007, pp. 282–287.
28
[4] L. Stoica et al., “An ultrawideband system architecture for tag-based wireless
sensor networks,” IEEE Trans. Veh. Technol., vol. 54, no. 5, pp. 1632–1645,
2005.
[5] R. B. Fred S. Lee et al., “A 3.1 to 10.6 GHz 100 Mb/s pulse-based ultra-
wideband radio receiver chipset,” in In. Proc. International Conference on
UWB 2006, Jan. 2006, pp. 185–190.
[6] F. S. Lee and A. P. Chandrakasan, “A 2.5nJ/b 0.65V 3-to-5GHz subbanded
UWB receiver in 90nm CMOS,” in Proc IEEE ISSCC, 2007, pp. 116–590.
[7] ——, “A 2.5 nJ/bit 0.65V pulsed uwb receiver in 90 nm CMOS,” IEEE J.
Solid-State Circuits, vol. 42, no. 12, pp. 2851–2859, 2007.
[8] E. Christen and K. Bakalar, “VHDL-AMS–a hardware description language for
analog and mixed-signal applications,” IEEE Trans. Circuits Syst. II, vol. 46,
no. 10, pp. 1263–1272, 1999.
[9] L. Barrandon et al., “Behavioral modeling and simulation of mixed signal front-
end for software defined radio terminals,” in In Proc. IEEE Int. Symp. on
Industrial Electronics 2004, May 2004.
[10] V. Nguyen et al., “VHDL-AMS behavioral modelling and simulation of
high-pass delta-sigma modulator,” in Proc. IEEE Int. Beh. Modeling and
Simul. Conf. (BMAS’05), San Jose`, CA, Oct. 2005, pp. 106–111.
[11] M. Crepaldi et al., “An effective AMS top-down methodology applied to the
design of a mixed-signal uwb system-on-chip,” in In Proc. DATE 2007, Nice,
France, Apr. 2007.
[12] ——, “UWB receiver design and Two-Way-Ranging simulation using VHDL-
AMS,” in In Proc. PRIME 2006, Otranto, Italy, June 2006.
[13] E. Sanchez-Sinencio et al., “CMOS transconductance amplifiers, architectures
and active filters: a tutorial,” in In IEE Proc.-Circuits Devices Syst., Feb. 2000.
[14] B. Razavi, Design of Analog CMOS Integrated Circuits. Mc Graw Hill, Int.
Edition, 2001.
[15] J. Moreira et al., “Limits to the dynamic range of low-power continuous-time
integrators,” IEEE Trans. Circuits Syst. I, vol. 48, no. 7, pp. 805–817, 2001.
[16] J. Galan et al., “A low-power low-voltage OTA-C sinusoidal oscillator with a
large tuning range,” IEEE Trans. Circuits Syst. I, vol. 52, no. 2, pp. 283–291,
2005.
[17] S.-H. Yang et al., “A novel CMOS operational transconductance amplifier based
on a mobility compensation technique,” IEEE Trans. Circuits Syst. II, vol. 52,
no. 1, pp. 37–42, 2005.
[18] P. M. Figueiredo and J. C. Vital, “Kickback noise reduction techniques for
CMOS latched comparators,” IEEE Trans. Circuits Syst. I, vol. 53, no. 7, pp.
541–545, 2006.
29
[19] S. Park and M. P. Flynn, “A regenerative comparator structure with integrated
inductors,” IEEE Trans. Circuits Syst. I, vol. 53, no. 8, pp. 1704–1711, 2006.
[20] P. G. Drennan and C. C. McAndrew, “Understanding MOSFET mismatch for
analog design,” IEEE J. Solid-State Circuits, vol. 38, no. 3, pp. 450–456, 2003.
[21] A. Nikoozadeh and B. Murmann, “An analysis of latch comparator offset due
to load capacitor mismatch,” IEEE Trans. Circuits Syst. II, vol. 53, no. 12, pp.
1398–1402, 2006.
[22] K. Kotani et al., “CMOS charge-transfer preamplifier for offset-fluctuation
cancellation in low-power A/D converters,” IEEE J. Solid-State Circuits,
vol. 33, no. 5, pp. 762–769, 1998.
[23] B. Razavi and B. A. Wooley, “Design techniques for high-speed, high-resolution
comparators,” IEEE J. Solid-State Circuits, vol. 27, no. 12, pp. 1916–1926, 1992.
[24] B. P. Ginsburg and A. P. Chandrakasan, “Dual time-interleaved successive
approximation register ADCs for an ultra-wideband receiver,” IEEE J. Solid-
State Circuits, vol. 42, no. 2, pp. 247–257, 2007.
[25] J. H. Atherton and H. T. Simmonds, “An offset reduction technique for use with
CMOS integrated comparators and amplifiers,” IEEE J. Solid-State Circuits,
vol. 27, no. 8, pp. 1168–1175, 1992.
[26] M.-J. E. Lee et al., “Low-power area-efficient high-speed i/o circuit techniques,”
IEEE J. Solid-State Circuits, vol. 35, no. 11, pp. 1591–1599, 2000.
[27] S. R. S. N. Stefanou, “An average low offset comparator for 1.25 Gsample/s
ADC in 0.18micron CMOS,” in Proc. IEEE International Conference on
Electronics Circuits and Systems, Dec. 2004, pp. 246–249.
[28] M. J. Pelgrom et al., “Matching properties of MOS transistors,” IEEE J. Solid-
State Circuits, vol. 24, no. 5, pp. 1433–1440, 1989.
[29] S. Bhattacharya et al., “Multi-level symmetry constraint generation for
retargeting large analog layouts,” IEEE Trans. Computer-Aided Design, vol. 25,
no. 6, pp. 945–960, 2006.
[30] C. Carbonelli and U. Mengali, “M-PPM noncoherent receivers for UWB
applications,” IEEE Trans. Wireless Commun., vol. 5, no. 8, pp. 2285–2294,
2006.
[31] S. Dubouloz et al., “Performance analysis of LDR UWB non-coherent receivers
in multipath environments,” in Proc. IEEE International Conference on UWB,
Sept. 2005, pp. 491–496.
[32] A. Bevilacqua and A. M. Niknejad, “An ultra-wideband CMOS LNA for 3.1 to
10.6 GHz wireless receiver,” in Proc. IEEE ISSCC Dig. Tech. Papers, 2004, pp.
382–383.
[33] R. W. B. Ian D. O’Donnell, “A flexible, low power, DC-1 GHz impulse-UWB
transceiver front-end,” in In. Proc. International Conference on UWB 2006,
Jan. 2006, pp. 275–280.
30
[34] A. Rabbachin et al., “UWB energy detection in the presence of multiple
narrowband interferers,” in Proc. IEEE International Conference on UWB,
Sept. 2007, pp. 857 – 862.
[35] F. Dowla et al., “Interference mitigation in transmitted-reference ultra-
wideband (UWB) receivers,” in Proc. IEEE Antennas and Propagation Society
International Symposium, June 2004, pp. 1307–1310.
[36] M. Crepaldi et al., “A 1-bit synchronization algorithm for a reduced complexity
energy detection UWB receiver,” in International Conference on UWB, Sept.
2007.
31
