Synthesis strategies for sub-VT systems by Meinerzhagen, Pascal Andreas et al.
Synthesis Strategies for Sub-VT Systems
Pascal Meinerzhagen∗, Oskar Andersson†, Yasser Sherazi†, Andreas Burg∗, and Joachim Rodrigues†
∗Institute of Electrical Engineering, EPFL, Lausanne, VD, 1015 Switzerland
Email: pascal.meinerzhagen@epfl.ch, andreas.burg@epfl.ch
†Department of Electrical and Information Technology, Lund University, Lund, 22100 Sweden
Email: oskar.andersson@eit.lth.se, yasser.sherazi@eit.lth.se, joachim.rodrigues@eit.lth.se
Abstract—Various synthesis strategies relying on conventional
standard-cell libraries (SCLs) are evaluated in order to minimize
the energy dissipation per operation in sub-threshold (sub-VT)
systems. First, two sub-VT analysis methods are reviewed, both of
which allow to evaluate the energy dissipation and performance
in the sub-VT regime for designs which have been synthesized
using a 65-nm CMOS SCL, characterized at nominal supply
voltage. Both analysis methods are able to predict the energy
minimum supply voltage (EMV) of any given design. Next, the
results of a sub-VT synthesis at EMV using re-characterized SCLs
are compared to the initial synthesis results. Finally, the results
of timing-driven synthesis in both the above-VT and the sub-VT
domain are compared to the results of power-driven synthesis.
I. INTRODUCTION
Battery-powered devices such as hearing aids, medical
implants [1], and remote sensors impose severe constraints
on energy dissipation. Supply voltage scaling reduces both
active energy dissipation and leakage power. When applied
aggressively, voltage scaling leads to sub-threshold (sub-VT)
operation [2]. As an alternative to full-custom sub-VT circuit
design, [3–5] promote the design of sub-VT circuits based on
conventional standard-cell libraries (SCLs).
However, most commercial SCLs are designed for the
above-VT domain, meaning that a) they are mainly optimized
for performance, as performance has been the main concern
for above-VT circuit design over the last few decades, and
that b) physical models describing the timing and the power
consumption of the standard-cells are readily available only for
the nominal supply voltage. Instead of using commercial SCLs
optimized for above-VT operation, standard-cell based sub-
VT design would ideally rely on SCLs which are especially
optimized for sub-VT operation [6,7], meaning that more
emphasis is given to leakage reduction than to performance
while designing the standard-cells. If the development of a
dedicated sub-VT SCL is not economic—which corresponds
to the viewpoint adopted in this paper—, a commercial SCL,
optimized for above-VT operation, can still be re-characterized
to at least generate the physical timing and power models
valid for sub-VT supply voltages. Beside SCLs, virtually all
logic synthesis tools as well as place-and-route (P&R) tools
have been developed for regular digital VLSI design in the
above-VT domain, and therefore use sophisticated timing-
driven optimization algorithms, whereas they hardly allow to
directly optimize a design for minimum energy dissipation per
operation, which is an important metric for energy-constrained
systems.
Contribution: This paper shows different synthesis and
analysis strategies for sub-VT system design using commercial
SCLs and commercial logic synthesis as well as P&R tools.
The focus is on energy-constrained sub-VT systems, which
are optimized to perform a given operation with the lowest
possible energy dissipation, assuming that the system might be
power-gated or turned off after task completion. Sec. II reviews
and compares two methods to analyze the sub-VT energy
dissipation and timing of designs which have previously been
synthesized in the above-VT domain, before focusing on a
direct sub-VT synthesis and analysis flow. Sec. III discusses
and compares various strategies to minimize the energy per
operation: above-VT-only synthesis with different power and
timing constraints, and above-VT synthesis (to determine
EMV) followed by an incremental sub-VT synthesis at EMV,
again with different constraints. Sec. IV concludes the paper.
II. SYNTHESIS AND ANALYSIS METHODS
A. Above-VT Synthesis with Sub-VT Analysis
Due to the predominance of SCLs and design tools devel-
oped for regular above-VT synthesis, it might be convenient
to synthesize different architectural variants of a system, with
different constraints on timing and power, in the above-VT
domain, and subsequently analyze and compare the energy
dissipation and throughput of the various resulting designs in
the sub-VT domain. In this section, two methods to analyze
the sub-VT behaviour of designs which have previoulsy been
synthesized in the above-VT domain are presented and com-
pared.
1) Analytical Sub-VT Model: As shown in Fig. 1(a), the
first method starts from a regular static timing analysis (STA)
and voltage-change dump (VCD)-based power analysis of a
fully placed, routed, and back-annotated netlist in the above-
VT domain. An analytical model [8,9] is then used to scale
timing and power quantities to the sub-VT domain. A main
advantage of this analytical model is the ability to immediatly
find the EMV.
The analytical sub-VT frequency model in [8,9] makes the
assumption that the propagation delay(s) dcell of all standard-
cells slow down at the same pace as the propagation delay dinv
of a basic inverter when the supply voltage VDD is gradually
scaled down. In other words, the ratio dcell/dinv is assumed
to be independent of VDD. In order to study the accuracy of
this assumption, consider the propagation delay(s) of various
standard-cells, extracted from analog circuit simulation for
Logic
synthesis
@1.2V
P&R
@1.2V
.v
RC db
SDF gen
@1.2V
STA 
@1.2V
1.2V SCL
Functional
simulation
@1.2V
.sdf
(1.2V) Power
engine
@1.2V
.vcd
(1.2V)
Analytical
sub-VT 
model
Sub-VT
energy & 
timing
(a)
Logic
synthesis
@1.2V
P&R
@1.2V
.v
RC db
SDF gen
@0.4V - 0.25V
STA 
@0.4V - 0.25V
1.2V SCL
Functional
simulation
@0.4V - 0.25V
.sdf
(0.4V -
0.25V) Power
engine
@0.4V - 0.25V
.vcd
(0.4V -
0.25V)
0.4V, 0.39V,
...
0.25V SCL
Re-characterization
Sub-VT
energy & 
timing
(b)
Logic
synthesis
@0.32V
P&R
@0.32V
.v
RC db
SDF gen
@0.4V - 0.25V
STA 
@0.4V - 0.25V
1.2V SCL
Functional
simulation
@0.4V - 0.25V
.sdf
(0.4V -
0.25V) Power
engine
@0.4V - 0.25V
.vcd
(0.4V -
0.25V)
0.4V, 0.39V,
...
0.25V SCL
Re-characterization
Sub-VT
energy & 
timing
(c)
Fig. 1. Sub-VT design and analysis flows: (a) Above-VT synthesis, STA, and
power analysis. Analytical sub-VT model. (b) Above-VT synthesis. Sub-VT
STA and power analysis. (c) Sub-VT synthesis, STA, and power analysis.
many different above-VT and sub-VT supply voltages. To
better visualize a potential change of cell delays compared
to the inverter delay, the normalized cell delay is defined as
dnorm(VDD) =
(
dcell(VDD)
dcell(VDD=V
(0)
DD
)
)
(
dinv(VDD)
dinv(VDD=V
(0)
DD
)
) , (1)
where V (0)DD is the nominal supply voltage (V
(0)
DD = 1.2V in
the current case).
Fig. 2 shows dnorm(VDD) for all timing arcs of all standard-
cells in a reference design [9]. In contrast to the basic
assumption of the sub-VT frequency model in [8,9], we find
that dnorm(VDD) can be significantly larger than one and
increases notably for most standard-cells when scaling down
VDD. Consequenty, the analytical model [8,9] underestimates
the critical path delay in the sub-VT domain for the considered
65-nm CMOS SCL. A more time-consuming but more precise
(in terms of timing) sub-VT analysis method is discussed next.
2) Evaluation Using Sub-VT SCLs: The second method,
shown in Fig. 1(b), consists of re-characterizing the original
SCL for many different supply voltages in the sub-VT domain
(from 250 mV to 400 mV in steps of 10 mV in the current
case), and then repeating the STA and the VCD-based power
analysis using these re-characterized SCLs. The considered
low-power (LP) high threshold-voltage (HVT) nMOS and
pMOS transistors in a 65-nm CMOS technology have absolute
threshold-voltage values above 450 mV. For an accurate VCD-
0.2 0.4 0.6 0.8 1 1.2
1
1.5
2
2.5
3
3.5
VDD [V]
d n
o
rm
(V
D
D)
Fig. 2. Normalized delay of standard-cells.
0.25 0.3 0.35 0.4
100
101
VDD [V]
En
er
gy
/c
yc
le
 [p
J]
 
 
Sub−VT Model
Sub−VT SCL
Fig. 3. Comparison of two sub-VT analysis methods (analytical sub-VT
model and evaluation using sub-VT SCLs): energy dissipation for operation
at a constant frequency of 1kHz.
based power analysis, the standard delay format (SDF) file
generation from the RC-annotated netlist, and the VCD dump
from the gate-level simulation must be repeated for each
supply voltage.
3) Comparison of Sub-VT Analysis Methods: The results
of the two sub-VT analysis methods (analytical sub-VT model
and evaluation using re-characterized sub-VT SCLs) are com-
pared by applying them to a reference design [9] which has
previously been synthesized, placed, and routed at nominal
supply voltage using a 65-nm CMOS SCL.
Concerning the estimation of the energy dissipation per
clock cycle for operation at a constant clock frequency, both
sub-VT analysis methods coincide fairly well, as shown in
Fig. 3. This means that the sub-VT model [8,9] does accurately
predict the active energy and the leakage power.
The analytical sub-VT model is thus very convenient to
quickly and reasonably precisely estimate the leakage power
consumption and the active energy dissipation in the sub-VT
domain, and to quickly have a reasonable guess of EMV. For
a more precise maximum frequency and EMV estimation, it
is important to re-characterize the SCL and repeat the STA in
the sub-VT domain.
The more precise flow, shown in Fig. 1(b), using re-
characterized SCLs is used for all above-VT synthesis runs
with subsequent sub-VT analysis in the reminder of this paper.
B. Direct Sub-VT Synthesis
For voltage-constrained sub-VT systems, or if the approx-
imate EMV is already known from a previous above-VT
synthesis, it might be desirable to directly synthesize in the
sub-VT domain, which allows to specify meaningful timing
constraints, and to directly obtain timing and power figures
for the considered supply voltage from STA and the power
engine, respectively. Fig. 1(c) shows a direct sub-VT synthesis
and analysis flow, which, in addition to the supply voltage at
which the logic synthesis and P&R are performed, gives the
energy dissipation and timing metrics of the resulting design
for the entire sub-VT range, allowing to find the true EMV.
This flow is used for all sub-VT synthesis runs in the remainder
of this paper.
III. DESIGN STRATEGIES
In this section, various synthesis approaches, including
synthesis in the above-VT and in the sub-VT domain, both
with different constraints on timing and power, are compared
in terms of energy-efficiency.
A. Power-Driven Above-VT Synthesis
In a first approach, the logic synthesis and backend design
are performed at nominal supply voltage (VDD = 1.2V), using
commercial SCLs characterized at this voltage. Synthesis
constraints are set to minimize the leakage power (and area),
as leakage currents are expected to considerably contribute to
the total energy dissipation for sub-VT operation, while tim-
ing is virtually unconstrained. This relaxed timing constraint
guarantees that mostly minimum-drive cells with minimum
leakage current are infered during synthesis. In Fig. 4 the
circle (◦) markers show a) the maximum operating frequency,
b) the energy dissipation per clock cycle when operating at
this frequency, and c) the energy dissipation per clock cycle
when operating at 1kHz, always as a function of VDD, of the
design resulting from this synthesis approach. As shown in
Fig. 4(b), there is an EMV at 320 mV for maximum-speed
operation.
B. Power-Driven Sub-VT Synthesis
With the knowledge of EMV of the initial design synthe-
sized in the above-VT domain with a very relaxed timing
constraint, an incremental synthesis directly in the sub-VT
domain at EMV is performed in order to see if the synthesizer
can leverage the accurate physical information of the standard-
cells (power consumption and timing), valid for EMV, to yield
a more energy-efficient design. Notice that EMV is a property
of a design and might vary for each new design synthesized at
a different supply voltage or with a different timing constraint.
However, for all synthesis conditions considered in this work,
EMV deviates by only 20 mV at maximum from the original
320 mV. Also, the energy at maximum speed starts to increase
only very slowly when moving away from the EMV of the
various designs, as shown in Fig. 4(b). Consequently, for the
0.25 0.3 0.35 0.4
0
10
20
30
40
50
60
VDD [V]
f m
a
x 
[kH
z]
 
 
Synt. @ 1.2V, relaxed
Synt. @ 1.2V, hard
Synt. @ 320mV, relaxed
Synt. @ 320mV, hard
(a)
0.25 0.3 0.35 0.4
2
4
6
8
10
12
VDD [V]
En
er
gy
/c
yc
le
 [p
J]
 
 
Synt. @ 1.2V, relaxed
Synt. @ 1.2V, hard
Synt. @ 320mV, relaxed
Synt. @ 320mV, hard
(b)
0.25 0.3 0.35 0.4
0
5
10
15
20
25
30
VDD [V]
En
er
gy
/c
yc
le
 [p
J]
 
 
Synt. @ 1.2V, relaxed
Synt. @ 1.2V, hard
Synt. @ 320mV, relaxed
Synt. @ 320mV, hard
(c)
Fig. 4. Comparison of above-VT synthesis at nominal supply voltage of
1.2 V and sub-VT synthesis at 320 mV, which corresponds to the EMV of
the initial design synthesized in the above-VT domain. For both above-VT
and sub-VT synthesis, a very relaxed and a very hard timing constraint are
chosen. (a) Maximum operating frequency, (b) energy dissipation per clock
cycle for maximum speed operation, and (c) energy dissipation for operation
at a constant frequency of 1kHz.
endaveour of finding the most energy-efficient design without
the need of synthesizing at many different sub-VT supply
voltages, it is reasonable to synthesize at 320 mV. Following
the reasoning for the initial above-VT synthesis, the synthesis
at 320 mV is driven by a leakage power and an area constraint,
while there is a very relaxed timing constraint. The diamond
() markers in Fig. 4 show that the design becomes slightly
faster, but also dissipates more energy per cycle compared
to the initial synthesis. Apparently, in a leakage- and area-
driven flow, sub-VT synthesis has no advantage over above-VT
synthesis.
C. Timing-Driven Above-VT Synthesis
The main metric in energy-constrained sub-VT systems
being the energy dissipation per operation/task, and assum-
ing that the system is power-gated or turned off after task
completion, a faster design might be more energy-efficient,
even though it might require more and/or stronger cells and
consequently exhibit higher leakage power. In fact, for higher
energy-efficiency, a shorter clock cycle over which leakage
power is integrated needs to overcompensate the increase in
leakage power and the potential increase in the total switched
capacitance/active energy dissipation. Energy reduction is
given priority over area-increase for the considered energy-
constrained sub-VT systems.
The initial above-VT synthesis is thus repeated with a very
hard timing constraint. The square (2) markers in Fig. 4 show
that the resulting design is indeed significantly faster than all
previous designs, which is clearly payed for by a higher energy
dissipation for operation at a constant frequency. Also for
operation at maximum speed, this design clearly exhibits more
energy per cycle compared to all previous designs, as shown
in Fig. 4(b), which means that the relatively small decrease
in the critical path delay does not sufficiently compensate for
the increase in leakage power and active energy dissipation.
D. Timing-Driven Sub-VT Synthesis
In order to investigate if the synthesizer can leverage the
knowledge of the power consumption valid for EMV, the
sub-VT synthesis at 320 mV is also repeated with a hard
timing constraint. The plus sign (+) markers in Fig. 4 indicate
that the timing-driven sub-VT synthesis results in a similar
speed enhancement as the timing-driven above-VT synthesis.
However, the timing-driven sub-VT synthesis clearly leads to
less energy dissiation per cycle than the timing-driven above-
VT synthesis, showing that the logic synthesizer can indeed
benefit from the knowledge of the physical information valid
for sub-VT supply voltages. For timing-critical designs, it is
thus beneficial to synthesize in the sub-VT domain, in order
to improve the energy-efficiency.
However, if timing is not critical, it is sufficient to conve-
niently synthesize in the above-VT domain with commercial
SCLs characterized at the nominal supply voltage in order
to get the most energy-efficient sub-VT design, if only the
synthesis is driven by a tight power and area constraint, while
the timing constraint is very relaxed.
For the timing-driven synthesis runs, the standard-cell area
increases by a similar factor as the energy in Fig. 4(c),
compared to the power-driven synthesis runs.
IV. CONCLUSIONS
Different synthesis strategies relying on commercial
standard-cell libraries (SCLs) and commercial synthesis tools
are compared for building an optimum energy-constrained
sub-VT system in the sense of lowest energy dissipation per
operation performed at maximum speed.
The most energy-efficient sub-VT system in the current
study is obtained by above-VT synthesis with a hard power
and area constraint, merely using commercial SCLs which
are readily available, without the need for synthesis in the
sub-VT domain and the associated effort to re-characterize a
commercial SCL for sub-VT supply voltages. However, if the
design is timing-critical, it is beneficial to perform a sub-VT
synthesis in order to improve the energy-efficiency, compared
to an above-VT synthesis.
A previously developed analytical model precisely predicts
the leakage power consumption and the active energy dissi-
pation in the sub-VT domain for a design which has been
synthesized in the above-VT domain, while for the estimation
of the maximum achievable operating frequency in the sub-
VT domain, re-characterizing the SCL and repeating the static
timing analysis in the sub-VT domain is more precise.
ACKNOWLEDGMENT
This work was kindly supported by the Swiss National
Science Foundation under the project number PP002-119057.
The project was conducted with financial support from the
Swedish VINNOVA Industrial Excellence Centre (SOS) and
Swedish Foundation for Strategic Research (SSF).
REFERENCES
[1] R. Sarpeshkar, “Ultra low power electronics for medicine,” in Proc. Inter-
national Workshop on Wearable and Implantable Body Sensor Networks,
April 2006, pp. 1–37.
[2] J.-J. Kim and K. Roy, “Double gate-MOSFET subthreshold circuit for
ultra low power applications,” in IEEE Trans. on Electron Devices,
vol. 51, no. 9, pp. 1468–1474, Sept. 2004.
[3] B. Calhoun, A. Wang, and A. Chandrakasan, “Device sizing for mini-
mum energy operation in subthreshold circuits,” in Proc. IEEE Custom
Integrated Circuits Conference, Oct. 2004, pp. 95–98.
[4] B. H. Calhoun, A. Wang, and A. Chandrakasan, “Modeling and sizing
for minimum energy operation in subthreshold circuits,” in IEEE J. of
Solid-State Circuits, vol. 40, no. 9, pp. 1778–1786, Sept. 2005.
[5] J. Rodrigues, O. C. Akgun, and V. Owall, “A <1 pJ Sub-VT cardiac event
detector in 65 nm LL-HVT CMOS,” in Proc. VLSI-SoC, June 2010.
[6] M. Alioto, “Impact of NMOS/PMOS imbalance in ultra-low voltage
CMOS standard cells,” in Proc. IEEE European Conference on Circuit
Theory and Design, Aug. 2011.
[7] S. Amarchinta, H. Kanitkar, and D. Kudithipudi, “Robust and high per-
formance subthreshold standard cell design,” in Proc. IEEE International
Midwest Symposium on Circuits and Systems, Aug. 2009, pp. 1183–1186.
[8] O. C. Akgun and Y. Leblebici, “Energy efficiency comparison of asyn-
chronous and synchronous circuits operating in the sub-threshold regime,”
in J. of Low Power Electronics, vol. 4, Oct. 2008.
[9] O. Akgun, J. Rodrigues, Y. Leblebici, and V. Owall, “High-level energy
estimation in the sub-VT domain: Simulation and measurement of a
cardiac event detector,” in IEEE Trans. on Biomedical Circuits and
Systems, 2011.
