NCFET-Aware Voltage Scaling by Salamin, Sami et al.
NCFET-Aware Voltage Scaling
Sami Salamin∗, Martin Rapp∗, Hussam Amrouch∗, Girish Pahwa‡, Yogesh Chauhan‡, Jörg Henkel∗
∗Department of Computer Science, Karlsruhe Institute of Technology, Karlsruhe, Germany
‡Electrical Engineering Department, Indian Institute of Technology Kanpur, Kanpur, India
{sami.salamin, martin.rapp, amrouch, henkel}@kit.edu, {girish, chauhan}@iitk.ac.in
Abstract—Negative Capacitance Field-Effect Transistor
(NCFET) has recently attracted significant attention. In the
NCFET technology with a thick ferroelectric layer, voltage
reduction increases the leakage power, rather than decreases,
due to the negative Drain-Induced Barrier Lowering (DIBL)
effect. This work is the first to demonstrate the far-reaching
consequences of such an inverse dependency w.r.t. the existing
power management techniques. Moreover, this work is the first
to demonstrate that state-of-the-art Dynamic Voltage Scaling
(DVS) techniques are sub-optimal for NCFET. Our investigation
revealed that the optimal voltage at which the total power
is minimized is not necessarily at the point of the minimum
voltage required to fulfill the performance constraint (as in
traditional DVS). Hence, an NCFET-aware DVS is key for
high energy efficiency. In this work, we therefore propose the
first NCFET-aware DVS technique that selects the optimal
voltage to minimize the power following the dynamics of
workloads. Our experimental results of a multi-core system
demonstrate that NCFET-aware DVS results in 20% on
average, and up to 27% energy saving while still fulfilling the
same performance constraint (i.e., no trade-offs) compared to
traditional NCFET-unaware DVS techniques.
I. INTRODUCTION
NCFET is an emerging technology that has great potential
to replace the CMOS technology in the near future, since it
exhibits a considerable improvement in the circuit’s perfor-
mance by overcoming the fundamental limit of sub-threshold
swing. The sub-threshold swing of a transistor determines the
minimum increase in voltage required to raise the transistor’s
“on” current by one order of magnitude. Hence, the sub-
threshold swing determines how fast the transistor can switch
from the “off” to the “on” state. In conventional FET tech-
nology, the sub-threshold swing is limited to 60mV/decade at
room temperature due to the so-called “Boltzmann tyranny”
(i.e., the Boltzmann distribution of charge carriers at the source
of the transistor) [1]. To overcome this limitation, NCFET
incorporates a ferroelectric layer within the gate stack of the
transistor, that behaves as a Negative Capacitance (NC) result-
ing in a voltage amplification. This allows the sub-threshold
swing of the transistor to fall below the 60mV/decade [1],
[2]. This, in turn, has two key implications when it comes to
high-performance and low-power applications: (1) compared
to conventional FET, NCFET-based circuits can be clocked at
higher frequencies without the need to increase the voltage. (2)
compared to conventional FET, NCFET-based circuits can be
clocked at the same frequency but at a much lower voltage
[3]. Like any other emerging technology, the compatibility
with the existing standard CMOS fabrication process is an
indispensable requirement to become a reality. When it comes
to NCFET, a breakthrough has been recently achieved when
Globalfoundries fabricated NCFET-based circuits using their
































Operating Voltage (Vdd) [V]
FET and NCFET Leakage Dependencies
Conventional FET
NCFET
Fig. 1: Leakage current (Ioff ) of a single NCFET transistor in
comparison with conventional FET transistor, for 7nm FinFET
technology, over a wide range of voltages. NCFET exhibits
an inverse dependency of leakage current with Vdd, unlike
conventional FET. This leads to a novel trade-off between













Operating Voltage of NCFET (Vdd) [V]




Fig. 2: Total power and its components (i.e., leakage and
dynamic) of canneal master thread starting from the minimum
possible voltage under a performance constraint. In all Vdd, the
same frequency of 1 GHz is applied. The total power decreases
when Vdd increases until it reaches an inflection point of min-
imum power. Then it starts to increase dominated by dynamic
power. Note that the ability of NCFET to operate at such
low Vdd (0.2V) is due to the inherent voltage amplification
provided by the integrated negative capacitance.
the right time to investigate the implications that NCFET
technology has on circuit and system levels.
NCFET and Voltage Dependency: In conventional FET,
the leakage current (Ioff ) decreases when the voltage Vdd
decreases. Hence, DVS techniques always aim at operating the
processor at the minimum Vdd to minimize power. However,
such a well-known voltage dependency becomes inverse with
respect to leakage power in NCFET due to the negative
DIBL effect, which is a typical characteristic of short-channel
NCFET [5], [6]. Negative DIBL reduces the threshold voltage978-1-7281-2954-9/19/$31.00 c©2019 European Union
(Vth) of the transistor and thus increases Ioff when the voltage
decreases. In practice, when Vdd is increased in the “off” state,
the gate charge reduces due to the electric field from drain
to ferroelectric [7]. As the ferroelectric layer in NCFET is
biased in a negative capacitance state, a decrease in charge
leads an increase in the voltage drop across the ferroelectric
layer. Consequently, the voltage reaching the internal transistor
gate decreases, resulting in a rise in the energy barrier to the
electrons coming from the source. Thus, Ioff reduces with
Vdd instead of increasing (as in conventional FET).
To demonstrate that, we performed simulations for the 7nm
FinFET technology node for both baseline (i.e., conventional
FET without a ferroelectric layer) and NCFET in which a
ferroelectric layer with a 4nm thickness is in use. Results
are extracted using the BSIM-CMG, which is the industrial
standard compact model for FinFET technologies [8]. We
have modified the BSIM-CMG to incorporate the state-of-
the-art physics-based model of negative capacitance [9]. As
demonstrated in Fig. 1, unlike conventional FET, where reduc-
ing Vdd results in lower Ioff and thus lower leakage power,
reducing Vdd in NCFET results in a noticeable increase in
Ioff . Note that this transistor-level analysis is solely used here
to demonstrate the leakage dependency. However, power and
timing analysis in the rest of this paper are accurately obtained
using signoff tools for a full SoC (see Section IV).
To further demonstrate the consequences of such an inverse
dependency at the system level, we show in Fig. 2 the total
power and its components (leakage and dynamic power) of the
core running the master thread in a multi-core (2× 2) system
when a benchmark canneal from the PARSEC benchmark
suite [10] is being executed. Detailed explanation of the em-
ployed NCFET modeling and our experimental setup will be
presented in Section IV. As shown, scaling down the voltage
reduces the dynamic power but also increases the leakage
power. As a result, the total power consumption reduces until
an inflection point after which it starts to increase again.
Therefore, an optimal voltage point exists not at minimum
possible operating voltage (around 0.35V for this particular
scenario). We will demonstrate in Section II that such a
novel trade-off in NCFET breaks the concept of determining
Voltage-frequency (V-f) pairs at design time that is typically
aimed in conventional FET-based processors [11]. In this work,
we demonstrate that NCFET requires voltage scaling that is
antagonistic to conventional FET voltage scaling.
Hence, DVS techniques for NCFET must be aware of this
property, which offers a novel trade-off. We will demonstrate
the scenarios, where an NCFET-aware DVS is required to
minimize the power. These are when a CPU is not operated
at the peak frequency, or when applications with low CPU
utilization are executed, such as memory-bound applications
or applications with high synchronization overhead between
threads. While there is a small amount of work studying the
impact of NCFET, w.r.t. performance and power at circuit,
single processor and many-core processor [3], [6], there is
no work on NCFET-aware power management i.e., work that
takes the bespoken inverse leakage dependency into consid-
eration. Therefore, we present in this paper the first DVS
technique of this kind.
Our novel contributions within this paper are as follows:
(1) We are the first to demonstrate that voltage scaling in














Operating Voltage of NCFET (Vdd) [V]




Traditional DVS would select Vmin=0.2V
Fig. 3: Total power consumption of different workloads run-
ning at the same frequency (1 GHz) over voltage, starting from
Vmin selected by the traditional DVS to achieve the required
frequency. Different workloads exhibit different Vopt at which
the total power is minimized. Such Vopt does not exist at Vmin,
which would be the decision of traditional DVS.
between leakage and dynamic power resulting in an optimal
point at which total power is minimized.
(2) The inverse voltage dependency in NCFET leads to dif-
ferent optimal voltages for different workloads based on the
share of leakage power from total power.
(3) The aforementioned two points emerge the necessity of
developing an NCFET-aware DVS for runtime power opti-
mization. In this work, we present the first technique that
dynamically selects the optimal voltage in this new scenario.
II. WORKLOAD DEPENDENCY
In this section, we show that with NCFET voltage selection
should be chosen based on the running workloads. General
purpose processors run a variety of applications, which typ-
ically exhibit different characteristics at runtime resulting in
different runtime power consumption. In the context of this
work, the optimal voltage (Vopt) is defined as the operating
voltage at which the processor’s total power is minimized.
Total power in this work is the peak total power consumption
when a core is active. In traditional DVS, which was designed
for conventional FET, the minimum voltage (Vmin) under a
performance constraint always equals Vopt. Importantly, the
selected Vmin does not depend on the running workload and
hence it can be obtained at design time. Consequently, a set of
V-f pairs are typically determined at design time and then be
later employed by the DVS technique at runtime. However,
in the context of NCFET, Vopt does not necessarily occur
at Vmin. In an NCFET-based processor, Vopt depends on
the share of leakage power from the total power and hence
Vopt will vary at runtime based on workloads. Therefore, in
NCFET technology, different workloads will exhibit different
Vopt. Hence, selecting Vopt becomes a runtime decision.
To demonstrate how different workloads may have different
Vopt, we present in Fig. 3 the total power across a wide range
of voltages for two different workloads (obtained from the
PARSEC benchmark suite [10]). As can be observed, different
workloads exhibit different Vopt at which the total power is
minimized and such Vopt is not equal to the Vmin.
All in all, NCFET-aware DVS is necessary due to the
change in the behavior of total power consumption over
voltage scaling which stems from the inverse dependency
w.r.t. leakage power. The new behavior results in a novel
trade-off between leakage and dynamic power. Based on the
leakage share (which is workload dependent) Vopt differs
from Vmin. However, our work exploits the trade-off between
the dynamic and leakage power at runtime. Therefore, it is
fundamentally different from any existing design-time trade-
offs, e.g., changing the threshold voltage of transistor Vth [12].
III. OUR NCFET-AWARE DVS TECHNIQUE
In the following, we first present the developed power and
performance models for selecting Vopt at runtime. Then, we
present our novel NCFET-aware DVS technique. Power and
performance models are derived based on [6].
A. Design Time: Power and Performance Modeling
The maximum, yet sustainable operating frequency
fmax(V ) depends on the voltage V through the minimum
delay dmin(V ):




adel>0, bdel<0, cdel≥0 are constants fitting parameters ob-
tained at design time. Operating at the peak frequency with
maximum CPU activity results in the following leakage and
peak dynamic power consumption:
Pleak(V ) = aleak · V bleak (2)
P peakdyn (V, dmin(V )) = adyn · V
bdyn + cdyn (3)
adyn>0, bdyn>1, cdyn≥0, aleak>0, bleak<0 are constant
fitting parameters. Both P peakdyn (V, dmin(V )) and Pleak(V ) are
convex in V . It is possible to operate the CPU at lower
frequency (higher delay) than the maximum sustainable fre-
quency (minimum sustainable delay). Since leakage power is
independent from CPU activity, it is not affected. Dynamic
power decreases proportionally with the increase in the delay.
P peakdyn (V, d) =
dmin(V )
d






P peakdyn (V, d) is convex in V (for constant d) if bdyn+bdel>1.
B. Runtime: Workload-Dependent Power Modeling
The workload (application) executed at runtime affects the
dynamic power consumption Pdyn(V, d), which is reduced by
a factor 0≤rdyn≤1 from the peak dynamic power P peakdyn (V, d):
Pdyn(V, d) = rdyn · P peakdyn (V, d) (5)
Ptotal(V, d) = Pdyn(V, d) + Pleak(V ) (6)
rdyn represents the current workload activity and therefore is
not constant. When measuring the total power consumption
Ptotal(Vc, d) at the current voltage Vc, rdyn can be calculated:
rdyn =
Pdyn(Vc, d)




P peakdyn (Vc, d)
Algorithm 1 Our NCFET-aware voltage scaling algorithm to
select the optimal voltage (Vopt) at runtime
Input: Power and performance models: P peakdyn (c, d) and
Pleak(V ), current supply voltage Vc and delay d, current
power consumption Pcurr, min. voltage resolution ε
Output: Optimal supply voltage Vopt
1: rdyn ← (Pcurr−Pleak(Vc)) /P peakdyn (Vc, d) . Eq. (7)
2: Vopt ← Vmin(d) . Eq. (8)
3: repeat
4: ∆Vopt ← −Ptotal(Vopt, d)′/Ptotal(Vopt, d)′′
5: Vopt ← Vopt + ∆Vopt . iterative update
6: if Vopt<Vmin(d) then return Vmin(d) . out of bounds
7: if Vopt>Vmax then return Vmax . out of bounds
8: until ∆Vopt < ε . Termination criteria
9: return Vopt
Fig. 4: Comparison of Vdd selected by NCFET-unaware (Vmin)
and NCFET-aware (Vopt) DVS based on frequency and leakage
to total power ratio. NCFET-unaware DVS always selects
Vmin that sustains the required frequency. NCFET-aware DVS
selects higher voltages (Vopt>Vmin) for low frequencies or
high leakage to total power ratio to minimize total power.
C. Runtime: Optimal Voltage Computing
Vopt that minimizes the total power, can be then obtained








Vopt(d, rdyn) = arg min
Vmin(d)≤V≤Vmax
Ptotal(V, d) (9)
In our implemented algorithm, we exploit that Ptotal(V, d)
is convex in V , because it is composed of convex functions.
Convexity guarantees that Ptotal(V, d) has exactly one mini-
mum w.r.t. V within the range [Vmin(d), Vmax]. Therefore, we
can use any convex optimization algorithm to efficiently obtain
Vopt. We choose Newton’s method to achieve fast convergence.
Algorithm 1 summarizes our implemented DVS technique.
D. NCFET-Aware and NCFET-Unaware DVS
Fig. 4 shows the design space with NCFET-aware (Vopt)
and NCFET-unaware DVS (Vmin). NCFET-unaware DVS sets
the minimum voltage that is needed to sustain the required fre-
quency and therefore it does not consider workload behavior.
Contrarily, NCFET-aware DVS does consider the workload as
it depends on the ratio of leakage to total power measured at
Vmin. The explored design space reveals the following:
(a) Two trends can be observed: (1) the higher the frequency
increases, the higher Vopt (i.e., the selected voltage) is. This
is completely consistent with NCFET-unaware DVS. (2) the
higher the leakage to total power ratio increases, the higher
Vopt is. This is because leakage power gets prominent and
therefore, it should be reduced by selecting a higher Vdd.
(b) Two distinct regions exist: (1) For low leakage to total
power ratio and for high frequencies, both techniques (NCFET-
aware and traditional NCFET-unaware) select the same voltage
(i.e., Vopt=Vmin). (2) For high ratios of leakage to total
power or low frequencies, NCFET-aware DVS selects higher
voltages than the minimum voltage to minimize the total power
(Vopt>Vmin).
As shown in Fig. 4, the shape of the operating voltage for
both techniques differ and they have different trends (i.e., not
just shifted or scaled). For NCFET-aware, Vopt almost always
differs than Vmin. Hence, it is indispensable to develop an
NCFET-aware DVS to optimize the total power.
IV. EXPERIMENTAL EVALUATION
In the following, we present different evaluation phases to
examine the effectiveness of our NCFET-aware DVS. Starting
from the experimental setup to show the used tools and
models, then we investigate the scenarios when our algorithm
is better and lastly, we show the energy gains and energy
savings obtained when employing our algorithm.
A. Experimental Setup
As summarized in Fig. 5, our experimental setup consists
of two main parts: (1) processor-level power and performance
modeling and (2) system-level simulation to evaluate the effi-
ciency of a multi-core system under the effects that traditional
(i.e., NCFET-unaware) DVS and our NCFET-aware DVS have.
Processor-Level Power and Performance Modeling: To
obtain detailed power and performance modeling of the pro-
cessor under the effects of NCFET, we implement a full tile
of the state-of-the-art OpenPiton SoC [13]. OpenPiton is an
opensource multi-core processor based on the OpenSPARC
T1 architecture. Using a physics-based NCFET model [9],
integrated within the industrial compact model of FinFET
(BSIM-CMG) [8], NCFET-aware cell libraries were created
[3], [14] based on the open-source 7nm FinFET PDK [15].
Our libraries are fully compatible with existing EDA tool flows
(Synopsys and Cadence). This allows us to directly deploy
them to perform a full chip design starting from logic synthesis
all the way to the GDSII level (i.e., full chip layout design).
To accurately estimate how the power and performance of
the processor are affected by NCFET for a wide range of
voltages, we created our cell libraries for different voltages at
the 7nm FinFET technology. Moreover, the room temperature
of 25◦C is assumed in all analysis. Since the optimal voltage
Vopt depends on the application’s characteristics, which might
be unknown at prior, we analyze the whole voltage range from
0.2 to 0.7V that is supported by the cell libraries. The thickness
of the employed ferroelectric layer is selected to be 4nm
because with larger thicknesses, a hysteresis-free operation
in NCFET cannot be ensured anymore [3] in the employed
7nm FinFET transistor. To accurately estimate the resulting
power and delay, industrial signoff tools (Cadence Voltus
1) Processor level power and 
performance modelling
2) System level simulation for 
DVS implementation 






Standard cell library 
characterization
Power and performance 
modelling 
Power and    Frequency Runtime power traces
Energy Efficiency Evaluation 












Unaware DVS  
OR
Fig. 5: General overview of our design flow demonstrating the
implementation at different abstraction levels to investigate the
efficiency of a multi-core system when our NCFET-aware DVS
versus traditional (NCFET-unaware) DVS is employed.
[16] and Tempus [17]) are employed. Afterwards and based
on the obtained processor analysis (as in [6]), we develop
power and performance analytical models as described in
Section III. These models are integrated later within a system-
level simulator for runtime voltage selection.
System-Level Simulation: We evaluate our proposed NCFET-
aware DVS technique on a multi-core (2×2) system. Each
of the cores has private L1-I and L1-D caches with 32 KB,
each. The per-core private L2 caches have a size of 256 KB,
each. The 8 MB L3 cache is shared among all four cores.
We use the HotSniper tool-chain [18] to simulate our multi-
core system. It combines the Sniper multi-core simulator
[19] with a periodic invocation of McPAT [20] for runtime
power estimation. We run tasks from the PARSEC benchmark
suite [10], which is commonly used to evaluate multi-core
system. The benchmarks facesim and raytrace do not offer
a simsmall input, the benchmarks ferret, freqmine and vips
have unresolvable instrumentation errors. Therefore, we had
to skip these five benchmarks in our analysis. Since McPAT
does not support the NCFET technology (investigated in
this work), we estimate power at 45nm using McPAT and
then scale dynamic and leakage power to 7nm NCFET. We
therefore implemented the OpenPiton SoC additionally at the
45nm technology node [21]. The frequency-dependent scaling
factors are obtained by comparing the dynamic and leakage
power consumption of both technology nodes based on our
previous work in [6]. The frequencies are set between 1.0 GHz
and 2.4 GHz. The maximum frequency limit of 2.4 GHz comes
from our scaling-based approach in which we first employ
McPAT to estimate the power at 45nm. Therefore, we limited
our frequency range to up to 2.4 GHz. Vdd is set between 0.2 V
and 0.7 V. It is important to note, that low Vdd (i.e., 0.2V)
does not result in sub-threshold computing due to the inherent
voltage amplification provided by the negative capacitance.
For fair comparisons, we configured our simulator for both
DVS cases to have: the same frequencies, voltage range, and
architecture, in addition to running the same benchmarks.





























































Fig. 6: Traditional DVS only selects the optimal Vdd
(Vopt=Vmin) for some PARSEC benchmarks when operated
at very high frequency. In all other cases total power is
minimized at higher operating voltage (Vopt>Vmin). This
demonstrates the necessity for NCFET-aware DVS.























Fig. 7: This figure illustrates the total power consumption over
voltage of fluidanimate running at different frequencies. This
example emphasizes again that power is minimized not at
Vmin, but at a higher voltage.
B. When does NCFET-Aware DVS result in Power Saving?
In the following, we demonstrate the conditions in which
our presented NCFET-aware DVS technique results in total
power reduction and hence more energy savings compared
to the baseline (i.e., traditional DVS). As explained in Sec-
tion III-D, NCFET-aware DVS selects a higher Vdd than
traditional DVS for low frequency or high ratio of leakage to
total power. This area is highlighted in Fig. 6 (Vopt>Vmin).
Fig. 6 also shows the ratio of leakage power to total power
for a representative set of PARSEC benchmarks operating
at different frequencies. Different workloads exhibit different
ratios of leakage to total power. This ratio decreases with
increasing frequency because dynamic power consumption
increases more strongly than leakage. We do not show all
PARSEC benchmarks in this figure to maintain the readability.
It can be noticed that for almost all scenarios (i.e., running
a workload at a certain frequency), it is required to select
a higher operating voltage than Vmin to minimize the total
power. This experiment demonstrates that NCFET-aware DVS
is required not only in some corner-cases, but in almost all ex-
ecution scenarios of workloads. In the case where Vmin=Vopt,
traditional DVS already selects the optimal operating voltage,
which is also selected by our proposed NCFET-aware DVS,
i.e., there are no power losses induces by our proposed
NCFET-aware DVS. Furthermore, Fig. 7 illustrates the impact
of frequency on Vdd selection and the total power consumption
as well for fluidanimate. For low frequency Vopt>Vmin and




































Fig. 8: (a) operating voltage Vdd and (b) total power consump-
tion of the canneal master thread with our NCFET-aware DVS
and NCFET-unaware DVS. NCFET-aware DVS overscales the
voltage based on the workload and thereby reduces the total
power by up to 67 % in phase-2 at the same CPU frequency
and results in total energy savings of 17 %. Traditional DVS
(using V-f pairs determined at design time) fails when it
comes to NCFET. As shown for point A and C, they have
the same frequency but different voltages. At point B, voltage
is antagonistically selected between the two DVS techniques.
C. Energy Savings with Our NCFET-Aware DVS
In this section, we evaluate the effectiveness of our NCFET-
aware DVS. First, we show how NCFET-aware DVS saves
power, then we show how total power saving varies at runtime.
Lastly, to demonstrate the effectiveness of NCFET-aware DVS
technique, we report the energy saving for different bench-
marks in comparison with NCFET-unaware DVS.
Fig. 8 presents an illustrative example. The master thread of
PARSEC canneal shows distinct phases during execution. The
total power consumption during phase-1 gradually decreases as
shown in Fig. 8b. The frequency is set at 1.7 GHz. Traditional
DVS sets Vdd to the minimum voltage (0.28 V) required to
sustain this frequency. Thereby, dynamic power is minimized,
but the leakage power is high. Our NCFET-aware DVS sets
Vdd to a higher value (up to 0.37 V), which increases the
dynamic power but stronger decreases leakage power resulting
in a lower total power compared to traditional DVS. As can be
noticed, Vdd is not constant, but it increases slightly over time.
This is because dynamic power decreases and therefore it is
more beneficial to decrease the leakage power. It is important
to note, that operating voltages in NCFET are lower than
traditional FET due to the inherent amplification in NCFET
provided by the ferroelectic layer.
During phase-2, the master thread is then idle and awaits
termination of the slave threads. The frequency is reduced
to the minimum frequency (1.0 GHz). Here, leakage power
dominates the total power. Traditional DVS would reduce Vdd
down to 0.2 V due to the low required frequency. Operating
at such a low voltage strongly increases the leakage power
in NCFET. Our NCFET-aware DVS, instead of reducing
Vdd, boosts the voltage to 0.53 V to minimize the leakage
power. Thereby, the total power consumption during phase-2




































































































Fig. 9: Energy results and energy savings of different bench-
marks running at 1.7GHz using our NCFET-aware DVS
compared to NCFET-unaware DVS. Our NCFET-aware DVS
technique results in up to 27 % energy savings (20 % on
average) while still providing the same performance.
the slaves terminated, the master resumes operation in phase-3
and its frequency is increased again to 1.7 GHz. However, CPU
activity here is very low due to the frequent memory accesses.
Our NCFET-aware DVS is able to exploit this by using a
higher Vdd than in phase-1, even though the frequency is the
same. The total energy consumption of all phases has been
reduced by 17 %. It is important to notice that our technique
does not simply statically increase Vdd, but in fact it results in
opposite behavior. Traditional DVS decreases Vdd in phase-2,
whereas Vdd needs to be increased to minimize the total power
as shown at point B in Fig. 8a where Vdd is antagonistically
selected. Furthermore, the V-f pairs model, which is used
in traditional DVS, does not hold anymore with NCFET. As
shown in Fig. 8a for points A and C, the CPU operates at the
same frequency but have different selected Vdd.
Fig. 9 summarizes the energy savings for different PARSEC
benchmarks with simsmall inputs when active threads are op-
erated at 1.7 GHz and idle cores are throttled to 1.0 GHz. The
DVS techniques (i.e., NCFET-aware and NCFET-unaware) do
not affect performance since the frequency is the same with
both techniques. Therefore, the only difference is the selected
Vdd, which results in a different total power. Energy savings
range from 14 % for blackscholes up to 27 % for dedup.
Two factors affect the observed gains: (1) the CPU utiliza-
tion which affects the dynamic power consumption of active
threads and (2) the idle times of threads which result from
synchronization between threads. The higher the total power
consumption of active threads, the lower are the possible gains
for these threads. This is the reason why e.g., swaptions results
in low gains. Long idle times of threads result in higher gains
since the total power consumption during idle times mainly
consist of leakage that is reduced by our NCFET-aware DVS
technique. This is a reason why fluidanimate has high savings.
Overall, the average energy saving is 20 %.
V. SUMMARY AND CONCLUSIONS
NCFET is a promising emerging technology that has re-
cently become compatible with standard CMOS technology.
In this work, we presented the first NCFET-aware DVS.
The necessity for a novel DVS technique stems from the
inverse voltage dependency that leakage power has in NCFET
technology caused by the negative DIBL effect. Our NCFET-
aware DVS, implemented at the 7nm FinFET, selects the
optimal voltage at runtime following the induced dynamics by
running workloads. Compared to traditional (NCFET-unaware)
DVS, our technique results in up to 27% energy saving because
it does consider the novel runtime trade-off between leakage
and dynamic power that NCFET brings. NCFET-aware DVS
is key to achieve the highest level of power efficiency in this
new emerging technology.
ACKNOWLEDGMENTS
This work (except the NCFET part) was supported in
parts by the Deutsche Forschungsgemeinschaft (DFG, German
Research Foundation) – Projektnummer 146371743 – TRR 89
“Invasive Computing”.
REFERENCES
[1] S. Salahuddin and S. Datta, “Use of Negative Capacitance to Provide
Voltage Amplification for Low Power Nanoscale Devices,” Nano Letters,
vol. 8, no. 2, 2008.
[2] M. Hoffmann, F. P. Fengler, M. Herzig et al., “Unveiling the Double-
Well Energy Landscape in a Ferroelectric Layer,” Nature, 2019.
[3] H. Amrouch, G. Pahwa, A. D. Gaidhane et al., “Negative Capacitance
Transistor to Address the Fundamental Limitations in Technology Scal-
ing: Processor Performance,” IEEE Access, vol. 6, 2018.
[4] Z. Krivokapic, U. Rana, R. Galatage et al., “14nm Ferroelectric FinFET
Technology with Steep Subthreshold Slope for Ultra Low Power Appli-
cations,” in IEEE Int. Electron Devices Meeting (IEDM), Dec 2017.
[5] G. Pahwa, T. Dutta, A. Agarwal et al., “Designing Energy Efficient and
Hysteresis Free Negative Capacitance FinFET with Negative DIBL and
3.5 XI ON Using Compact Modeling Approach,” in European Solid-
State Circuits Conference (ESSCIRC), 2016.
[6] M. Rapp and S. Salamin and H. Amrouch and G. Pahwa and Y.
S. Chauhan and J. Henkel, “Performance, Power and Cooling Trade-
Offs with NCFET-based Many-Cores,” Design Automation Conference
(DAC), 2019.
[7] G. Pahwa, A. Agarwal, and Y. S. Chauhan, “Numerical Investigation
of Short-Channel Effects in Negative Capacitance MFIS and MFMIS
Transistors: Subthreshold Behavior,” IEEE Transactions on Electron
Devices (TED), vol. 65, no. 11, 2018.
[8] “BSIM-CMG Technical Manual,” October 2019, http://www-
device.eecs.berkeley.edu/bsim/?page=BSIMCMG.
[9] G. Pahwa, T. Dutta, A. Agarwal et al., “Analysis and Compact Modeling
of Negative Capacitance Transistor with High ON-Current and Negative
Output Differential Resistance—Part II: Model Validation,” Transactions
on Electron Devices (TED), vol. 63, no. 12, 2016.
[10] C. Bienia, S. Kumar, J. P. Singh et al., “The PARSEC Benchmark
Suite: Characterization and Architectural Implications,” in Parallel
Architectures and Compilation Techniques (PACT), 2008.
[11] Choi, Jung Hwan and Murthy, Jayathi and Roy, Kaushik, “The Effect
of Process Variation on Device Temperature in FinFET Circuits,” in
International Conference on Computer-aided Design (ICCAD), 2007.
[12] T. Kuroda, “Optimization and Control of VDD and VTH for Low-power,
High-speed CMOS Design,” in International Conference on Computer-
aided Design, ser. ICCAD. ACM, 2002.
[13] J. Balkind, M. McKeown, Y. Fu et al., “OpenPiton: An Open Source
Manycore Research Framework,” in Architectural Support for Program-
ming Languages and Operating Systems (ASPLOS), ser. ASPLOS, 2016.
[14] H. Amrouch and S. Salamin and G. Pahwa and A. Gaidhane and J.
Henkel and Y. S. Chauhan, “Unveiling the Impact of IR-drop on Per-
formance Gain in NCFET-based Processors,” Transactions on Electron
Devices (TED), 2019.
[15] L. T. Clark, V. Vashishtha, L. Shifren et al., “ASAP7: A 7-nm FinFET
predictive process design kit,” Microelectronics Journal, vol. 53, 2016.
[16] “Voltus IC Power Integrity Solution,” https://www.cadence.com.
[17] “Tempus Timing Signoff Solution,” https://www.cadence.com.
[18] A. Pathania and J. Henkel, “HotSniper: Sniper-Based Toolchain for
Many-Core Thermal Simulations in Open Systems,” IEEE Embedded
Systems Letters (ESL), 2018.
[19] T. E. Carlson, W. Heirman, and L. Eeckhout, “Sniper: Exploring the
Level of Abstraction for Scalable and Accurate Parallel Multi-Core
Simulation,” in Int. Conf. for High Performance Computing, Networking,
Storage and Analysis (SC). ACM, 2011.
[20] S. Li, J. H. Ahn, R. D. Strong et al., “The McPAT Framework
for Multicore and Manycore Architectures: Simultaneously Modeling
Power, Area, and Timing,” TACO.
[21] NanGate, “Open Cell Library,” https://www.silvaco.com/.
