Energy-Efficient Circuit Design by Nugent, Michael
ENERGY-EFFICIENT CIRCUIT DESIGN
by
Michael Nugent
B.S. Computer Science and Mathematics, University of Pittsburgh,
2007
Submitted to the Graduate Faculty of
the Kenneth P. Dietrich School of Arts and Sciences in partial
fulfillment
of the requirements for the degree of
Doctor of Philosophy
University of Pittsburgh
2015
UNIVERSITY OF PITTSBURGH
DIETRICH SCHOOL OF ARTS AND SCIENCES
This dissertation was presented
by
Michael Nugent
It was defended on
April 29, 2015
and approved by
Kirk Pruhs, Department of Computer Science
Daniel Mosse´, Department of Computer Science
Adam Lee, Department of Computer Science
Anupam Gupta, Department of Computer Science, Carnegie Mellon University
Dissertation Director: Kirk Pruhs, Department of Computer Science
ii
ENERGY-EFFICIENT CIRCUIT DESIGN
Michael Nugent, PhD
University of Pittsburgh, 2015
We initiate the theoretical investigation of energy-efficient circuit design. We assume that
the circuit design specifies the circuit layout as well as the supply voltages for the gates. To
obtain maximum energy efficiency, the circuit design must balance the conflicting demands
of minimizing the energy used per gate, and minimizing the number of gates in the circuit;
If the energy supplied to the gates is small, then functional failures are likely, necessitating
a circuit layout that is more fault-tolerant, and thus that has more gates.
By leveraging previous work on fault-tolerant circuit design, we show general upper and
lower bounds on the amount of energy required by a circuit to compute a given relation.
We show that some circuits would be asymptotically more energy-efficient if heterogeneous
supply voltages were allowed, and show that for some circuits the most energy-efficient supply
voltages are homogeneous over all gates.
In the traditional approach to circuit design the supply voltages for each transistor/gate
are set sufficiently high so that with sufficiently high probability no transistor fails. We
show that if there is a better (in terms of worst-case relative error with respect to energy)
method than the traditional approach then P = NP , and thus there is a complexity theoretic
obstacle to achieving energy savings with Near-Threshold computing.
We show that almost all Boolean functions require circuits that use exponential energy.
This is not an immediate consequence of Shannon’s classic result that most functions require
exponential sized circuits of faultless gates because, as we show, the same circuit layout can
compute many different functions, depending on the value of the supply voltage.
If the error bound must vanish as the number of inputs increases, we show that a natural
iii
class of functions can be computed with asymptotically less energy using heterogeneous
supply voltages than is possible using homogeneous supply voltages. We also prove upper
bounds on the asymptotic energy savings achieved by using heterogeneous supply voltages
over homogeneous supply voltages for a class of functions, and also show a relation that can
bypass this bound.
iv
TABLE OF CONTENTS
PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1.0 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 General Bounds on Circuits With Constant Error . . . . . . . . . . . 7
1.2.2 Introduction to Heterogeneity . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Hardness Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4 Almost All Functions Require Exponential Energy . . . . . . . . . . . 13
1.2.5 The Power of Heterogeneity to Reduce Energy . . . . . . . . . . . . . 15
2.0 MODEL, DEFINITIONS, AND NOTATION . . . . . . . . . . . . . . . . 19
3.0 GENERAL ENERGY UPPER AND LOWER BOUNDS . . . . . . . . 21
3.1 A General Energy Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 A General Energy Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . 28
4.0 INTRODUCTION TO SUPPLY VOLTAGE HETEROGENEITY . . . 29
4.1 Supply Voltage Heterogeneity May Not Help . . . . . . . . . . . . . . . . . . 29
4.2 Supply Voltage Heterogeneity Can Help . . . . . . . . . . . . . . . . . . . . 30
5.0 HARDNESS AND ALGORITHMIC RESULTS FOR CIRCUIT EN-
ERGY PROBLEMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1 Polynomial-Time Approximation of the Minimum Circuit Energy Problem . 39
5.2 Hardness of Approximation for the Minimum Circuit Energy Problem . . . 42
5.3 Hardness of Determining (, δ)-reliability on Fixed Inputs . . . . . . . . . . 48
5.4 Tree Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
v
5.5 Non-Monotonicity of δ in  . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.0 ALMOST ALL FUNCTIONS REQUIRE EXPONENTIAL ENERGY 57
6.1 A Lower Bound on the Number of Functions Computable by a Circuit . . . 57
6.1.1 Homogeneous Supply Voltages . . . . . . . . . . . . . . . . . . . . . . 58
6.1.2 Heterogeneous Supply Voltages . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Almost all Functions Require Exponential Energy . . . . . . . . . . . . . . . 65
6.2.1 Adaptation of Shannon’s Argument . . . . . . . . . . . . . . . . . . . 66
6.2.2 Homogeneous Supply Voltages . . . . . . . . . . . . . . . . . . . . . . 66
6.2.3 Heterogeneous Supply Voltages . . . . . . . . . . . . . . . . . . . . . . 68
6.3 Relating Energy and the Number of Noisy Gates . . . . . . . . . . . . . . . 70
7.0 THE POWER OF HETEROGENEITY TO REDUCE ENERGY . . . 72
7.1 Lower Bound for Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2 Upper Bound for Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.3 Lower Bound for Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.4 Generalizing the Failure-to-Energy Function . . . . . . . . . . . . . . . . . . 81
8.0 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.1 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.1.1 Solving the Minimum Circuit Energy Problem for Restricted Classes
of Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.1.2 Whether Heterogeneity Reduces Energy When δ is a Fixed Constant . 85
8.1.3 Whether log n Energy Savings via Heterogeneity is the Maximum Pos-
sible When δ Vanishes . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.1.4 The Power of the Exact Failure Model . . . . . . . . . . . . . . . . . . 86
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
vi
LIST OF FIGURES
1 Semi-log plot of voltage-to-failure for an SRAM cell from [16]. . . . . . . . . . 2
2 Two SRAM circuits with the same functionality. . . . . . . . . . . . . . . . . 3
3 (a) Pr[r outputs 1] ≥ 1 − p. (b) The path from b to g. The input gates bi
receive input 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 A subtree B. The solid edges denote the full ternary subtree T ∈ Γ. Note that
T has 1’s as inputs on its leafs. The dashed edges denote the edges in B \ T .
The gray nodes denote gates that failed. . . . . . . . . . . . . . . . . . . . . . 36
5 The circuit Sφ where φ = (x1∨x¯2∨x4)∧(x1∨x2∨x3)∧(x¯3∨x6∨x5)∧(x3∨x¯5∨x¯6). 43
6 A simple circuit where δ∗() is not monotone in  in the von Neumann failure
model, consisting of two OR gates and one AND gate. . . . . . . . . . . . . . 55
7 A simple circuit where δ∗() is not monotone in  in the 0-default failure model,
consisting of an AND gate and two NOT gates. . . . . . . . . . . . . . . . . . 56
8 The circuit used in the proof of Theorem 53. . . . . . . . . . . . . . . . . . . 59
9 A tree of adders. The majority of x1, . . . , xn is a function of y1, . . . , ylogn that
is computable by a circuit of size o(n). . . . . . . . . . . . . . . . . . . . . . . 78
10 A 1-bit full adder: logic block (left) and circuit realization (right). . . . . . . 79
vii
PREFACE
The research within this dissertation was published in the following papers, and was a
collaboration with the listed coauthors:
• Energy-Efficient Circuit Design, published in the 5th conference on Innovations in The-
oretical Computer Science (ITCS), with Antonios Antoniadis, Neal Barcelo, Kirk Pruhs,
and Michele Scquizzato [6].
• Complexity-Theoretic Obstacles to Achieving Energy Savings with Near-Threshold Com-
puting, published in the 5th International Green Computing Conference (IGCC), with
Antonios Antoniadis, Neal Barcelo, Kirk Pruhs, and Michele Scquizzato [5].
• Almost All Functions Require Exponential Energy, to appear in the 40th International
Symposium on Mathematical Foundations of Computer Science (MFCS), with Neal
Barcelo, Kirk Pruhs, and Michele Scquizzato [8].
• The Power of Heterogeneity in Near-Threshold Computing, in submission to the 6th
International Green Computing Conference (IGCC), with Neal Barcelo, Kirk Pruhs, and
Michele Scquizzato [9].
We thank Rami Melhem for insightful discussions about Near-Threshold Computing.
viii
1.0 INTRODUCTION
The number of transistors per unit volume on a chip continues to double about every two
years. However, about a decade ago chip makers hit a thermal wall as the cost of cooling
chips with these transistor densities became prohibitive. This has resulted in Moore’s gap,
namely that increased transistor density no longer directly translates into a similar increase
in performance, and in energy becoming the first order design constraint in CMOS-based
technologies.
One possible technique to attain more energy-efficient circuits is Near-Threshold Com-
puting. The threshold voltage of a transistor is the minimum voltage at which the transis-
tor starts to conduct current, around 0.2-0.3V for modern processors. Of course, even for
identically-designed transistors, there can be variations in the actual threshold voltage due
to manufacturing variations; And even for the same transistor, the actual threshold voltage
will vary with environmental conditions. Further, actual supply voltages may differ from the
designed voltage due to manufacturing and environmental variance. Thus if the designed
supply voltage was exactly the ideal threshold voltage, some transistors would likely fail to
conduct current as designed. For example, for a typical 65 nm SRAM circuit, halving the
supply voltage from the nominal level to 0.5V typically increases the failure rate by about
5 orders of magnitude (see Figure 1 from [16]). Since the relationship between voltage and
the log of the failure is approximately linear, the error as a function of supply voltage v is
approximately of the form of (v) = c−v, for some positive constant c. Using the fact that
the energy is proportional to the square of the supply voltage [10], the energy used by a
65nm SRAM with failure rate  is proportional to Θ
(
log2(1/)
)
.
The traditional design approach to achieving fault tolerance is to set the supply voltage
to be sufficiently high so that with sufficiently high probability no transistor fails. Near-
1
Threshold Computing simply means that the supply voltages are designed to be closer to the
threshold voltage, which can potentially offer significant improvements in energy efficiency,
provided another, more energy-efficient, solution for the fault-tolerance issue can be found.
Figure 1: Semi-log plot of voltage-to-failure for an SRAM cell from [16].
One strategy to achieve fault-tolerance is to design fault-tolerant circuits, namely circuits
that correctly compute the desired output if the number of failures is not significantly higher
than the expected number of failures. The study of fault-tolerant circuits is not new. Starting
with the seminal paper by von Neumann [30], several papers [14, 15, 17, 18, 25–27] have
considered the question of how many faulty gates, each (independently) having a (small)
fixed probability of failure, are required to mimic the computation of an ideal circuit with
some desired probability of correctness. In general, as the probability of gate failure increases,
one would expect that more gates will be required to achieve a fixed probability of failure
for the circuit. As an example from [16], the circuit shown in Figure 2a is the traditional
6-transistor design for an SRAM cell, while the circuit shown in Figure 2b is a more fault-
tolerant, and thus more suited for Near-Threshold Computing, 10-transistor design for an
SRAM cell.
Our goal here is to initiate the theoretical study of the design of energy-efficient circuits.
We assume that the design of the circuit specifies both the circuit layout as well as the
supply voltages for the gates. To obtain maximum energy efficiency, the circuit design must
balance the conflicting demands of minimizing the energy used per gate, and minimizing the
2
Vdd
WL
BLBLB
Q QB
(a) Standard 6-transistor design.
Vdd
WL
BLBBL
Q QB
RWL
RBL
QBB
(b) A more fault-tolerant 10-transistor design from [11].
Figure 2: Two SRAM circuits with the same functionality.
number of gates in the circuit; If the energy supplied to the gates is small, then functional
failures are likely, necessitating a circuit layout that is more fault-tolerant, and thus has more
gates. Thus the design should find a “sweet spot” for the supply voltages that balances the
competing demands of small circuit size and low per-gate energy.
Perhaps the most natural question that arises within this context is the following: Given
a function, what is the minimum energy required by a circuit to compute that function?
Answering this involves finding the optimal tradeoff between reduced supply voltage and
increased circuit size for that function, which is closely related to the problem of finding the
smallest circuit to compute that function. Due to this relationship, coupled with the fact
that the ability to determine the smallest circuit to compute an arbitrary function would
have vast implications throughout theoretical computer science (for example, proving that an
NP-complete problem requires circuits of super-polynomial size would show P6=NP), finding
the optimal energy to compute an arbitrary function seems to be untouchable at this time;
However, even conditional, approximate answers would provide useful insight, especially
for the simpler classes of functions that circuits in computers are typically composed of.
Additionally, one may be interested in optimizing the energy used by a specific circuit. In
this case, the question is: Given a circuit, what is the lowest possible supply voltage such
that the circuit still computes correctly (with sufficiently high probability)?
In order to begin to answer these questions, we formalize them as the following problems:
3
Definition 1. Minimum Energy Circuit Problem (MEC): Given a function (or re-
lation) F , and an error bound δ, output a circuit layout C and a setting v of the supply
voltage, such that C uses minimal energy, subject to the constraint that C computes F with
probability at least 1− δ.
Definition 2. Minimum Circuit Energy Problem (MCE): Given a circuit layout C,
and an error bound δ, output a setting v of the supply voltage, such that C uses minimal
energy, subject to the constraint that C computes correctly (what C would compute if there
were no errors) with probability at least 1− δ.
While it may not currently be practical, in principle the supply voltages need not be
homogeneous over all gates of a circuit, that is, different gates could be supplied with different
voltages. This naturally leads to the question of whether allowing heterogeneous supply
voltages might yield lower-energy circuits than is possible if the supply voltages are required
to be homogeneous. Intuitively, heterogeneous supply voltages should benefit a circuit where
certain parts of the computation are more sensitive to failure than others. This naturally
leads to the heterogeneous versions of the above two problems:
Definition 3. Heterogeneous Minimum Energy Circuit Problem (HMEC): Given
a function (or relation) F , and an error bound δ, output a circuit layout C and a setting vg
of the supply voltage for each gate g ∈ C, such that C uses minimal energy, subject to the
constraint that C computes F with probability at least 1− δ.
Definition 4. Heterogeneous Minimum Circuit Energy Problem (HMCE): Given
a circuit layout C, and an error bound δ, output a setting vg of the supply voltage for each
gate g ∈ C, such that C uses minimal energy, subject to the constraint that C computes
correctly (what C would compute if there were no errors) with probability at least 1− δ.
A variety of questions can be formulated around these problems. Perhaps most basically,
for which functions can we solve MEC and HMEC (exactly or approximately)? As discussed,
there are a number of functions for which solving this question seems to be beyond the reach
of even state-of-the-art mathematics. However, even if it is infeasible to solve MEC and
HMEC for most functions, it may be possible to say something about the minimum energy
required to compute a randomly chosen function: If a function is chosen uniformly at random,
4
how much energy does the solution to MEC or HMEC require (with high probability)? As
these questions are formulated for both MCE and HMCE, it is natural to consider how
much supply voltage heterogeneity effects the solutions: For which functions is the solution
to HMEC significantly less than the solution to MEC? And for a function chosen uniformly
at random, is the solution to HMEC significantly less than the solution to MEC? Switching
to the problems for determining the minimum energy required by a fixed circuit, we can ask
for which circuits can we solve MCE and HMCE (exactly or approximately)? In this case it
is clear that HMCE can have a significantly smaller solution than MCE, as a fixed circuit
may contain a large “useless” component that does not affect circuit computation, and thus
heterogeneous supply voltages may be employed to spend little or no energy on that part
of the circuit. However, it is still interesting to consider whether, for a very natural circuit
that computes some function or relation, the solution to HMCE is significantly smaller than
the solution to MCE.
1.1 RELATED WORK
As far as we are aware, this is the first work to consider energy-efficient circuit design from
a theoretical perspective; However, there has been a significant amount of related work in
the area fault-tolerant circuit design, which we summarize here. Von Neumann [30] first
introduced the model where each gate of the circuit fails with some independent, fixed
probability . Von Neumann also informally argued that any function that can be computed
by a faultless circuit of size s can be computed by a faulty circuit of size O(s log s). Dobrushin
and Ortyukov [15] proved this result formally. Subsequently, Pippenger [25] improved this
result by giving explicit, rather than probabilistic, constructions of certain aspects of the
proof, resulting in an explicit construction of a fault-tolerant circuit for some fixed values of
. Finally, Ga´cs [17] proved this result in full generality for arbitrary values of . Pippenger
[25] also proved that any function can be computed by a network of O(2n/n) faulty gates.
When combined with Shannon’s result that almost all functions require Ω(2n/n) faultless
gates [28], this shows that almost all functions require only a constant factor increase in
5
circuit size when the gates are faulty.
The remaining related work is in developing lower bounds on faulty circuit size. Do-
brushin and Ortyukov [14] attempted to show that any function of sensitivity m requires at
least Ω(m logm) faulty gates to compute, however their proof contains errors, which were
first pointed out by Pippenger et al. [26]. Ga´cs and Ga´l [18] proved the lower bound cor-
rectly, using a similar idea to [14] but proving new lemmas for the analysis. Reischuk and
Schmeltz [27] independently gave another proof of the lower bound using decision trees.
The general idea of trading accuracy of a hardware circuit and computing architecture
for energy savings dates back to at least [24]. The paper [16] gives an excellent survey on
Near-Threshold Computing. According to [16], the three main barriers to the widespread
use of Near-Threshold computing are:
1. Performance Loss: Circuits supplied a Near-Threshold voltage perform orders of mag-
nitude slower than circuits supplied the nominal voltage.
2. Increased Performance Variation: Circuits supplied the nominal voltage experience
a roughly 1.5X performance variation, while circuits supplied a Near-Threshold voltage
experience up to a 20X performance variation.
3. Increased Functional Failure: Circuits experience an increased sensitivity to process,
temperature, and voltage, resulting in an increased rate of functional failures.
While significant research has been performed to mitigate all of these barriers, the work
presented in this dissertation is meant to provide the theoretical basis for solving the third
problem, that of increased functional failure. From a system-design perspective, the problem
of increased functional failures has been most pronounced in SRAM, and thus much of
the research on this problem has been focused on making SRAM robust when supplied a
Near-Threshold voltage [16]. As previously mentioned, some research has been in designing
SRAM bitcells that are more fault tolerant, for example with 8 or 10 transistors, rather than
the standard 6 transistor design [11, 12]. Other work has been in designing SRAM cache
architectures using error-correcting codes, redundancy, and other methods to ensure cache
reliability, at the cost of increased cache size or latency [1, 3, 13, 21, 22, 31, 32].
As an example of a technology that has at least the spirit of Near-Threshold Computing,
6
IBM’s production POWER7 servers use a technique called Guardband to save energy by
dynamically lowering operating voltage [4].
1.2 OUR CONTRIBUTIONS
We discuss our main results in the following sections.
1.2.1 General Bounds on Circuits With Constant Error
We begin in Chapter 3 by showing general lower and upper bounds on the amount of energy
required by a circuit to compute a given relation, when the reliability parameter δ is a fixed
constant.
General lower bound on the energy to compute a function: We first show in
Section 3.1 the following lower bound on the amount of energy required to compute any
relation:
Theorem 5. Let δ < 1/4, and let C be a circuit that computes a relation h of sensitivity m
with probability at least 1− δ. Then C requires Ω
(
m log
(
m1−2
√
δ
δ
))
energy.
Ga´cs and Ga´l [18], and independently Reischuk and Schmeltz [27], show that any Boolean
function f with sensitivity m (roughly the number of input bits which affect the output)
requires a circuit of size Ω(m logm) to be reliably computed when the gates of the circuit
fail independently with a fixed positive probability. We modify the techniques in [18] to
prove our lower bound on the energy required by any circuit that computes a relation with
sensitivity m. The proof consists of two main parts. The first part is to consider a failure
model that is equivalent in terms of the reliability of any part of the circuit, but where
failures occur, and energy is consumed, on the wires of the circuit as well as at the gates.
The second part considers the sensitive input bits to the circuit: If there are too few wires
emanating from these input bits, then the probability that failures will cause the circuit to
compute as if one of the sensitive bits were flipped is too large, and the output would be
incorrect with too high a probability. An additional technical hurdle to obtaining our result
7
is that supply voltages may be heterogeneous, so, unlike the setting of the previous work,
different gates of the same circuit may use different amounts of energy, and thus fail with
different probabilities.
General upper bound on the energy to compute a function: In Section 3.2, we
extend the classic upper bound on circuit size from the fault-tolerant circuit literature to
obtain the following upper bound on circuit energy consumption:
Theorem 6. Given a reliable circuit C of size s, and a fixed constant δ > 0, it is possible
to construct a circuit C ′ with homogeneous voltage supplies that uses O(s log(s)) energy and
that computes the same function computed by C with probability at least 1− δ.
Von Neumann [30] showed that given a Boolean function f and a circuit of size s which
computes f , a circuit of size O(s log s) is sufficient for computing f correctly with high
probability when the gates of the circuit fail independently with a fixed positive probability.
Using techniques from [30] and from [17, 25], we show that a relation h that is computable
by a circuit of size s can, with probability at least 1− δ, be computed by a circuit of faulty
gates using O(s log(s)) energy. In our construction, the supply voltages are homogeneous.
The construction works by introducing a Θ(log s) factor of redundancy in the circuit, and
each gate of the circuit is replaced by a gadget. The input to each gadget is Θ(log s) wires
per original gate input, most of which, with high probability, carry the same input bit as if
the computation were being performed in the original, faultless circuit. The gadget contains
Θ(log s) copies of the original gate, as well as a component of size Θ(log s) that ensures that
the fraction of incorrect wires exiting the gadget is sufficiently low with high probability.
1.2.2 Introduction to Heterogeneity
In Chapter 4, we consider simple cases in which allowing heterogeneous supply voltages both
does and does not yield asymptotic decreases in energy.
Settings where heterogeneous supply voltages are not beneficial: In Section 4.1
we observe that, when δ is restricted to a fixed constant, there are relations, namely the parity
function, for which allowing heterogeneous supply voltages will not allow one to achieve a
circuit design that uses asymptotically less energy than is achievable by a circuit design with
8
homogeneous supply voltages:
Theorem 7. Let δ < 1/4 be a fixed constant. The energy used by any circuit that computes
the parity function with probability 1 − δ is Ω(n log(n)), and this is achievable by a circuit
with homogeneous supply voltages.
Intuitively, the parity function has such high sensitivity that every gate in any reason-
able circuit will be of equal importance, so nothing can be gained by heterogeneous supply
voltages. This immediately implies that in the setting where δ is a fixed constant, hetero-
geneity doesn’t significantly benefit some functions. That is, for these functions the optimal
solution to MEC and the optimal solution to HMEC use asymptotically the same energy
(up to constants), which also implies that there are circuits for these functions such that the
optimal solution to MCE and the optimal solution to HMCE use asymptotically the same
energy. Formally, the proof is a corollary of our lower and upper bounds from Sections 3.1
and 3.2.
A setting where heterogeneous supply voltages is beneficial: In contrast, in
Section 4.2 we give a natural example where allowing heterogeneous supply voltages allows
one to use asymptotically less energy than would be achievable using homogeneous supply
voltages. In particular, we consider a natural super-majority relation called LSR, which
outputs the majority of the input bits if this majority is sufficiently large, and the most
natural circuit that computes this relation, a balanced tree of majority gates, and obtain the
following theorem:
Theorem 8. Let E1(δ) be the optimal energy consumption of the majority tree on n leaves
with homogeneous supply voltages that computes LSR with probability 1 − δ, and E2(δ) be
the optimal energy consumption of the same if supply voltages may be heterogeneous. Then,
for δ′ = 2
log3 n
, it holds that E1(δ
′)
E2(δ′)
= ω(1).
We show that for homogeneous supply voltages the energy required by this circuit to
compute LSR is Ω(n log2(δ)), where n is the number of input bits. We then show that if
supply voltages can be heterogeneous, this circuit can compute LSR using energy O(n +
31/δ log2(δ/10)), which is asymptotically less than n log2(δ) if δ → 0 as n → ∞. This
implies that there are quite simple relations and circuits for which the optimal solution to
9
HMCE uses asymptotically less energy than the optimal solution to MCE. The heterogeneous
voltage setting is quite intuitive: Since we only care about a super-majority, the gates far
from the output gate can sustain a small but constant fraction of failures without affecting
the output, while the gates closer to the output, of which there are a comparatively small
number, affect the output much more dramatically, so we use a large amount of energy to
ensure they do not fail. For the proof, we lower bound the energy used by any homogeneous
setting by observing that in any homogeneous setting, gates, and in particular the output
gate, cannot fail with probability greater than δ. We give an explicit setting of voltages
for the heterogeneous upper bound, and use recurrence relations to bound the probabilities
of “bad” failure profiles, to show that, with that heterogeneous setting, the circuit outputs
correctly with probability at least 1− δ.
1.2.3 Hardness Results
Chapter 5 discusses complexity theoretic barriers to energy-efficient circuit design. Ideally,
a circuit designer would like to solve MEC, but, for many functions, the ability to even
approximately bound optimal circuit sizes is essentially at least as hard as the P vs. NP
question,1 and is untouchable with current mathematical knowledge. In order to avoid
this mathematical barrier, we instead consider MCE. We show that this problem is NP-
hard. Thus if P 6=NP then there is no efficient method for computing the optimal supply
voltage setting. The standard fallback approach for NP-hard optimization problems is to
seek algorithms that are guaranteed to produce solutions with optimal/good relative error
compared to the optimal solution. In our case, an algorithm A has approximation ratio c (or
equivalently worst-case relative error c − 1) if for all inputs, the energy used by the circuit
with the supply voltage setting given by A is at most c times the optimal minimum energy.
The approximation ratio of the traditional approach: We show in Section 5.1
that the approximation ratio of the traditional algorithm, which sets the supply voltages
such that the probability that even a single gate fails is at most δ, is O
(
log2 s
)
, where s is
the number of gates in the circuit:
1If one could prove that your favorite NP-complete problem required super-polynomially many gates to
compute, this would prove P6=NP.
10
Theorem 9. The traditional approach is an O
(
log2 s
)
-approximation for MCE.
Similar to the lower bound on the energy used by homogeneous voltage settings in Sec-
tion 4.2, we can upper bound the failure probability of any homogeneous setting by δ. The
result follows by noting that setting the failure probability to δ/s uses Θ
(
log2 s
)
more energy,
and that with this failure probability, the probability that even one gate fails is at most δ.
Hardness of improving upon the traditional approach: In contrast, we show in
Section 5.2 that it is NP-hard to approximate MCE to within a factor polynomially less than
O
(
log2 s
)
:
Theorem 10. It is NP-hard to O
(
log2−γ s
)
-approximate MCE for any γ > 0.
The circuit used in the reduction is the natural circuit for a 3SAT instance. The input
bits represent the boolean settings of the variables, which are negated as appropriate and
fed into OR gates, representing the clauses. The output to these OR gates is then fed into
a tree of AND gates, and thus if the variable assignment satisfies the 3SAT formula, all of
the inputs of the AND tree are 1’s, and the circuit outputs 1; Otherwise, the circuit outputs
0. Intuitively, the AND tree is very prone to failures, and is very likely to output 0 if even a
few gates of the AND tree fail, regardless of the input. Because of this, in the case that the
3SAT formula is satisfiable, the failure rate must be low enough such that every gate of the
AND tree works correctly with sufficiently high probability, and so the energy must be high.
On the other hand, if the 3SAT formula is not satisfiable, any input to the circuit should
output 0, and thus a small number of failures in the AND tree do not significantly decrease
the probability that it outputs 0. We also observe here that, using essentially the same proof
as MCE, one can show HMCE is at least as hard in terms of approximation as MCE, as the
ability to supply the gates of the AND tree of the 3SAT circuit different voltages yields no
asymptotic benefit.
Generalization of hardness to other failure models: One might be concerned that
this hardness is the result of how we specifically model functional failures in circuits, rather
than due to the complexity of circuits; In order to provide evidence that the hardness does
indeed come from the complexity of circuits, we prove that the same result holds in another
model, the 0-default model, that is somewhat different and, perhaps, closer to how failures
11
occur in real circuits: In the 0-default model failures cause wires to carry a “default” 0 bit.
In fact, we even show in Section 5.3 that it is NP-Hard in this model to even determine
whether a circuit computes correctly on a fixed input.
Theorem 11. In the 0-default model, it is NP-hard to O
(
log2−γ s
)
-approximate MCE for
any γ > 0. Additionally, it is NP-Hard to determine if a circuit computes correctly on a
fixed input.
Bypassing hardness via specific families of circuits: Putting these results together,
we see that there is a complexity theoretic obstacle to achieving more energy efficient cir-
cuits by using lower supply voltages than one obtains by the traditional high supply voltage
approach. More precisely, if one could find a computationally efficient algorithm for setting
supply voltages that has better worst-case relative error than the traditional approach, then
P=NP. So, assuming P 6=NP, any proposed algorithm would either not have worst-case rel-
ative error better than the traditional approach, or would take super-polynomial time on
some circuits. But of course the standard caveat applies here: as NP-hardness is a worst-
case concept, this doesn’t mean that one cannot beat the energy used by the traditional
approach for particular circuits of interest. As a small step in the direction of showing that
for circuits of interest MCE may be approachable, we show in Section 5.4 that there is an
efficient algorithm to verify whether a particular setting of the supply voltage achieves the
desired error bound if the circuit is a tree.
Lemma 12. Let C be a circuit with a tree as the underlying graph, and suppose each gate g
of the circuit fails with probability g. Then there is a polynomial time algorithm to determine
if C computes correctly with probability at least 1− δ.
This result hints at the hardness of MCE coming from “cycles” in the circuit. In Sec-
tion 5.5 we make the curious observation that there are circuits where the reliability of the
output is not monotone in the reliability of the gates. Understanding this non-monotonicity
seems to be the key to being able to solve MCE for circuits that are trees.
12
1.2.4 Almost All Functions Require Exponential Energy
Chapter 6 discusses the minimum amount of energy required to compute almost all, i.e., a
1 − o(1) fraction of, functions. In principle, for every function f , MEC has some solution.
Though for many functions finding the solution to MEC may be untouchable due to the
previously discussed complexity-theoretic barriers, one might hope to determine the amount
of energy required by an average Boolean function. Pippenger showed that all Boolean
functions can be computed by circuit layouts with O(2n/n) noisy gates [25]. Using that
construction, it immediately follows that all Boolean functions can be computed by some
circuit that uses O(2n/n) energy assuming δ is a fixed constant. We show in Chapter 6 that
this result is tight for almost all functions, i.e:
Theorem 13. For any 0 < δ < 1/2, almost all Boolean functions on n variables require
circuits that use Ω(2n/n) energy.
To develop intuition about this result, it is necessary to consider more precisely the
relationship between the voltage supplied a gate and the probability that it actually fails.
Let  be the function mapping voltage to error probability. In the exact failure model, when
a gate is supplied a voltage v, it fails with probability exactly (v). On the other hand,
in the bounded failure model, the gate fails with probability at most (v). The bounded
failure model is arguably more realistic, in the sense that the circuit designer may not know
exactly the probability that a gate will fail, and the failure probability may vary with time
or environmental conditions. On the other hand, the exact failure model allows the circuit
designer to use component failures as a source of randomness, and thus perhaps perform
computations more efficiently. All of our results discussed thus far have held for both exact
and bounded failure models.
Circuits that can compute many functions: The main component of the proof
that almost all functions require exponential energy is to show that almost all functions
require circuit layouts with exponentially many gates. In the bounded failure model, this
directly follows from Shannon’s result that almost all functions require circuits of exponential
size [28], since in the bounded failure model a circuit must compute correctly even if no gates
fail. The exact failure model is less straightforward, as, by modifying the voltage supplied to
13
every gate, a single circuit layout may be able to compute a number of different functions.
In fact, we have the following theorem for circuits with homogeneous supply voltages:
Theorem 14. For any 0 < δ < 1/2 and n ∈ N, there exists a circuit with n inputs of size
O(n) that computes Ω
(
logn
log( 1
δ
logn)
)
different functions with probability at least 1− δ.
The circuit is composed of trees of AND gates of varying size, and one can see how this
circuit computes multiple functions by observing that, as the failure rate increases, a tree of
AND gates will switch from computing the AND function on its input bits, to computing
the 0 function.
If supply voltages are allowed to be heterogeneous, we obtain the following theorem:
Theorem 15. For any 0 < δ < 1/2 and n ∈ N, there exists a circuit C with n inputs of size
O(n2) that computes Ω(3n) different functions with probability at least 1− δ.
The circuit in this proof is a modification on the natural circuit for a 3CNF formula:
Instead of connecting the literals in each clause directly to an input or its negation, for each
literal, the output of a series of AND trees (each of which may output 1 or 0, depending
on the voltage settings) specifies which input variable should be represented by that literal,
and whether or not it should be negated. Thus, by modifying supply voltages, the circuit is
able to compute any function on n inputs that can be represented by a 3CNF formula with
a fixed number of fixed-sized clauses.
Upper bounds on the number of functions a circuit can compute: Despite the
existence of circuits that can compute many functions, we are able to provide, both for
homogeneous and heterogeneous supply voltages, a sufficiently small upper bound on the
number of functions that a single circuit can compute:
Lemma 16. A circuit C on n inputs with s gates can compute at most s2n + 1 functions if
supply voltages must be homogeneous, and at most (8e2n)s functions if supply voltages may
be heterogeneous.
Though the proof for heterogeneous supply voltages is somewhat more complicated than
the one for homogeneous supply voltages, both follow primarily from two observations: (1)
The probability a fixed circuit outputs a 1 on a fixed input can be written as a polynomial
in the failure rate of each gate of the circuit, and (2) A circuit with a fixed setting of
14
supply voltages only computes a function when, on all inputs, it outputs either 1 or 0 with
probability at least 1−δ. Combining these two, we observe that a circuit with a set of supply
voltages only computes a function when, for each polynomial associated with the probability
of outputting a 1 on an input, that polynomial is above 1− δ or below δ. With this in hand,
we can apply results from calculus and geometry to obtain the upper bound on the number
of functions a single circuit can compute.
Equivalence of exponential energy and exponentially many noisy gates: These
results leave open the possibility that some Boolean functions that do not require circuits
with exponentially many gates still require exponential energy. For example it could be the
case that for some function the energy optimal circuit has sub-exponentially many gates,
with many of them requiring exponential energy. We show that this is not the case:
Lemma 17. A Boolean function f requires circuits that use exponential energy if and only
if it requires circuits that contain exponentially many gates.
The proof follows by noting that, for any circuit, setting supply voltages such that the
energy-per-gate is polynomial in the circuit size is sufficient for no gate in a circuit to fail,
and thus using a higher energy-per-gate can provide no additional benefit.
1.2.5 The Power of Heterogeneity to Reduce Energy
In Chapter 7, we explore the power of heterogeneous supply voltages to asymptotically save
energy over any circuit using homogeneous supply voltages when computing certain functions
or relations. In this chapter, we consider the case when the error parameter δ vanishes as
the number of the inputs to the function increases. Previously, in Chapter 3, we focused
on the case when δ is a fixed constant. This intuitively makes sense, as a circuit designer
may calculate the requirements for δ, and design the circuit based on that. However, it is
also reasonable to consider the case when δ must vanish as the number of inputs to the
circuit increases: As the circuit size increases, circuit failures may become more expensive
to recover from; Additionally, a single circuit may be a component of a larger system, and
as this system grows, it will not function reliably enough if the the failure probability of the
components it is made up of does not decrease.
15
Many functions benefit from heterogeneous supply voltages: Perhaps surpris-
ingly given the general upper and lower bounds in Chapter 3 when δ is a fixed constant,
which are tight for many natural functions (e.g., parity), we show that, if the circuit error
probability must vanish as the number of inputs increases, there is a natural class of func-
tions (that includes parity) which can be computed with asymptotically less energy if supply
voltages are allowed to be heterogeneous.
Theorem 18. For any function f with minimum circuit size s, for any constant c > 0, if
δ = 1/sc, then every circuit with homogeneous supply voltages computing f uses Ω
(
s log2(s)
)
energy, and there exists a circuit with heterogeneous supply voltages using O(s log s) energy.
If we replace s by Θ(n), where n is the number of inputs to the function, then the above
theorem applies to all functions with circuits of size linear in the number of inputs. Thus,
for such functions, the solution to HMEC is asymptotically less than the solution to MEC
by a factor of Ω(log n) when δ is polynomial in 1/n. In order to show this, we first provide
a Ω
(
s log2 s
)
lower bound on the energy used by any circuit using homogeneous supply
voltages, which can be obtained by noting that the voltage setting must be such that the
output gate does not fail with probability more than δ. On the other hand, we provide a
general circuit construction and heterogeneous voltage setting for functions in this class that
uses O(s log s) energy. Intuitively, in a manner similar to the upper bound in Chapter 3, we
replace each gate of a faultless circuit with a fault-tolerant gadget, and supply low voltages
to these gates, with the result that the failure rate for each gate of each gadget is a constant.
We are left at the end with a (relatively small) set of wires, such that, with high probability,
the majority of these wires carry the correct output bit. We then use a majority circuit, with
voltages set sufficiently high so that with high probability no gate fails, and thus obtain the
correct output bit with high enough probability.
Limit for many functions on the energy savings via heterogeneous supply
voltages: We then show that for functions that have circuits of linear size and obtain a
Ω(log n) energy savings via heterogeneous supply voltages, this energy savings is tight, i.e.,
the solution to HMEC is only a factor of Θ(log n) less than the solution to MEC, as long as
most input bits are non-degenerate. An input bit to a function is non-degenerate if, roughly,
16
there is some input to the function where the value of that bit matters.
Theorem 19. Let f be a function with b non-degenerate input bits. Then, for any 0 < δ <
1/2, any circuit C that computes f with error at most δ requires Ω(b log 1/δ) energy.
We show that for any function, each input bit that is non-degenerate essentially must
use Θ(log n) energy by itself, or else the output cannot possibly be correct with high enough
probability. If the number of non-degenerate bits of a function is a constant fraction of n,
then even circuits with heterogeneous supply voltages computing that function must use
Ω(n log n) energy when δ = 1/n.
Relations gaining greater energy savings via heterogeneous supply voltages:
In principle, a circuit may compute a relation, rather than a function, as, for example, the
proper functioning of the system may guarantee that the circuit does not receive certain
inputs. This raises the question of whether heterogeneous supply voltages can provide a
greater energy savings over homogeneous supply voltages when computing relations rather
than functions, i.e., whether a circuit with heterogeneous supply voltages that computes a re-
lation can save ω(log n) energy over any circuit with homogeneous supply voltages computing
that relation. We answer this question in the positive.
Theorem 20. Suppose δ = 1/nc for some constant c > 0. Then there is a relation that
can be computed by a heterogeneous circuit using O(n) energy, but for homogeneous circuits
requires Ω
(
n log2 n
)
energy.
The relation that obtains the Θ
(
log2 n
)
energy savings by using heterogeneous supply
voltages is a natural supermajority relation. This Θ
(
log2 n
)
energy savings is the best savings
possible for computing any relation that requires a faultless circuit of size O(n) that must
access Θ(n) of the input bits. Our lower bound on the energy used by any circuit with a
homogeneous voltage setting is similar to our previous such lower bounds. The upper bound
using a heterogeneous voltage setting modifies a standard majority circuit, and benefits
from the fact that, in such a circuit, failures occuring closer to the inputs have only a small
effect on the output of the circuit. As we traverse down the circuit, failures become more
problematic, so we increase the redundancy linearly in order to make failures more rare, but
since the number of gates decreases exponentially, adding this redundancy only increases the
17
circuit size by a constant.
18
2.0 MODEL, DEFINITIONS, AND NOTATION
We now make formal definitions. A Boolean relation h is a map from {0, 1}n to {0, 1}, where
each input is mapped to 0, 1, or both 0 and 1. If x ∈ {0, 1}n is mapped to both 0 and 1,
this can be thought of as “don’t care” (for example because the input x should not occur
in a correctly functioning system). A Boolean function f is a Boolean relation where each
input is uniquely mapped to either 0 or 1. For any input x ∈ {0, 1}n, denote by x` the input
that has the same bits as x, except for the `-th bit, which is flipped. A Boolean relation h
is sensitive on the `-th bit of x if neither h(x) nor h(x`) is mapped to both 0 and 1, and
h(x) 6= h(x`). The sensitivity of h on x is the number of bits of x that h is sensitive on.
The sensitivity of h is the maximum over all x of the sensitivity of h on x.
A gate is a function g : {0, 1}ng → {0, 1}, where ng is the number of inputs (i.e., the
fan-in) of the gate. We assume that the maximum fan-in is at most a constant. A Boolean
circuit C with n inputs is a directed acyclic graph in which there are n nodes with no
incoming edges that each output one of the input bits, and every other node is a gate. The
size of a circuit, denoted by s, is the number of gates it contains. For any I ∈ {0, 1}n, we
denote by C(I) the output of the Boolean function computed by Boolean circuit layout C.
In this paper we consider circuits (C, v¯) that consist of both a traditional circuit layout C
as well as a vector of supply voltages v¯, one for each gate of C. Every gate g is supplied with
a voltage vg. We say that the supply voltages (and, as shorthand, circuit) are homogeneous
when every gate of the circuit is supplied with the same voltage, and heterogeneous otherwise.
We say that a gate fails when it produces an incorrect output, that is, when given an input
x it produces an output other than g(x).
The benefits of allowing the circuit designer to control the voltage depends both on the
rate at which failures decrease as voltages increase as well as whether or not failure rates are
19
previously known. If the circuit designer knows exactly the probability that a component
will fail when supplied a specific voltage, then this may be used as a source of randomness,
which could in theory allow for more efficient computation. Because of this, we consider two
different failure models. In the exact failure model, each non-input gate g fails independently
with probability exactly (vg). In contrast, in the bounded failure model, every non-input gate
g fails with probability at most (vg) (we specify the failure model only for results that do
not hold for both). While the bounded failure model perhaps more closely models reality, in
the sense that the actual error rate of a circuit component may be unknown or may vary with
time or environment, the exact failure model gives the circuit designer more power. In both
models we assume  : R+ → (0, 1/2) is a decreasing function. The voltage supplied to a gate
determines both its energy usage and its failure probability, thus we define g := (vg) and
drop all future formal reference to supply voltages. Finally we assume there is a decreasing,
nonnegative failure-to-energy function E() that maps the failure probability  to the energy
used by a gate. The energy required by a circuit C is simply the aggregate energy used
by the gates,
∑
g∈C E(g) in our notation. Throughout the majority of this dissertation we
assume E() = Θ
(
log2(1/)
)
; Throughout, in the appropriate locations, we discuss how to
generalize our results to other failure-to-energy functions.
A gate that never fails is said to be reliable or faultless. Given a value δ ∈ (0, 1/2)
(δ may not be constant), a circuit (C, ¯) that computes a Boolean relation h is said to be
(1− δ)-reliable if for every input I on which h(I) is not both 0 and 1, C(I) equals h(I) with
probability at least 1− δ. We say that C can compute ` different functions (1− δ)-reliably if
there exists ¯1, ¯2, . . . , ¯` ∈ (0, 1/2)|C| and different functions f1, f2, . . . , f` such that (C, ¯i) is
(1− δ)-reliable for function fi. We say that a circuit is reliable or faultless if it is 1-reliable
(for example, because all its gates are reliable). We say that the circuit is (, δ)-reliable if it
is (1− δ)-reliable when gates fail with probability exactly .
20
3.0 GENERAL ENERGY UPPER AND LOWER BOUNDS
In this chapter, we prove a general lower bound, in terms of sensitivity, and a general upper
bound, in terms of circuit size, on the amount of energy required to compute a function.
3.1 A GENERAL ENERGY LOWER BOUND
Our main goal in this section is to prove Theorem 21, which roughly states that Ω(m logm)
energy is necessary to compute a relation with sensitivity m.
Theorem 21. Let δ < 1/4, and let C be a circuit that (1 − δ)-reliably computes a relation
h of sensitivity m. If each gate g of C fails independently with probability g, and incurs
an energy consumption of E(g), with E being a proper failure-to-energy function, then C
requires
Ω
(
m log
(
m
1− 2√δ
δ
))
energy in order to (1− δ)-reliably compute h.
The outline of this section is as follows. First, we define proper failure-to-energy functions
(Definition 22), and discuss why proper functions are natural. Then, similarly to [18], we
show how to translate our problem to an equivalent problem where the failures occur not
only on gates, but on wires as well. This is formalized in Statement 23, which is implied by
the proof of Lemma 3.1 in [14], and is also used in [18]. Lemma 26 then gives a lower bound
on the energy necessary for (1− δ)-reliable circuits within this new model with wire failures.
The proof is based on the proof of Theorem 3.1 in [18], and uses a series of inequalities
that relate the probability of an input being incorrectly transmitted to the probability of the
21
circuit being incorrect. Using this, we can write the problem as a single-variable optimization
problem and use standard techniques to give the desired lower bound. Finally, to prove
Theorem 21 we show that given a (1 − δ)-reliable circuit C in our original model without
wire failures, we can create a (1− δ)-reliable circuit C ′ in the new model with wire failures,
where the energy consumptions of C and C ′ differ only by a constant.
Definition 22. A failure-to-energy function E is called proper when it satisfies the four
following properties:
1. E is nonincreasing,
2. lim→0+ E()/(log 1/) > 0,
3. lim→1/2− E() > 0,
4. E(1) + E(2) ≥ 2E(√12) for all 1, 2 ∈ (0, 1/2).
The first and third restrictions are natural, since they just require that the energy used
decreases, but never becomes zero, as the probability of failure of a gate increases. The
second property states that the energy must increase “quickly enough” as the probability of
a gate’s failure tends to 0, which is necessary in order to have any energy saving over gates
that never (or almost never) fail. The last property provides a convexity constraint on the
function P . We point out that failure-to-energy functions typically observed in real gates
fall within this class of proper failure-to-energy functions [16, 20].
Statement 23 ([14]). Let g be a gate with fan-in ng, in a circuit C where both gates and
wires may fail. Furthermore, let  ∈ (0, 1/2), ζg ∈ [0, /ng] and let g(t) be the output of gate
g assuming that its input-wires receive input t, and both g and g’s input-wires are reliable.
Then there exists a unique value ηg(y, ζg) ∈ [0, 1] such that if
• the input wires of g fail independently with probability ζg, and
• gate g fails with probability ηg(y, ζg) when the gate receives input y,
then the probability that g does not output g(t) is equal to .
Note that in Statement 23, since we can now have failures on wires, the input y received
by a gate g may be different than the input t received by the corresponding wires.
We need the following definition and technical lemma.
22
Definition 24. Given x1,1, x1,2, . . . , x1,n ∈ R, we recursively define a sequence of numbers
as follows. Let muj = arg maxi xj,i and m
l
j = arg mini xj,i. Then, for all i 6= {muj ,mlj}, let
x(j+1),i = xj,i, and let x(j+1),muj = x(j+1),mlj =
√
xj,muj xj,mlj .
Lemma 25. Let a1, a2, . . . be a sequence of numbers such that aj = xj,muj − xj,mlj , with the
terms xj,muj and xj,mlj as defined above. Then,
lim
j→∞
aj = 0.
Proof. First note that within the n+1-th recursive step of the above construction there must
be some index i∗ that has been chosen as mlj twice. Let k1 and k2 denote the recursive step
during which i∗ is chosen for the first and second time, respectively. More formally, consider
Sl =
⋃
j≤l{mlj}. Then, k2 is the minimum index j ≤ n + 1 such that {mlj} ∩ Sj−1 6= ∅, and
k1 is the index j < k2 such that m
l
k2
∩ Sj 6= ∅.
For notational convenience, we denote xk1,muk1
with xh and xk1,mlk1
with xl. Note that the
sequence formed by xj,muj is monotonically decreasing in j, and similarly xj,m2j is monotoni-
cally increasing in j. Then,
ak2 = xk2,muk2
− xk2,mlk2 = xk2,muk2 −
√
xhxl ≤ xh −√xhxl.
Therefore,
ak2
ak1
≤ xh −
√
xhxl
xh − xl =
√
xh
(√
xh −√xl
)
xh − xl =
√
xh√
xh +
√
xl
.
Rewriting this, we obtain ak2 ≤ ak1/(1 +
√
xl/xh). Then, by observing that the sequence
of ai’s is monotonically decreasing, because the sequence formed by xj,muj is monotonically
decreasing and xj,m2j is monotonically increasing in j, we conclude that
ak2 ≤
a1(
1 +
√
xml1/xm
u
1
) .
It follows that, for any positive integer x, axn+1 ≤ a1/(1 +
√
xml1/xm
u
1
)x, and therefore
limj→∞ aj = 0.
23
Lemma 26. Let E be a proper failure-to-energy function, and let C be a circuit that (1−δ)-
reliably computes a relation h of sensitivity m. If (i) each gate g of C fails independently
with probability ηg(y, ζg) when receiving input y, (ii) g incurs an energy consumption of zero,
and (iii) each wire i entering g fails independently with probability ζg ∈ (0, 1/4) and incurs
an energy usage of f(ζg), then C requires
Ω
(
m log
(
m
1− 2√δ
δ
))
energy in order to (1− δ)-reliably compute h.
Proof. We start by rephrasing our problem after borrowing a constraint on the number of
wires and some notation from [18]. Specifically, let z be an input such that h has maximum
sensitivity on z. Let S ⊂ {1, 2, . . . , n} be the set of indexes so that ` ∈ S if and only if h is
sensitive to the `-th bit on input z. Then |S| = m, where m is the sensitivity of h. For each
` ∈ S denote by B` the set of all wires originating from the `-th input of the circuit. Let
w` = |B`|. For any set β ⊂ B`, let H(β) be the event that the wires belonging to β fail and
the other wires of B` are correct. Denote by β` the subset of B` where
max
β⊂B`
Pr[C(z`) = h(z`) s.t.H(β)]
is obtained, where C(z`) is a random variable for the output of the circuit given input z`.
Finally, let H` = H(B` \ β`). Note that since wires can now fail with different probabilities,
we have that,
Pr[H`] =
∏
i∈β`
(1− ζi)
∏
i/∈β`
ζi ≥
∏
i∈B`
ζi.
It follows from Inequalities (5) and (6) of [18] that
δ
1− 2√δ ≥
∑
`∈S
∏
i∈B`
ζi
and as in [18], using the inequality of arithmetic and geometric means, we have
δ
1− 2√δ ≥ m
( ∏
`∈S,i∈B`
ζi
)1/m
.
24
Rewriting this to isolate the product term, we have∏
`∈S,i∈B`
ζi ≤
(
δ
m(1− 2√δ)
)m
.
Therefore, minimizing the energy consumption, is equivalent to the following optimization
problem,
minimize
∑
`∈S,i∈B`
E(ζi)
subject to
∏
`∈S,i∈B`
ζi ≤
(
δ
m(1− 2√δ)
)m
.
Now, take some feasible solution ζ∗ to the above optimization problem. Let ζ∗1 and
ζ∗2 denote the minimum and maximum ζ
∗
i respectively, and M denote the total number of
wires, i.e., M =
∑
`∈S w`. Note that since we assume that E(p1)+E(p2) ≥ 2E(
√
p1p2) for all
p1, p2 ∈ (0, 1/2), we can set ζ∗1 = ζ∗2 =
√
ζ∗1ζ
∗
2 , without increasing the value of the objective,
and further the constraint remains feasible. By Lemma 25 this process, if repeated, will
converge to a solution where all ζi are equal. Therefore, we can rewrite the optimization
problem as
minimize ME(x)
subject to xM ≤
(
δ
m(1− 2√δ)
)m
.
Isolating M in the constraint above, the problem is equivalent to that of minimizing(
m
log 1/x
log
(
m
1− 2√δ
δ
))
E(x).
Since the function satisfies properties 1, 2, and 3 of Definition 22, the above expression will
be minimized either at some constant x ∈ (0, 1/4), in which case E(x)/ log(1/x) > 0, or in
the limit as x approaches 0, in which case
lim
x→0+
E(x)/ log(1/x) > 0,
or in the limit as x approaches 1/4, in which case
E(x)/ log(1/x) > 0.
The lemma follows.
25
We are now ready to prove Theorem 21.
of Theorem 21. We start by constructing a new circuit C ′ for computing h, which is identical
to C except that both wires and gates may fail, wires of C ′ incur some non-zero energy
consumption (as a function of their probability of failure), and the gates in C ′ do not consume
energy. First we argue that this can be done such that C ′ is (1 − δ)-reliable. Observe that
if for each wire i entering gate g we set its probability of failure to ζg = g/ng, we can apply
Statement 23 and set the failure probability on gate g when receiving input y to ηg(y, ζg).
The result is that when the input wires of gate g in C ′ receive input t, the probability that
g does not output g(t) is g (the same as the probability of failure of g in the original circuit
C). Thus by setting these failure probabilities for each gate and wire in C ′ we have that, for
any input x, C and C ′ output h(x) with the same probability, and so C ′ is (1− δ)-reliable.
Now we set the energy consumption of the wires such that the energy of C ′ is at most
the energy of C. First observe that if for each gate g we set the failure-to-energy function
of the wires that are inputs to g to be E˜g(ζ) = E(ng · ζ)/ng, then since ζg = g/ng, the
total energy of the wires entering g would be ngE˜g(ζg) = E(g) and the energy of C and C
′
would be equal. However, to apply Lemma 26, all wires must have the same failure-to-energy
function. Therefore, let n∗g be the maximum fan-in of any gate of C, i.e., n
∗
g = maxg∈C ng.
We set the failure-to-energy function of all wires in C ′ to be
E˜(ζ) =
E(n
∗
g · ζ)/n∗g if ζ < 12n∗g ,
lim→1/2− E()/n∗g if ζ ≥ 12n∗g .
First observe that E˜g(ζ) ≥ E˜(ζ) for all ζ ∈ (0, 1/2) since E is nonincreasing so E(ngζ) ≥
E(n∗gζ). This implies that the energy of C
′ is at most the energy of C.
In order to apply Lemma 26, we need to verify that E˜ is a proper failure-to-energy
function. The first property follows directly from the definition of E˜. For the second property,
observe that
lim
ζ→0+
E˜(ζ)
log
(
1
ζ
) = 1
n∗g
lim
ζ→0+
E(n∗gζ)
log
(
1
n∗gζ
) · lim
ζ→0+
log
(
1
n∗gζ
)
log
(
1
ζ
)
= lim
ζ→0+
E(n∗gζ)
log
(
1
n∗gζ
) > 0.
26
The third property follows from the fact that
lim
ζ→1/2−
E˜(ζ) = lim
→1/2−
E()/n∗g > 0,
where we exploited the definition of E˜ and the fact that, by hypothesis, E is a proper failure-
to-energy function. For the fourth property, let ζ1, ζ2 ∈ (0, 1/2), and, w.l.o.g., ζ1 < ζ2. There
are four cases, depending on the relationship between ζ1, ζ2, and n
∗
g. When ζ1 < ζ2 < 1/2n
∗
g,
by applying the definition of E˜ and since E by hypothesis is a proper failure-to-energy
function, we have
E˜(ζ1) + E˜(ζ2) =
E(n∗gζ1)
n∗g
+
E(n∗gζ2)
n∗g
≥ 2E
(√
n∗gζ1n∗gζ2
)
n∗g
= 2E˜
(√
ζ1ζ2
)
.
When ζ1 < ζ2 = 1/2n
∗
g, by the previous case we have that
lim
ζ2→(1/(2n∗g))−
(
E˜(ζ1) + E˜(ζ2)− 2E˜
(√
ζ1ζ2
))
≥ 0,
and so in this case the property holds. When ζ1 < 1/2n
∗
g < ζ2, we have that
E˜(ζ1) + E˜(ζ2) = E˜(ζ1) + E˜
(
1
2n∗g
)
≥ 2E˜
(√
ζ1
2n∗g
)
≥ 2E˜
(√
ζ1ζ2
)
,
where the first equality holds by definition of E˜, the first inequality follows by the preced-
ing case, and the second inequality holds since E˜ is nonincreasing ans since, in this case,√
ζ1/2n∗g ≤
√
ζ1ζ2. Finally, when 1/2n
∗
g ≤ ζ1 < ζ2,
√
ζ1ζ2 > ζ1 and thus, by definition,
E˜(ζ1) = E˜(ζ2) = E˜
(√
ζ1ζ2
)
. We conclude that E˜ is a proper failure-to-energy function. The
theorem then directly follows by applying Lemma 26 to C ′ and E˜.
27
3.2 A GENERAL ENERGY UPPER BOUND
Our main goal in this section is to prove Theorem 27, which roughly states that O(s log s)
energy is sufficient to simulate a circuit of size s.
Theorem 27. Given a reliable circuit C of size s, a non-trivial failure-to-energy function,
and a fixed constant δ > 0, it is possible to construct a circuit C ′ with homogeneous voltage
supplies that uses O(s log(s/δ)) units of energy and that (1− δ)-reliably computes the same
function computed by C.
To prove Theorem 27 we use an upper bound on the number of gates for fault-tolerant
circuits originally stated by Pippenger [25] and later proved in full generality by Ga´cs [17].
This upper bound is stated in Theorem 28. The energy upper bound follows by choosing
the voltage supply that minimizes the product of the total number of gates in the circuit
constructed in Theorem 28, and the energy used by each gate. More specifically, we want
to set the gate failure probability  so as to minimize E()/(log(1/)− r0) for some constant
r0. As long as E is non-trivial, i.e., if for any p
∗ ∈ (0, 1/2) it holds that E(p∗) < +∞, one
can find an  such that E() = O(1). This setting of  then implies the upper bound of
O(s log(s/δ)) on the energy used by this construction using homogeneous supply voltages.
Theorem 28 ([17]). There are constants R0, 0, r0 > 0 such that for all  < 0 and δ ≥ 3,
for every reliable circuit C of size s there is a circuit of size R0
s log(s/δ)
log(1/∗)−r0 that computes the
same result as C with probability at least 1− δ if gates fail independently with probability at
most , where ∗ = max{, δ/s}.
Proof. The main idea of this construction is to replace each gate of the reliable circuit with
a gadget in the fault-tolerant circuit. A constant k is chosen as the level of redundancy for
the circuit, meaning that each gadget has k outputs and each input to a gate in the reliable
circuit is replaced by k inputs to a gadget. Another constant θ ∈ (0, 1) is chosen such that,
with high probability, θk wires exiting each gadget carry the same value as the corresponding
gate in the reliable circuit. Each gadget contains k copies of the corresponding gate from
the reliable circuit, as well as an additional circuit that ensures that at least a θ fraction of
the wires exiting the gadget are correct.
28
4.0 INTRODUCTION TO SUPPLY VOLTAGE HETEROGENEITY
In this chapter, we show situations where allowing supply voltages to be heterogeneous
rather than homogeneous both does and does not allow for asymptotic decreases in energy
consumption. In Section 4.1, we show that, when δ is a fixed constant, there are functions
which do not use asymptotically less energy when supply voltages may be heterogeneous.
In contrast, in Section 4.2, we show that for the natural circuit computing a supermajority
relation, allowing supply voltages to be heterogeneous allows the relation to be computed
with asymptotically less energy then if the supply voltages must be homogeneous.
4.1 SUPPLY VOLTAGE HETEROGENEITY MAY NOT HELP
In this section we observe in Theorem 29 that there are relations, namely the parity function,
where heterogeneous supply voltages do not allow for an asymptotic reduction in energy.
Theorem 29. Let δ < 1/4 be a fixed constant. The energy used by any circuit to (1 − δ)-
reliably compute the parity function is Ω(n log(n/δ)), and this is achievable by a circuit with
homogeneous supply voltages.
Proof. The parity function can be reliably computed by a perfect binary tree of 2n − 1
XOR reliable gates. Thus, by Theorem 27 there exists a (1 − δ)-reliable circuit for the
parity function that uses homogeneous voltage supplies and that incurs O(n log(n/δ)) energy
consumption. Since the sensitivity of the parity function is n, by Theorem 21 this is the best
possible to within a constant factor.
29
4.2 SUPPLY VOLTAGE HETEROGENEITY CAN HELP
The goal of this section is to prove Theorem 33, which roughly states that heterogeneous
supply voltages allow the natural majority circuit to compute a super-majority with asymp-
totically less energy than is possible with homogeneous supply voltages.
We start by defining the circuit and the logarithmic supermajority relation (LSR).
Lemma 37 shows that Ω(n · E(δ)) energy is required to (1 − δ)-reliably compute the LSR
with homogeneous voltage supplies. The intuition behind the proof is that the output gate
of any (1 − δ)-reliable circuit cannot have a probability of failure greater than δ and, since
the voltage supplies are homogeneous, neither can any other gate. Then in Lemma 38 we
show that there is a heterogeneous setting of the supply voltages so that this circuit (1− δ)
-reliably computes LSR with energy O
(
n+ 31/δE(δ/10)
)
. Intuitively, we split the circuit into
an “upper” part consisting of gates close to the output gate, and a “lower” part consisting
of gates close to the input gates: Each gate in the lower part has a constant probability of
failure and thus a small energy consumption which results in non-constant savings compared
to the optimal homogeneous setting. With the help of technical Lemmas 34 and 35, we are
able to show that although there exist gates in the lower part of the circuit that fail with a
probability higher than δ, no such gate fails with a probability o(1). This preserves enough
information for the upper part of the circuit to still (1− δ)-reliably compute LSR. In other
words, in the upper part of the circuit we use a much smaller probability of failure for each
gate in order to ensure that the circuit will “autocorrect” itself and output the correct result
(see Lemma 36).
Definition 30. The Logarithmic Supermajority Relation (LSR) is the following Boolean
relation:
LSR(x) =

0 if the number of 0′s in x is at least n− 1
2
log3 n,
1 if the number of 1′s in x is at least n− 1
2
log3 n, and
0 and 1 otherwise,
where x is the input and |x| = n.
30
This relation outputs 1 when the input contains at least n − (1/2) log3 n ones, 0 when
the input contains at least n− (1/2) log3 n zeros, and otherwise we “don’t care”.
Definition 31. A majority tree is a Boolean circuit where the gates form a perfect ternary
tree in which the leaves represent the inputs, and each internal gate, called majority gate,
outputs the majority of its three children.
Definition 32. A failure-to-energy function E is called easy-going when the following hold:
• limx→0E(x) = +∞
• There exists a constant c > 0 such that E(x/10)
E(x)
≤ c for all x ∈ (0, 1/2).
Note that this class of failure-to-energy functions contains many natural functions. For
example E(x) = 1/xα and E(x) = (log 1/x)α are both easy-going.
Theorem 33. Let E be an easy-going failure-to-energy function, and let c ∈ (0, 1) be a
constant. Furthermore, let E1(δ) be the optimal energy consumption of the (1 − δ)-reliable
majority tree on n leaves where all the gates must have the same failure probability, and
E2(δ) be the optimal energy consumption of the same (1−δ)-reliable majority tree when each
gate can have an arbitrary failure probability. Then, for δ′ = 1
(1−c) log3 n , there holds
E1(δ
′)
E2(δ′)
= ω(1).
Let pi be the probability that a gate of height i in a majority tree outputs 1. Notice that
pi+1 = p
3
i (1− ) + 3p2i (1− pi)(1− ) + 3pi(1− pi)2+ (1− pi)3
= (3p2i − 2p3i )(1− 2) + .
Also, let R(pi) := pi+1, and let `
∗() be the largest real number such that R(`∗()) = `∗().
Note that `∗() only exists when  < 1/6. Therefore for the following we assume that  < 1/6.
Lemma 34. It holds that `∗() = 1
2
+ 1
2
√
1−6
1−2 . Furthermore, if 1/2 ≤ pi ≤ `∗() then
pi ≤ pi+1 ≤ `∗(), and if pi ≥ `∗() then pi ≥ pi+1 ≥ `∗(), for all i ∈ {1, 2, . . . , log3 n}.
31
Proof. For any fixed point r of R, we have that R(r) = r which implies that the fixed points
are the zeros of the third order polynomial R(r)−r. Observe that these zeros are 1
2
− 1
2
√
1−6
1−2 ,
1
2
, and 1
2
+ 1
2
√
1−6
1−2 .
For the second statement of the lemma, assume first that pi ≥ `∗(), and note that R(pi)
is increasing because R′(pi) = (12− 6)(p2i − pi) > 0 for  < 1/2. Therefore we have
`∗() = R(`∗()) ≤ R(pi) = pi+1.
Now we have to show that pi+1 ≤ pi. It is easy to see that R(pi) − pi is decreasing when
pi = `
∗() and  < 1/6. Furthermore, R(pi) − pi is a concave function when pi ≥ 1/2,
since (R(pi) − pi)′′ = R′′(pi) = (12 − 6)(2pi − 1) < 0, for pi ≥ 1/2 and  < 1/2. Since
R(`∗()) − `∗() = 0, it follows that for pi ∈ (`∗(), 1), R(pi) − pi ≤ 0, and thus R(pi) =
pi+1 ≤ pi.
The case pi ≤ `∗() follows from the fact that R(pi) − pi is decreasing when pi = `∗(),
R(pi)− pi cannot be zero in the interval (1/2, `∗()), and R(`∗())− `∗() = 0.
Note that the above technical lemma implies that `∗() is a stable fixed point. The next
two lemmas will be useful for setting the failure probabilities and analyzing the upper part
of the tree.
Lemma 35. Let G be a majority gate with input gates g1, g2 and g3 which output 1 with
probability q1 > 1/2, q2 > 1/2, and q3 > 1/2, respectively. Furthermore let qG be the prob-
ability that G outputs 1 (for the given probabilities of the inputs to output 1). If we alter
g1, g2, and g3 to have probabilities q
′
1 > q1, q
′
2 > q2, and q
′
3 > q3 of outputting 1, then for the
new probability q′G of G outputting 1 it holds that q
′
G ≥ qG.
Proof. We have:
q′G = q
′
1q
′
2q
′
3(1− ) + q′1q′2(1− q′3)(1− ) + q′1q′3(1− q′2)(1− )
+ q′2q
′
3(1− q′1)(1− ) + q′1(1− q′2)(1− q′3)+ q′2(1− q′1)(1− q′3)
+ q′3(1− q′1)(1− q′2)+ (1− q′1)(1− q′2)(1− q′3) =
(1− 2)(q′1q′2 + q′1q′3 + q′2q′3 − 2q′1q′2q′3) +  ≥
(1− 2)(q1q2 + q1q3 + q2q3 − 2q1q2q3) +  = qG
32
The inequality holds because the partial derivatives of the right-hand side with respect
to q1, q2 and q3 are all always nonnegative.
Lemma 36. Consider a majority tree T of height b1/δc (for δ small enough) where each
input of T is 1 with probability at least 0.79, and suppose that each gate of T has a failure
probability of δ/10. Then T outputs 1 with probability at least 1− δ.
Proof. By Lemma 35, the probability that T outputs 1 is minimized when all of the inputs
are 1 with probability exactly 0.79.
Let f(pi) = pi+1−pi = p3i (1− ) +3p2i (1−pi)(1− ) +3pi(1−pi)2+ (1−pi)3−pi. Since
by assumption  < 1/2 and pi > 1/2, we have that f
′′(pi) = (12pi − 6)(2 − 1) is negative
and therefore f is concave. Furthermore, by setting  = δ/10, and for δ small enough we
may omit higher order terms and obtain
f(1− δ) ≥ 4
5
δ.
It can be verified that for δ small enough f(0.79) ≥ (4/5)δ, and f(0.79) ≥ f(1 − δ).
By the concavity of f , f(x) ≥ (4/5)δ for x ∈ (0.79, 1 − δ). This means that there exists a
0 < k < 1/δ, so that each gate of height blog3 n − 1/δ + kc outputs 1 with probability at
least 1 − δ. It suffices to show that each gate of height greater than blog3 n − 1/δ + kc has
a probability of outputting 1 that is not lower than the probability of the gates one level
below to output 1. This follows by Lemma 34: For δ small enough, 1 − δ < `∗(δ/10), and
pi ≤ pi+1. This completes the proof of the lemma.
With the help of the above lemmas, we are now ready to bound E1 and E2.
Lemma 37. It holds that E1(δ) = Ω(n · E(δ)).
Proof. In order to lower bound E1, we will consider the case where each input bit is 1. Note
that the root cannot have a probability of failure greater than δ, since then even with all
its inputs being correct it would not give the right output with the desired probability, and
by Lemma 35 this probability can only decrease as the probability that the input gates are
correct decreases. Because all gates must have the same probability, we have that each of
the O(n) gates has an energy consumption of at least E(δ), and the lemma follows.
33
0r
11111 1. . . 11111 1. . .
(a)
g
b
x1
b1
x2
b2
x3
b3
(b)
Figure 3: (a) Pr[r outputs 1] ≥ 1− p. (b) The path from b to g. The input gates bi receive
input 0.
Lemma 38. It holds that E2(δ) = O
(
n+ 31/δE(δ/10)
)
.
Proof. Assume without loss of generality that the input contains at least n− (1/2) log3 n 1’s,
so that the desired output is 1. We assign a failure probability of δ/10 to each gate located
at a height of at least log3 n− 1/δ, and a failure probability of 0.12 to each gate at a height
strictly less than log3 n − 1/δ. By Lemma 36, it suffices to show that each gate at height
blog3 n− 1/δc outputs 1 independently with probability at least 0.79.
Let p = 0.36. Consider a majority tree where each gate has a failure probability of 0.12.
Then, by Lemma 34, and since for this tree p0 = 0.88 > `
∗(0.12), we have that the root of
the tree outputs 1 with probability at least `∗(0.12) > 0.8. Thus, a reliable majority gate
whose inputs are one 0 and the outputs of two arbitrarily sized majority trees whose inputs
are all 1’s outputs 0 with probability at most 1− 0.82 = p. See Figure 3a.
Consider a majority tree of height h rooted at gate g, and fix an input to this tree that
contains exactly d zeros as input, with 0 < d < h. Let b be any of the input gates of the
tree that was assigned a zero for this input. We first show that the probability that the path
34
from b to gate g contains only 0’s after each gate has computed is at most ph−d. Let bi for
i = 1, 2, . . . , d− 1 be the other input gates that were assigned 0’s. We may assume that the
path from each bi to g intersects the path from b to g, at a distinct gate xi. Furthermore we
may assume that each such xi outputs a 0. See Figure 3b for an example. The probability of
such a path from b to g to contain only 0’s is equal to the probability that the h−d−1 non-xi
gates on the path from b to g output a 0. Note that these non-xi gates either receive a 0
and two inputs from majority subtrees whose inputs are all 1, or three inputs from majority
subtrees whose inputs are all 1. Therefore, by the above observation about p and Lemma 35,
the probability of such a path of all 0’s is at most ph−d.
Let T be any full (but not necessarily complete) majority tree of some height hT . For
any hA ≥ hT , we can “complete” a copy of tree T by adding extra gates in order to obtain
a perfect majority tree A of height hA. We associate each gate in T with the corresponding
gate in A. We claim that if the input at the leaves of both T and A consists of only 1’s, then
each gate of T is at least as likely to output 1 as the corresponding gate in A. We prove
this claim by induction over the heights of gates in T . The base case, i.e., if a gate in T is a
leaf, is straightforward. Assume now that each gate of T up to some height h′, has a higher
probability of outputting 1 than its corresponding gate in A, and consider a non-leaf gate g′
of T at height h′ + 1. Since T is a full tree, and g′ is not a leaf, g′ must have three children.
The inductive step now directly follows from Lemma 35.
Next, consider any subtree B of our original majority tree that is rooted at a gate g of
height blog3 n− 1/δc. Since we assumed that the input to the original tree contains at most
(1/2) log3 n 0’s, clearly this holds for B as well. See Figure 4 for an example of a tree B.
Now we want to lower bound the probability that g outputs a 1 when there is no path of
all 0’s from a leaf to g. We note that if there is no path of all 0’s from a leaf to g then there
exists a full subtree T ′ of B that is also rooted at g and whose inputs can be assumed to be
all 1’s. The subtree T ′ can be constructed by truncating each leaf-to-root path in B at the
first node that outputs 1. The existence of T ′ follows from the fact that there is no path of
all 0’s from a leaf in B to g, but the structure T ′ depends on the random events occurring
at each gate in B \ T ′ and the leaves of T ′. Conditioning on those random events, we have
that B and T ′ output a 1 with the same probability.
35
11 1 0 1 1 1
1
0 1 1
1
1 0 1 1 1 1 1 1 1
1
0
1 0 1
1
1 1 1
0
0 1 0
Figure 4: A subtree B. The solid edges denote the full ternary subtree T ∈ Γ. Note that T
has 1’s as inputs on its leafs. The dashed edges denote the edges in B \ T . The gray nodes
denote gates that failed.
Let Γ be the set of all full ternary trees of height at most blog3 n− 1/δc, and for T ∈ Γ
let XT be the event that the truncated tree described above is T . We have that:
Pr[B outputs 1] ≥
Pr[B outputs 1|@ path of all 0’s] Pr[@ path of all 0’s] ≥
Pr[@ path of all 0’s] Pr[A outputs 1].
The second inequality follows because
Pr[B outputs 1|@ path of all 0’s] =∑
T∈Γ
Pr[XT ] Pr[T outputs 1 when given only 1’s as input] ≥∑
T∈Γ
Pr[XT ] Pr[A outputs 1] =
Pr[A outputs 1].
36
It follows by the union bound over all possible leaf-to-root paths of all 0’s that g, and
therefore every gate of height blog3 n − 1/δc, outputs 1 independently with probability at
least `∗() · (1− 1
2
(log3 n)p
1
2
log3 n− 1δ ). For n large enough this is at least 0.79. By Lemma 36,
the upper part of the majority tree outputs 1 with probability at least 1 − δ. The total
energy of the circuit is at most E(0.12)n+ 31/δE(δ/10).
Proof. By Lemmas 37 and 38, we have that for δ′ = 1
(n−c) log3 n ,
E1(δ
′)
E2(δ′)
= Ω
(
n · E(δ′)
n+ 31/δ′E(δ′/10)
)
= Ω
 n · E
(
1
(1−c) logn
)
n+ n1−cE
(
1
(1−c) logn
)

= ω(1),
where the second equality follows by the second property of easy-going functions, and the
third equality by the first property of easy-going functions when taking n large enough.
We note that there are more trivial examples where heterogeneous supply voltages help.
For example, consider a circuit that is a balanced binary tree of gates that each output the
first bit, and the relation that outputs the first input bit. As most gates in this circuit are
irrelevant to computing the desired relation, one can get an asymptotic energy saving by
setting the supply voltages of the irrelevant gates to zero. Our example is more natural as
one cannot simply power-off most of the gates. Although one might argue that our example
is still not fully satisfactory as a more energy-efficient way to compute the super-majority
relation is to use the majority circuit from [29] to compute the majority of the first log n bits
with the supply voltages on each gate set so that the probability that any gate fails is at most
δ. So a natural question is, “For every relation, is there is an asymptotically energy-optimal
circuit for computing this relation that uses homogeneous supply voltages?”
37
5.0 HARDNESS AND ALGORITHMIC RESULTS FOR CIRCUIT
ENERGY PROBLEMS
In this chapter, we evaluate the traditional solution to MCE, and show complexity-theoretic
barriers to obtaining a better solution to MCE in general. The model of circuit and gate
failure described in Chapter 2 is called the von Neumann failure model. In this chapter,
partially to provide evidence that our results also hold for general models, we also prove
results for the 0-default failure model. In the 0-default failure model, gates are always
faultless, but each input wire to a gate g is associated with a probability of failure , and
when a wire fails it sends the default value of 0 (e.g., the wire by default carries a low voltage).
More formally, for a given input I = (b1, . . . , bng) ∈ {0, 1}ng , the ith input wire carries bit
bi. If bi = 0 then with probability 1 g receives 0 as the i
th input bit. If bi = 1, then with
probability  the wire fails and g receives 0 as the ith input bit, and with probability 1−  g
receives 1 as the ith input bit (note that a failure can only change a wire from carrying a 1
to carrying a 0). In this case, there is a voltage-to-energy function P (v) mapping the supply
voltage to the energy used by a wire with that supply voltage. The energy required by a
circuit C is simply the aggregate energy used by the wires,
∑
w∈C P (v). For convenience, we
define a failure-to-energy function E(q) := P (−1(q)), where −1 denotes the inverse of the
function . Thus the energy of a circuit C can be rewritten as
∑
w∈C E((v)). Since the two
quantities we are most interested in are failure probability and energy, and the failure-to-
energy function describes a direct relationship between the two, from henceforth we drop all
reference to the supply voltage (e.g., we denote (v) by ). (, δ)-reliability for the 0-default
model is defined in the natural (analogous) way.
We consider bi-criteria approximations on energy and circuit failure.
38
Definition 39. For any circuit C and δ ∈ (0, 1), let ∗C,δ be the solution to MCE(C,δ). An
algorithm is a (c, d)-approximation for MCE if on any input (C, δ) it outputs a value  such
that C is (, dδ)-reliable and E() ≤ c · E(∗C,δ).
Note that a (c, 1)-approximation for MCE means that the approximation is only on
energy, i.e., the algorithm outputs an  such that the circuit is (-δ)-reliable and the circuit
uses at most c times the energy of the circuit with the optimal choice of . Throughout
this chapter we generalize our failure-to-energy function to be E() = Θ(logα 1/) for some
α > 0.
5.1 POLYNOMIAL-TIME APPROXIMATION OF THE MINIMUM
CIRCUIT ENERGY PROBLEM
In this section we show in Theorem 40 that the approximation ratio of the traditional al-
gorithm, which sets  = δ/s, is O(logα s). We can actually prove a slightly more general
bi-criteria approximation bound, in Theorem 41, that shows the trade-off on approximation
between energy and reliability for a generalization of the traditional approach. For the 0-
default failure model, we require that the circuit is non-trivial in the sense that there is at
least one input that causes the output to be 0, and at least one input that causes the output
to be 1.
Theorem 40. In both the von Neumann and 0-default failure models, the traditional ap-
proach is an (O(logα s), 1)-approximation for the MCE problem on non-trivial circuits.
Theorem 41. Let w denote the total number of wires of the circuit C, that is, w =
∑
g∈C ng,
and let ϕ denote the fan-in of the output gate of the circuit. In the 0-default failure model,
setting  = δ/(βw), for any β ≥ 1, yields a ((2ϕ2/ log 2)α logα(βw), 3/(2β))-approximate
solution for the MCE problem on non-trivial circuits. In the von Neumann failure model,
setting  = δ/(βs), for any β ≥ 1, yields a ((2/ log 2)α logα(βs), 3/(2β))-approximate solution
for the MCE problem.
Proof. We first prove Theorem 41 for the 0-default failure model. First, we will choose the
39
greatest value of  for which we can prove that the desired bound on the error of the circuit
(that is, 3/(2β)) is satisfied. Since the probability that no wire in the circuit C fails is
(1− )w, it is sufficient to set  such that
(1− )w ≥ 1− 3
2β
δ,
that is
log(1− ) ≥
log
(
1− 3
2β
δ
)
w
. (5.1)
From standard calculus we know that
log(1− x) > −3
2
x for 0 < x ≤ 0.5828
and
log(1− x) < −x for x < 1 and x 6= 0,
and thus Inequality 5.1 is satisfied by setting  = δ/βw, since
log(1− ) = log
(
1− δ
βw
)
> −3
2
δ
βw
>
log
(
1− 3
2β
δ
)
w
.
Then, we have to show that with this choice of  the energy E used by the circuit is at
most a factor of (2ϕ2/ log 2)α logα(βw) of the energy E∗ used in an optimal solution. As for
the preceding theorem, to do this we determine an upper bound to the optimal solution ∗,
that is the maximum value of  for which the circuit is (, δ)-reliable, from which it follows
a lower bound for the energy used in an optimal solution. We have two cases, depending on
whether the last gate go of the circuit outputs 0 or 1 on input (0, 0, . . . , 0). Consider first
the case, that is, go(0, 0, . . . , 0) = 0. The other case is symmetric. Since by hypothesis the
circuit is non-trivial, then the circuit does not represent the constant function f ′ = 0. Hence,
there must be at least one input I to the circuit C for which C(I) = 1. Let q denote the
probability that all the ϕ wires entering the output gate go receive value 0 when the input
40
to the circuit is I. If we denote with p the probability that the circuit outputs the correct
bit when each of its wires fails with probability , then it holds that
1− p = Pr[circuit C outputs the wrong bit]
≥ Pr[circuit C outputs the wrong bit on input I]
= Pr[circuit C outputs 0 on input I]
= q · 1 + (1− q)·
·Pr[go receives an input x s.t. go(x) = 0]
≥ q + (1− q) ·Pr[go receives input x = (0, 0, . . . , 0)]
≥ q + (1− q) Pr[all the ϕ input wires of gate go fail]
= q + (1− q)ϕ
≥ ϕ,
and therefore,
p ≤ 1− ϕ.
In an optimal solution it must be that
p ≥ 1− δ,
and thus, combining the two previous inequalities, it must hold that
1− δ ≤ 1− (∗)ϕ,
that is
(∗) ≤ δ1/ϕ.
This implies a lower bound of s logα(1/δ1/ϕ) for the optimal energy consumption E∗.
41
For the same reason, the energy consumption E of our approximate solution is s logα(βw/δ).
Since δ < 1/2, β ≥ 1, we have,
E = n logα
(
βw
δ
)
= n
(
log(βw) + log
1
δ
)α
≤ n2α−1
(
logα(βw) + logα
1
δ
)
= n2α−1
(
logα(βw) + ϕα logα
1
δ1/ϕ
)
≤ n2α−1
(
logα(βw)
ϕα logα 1
δ1/ϕ
logα 21/ϕ
+
+ logα(βw)
ϕα logα 1
δ1/ϕ
logα 21/ϕ
)
= n
(
2ϕ2
log 2
)α
logα(βw) logα
1
δ1/ϕ
≤
(
2ϕ2
log 2
)α
logα(βw) · E∗,
where the first inequality follows from Jensen’s inequality.
The proof for the von Neumann failure model is similar.
We can then prove Theorem 40, showing that the traditional approach is a (O(logα s), 1)-
approximation, by using the same analysis with β = 3/2,
5.2 HARDNESS OF APPROXIMATION FOR THE MINIMUM CIRCUIT
ENERGY PROBLEM
In this section we prove that it is NP-hard to obtain a significantly better approximation
than the O(logα s) obtained from the traditional approach.
Theorem 42. In both the von Neumann and 0-default failure models, it is NP-hard to
(logα−γ s, 1)-approximate the MCE problem for any γ > 0.
42
Proof. The main idea of the proof is to show that for a satisfiable circuit and an unsatisfiable
circuit there is a large gap between the probability they correctly compute their input. In
particular, in the case of a satisfiable input, we show that it is very unlikely for the output
of the circuit to be a 1. For technical reasons we restrict γ to γ ∈ (0, α). It is clear that
the problem is only computationaly harder as γ increases. The proof makes use of some
technical facts stated after this proof.
Assume by contradiction that there exists a (logα−γ s, 1)-approximate algorithm A. For
notational convenience, let c = logα−γ s. Furthermore, let φ be an arbitrary 3SAT formula
with n variables and m clauses. Let Sφ be the natural circuit for φ that uses at most 3m NOT
gates to represent the negated variables, m OR gates of fan-in 3 to represent the clauses,
and a tree of m− 1 AND gates of fan-in 2 that computse the conjunction of all clauses. See
Figure 5 for an example.
x1
x2
x3
x4
x5
x6
Figure 5: The circuit Sφ where φ = (x1∨ x¯2∨x4)∧(x1∨x2∨x3)∧(x¯3∨x6∨x5)∧(x3∨ x¯5∨ x¯6).
Let s = |Sφ|, and note that 2m− 1 ≤ s ≤ 5m− 1. We choose  such that
(
1−  α
√
c
)m+1
+ 4
α√c < 1− 8 (5.2)
and let δ = 8. We will show later that such an  must exist. Now, consider the output of
A, A on Sφ with input δ. We claim that φ is satisfiable if and only if E(A) > c log
α(1

). In
43
the first case, assume φ is satisfiable. Consider Sφ where each gate fails independently with
probability A (or wire in the 0-default model), and the input x such that φ(x) = 1. Let E0
be the event that each of the OR gates receives at least 1 positive input, E1 be the event
that all of the OR gates output a 1 and E2 be the event that Sφ outputs a 1. By Lemma 45,
we know that
Pr[E2] ≤ (1− A)m+1 + 4A.
Further by (5.2), if A = 
α√c then we have
Pr[E2] ≤ (1−  α
√
c)m+1 + 4
α√c < 1− 8 = 1− δ. (5.3)
Note that for A ∈ [ α
√
c, 1/2), the quantity (1− A)m+1 + 4A is maximized at A =  α
√
c.
Therefore, we must have A < 
α√c otherwise by (5.3), the probability that Sφ is correct
would not be within 1− δ, contradicting that A is a (logα−γ(s), 1)−approximation. Further
since E is decreasing, E(A) > E(
α√c) = c logα(1

).
Now, assume that φ is unsatisfiable. Consider Sφ with an arbitrary input x where each
gate fails independently with probability A (or wire in the 0-default model). Note since φ is
unsatisfiable, φ(x) = 0. Let the events E0, E1 and E2 be defined in the same way as before.
Using the bounds on Pr[E2|E1] and Pr[E2|¬E1] from the proof of Lemma 45 we have,
Pr[E2] = Pr[E2|E1] Pr[E1] + Pr[E2|¬E1] Pr[¬E1]
= (1− A) Pr[E1] + (4A) Pr[¬E1]
= (1− A)(Pr[E1|E0] Pr[E0]
+ Pr[E1|¬E0] Pr[¬E0]) + (4A) Pr[¬E1]
≤ (1− A)
(
(1− A)m(3A) + (1− )m−1(A) · 1
)
+ (4A) Pr[¬E1]
≤ 8A.
Therefore for all inputs, the probability that Sφ is correct is at least 1 − 8A. So note
that if A = , the probability Sφ is correct is at least 1 − δ. This shows that ∗ ≥ . By
44
the definition of A being (c, 1) approximate this means that E(A) ≤ cE() = c logα(1 ).
This shows that we can determine the satisfiability of φ using A. If E(A) > c log
α(1

), φ is
satisfiable, and otherwise if E(A) ≤ c logα(1 ), φ is not satisfiable. The last thing to do is
show the existence of an  satisfying (5.2). Consider  =
(
1
m+1
) 1
α√c . Then,
(1−  α
√
c)m+1 + 4
α√c ≤ e− α
√
c(m+1) + 4
α√c = e−1 +
4
m+ 1
.
Also, 1− 8 = 1− 8 ( 1
m+1
) 1
α√c . Note that e−1 + 4
m+1
< 1− 8 ( 1
m+1
) 1
α√c since
lim
m→∞
8
(
1
m+ 1
) 1
α√c
≤ lim
m→∞
8
(
1
m+ 1
) 1
log
1− γα m → 0.
We now state and prove some technical lemmas used in the above proof.
Lemma 43. The recurrence pi = p
2
i−1(1− ) + (1− p2i−1), p0 = 1 satisfies pi ≤ pi−1 for all
i.
Proof. By taking the derivative of pi with respect to pi−1, we see it is increasing in pi−1
and therefore pi ≤ pi−1 implies pi+1 ≤ pi. Since p1 < p0, combining these facts gives that
pi ≤ pi−1 for all i.
Lemma 44. Let  = (1/(m + 1))1/
α
√
logα−γ(s) and let pi = p
2
i−1(1 − ) + (1 − p2i−1), with
p0 = 1. Then, for m bigger than some constant M0,
plog2m ≤ 3.
Proof. We first show that plog2(1/) ≤ 1/
√
2. Note that for pi−1 ≥ 1/
√
2, pi = p
2
i−1(1−2)+ =
p2i−1 − (2p2i−1 − 1) ≤ p2i−1. Therefore, since p1 = (1 − ) it follows that for pi ≥ 1/
√
2,
pi ≤ (1 − )2i . Note for i = log2(1/) we have pi ≤ (1 − )1/ → 1/e as  → 0. It follows
that for some constant M0 and m ≥ M0, plog2(1/) ≤ 1/
√
2. Further it is easy to see that
plog2(1/)+3 ≤ 1/8 by expanding the recurrence and letting → 0.
The last thing to show is that in log2 log2(1/) additional steps we can go from 1/8 to 3.
Let k = log2(1/) + 3. Then, this follows by noting that pk+j ≤ p2jk + 2 for any j ≥ 0. To
45
see this note that it clearly holds for j = 0, and by induction if this holds for an arbitrary
j > 0, then
pk+j+1 = p
2
k+j(1− 2) +  ≤ (p2
j
k + 2)
2 +  ≤ p2j+10 + 3.
By solving p2
j
k =  we obtain that plog2(1/)+3+log2 log2(1/) ≤ 3. Lastly, note that
log2(1/) + log2 log2(1/) + 3 ≤ 3 log2(1/)
= 3
log2 (m+ 1)
α
√
logα−γ s
≤ 3log2 (m+ 1)
α
√
logα−γ2 m
α
√
logα−γ2 e
≤ 9 logγ/α2 (m)
and so plog2m ≤ 3, since by Lemma 43 plog2m ≤ p9 logγ/α(m)/ α√logα−γ 2.
Lemma 45. Let φ be some satisfiable 3SAT formula with n variables and x be the input such
that φ(x) = 1. Then, in both the von Neumann model and the 0-default model, the probability
that Sφ outputs a 1 is bounded above by (1− )m+1 + 4, where  = (1/(m+ 1))1/
α
√
logα−γ(s).
Proof. We first show this holds in the von Neumann failure model. Let go be the output
gate of C (the root of the tree of AND gates). Further, let E0 be the event that each of
the OR gates receives at least 1 positive input, E1 be the event that all of the OR gates
output a 1 and E2 be the event that go outputs a 1. We first calculate Pr[E2|E1]. This is
the probability that the tree of m − 1 AND gates outputs a 1 when all the inputs to the
leaves are 1. Let pi be the probability that a gate on the i
th level outputs a 1. We define
the input to the leaves to be at level 0. Note that p0 = 1, and for i > 0, we can write pi as
a recurrence in the form,
pi = p
2
i−1(1− ) + (1− p2i−1).
Further, since p1 = (1− ), and by Lemma 43, the sequence pi is decreasing as i→∞ we
have that Pr[E2|E1] ≤ (1 − ). Next, we bound Pr[E2|¬E1]. Let A denote the event that
go receives two 1
′s as input. We have,
46
Pr[E2|¬E1] = Pr[E2|¬E1 ∧ A] Pr[A|¬E1]
+ Pr[E2|¬E1 ∧ ¬A] Pr[¬A|¬E1]
≤ (1− ) Pr[A|¬E1] + .
The last thing to do is bound Pr[A|¬E1]. Informally, we first argue that the probability
of getting a 1 to the root of the tree is only increased if E1 occurs, that is all leaves have
value 1. After that, we can use the recurrence to show that for sufficiently large trees this
probability is O(). More formally, for some fixed gate g′, let pL be the probability the left
input is 1 and pR be the probability the right input is 1. Then, if pg′ denotes the probability
g′ outputs a 1, we have
pg′ = (pLpR)(1− ) + (1− pLpR).
Taking the partial derivative with respect to pL or pR shows that pg′ will increase as pL
or pR increase. This implies that Pr[A|¬E1] ≤ Pr[A|E1], since for every leaf, the probability
of having a 1 will not decrease, and therefore by induction on the levels of the tree, every
gate will have an increased probability of outputting a 1. Let h be the height of the tree.
Then, note that Pr[A|E1] = p2h ≤ ph as defined by the recurrence in Lemma 43. However,
since h = log2m by Lemma 44 plog2m ≤ 3 and therefefore Pr[A|¬E1] ≤ 3 and futher,
Pr[E2|¬E1] ≤ 4. We are now ready to calculate the probability that Sφ(x) outputs a 1. We
have,
Pr[E2] = Pr[E2|E1] Pr[E1] + Pr[E2|¬E1] Pr[¬E1]
= (1− ) Pr[E1] + 4Pr[¬E1]
= (1− )(Pr[E1|E0] Pr[E0]
+ Pr[E1|¬E0] Pr[¬E0]) + 4Pr[¬E1]
≤ (1− ) ((1− )m · 1 + (1− )m−1() · 1)+ 4
≤ (1− )m+1 + 4.
47
To see that this holds in the 0-default model, note that Pr[E2|¬E1] = 0 ≤ 4 since a 0
wire will never flip to a 1. Using this we can make an identical calculation to the above to
get that Pr[E2] ≤ (1− )m+1 + 4.
We end by noting that a slight modification of the proof of Theorem 42 can be used to
prove the following more general theorem.
Theorem 46. It is NP-hard to (c, d)-approximate the MCE problem in both the von Neu-
mann and 0-default failure models for all c > 1 and d such that limm→∞ 8d
(
1
m+1
) 1
α√c → 0.
5.3 HARDNESS OF DETERMINING (, δ)-RELIABILITY ON FIXED
INPUTS
In this section we prove the following theorem.
Theorem 47. In the 0-default faulure model, given , δ, C, and I, it is NP-Hard to determine
if C is (, δ)-reliable on I.
The section proceeds as follows. Our reduction is from the gap-3SAT problem, which is
known to be NP-Hard for certain parameters, so we begin by formally defining this problem.
We then bound the probability that the natural 3SAT circuit, Sφ, outputs a 1 when given a
random input both when φ is satisfiable, and when at most 15/16 fraction of the clauses of
φ are satisfiable. Finally, we introduce a circuit Nk that, in the presence of failures, can be
used to randomize our input.
First we must introduce the gap-3SAT[α, β] problem (with α ≤ β), as the NP-hardness
reduction will be from this problem. The problem is as follows: Given a 3SAT instance,
output “YES” if at least a β fraction of the clauses are satisfiable, “NO” if at most an α
fraction of the clauses are satisfiable, and either “YES” or “NO” otherwise (i.e., such inputs
are not given). The hardness of this problem for certain values of α and β follows from the
PCP Theorem [7], and in particular, H˚astad proved the following theorem, giving the best
possible values for α and β.
Theorem 48 (H˚astad [19]). Gap-3SAT[7/8 + , 1] is NP-Hard for all  > 0.
48
The reduction is from the hardness of gap-3SAT[15/16,1]. We use as our main circuit
the standard 3SAT circuit Sφ used elsewhere in this paper (see Figure 5 and the related
discussion). As we have seen, if the tree of AND gates does not receive all 1’s, then with
probability 1 the output is 0. Thus, intuitively, if we could give Sφ a random input, then
(i) if φ is satisfiable, on the satisfying input Sφ is much more likely to output a 1 than on
any other input, and (ii) if φ is not satisfiable, then any assignment satisfies a fraction of at
most 15/16 of the clauses, so a large number (for example, at least a 1/16 fraction) of wires
would have to fail for Sφ to be likely to output a 1. We first bound the probability that Sφ
outputs a 1 when receiving an almost random input in the two cases when there exists a
satisfying assignment and when at most a 15/16 fraction of the clauses can be satisfied. We
then show that it is possible with a polynomially sized circuit to create an almost random
input from a fixed input, and use this to complete the reduction.
Lemma 49. Let φ be a 3SAT formula and Sφ be the circuit for φ, where each wire fails
independently with probability . Suppose that each input to Sφ is a 1 with probability at least
1/2− γ and at most 1/2 + γ. Then, in the 0-default failure model:
1. If φ is satisfiable, then
Pr[Sφ outputs a 1] ≥
(
1
2
− γ
)n
(1− )5m.
2. If at most a 15/16 fraction of the clauses of φ are satisfiable, then
Pr[Sφ outputs a 1] ≤ (3)m/16.
Proof. Let O be the random output of the circuit Sφ and A be the random event that the
tree of AND gates of Sφ receives all 1’s as input. Then clearly O = 1 if A occurs and none
of the wires within the tree of AND gates fail, and O = 0 otherwise. Therefore,
Pr[O = 1] = (1− )2m−1 Pr[A].
49
1. φ is satisfiable. Let E be the event that Sφ receives a satisfying assignment as input.
The probability E occurs is at is at least (1
2
− γ)n, since this is a lower bound on Sφ
recieving any fixed input. Further, if none of the wires in entering the OR gates fail (the
wires entering NOT gates in the clauses can only fail and output 1, which only increases
the probability thatO = 1), then A occurs, so
Pr[A|E ∧ φ is satisfiable] ≥ (1− )3m.
Clearly, the probability that A occurs if Sφ does not receive satisfying assignment as
input is at least 0, so the first statement of the lemma follows since
Pr[O = 1|φ is satisfiable] ≥
(1− )2m−1 Pr[A|φ is satisfiable] ≥
(1− )2m−1 Pr[E|φ is satisfiable]·
Pr[A|E ∧ φ is satisfiable] ≥(
1
2
− γ
)
(1− )5m−1.
2. At most a 15/16 fraction of the clauses of φ are satisfiable. For this case, every
assignment satisfies at most a 15/16 fraction of the clauses. Thus we have that an upper
bound on A occuring is if at least one of the wires associated with not gates in every
clause that is not satisfied fails (if a wire entering an OR gate fails the gate will output
0), and all other gates do not fail. Thus we have that
Pr[A|φ is not satisfiable] ≤ (3)m/16,
and therefore
Pr[O = 1|φ is not satisfiable] ≤ (3)m/16.
The following circuit will be useful in the reduction.
50
Definition 50. Nk is the circuit consisting of one input bit connected to a single line of k
NOT gates, i.e., the output of the ith NOT gate is the input to the i + 1st NOT gate, for
i ∈ [k − 1].
If no gate in Nk fails, the output on input bit b is (b + k) mod 2. However, if each of
these gates fail independently with probability , then the output is random and, for k large
enough, will be b with probability very close to 1
2
. Consider the Markov chain M with two
states that correspond to the output bit after a certain number of NOT gates, and transitions
with probabilities based on whether or not the wire entering the current NOT gate fails. If
we label one state “1” and the other “0”, then the output of Nk is identical to the output
of starting M in state b and running for k steps. The transition from the 0 state to the 1
state happens with probability 1, since the wire cannot fail in this case. On the other hand,
the transition from the 1 state to the 0 state only happens with probability 1 − , and the
chain stays in the 1 state with probability . It is easy to verify that this chain is irreducible,
aperiodic, and reversible. The transition matrix is
M =
0 1− 
1 
 .
The eigenvalues of M are 1 and  − 1, and the stationary distribution of M is 1−
2− in state
0, and 1
2− in state 1, so the number of steps k(ρ) until we are ρ away from the stationary
distribution is
k(ρ) ≤ 1

log
(
2− 
ρ(1− )
)
.
For a more in depth discussion of Markov chains and mixing times, see, e.g., [23]. By setting
ρ = 0.05, we obtain the following observation.
Observation 51. Suppose each wire of Nk fails independently with probability  < 1/10.
Then in the 0-default failure model if k ≥ log(44)/, we have that 0.4 ≤ Pr[Nk(b) = b] ≤ 0.6.
We can now finish the reduction.
Proof of Theorem 47. The reduction is from gap-3SAT[15/16,1]. Let φ be a 3SAT formula
that is either satisfiable or at most a 15/16 fraction of the clauses can be satisfied. Without
loss of generality, we can assume the assignments of all 1’s and all 0’s do not satisfy φ, and
51
that there are at least n clauses in φ. We set  = 1.4×10−7 (a constant). Construct a circuit
S ′φ that is Sφ except that each input first passes through a Nk circuit, where k = dlog(44)/e,
and thus S ′φ is polynomial in size and logarithmic in depth. We fix the input to this circuit
to be the input of all 1’s, so the correct output of S ′φ is 0. By Observation 51, the output of
each Nk circuit is 1 with probability at least
1
2
− γ and at most 1
2
+ γ for γ = 0.1. We set
δ = (3)m/16. By Lemma 49 (since S ′φ is incorrect if it outputs 1), if we show that
(3)m/16 < (0.4)n(1− )5m (5.4)
then it is NP-Hard to determine whether or not S ′φ outputs correctly with probability at
least 1− δ. Rearranging the exponents and noting that n ≤ m, we obtain that
3 < (0.4)16(1− )80
implies that (5.4) holds. It is easy to verify that the choice of  satisfies this inequality.
In the von Neumann failure model, we were unable to prove that determining if a circuit
is (, δ)-reliable is NP-Hard. The difficulty with following this same proof structure extends
from two conflicting constraints. The first constraint is that the probability that S ′φ outputs
a 1 when at most a 15/16 fraction of the clauses of φ are satisfiable must be smaller than
the probability S ′φ outputs a 1 when φ is satisfiable. In the 0-defualt failure model, if the
tree of AND gates in S ′φ received anything but all 1’s as input, the circuit would output 0.
In the von Neumann failure model this is not the case, since any of the AND gates (e.g., the
output gate) can fail and incorrectly output a 1 instead of a 0. Thus in the von Neumann
failure model, the tree of AND gates has a higher probability of outputting a 1 if 15/16
of its inputs are 1’s than if very few of its inputs are 1’s, and this difference in probability
is polynomial in . Since in the case when φ is satisfiable we can only guarantee that the
output of the Nk circuits is the satisfying assignment with probability approximately 2
−n,
we need to require  to be exponentially small in order to guarantee that S ′φ has a higher
probability of outputting a 1 when it is satisfiable than when at most a 15/16 fraction of
the clauses are satisfiable. The second constraint is that we need k ≥ 1/ in order for Nk to
output a random bit. Thus if  is exponentially small, the circuit S ′φ will be exponentially
large, and so the reduction will not be polynomial time.
52
5.4 TREE CIRCUITS
There are classes of circuits for which the problems discussed in this paper are much easier,
namely circuits whose graph representation is a tree. The hardness results in this paper stem
from the fact that, in general, the undirected version of the DAG representing a circuit C
may contain cycles. When this is not the case, then the probability that a gate g outputs
a 1 or a 0 is dependent only on the outcomes of the immediate predecessors of g in C, and
thus the situation is much simpler. Given a circuit C that is a tree and where each gate has
bounded fan-in, we describe below how to, in both the von Neumann and 0-default failure
models, answer the question of whether C is (, δ)-reliable in time polynomial in the size of
C (it can be seen that polynomial complexity can be achieved also in slightly more general
settings, e.g., when the circuit’s structure is “close to” a tree).
The algorithm is as follows: Each gate g stores four probabilities:
1. The highest probability that g is correct given that its correct output is 1.
2. The lowest probability that g is correct given that its correct output is 1.
3. The highest probability that g is correct given that its correct output is 0.
4. The lowest probability that g is correct given that its correct output is 0.
Let ϕ be the fan-in of g, and let g1, . . . , gϕ be the parents of g. By choosing one of the
stored probabilities from each of g’s parents, we can in O(2ϕ) steps calculate the probability
that g outputs a 1 in that case, and the correct output of g in that case can be computed
from the correct outputs for the probabilities chosen from g’s parents. Since there are 4ϕ
ways to choose one stored probability from each of g’s parents, we calculate all of these
probabilities. Of those where the correct ouptut of g is a 1, we find and store the highest
and lowest probabilities that g does output a 1, and do the same for those where the correct
output of g is a 0. At the output gate, we find the minimum of the lowest probability that
g is correct given that its correct output is 1, and the lowest probability that g is correct
given that its correct output is 0. This value determines the minimum value for δ given that
functional failures occur with probability . It is straightforward to see how this algorithm
could be modified slightly to find the input to the circuit that minimizes the probability of
53
correctness when functional failures occur with probability .
To see why this alogrithm is correct, consider the situation where all but the ith parent,
gi, of some gate g output a 1 with fixed probability. In this case, the probability that g
outputs a 1 is linear in the probability that gi outputs a 1, and thus this probability is
monotonically increasing, monotonically decreasing, or constant, as the probability that gi
outputs a 1 increases. Further, since the circuit is a tree, changing the input to the subtree
rooted at gi does not affect the probability that any of the other parents of g output a 1.
Thus we can compute the highest and lowest probabilities that g will output a 1 by some
combination of the highest and lowest probabilities that its parents will output a 1. Since
we do not know what the correct output for g should be on the input that causes the circuit
to be incorrect with highest probability, we store these probabilities in the cases when the
correct output of g is either 1 or 0.
5.5 NON-MONOTONICITY OF δ IN 
For any circuit C, let δ∗() be the smallest value such that C is (, δ∗())-reliable. A question
that one might ask is whether δ∗() is, in general, a non-decreasing function of . In both
the von Neumann and 0-default failure models, this is not the case. The circuit depicted in
Figure 6 provides an example for the von Neumann failure model. For this circuit, we obtain
different bounds on δ∗() depending on the input to the circuit. The three cases are based
54
on how many of the OR gates receive a 1 as input.
δ∗() ≥ Pr[y 6= (x1 ∨ x2) ∧ (x3 ∨ x4)|(x1 ∨ x2) = 1
and (x3 ∨ x4) = 1]
= 3(1− )2 + 2(1− )
δ∗() ≥ Pr[y 6= (x1 ∨ x2) ∧ (x3 ∨ x4)|(x1 ∨ x2) = 1
and (x3 ∨ x4) = 0, or (x1 ∨ x2) = 0 and (x3 ∨ x4) = 1]
= 2(1− )2 + 2(1− ) + 3
δ∗() ≥ Pr[y 6= (x1 ∨ x2) ∧ (x3 ∨ x4)|(x1 ∨ x2) = 0
and (x3 ∨ x4) = 0]
= (1− )2 + 32(1− ).
Since δ∗() is the maximum of the previous three bounds, it is easy to see that for  ≤ 1/2,
δ∗() = 3(1− )2 + 2(1− ),
which is strictly decreasing on (a, 1), where a = (5−√7)/6 ≈ 0.39. Intuitively, this happens
because in such a circuit, when  increases, it is more likely that the errors occurring at the
two gates cancel out each other.
x1
x2
x3
x4
y = (x1 ∨ x2) ∧ (x3 ∨ x4)
Figure 6: A simple circuit where δ∗() is not monotone in  in the von Neumann failure
model, consisting of two OR gates and one AND gate.
55
x1
x2
y = ¬(¬(x1 ∧ x2))
Figure 7: A simple circuit where δ∗() is not monotone in  in the 0-default failure model,
consisting of an AND gate and two NOT gates.
Figure 7 depicts an example where δ∗() is not monotone in the 0-default failure model.
Here there are only two cases to bound δ∗(), since if the inputs are not both 1, the output
of the AND gate is a 0 with probability 1.
δ∗() ≥ Pr[y 6= x1 ∧ x2|x1 = 1 and x2 = 1]
= (1− (1− )2)(1− ) + (1− )3
= 1− − (1− )4
δ∗() ≥ Pr[y 6= x1 ∧ x2|x1 = 0 or x2 = 0]
= 
Since δ∗() is the maximum of the previous two bounds, we have that for  ≤ 0.45,
δ∗() = 1− − (1− )4,
which is strictly decreasing on (b,1), where b = 1− 4−1/3 ≈ 0.37.
56
6.0 ALMOST ALL FUNCTIONS REQUIRE EXPONENTIAL ENERGY
In this chapter, we show that almost all functions require circuits using exponential energy
in the exact failure model. We first show that directly applying Shannon’s argument that
almost all functions require circuits with exponentially many gates is not sufficient, as in the
exact failure model, there are circuits that can compute a logarithmic number of functions
using homogeneous voltage supplies, and an exponential number of functions if allowed
heterogeneous voltage supplies. We then prove sufficiently small upper bounds on the number
of functions a single circuit can compute, both with homogeneous and heterogeneous voltage
supplies, and thus show that almost all functions require exponential energy.
6.1 A LOWER BOUND ON THE NUMBER OF FUNCTIONS
COMPUTABLE BY A CIRCUIT
In this section we show that in the exact failure model, in both the homogeneous case and the
heterogeneous case, a single circuit can reliably compute many different functions. Both of
these lower bounds demonstrate that Shannon’s counting argument will not be sufficient to
show that almost all functions require exponential energy. The lower bound on the number
of such functions is much stronger in the heterogeneous case, and thus also demonstrates the
power that heterogeneity affords the circuit designer.
57
6.1.1 Homogeneous Supply Voltages
We start with the homogeneous case giving an explicit construction of a circuit that computes
approximately log n different functions in the exact failure model. The key concept used
throughout is that for a large enough perfect binary tree of AND gates (referred to as an
AND tree) there is some  such that, regardless of the input, the tree will output 0 with high
probability. By combining such trees of different sizes into a single circuit we can essentially
ignore different parts of the input depending on . The statement and proof are formalized
below, after the statement of a technical lemma that we need in the proof of our result.
Lemma 52. Let  ≤ 1/10 and p0 = 1, and let pi = p2i−1(1 − 2) + . Then, for i ≥
log(1/) + log log(1/) + 5, pi ≤ 3.
Proof. It is straightforward to show that since p0 > , we have pi ≥ pi+1 for all i. We first
show that plog(1/)+1 ≤ 1/
√
2. Note that, for pi−1 ≥ 1/
√
2, we have pi = p
2
i−1(1 − 2) +  =
p2i−1 − (2p2i−1 − 1) ≤ p2i−1. Therefore, since p1 = (1 − ) it follows that, for pi ≥ 1/
√
2, it
holds that pi+1 ≤ (1− )2i−1 . Note that for i = log(1/) + 1 we have pi ≤ 1/
√
2, as otherwise
we would reach a contradiction since we would have pi ≤ (1− )1/ ≤ 1/e ≤ 1/
√
2. It is easy
to see that plog(1/)+5 ≤ 1/8.
We now show that plog(1/)+5+log log(1/) ≤ 3. Let k = log(1/) + 5. We show that
pk+j ≤ p2jk + 2 for any j ≥ 0. To see this note that it clearly holds for j = 0, and by
induction if this holds for an arbitrary j > 0, then
pk+j+1 = p
2
k+j(1− 2) +  ≤ (p2
j
k + 2)
2 +  ≤ p2j+1k + 2,
since  ≤ 1/10 and pk ≤ 1/8. By solving p2jk =  we obtain that pk+log log(1/) ≤ 3.
Theorem 53. In the exact failure model, for any δ ∈ (0, 1/2) and n ∈ N, there exists
a circuit C with n inputs and size O(n) that computes Ω
(
logn
log( 1
δ
logn)
)
different functions
(1− δ)-reliably.
Proof. The circuit, which we indicate with C, consists of k perfect binary trees of AND
gates, which we refer to as AND1, . . . ,ANDk, and of a complete binary tree of OR gates,
denoted OR1. The size of ANDi, which will be determined later but decreases exponentially
58
as i increases, is denoted by si, and the size of OR1 is k − 1. Each AND tree receives its
own set of input bits. The outputs of these k trees are fed into the tree of OR gates, and
the output of the latter tree is the output of the circuit (see Figure 8). Thus, when  = 0,
the circuit C computes OR(AND1, . . . ,ANDk).
AND1
ANDk−1
ANDk
OR1
...
...
...
f......
Figure 8: The circuit used in the proof of Theorem 53.
The high level approach is to show that as  grows larger, the larger AND trees switch
from computing the AND function to computing the 0 function. In other words, the result
is completely determined by the remaining functional AND trees. By choosing the sizes
si to be sufficiently different, we can show that each ANDi will switch to computing the 0
function at a different , and further, when this switch occurs all of the smaller trees will
still be functioning correctly with high probability.
Before diving into the technical details, we show the two main bounds we need to
hold and how with these we can prove the desired theorem. We will use i to denote the
minimum gate failure probability at which ANDi begins to compute the 0 function with
high probability. This value along with the size of each ANDi will be given later. Let
fi(I) = OR(ANDi+1(I), . . . ,ANDk(I)), i.e., the function we are trying to compute. The
first result we will need is an upper bound on the probability that the largest AND trees,
AND1, . . . ,ANDi, output a 1. Let E
j
1 be the event that ANDj outputs a 1 on any input
59
when gates are failing with probability i. For all j, 1 ≤ j ≤ i, we will show that
Pr[Ej1] ≤
δ
2k
. (6.1)
The other inequality we will need bounds the probability that any gate in ANDi+1, . . . ,
ANDk fails. More specifically, let E
j
2 be the event that some gate in ANDj fails. We will
show that for all j, i < j ≤ k,
Pr[Ej2] ≤
δ
2k + 2
. (6.2)
Let E3 be the event that any gate in OR1 fails. We will show that Pr[E3] ≤ Pr[Ek2 ].
The result then follows by applying the union bound over all of the trees and combining the
two previous inequalities. That is,
Pr[C(I) = fi(I)] ≥ Pr[¬(E11 ∨ · · · ∨ Ei1) ∧ ¬(Ei+12 ∨ · · · ∨ Ek2 ∨ E3)]
≥ (1− δ/2)(1− δ/2)
> 1− δ.
To complete the proof we will show that for specific values of i and si, Inequality 6.1
and Inequality 6.2 both hold. Let si = (6k/δ)
2(k−i+1) and let i = (δ/6k)2(k−i)+1.1 Fix i,
and consider when each gate in C fails independently with probability i. We first prove
Inequality 6.2. The basic idea is that 1/i is much larger than si+1, and thus the probability
that any gate will fail is quite small. More formally, note that by the union bound, since
each gate fails with probability i and there are sj gates,
Pr[Ej2] ≤ isj
=
(
δ
6k
)2(k−i)+1(
δ
6k
)−2(k−j+1)
≤
(
δ
6k
)2(k−i)+1(
δ
6k
)−2(k−(i+1)+1)
≤ δ
2k + 2
.
1For the sake of simplicity we assume that si is a power of two, minus one.
60
The fact that Pr[E3] ≤ Pr[Ek2 ] follows from noting that OR1 has k − 1 gates, which is less
than sk = (6k/δ)
2.
The last piece is to show that, for these values of si and i, Inequality 6.1 holds. The key
to this argument revolves around a recurrence that describes the probability a gate at some
level ` in an AND tree outputs a 1 when gates are failing with probability i. Intuitively this
probability should be decreasing as we go deeper into the tree, so the goal is to show that
the tree is large enough so that the root outputs a 0 with sufficiently high probability. Let p`
be the probability that a gate at level ` outputs a 1 (where the leaves are level 0) assuming
that all inputs are 1. Then, we have that p0 = 1, and
p` = (1− i)(p`−1)2 + 2i(1− p`−1)p`−1 + i(1− p`−1)2 = p2`−1(1− 2i) + i.
By Lemma 52, we have that for ` ≥ log(1/i) + log log(1/i) + 5, p` ≤ 3i. Using this,
we are now ready to prove Inequality 6.1. Since the height of ANDj is log sj, we have, for k
larger than some constant,
log sj ≥ log si
= log
(
6k
δ
)2(k−i+1)
≥ log
(
6k
δ
)2(k−i)+1
+ log log
(
6k
δ
)2(k−i)+1
+ 5
= log
(
1
i
)
+ log log
(
1
i
)
+ 5
and thus by Lemma 52, Pr[Ej1] ≤ 3i ≤ δ2k .
Up until this point we have constructed a circuit that computes k different functions,
but we have not yet compared k to n. Note that the size of C, which is simply the sum of
the si’s along with the size of OR1 (which is of size k− 1), is Θ(n). Therefore, we have that
n = Θ
(
k − 1 +
k∑
i=1
si
)
= Θ
(
k − 1 +
(
6k
δ
)2k+2 − (6k
δ
)2(
6k
δ
)2 − 1
)
.
Thus,
log n
log(1
δ
log n)
= Θ
(
(2k + 3) log 6k
δ
2 log(2k
δ
log 6k
δ
)
)
= Θ(k),
and the proof is complete.
61
6.1.2 Heterogeneous Supply Voltages
We now show that with heterogeneous voltage settings in the exact failure model, we can
construct a circuit that computes exponentially many functions (1−δ)-reliably. We leverage
the power of heterogeneity to ensure that certain parts of the circuit compute correctly with
high probability, while other parts can fail with high probability. In particular, we build a
circuit for a CNF formula where the literals of the formula can be determined dynamically
by forcing certain gates to fail while preserving the correctness of the CNF calculation. This
allows a single circuit to compute all possible functions representable by CNF formulas with
n inputs and a fixed number of fixed-length clauses.
Theorem 54. In the exact failure model, for any constant δ ∈ (0, 1/2) and n ∈ N, there
exists a circuit C with n inputs of size O(n2) that computes Ω(3n) different functions (1−δ)-
reliably.
Proof. We give a circuit that computes at least 3n different functions. We delay the discussion
of voltages and correctness until we have completely described the circuit. Consider a 3CNF
formula Φ with n variables and m clauses, i.e., Φ(x) is 1 if x satisfies all the clauses and
0 otherwise. To build a circuit that computes Φ, for each clause (`1 ∨ `2 ∨ `3) we have a
single OR gate the inputs of which are variables `1, `2, `3 (note these need not be different
and we are ignoring negations here). The output of each such OR gate is fed into an AND
tree which outputs the conjunction of all such clauses. In such, this circuit computes the
function
fΦ(x) =
 1 if Φ(x) = 1,0 if Φ(x) = 0.
We now give the construction of the circuit C. Consider a generic 3CNF formula Φ =
(`1∨ `2∨ `3)∧· · ·∧ (`3m−2∨ `3m−1∨ `3m), and the corresponding series of OR and AND gates
as described above, however with input wires coming into each `i removed. We will use a
selection circuit to dynamically connect each `i to some xj depending on voltages.
We define the selection circuit for `i, Si as follows. This circuit takes as input log 2n
bits as selectors as well as the 2n bits (x1,¬x1, . . . , xn,¬xn). The output of Si is the bit
corresponding to the location determined by the first log 2n bits. Note that Pippenger
62
provides such a circuit of size O(n) in [25]. Hence for all possible Φ, by appropriately setting
the log 2n bits of each selection circuit, this circuit computes the function fΦ.
The last piece necessary to define C is describing how the log 2n input bit bk of each selec-
tion circuit are set. For each such bk, we have a tree of AND gates of with
⌈
log 12m log 2n
δ
⌉
= |I|
inputs, the output of which is fed into bk. The input to these AND gates are constant 0’s
that go through a single NOT gate. For each such selection circuit Si and each such input
bit bk we refer to this circuit as Ii,k.
We now have a complete description of C and proceed to proving there exist voltage
settings such that this circuit correctly computes 3n different functions. We break this into
two parts. We first show that for any fixed Φ there are voltage settings such that, with
probability at least 1 − δ, C(x) = fΦ(x). We then give a bound on the number of unique
functions fΦ(x).
Fix Φ, and consider the following voltage setting. For each gate in the 3CNF circuit set
S = 1− (1− δ/2)1/3m. For each gate in Si, set Si = 1− (1− δ/2)1/6mN where N is the size
of Si. Finally we need to set the voltages for Ii,k for all i and k. Consider some i, 1 ≤ i ≤ 3m
and some 1 ≤ k ≤ log 2n. Let xj be the input (ignoring negations) `i is to be connected to
in Φ and let b = b1b2 . . . blog 2n be the binary representation of 2j − 1. There are two cases.
Case 1: bk = 0. In this case, set the  of all NOT gates to 1/2 and set all other  in Ii,k
to 1− (1− (1/2)|I|)1/2|I|, where |I| is the size of the input to Ii,k.
Case 2: bk = 1. In this case, set the  of all gates in Ii,k to 1− (1− δ/2)1/6m log 2nN where
N is the size of Ii,j.
We are now ready to bound the probability that C(x) = fΦ(x). Let E1 be the event that
no gate fails in the 3CNF circuit. Since there at most 3m gates in the 3CNF circuit, we have
Pr[E1] ≥ (1− S)3m = (1− δ/2).
Next we bound Pr[Si(x) = xj], where xj is the input `i is to connect to in Φ. Again let
b = b1b2 . . . blog 2n be the binary representation of 2j−1, and let E2 be the event that no gate
in Si(x) fails. We have
Pr[Si(x) = xj] ≥ Pr[Ii,1 = b1 ∧ · · · ∧ Ii,log 2n = blog 2n ∧ E2]
= Pr[Ii,1 = b1] · · · · ·Pr[Ii,log 2n = blog 2n] ·Pr[E2].
63
By construction we have Pr[E2] ≥ (1− Si)|Si| = (1− δ/2)1/6m. To show a similar bound for
the product of the first log 2n terms we have two cases. Let c = 6m log 2n.
Case 1: bk = 0. Let E3 be the event that all NOT gates do not fail and E4 be the event
that no gate in the AND tree of Ii,k fails. Then we have that, when n and m are larger than
some constant over δ,
Pr[Ii,k = 0] = 1−Pr[Ii,k = 1]
= 1− (Pr[Ii,k = 1|E3] Pr[E3] + Pr[Ii,k = 1|¬E3] Pr[¬E3])
≥ 1− (Pr[E3] + Pr[Ii,k = 1|¬E3])
≥ 1− ((1/2)|I| + (1−Pr[Ii,k = 0|¬E3]))
≥ 1− ((1/2)|I| + (1−Pr[E4]))
≥ 1− (1/2)|I|−1
= 1− (1/2)log(2c/δ)−1
≥ 1− (1/2)log(cδ)+8/δc
≥ 1− (1/2)− log(1/cδ)−log(1−4/δc)
≥ 1− (1/2)− log(1/cδ−4/c2δ2)
≥ 1− (1/2)− log(1−e−2/δc)
≥ 1− (1/2)− log(1−(1−δ/2)1/c)
= (1− δ/2)1/c.
Some of the above inequalities follow from the Taylor series expansions of ex and log(1− x).
Case 2: bk = 1. Note that in this case, Pr[Ii,k = 1] ≥ Pr[No gate in Ii,k fails] =
(1− δ/2)1/6m log 2n = (1− δ/2)1/c.
Combining these, we have that Pr[Si(x) = `i] ≥ (1 − δ/2)1/3m. We are now ready to
bound the probability that C correctly computes fΦ(x).
64
Pr[C(x) = fΦ(x)] ≥ Pr[S1(x) = `1 ∧ · · · ∧ S3m(x) = `3m ∧ E1]
= Pr[Si(x) = `i]
3m ·Pr[E1]
≥ (1− δ/2)(1− δ/2)
> 1− δ.
Consider the case where m = n. We now compute the size of C. The size of the 3CNF
circuit is at most 3n. For each of the 3n literals, there is a circuit of size O(n) that uses
log 2n bits to map an input or its negation to that literal. Each of the O(n log n) bits is
created by a tree of size O(log(n log(n)/δ)). Thus C has size O(n2 + n log(n) log(1/δ)).
The last step is to show that there are Ω(3n) unique functions fΦ(x) with m clauses.
Consider some subset S = {s1, . . . , s|S|} ⊆ [n] and some setting x = (xs1 , xs2 , . . . , xs|S|) for
the variables xi such that i ∈ S. Then, for each such xi, if xi = 1 create the clause (xi∨xi∨xi)
and if xj = 0 create the clause (¬xj ∨ ¬xj ∨ ¬xj). Create n − |S| additional clauses that
are a duplicate of one of these clauses. Note that the resulting formula Φ returns 1 exactly
when the input bits S are set to x, regardless of the value of the rest of the input bits, and
0 otherwise. Thus for each unique setting of x and each unique S we obtain a new function.
Since there are
(
n
|S|
)
ways to choose S and 2|S| settings of x, by summing over 0 ≤ |S| ≤ n
we get the desired result.
6.2 ALMOST ALL FUNCTIONS REQUIRE EXPONENTIAL ENERGY
In this section we show that, despite the ability of a single circuit to compute multiple
functions in the exact failure model, an upper bound on the number of such functions and
an adaptation of Shannon’s argument allows us to show that almost all functions require
exponential energy, both in the homogeneous and heterogeneous case. In some sense, this is
evidence that the advantages heterogeneity provides are somewhat limited, as even though
some heterogeneous circuits can compute many more functions than any homogeneous circuit
65
of the same size, this advantage is not sufficient to reduce the minimal circuit size by more
than a constant for almost all functions.
6.2.1 Adaptation of Shannon’s Argument
Inspired by Shannon’s counting argument that almost all functions require exponentially-
sized circuits, we show first that, in circuit models where circuits can compute multiple
functions, as long as the number of functions a single circuit can compute is not too many,
almost all functions still require exponentially-sized circuits. We will combine this with upper
bounds on the number of functions homogeneous and heterogeneous circuits can compute
to obtain our main results. Note that the following lemma assumes gates have fan-in at
most two, and thus all our results assume gate fan-in is at most two; It is straightforward to
generalize this lemma and our results to any setting where the fan-in of gates is a constant.
Lemma 55. Suppose a circuit of size s can compute at most f(s) functions in some circuit
model where gates have fan-in at most two. If there exists some constant c > 0 such that
s4sf(s) = o
(
22
n)
for s = 2n/cn, then almost all circuits require Ω(2n/n) gates in that model.
Proof. Consider the set of circuits with at most s gates. A standard counting argument
shows that any circuit in this set can be represented with 4s log s bits, and therefore there
are at most s4s circuits with size at most s. Thus, if for some c > 0 and s = 2n/cn
it holds that s4sf(s) = o
(
22
n)
, then almost all functions require circuits of size at least
2n/cn = Ω(2n/n).
6.2.2 Homogeneous Supply Voltages
In this subsection we show that almost all functions require exponential-energy homogeneous
circuits in the exact failure model. In some sense, this result is a corollary of the later result
that almost all functions require exponential-energy heterogeneous circuits; However, we
include this result as it illustrates how homogeneous circuits are simpler than heterogeneous
circuits, and we are able to obtain a slightly stronger lower bound on the energy used by
almost all functions. Our proof aims to bound the number of functions a circuit of size s
66
can compute, which is necessary, since, as we showed in the previous section, a single circuit
can compute many functions.
Lemma 56. For any circuit C on n inputs with s gates, and any δ > 0, let F be the set of
all functions f for which there exists some  such that (C, ) is (1− δ)-reliable for f . Then,
|F| ≤ s2n + 1.
Proof. Fix some circuit C and input I, and let CI() be the probability that C outputs a 1
on input I with -faulty gates. Note that by definition for C to compute some function f
with -faulty gates we must have that for all inputs I, either CI() ≥ 1− δ or CI() ≤ δ. Fix
some input I and consider how the output of C changes as we vary . Note that the above
observation implies that C will only switch the function it is computing due to input I if
CI() = 1 − δ and CI() is decreasing or CI() = δ and CI() is increasing. However note
that CI() is a polynomial in  of degree s,
2 and therefore there are at most s such points
since between any two of them the function must change at least once from increasing to
decreasing or vice versa. This means that each input I can cause C to switch the function
it is computing at most s times. Since there are 2n distinct inputs, this means that C can
switch functions at most s2n times, and therefore it is able to compute at most s2n + 1
different functions.
Since E() = Ω(1) for  > 1/2, we need only show that almost all functions require
exponentially many gates in this model to show that almost all functions require exponential
energy. However, the following lemma will allow us to strengthen our theorem statement,
and will be helpful later to show that heterogeneous circuits can asymptotically save energy
over homogeneous circuits.
Lemma 57. Let C be a homogeneous circuit that is (1− δ)-reliable. Then,  ≤ δ.
Proof. Let f be the function C is trying to compute, and fix some input I. It suffices to show
that the output gate, go, must fail with probability less than δ. Let p be the probability that
2If we fix which gates fail, then the output of C on I is fixed to either 1 or 0. A fixed set of q gates fail
with probability q(1− )s−q, a polynomial of degree s in . CI() can be viewed as the sum over the sets of
gates that, when failing, cause C to output 1 on I, of the probability of that set failing.
67
go receives an input I
′ such that go(I ′) = f(I). Then, note that
Pr[go(I
′) = f(I)] = p(1− ) + (1− p)
= p(1− 2) + 
≤ (1− 2) + 
= 1− .
Since by hypothesis Pr[C(I) = f(I)] ≥ 1− δ, it follows that  ≤ δ.
With this in hand, we can now prove the desired theorem.
Theorem 58. For any δ ∈ (0, 1/2), almost all Boolean functions on n variables require
homogeneous circuits using Ω
(
log2(1/δ)2n/n
)
energy.
Proof. From Lemma 56 we know that each circuit of size s computes at most s2n+1 different
functions. We now show that for s = 2n/4n, the quantity s4s(s2n + 1) is asymptotically
smaller than 22
n
, the number of functions on n inputs. Plugging in and simplifying we have
(
2n
4n
)4 2n
4n
(
2n
4n
2n + 1
)
≤ 2
2n
n
2n
n
22n  22n .
Hence, Lemma 55 implies that almost all homogeneous circuits require Ω(2n/n) gates. By
Lemma 57, we have  ≤ δ, so each gate uses at least E(δ) energy.
6.2.3 Heterogeneous Supply Voltages
In this section we show that almost all functions require exponential energy in the exact
failure model, even when allowed heterogeneous circuits. The approach is similar to the one
for the homogeneous case, however the bound on the number of functions a heterogeneous
circuit can compute requires some technical results from semi-algebraic geometry.
Lemma 59. For any circuit C on n inputs with s gates, and any δ > 0, let F be the set of
all functions f for which there exists some ¯ ∈ (0, 1/2)|C| such that (C, ¯) is (1− δ)-reliable
for f . Then, |F| ≤ (8e2n)s .
68
Proof. Let P ⊂ R[X1, . . . , Xk] be a finite set of p polynomials with degree at most d. A sign
condition on P is an element of {0, 1,−1}p. The realization of the sign condition σ in Rk is
the semi-algebraic set
R(σ) =
{
x ∈ Rk :
∧
P∈P
sign (P (x)) = σ(P )
}
.
Let N(p, d, k) be the number of realizable sign conditions, i.e., the cardinality of the set
{σ : R(σ) 6= ∅}. The following theorem is due to Alon.
Theorem 60 ([2]).
N(p, d, k) <
(
8edp
k
)k
.
Let I ∈ {0, 1}n be some input to C, and let PI(1, . . . , s) be the probability that C
outputs 1 on I, when gate i fails with probability i. Observe that PI ∈ R[1, . . . , s] and
that PI has degree at most s, since we can compute PI by summing over all possible subsets
of gates that could fail and cause C to output a 1, of the probability that exactly those gates
fail and no others (which is a polynomial is 1, . . . , s, where each i has exponent 1).
Let P = {PI − (1− δ)|I ∈ {0, 1}n}. Clearly, the cardinality of P is at most 2n. Observe
that every different function f that C calculates must correspond to a unique realizable sign
condition of P , in the sense that there is some setting of ¯ = (1, . . . , s) such that
1. P (¯)− (1− δ) > 0 on inputs I such that f(I) = 1, and
2. P (¯) − (1 − δ) < 0 on inputs I such that f(I) = 0 (in fact, we need P (¯) − δ < 0, an
even stronger condition).
By Theorem 60, the number of realizable sign conditions of P is at most (8e2n)s, which
is thus an upper bound on the number of different functions C can compute.
We can now prove the main theorem of this section.
Theorem 61. For any δ ∈ (0, 1/2), almost all Boolean functions on n variables require
heterogeneous circuits using Ω(2n/n) energy.
69
Proof. From Lemma 59 we know that each circuit of size s computes at most (8e2n)s different
functions. We now show that for s = 2n/8n, the quantity s4s(8e2n)s is asymptotically smaller
than 22
n
, the number of functions on n inputs. Plugging in and simplifying we have
(
2n
8n
)4 2n
8n
(8e2n)
2n
8n ≤ 2 2
n
2 2
2n(3+2−12−4 logn)
8n 2
2n
8 ≤ 2 5·2
n
8  22n
Hence, Lemma 55 implies that almost all heterogeneous circuits require Ω(2n/n) gates. The
theorem follows since E(δ) = Ω(1) and E is decreasing on the interval (0, 1/2).
6.3 RELATING ENERGY AND THE NUMBER OF NOISY GATES
In this section, we show that the Boolean functions that require exponential energy are
exactly the Boolean functions that require exponentially many noisy gates. Before formal-
izing this notion we introduce some additional notation. For any Boolean function f on
n variables and any reliability parameter δ, let NG(f, δ) denote the minimum size of any
(heterogeneous) circuit that (1− δ)-reliably computes f , and N˜G(f, δ) denote the minimum
size of any homogeneous circuit that (1− δ)-reliably computes f . Similarly define E(f, δ) to
be the minimum energy used by any (heterogeneous) circuit that (1− δ)-reliably computes
f , and E˜(f, δ) the minimum energy used by any homogeneous circuit that (1 − δ)-reliably
computes f . We are now ready to state the main result of this section.
Lemma 62. For all Boolean functions f , and for all δ < 1/2,
E(1/2)NG(f, δ) ≤ E(f, δ) ≤ E˜(f, δ) ≤ E
(
δ
N˜G(f, δ)
)
N˜G(f, δ).
Proof. First observe that E(f, δ) ≤ E˜(f, δ). We now prove the leftmost inequality. Let (C, ¯)
be the circuit achieving E(f, δ) and note that by definition E(f, δ) = ∑g∈C E(g). Since
E is decreasing, it follows that E(g) ≥ E(1/2) for all g ∈ C. Additionally, by definition,
|C| ≥ NG(f, δ), and the result follows.
To show the rightmost inequality, fix some Boolean function f , and some δ. Let C be a
circuit of size s = N˜G(f, δ), and  the failure probability, such that (C, ) is (1− δ)-reliable
70
on f . If  ≥ δ/s, we are done, since E is decreasing. Note that for a circuit of size s, if
gates fail with probability at most δ/s, then by the union bound, the probability that any
gate fails is at most δ. Thus, if  < δ/s, the probability that any gate fails is at most δ.
However, this implies that (C, δ/s) is (1− δ)-reliable on f as well, and thus can use energy
E
(
δ
N˜G(f,δ)
)
N˜G(f, δ).
If E(1/2) is Ω(1), and E
(
δ
N˜G(f,δ)
)
is bounded above by a polynomial in N˜G(f, δ) and
1/δ (recall that in current CMOS technologies E() = Θ
(
log2(1/)
)
), this implies that any
function that requires exponential energy requires exponential circuit size and vice versa.
71
7.0 THE POWER OF HETEROGENEITY TO REDUCE ENERGY
In the previous chapter we studied the power of heterogeneity at a very coarse granularity,
asking how many functions require exponential energy, paying little attention to the specific
functions themselves. In this chapter we turn our focus to individual functions asking the
question, for a given function, how much energy does the best heterogeneous circuit save
over the best homogeneous circuit? We are able to show that for a wide class of natural
functions, heterogeneity allows us to save log n energy when δ is polynomial in 1/n. On the
contrary we show that for this same class of functions we cannot hope to do any better. That
is, we show an upper bound of log n on the energy savings possible due to heterogeneity.
We conclude the chapter by showing that if we extend the setting to include relations, an
energy savings of log2 n is possible. This continues the theme of demonstrating that while
heterogeneity offers the circuit designer tangible benefits, there are proven limitations to
these advantages.
7.1 LOWER BOUND FOR FUNCTIONS
We start by showing a lower bound on the energy savings possible with heterogeneity. In
particular, we show that, when δ is a polynomial function of the minimum circuit size s, it
is possible to obtain an Ω(log s) energy savings using heterogeneous voltages in the bounded
failure model. The result is that many natural Boolean functions can be computed with
asymptotically less energy using heterogeneous circuits. When both s and the number of
non-degerate inputs (see Definition 65) is Θ(n), this also holds in the exact failure model.
72
Theorem 63. For any function f with minimum circuit size s, for any constant c > 0,
if δ = 1/sc, in the bounded failure model every homogeneous circuit for f uses Ω
(
s log2 s
)
energy, and there exists a heterogeneous circuit using O(s log s) energy.
Proof. The first task is to give a lower bound on the energy used by any homogeneous circuit
that (1 − δ)-reliably computes f . By assumption since s gates are required when there are
no failures, and, because the circuit is homogeneous, gates (in particular, the output gate)
can fail with probability at most 1/sc. Since by Lemma 57 it must be that  < δ, and
E(1/sc) = Θ
(
log2 s
)
, we have Ω
(
s log2 s
)
energy is required.
The upper bound requires significantly more work, although it is still a somewhat
straightforward use of techniques from [17] which proves the following, as part of the proof
of Theorem 69:
Lemma 64. Let the maximum fan-in of any gate be a constant. There is a constant 1 > 0
and θ > 1/2 such that for any  ≤ 1, there is a ρ = ρ() < 1 such that any gate g of fan-in
` can be replaced by a gadget with
1. k input wires for each input to g,
2. k output wires, and
3. Θ(k) gates,
with the property that if, for all i, at least a θ fraction of the i-th set of input wires carries bit
bi, then the probability that fewer than a θ fraction of the output wires carries g(b1, . . . , b`)
is at most ρk.
In a manner similar to the proof of Theorem 69, we use Lemma 64 to replace each gate
in the original circuit with a gadget whose input and output is Θ(log s) wires, and set the
failure probability of this section of the circuit to 1, with the result that the probability that
less than a θ fraction of the wires carry the correct output (i.e., the output if there were no
failures) is at most 1/sc+2. Since the failure rate is set to be constant, the first part of the
circuit uses energy Θ(s log s). The probability that any gadget’s output does not carry at
least a θ fraction of the correct bits is at most 1/sc+1.
At the end of the circuit, we use the standard majority circuit (see Figure 9) of size
Θ(log s) to obtain the output, and set the failure of this section of the circuit to be 1/sc+2,
73
thus this section of the circuit uses energy Θ
(
log3 s
)
and the probability that any gate in
this section of the circuit fails is at most 1/sc+1.
7.2 UPPER BOUND FOR FUNCTIONS
We now show that for a large class of natural functions this log n savings is the best we can
hope to do, in both the exact and bounded failure models. We start with a definition of
non-degenerate input bits and then give the main theorem of this section.
Definition 65 (non-degenerate input bit). We say a function has a non-degenerate input
bit bi if there exists some input I such that f(I) 6= f(Ibi).
Theorem 66. Let f be a function with b non-degenerate input bits. Then, for any δ ∈
(0, 1/2), any circuit C that (1− δ)-reliably computes f requires Ω(b log 1/δ) energy.
Proof. This proof is quite similar to the proofs of Theorem 1 and Lemma 6 from [6], which
in turn use ideas from [18] and [14]. It is produced in full here for completeness.
We start by considering the case where we have homogeneous failures and these failures
occur on wires, with the same failure-to-energy function used throughout the paper. We
show that every non-degenerate input must have log(1/δ)/ log(1/) wires emanating from it,
even if heterogeneous failures are allowed. This combined with the assumption that there
are b non-degenerate inputs, that all gates have constant fan-in, and that P (1/2) > 0 yields
the desired result. We then show how the result when failures occur on wires implies the
result when failures occur on gates.
Consider some non-degenerate input bit bi and let z be some input I such that f(z) 6=
f(zi). Let B be the set of all wires originating from bi and |B| = m be the number of such
wires. For all β ⊆ B, let H(β) be the event that all wires in β fail and the other wires of B
are correct. Finally, let βi denote the subset where
max
β⊆B
Pr
[
C(zi) = f(zi)|H(β)] .
74
Note that since we assume C is (1− δ)-reliable we must have Pr[C(zi) = f(zi)] ≥ 1− δ and
therefore Pr[C(zi) = f(zi)|H(βi)]. Let Hi be the event corresponding to H(B \ βi). So for
example if βi is the empty set then Hi is the event where all wires fail. Also, note that since
bi is non-degenerate,
Pr[C(z) 6= f(z)|Hi] = Pr[C(zi) = f(zi)|H(βi)] ≥ 1− δ.
Finally since δ ≥ Pr[C(z) 6= f(z)] ≥ Pr[C(z) 6= f(z)|Hi] Pr[Hi] ≥ (1 − δ) Pr[Hi] we get
that Pr[Hi] ≤ δ/(1 − δ). Combining this with the fact that Pr[Hi] ≥ m and using simple
algebra yields m ≥ log(1/δ)/ log(1/) as desired.
Note that in the heterogeneous case when these wires fail with probabilities 1, . . . , m,
for the purposes of this lower bound we have the following optimization problem: We want
to minimize
∑m
i=1 P (1/) subject to the constraint that
∏m
i=1 i ≤ δ/(1 − δ). Since for any
1, 2 ∈ (0, 1/2), it holds that
P (1/1) + P (1/2) ≥ 2P (1/√12) ,
this will be minimized when all i are equal. Therefore, for the purposes of this lower bound,
we can assume wire failures are homogeneous.
We have that m ≥ log(1/δ)/ log(1/), which implies the energy used by wires coming
from each non-degenerate input bit is at least
P () log(1/δ)/ log(1/) = Ω(log(1/δ)),
and thus the energy used by the circuit is at least Ω(b log(1/δ)).
We now show why the energy lower bound when failures occur on wires implies the
energy lower bound when failures occur at gates. Consider any circuit C that (1−δ)-reliably
computes f . We construct a new circuit C ′ for computing f , which is identical to C except
that both wires and gates may fail, wires of C ′ incur some non-zero energy consumption (as
a function of their probability of failure), and the gates in C ′ do not consume energy. First
we argue that this can be done such that C ′ is (1− δ)-reliable. The following statement was
proved in [14].
75
Statement 67 ([14]). Let g be a gate with fan-in ng, in a circuit C where both gates and
wires may fail. Furthermore, let  ∈ (0, 1/2), ζg ∈ [0, /ng] and let g(t) be the output of gate
g assuming that its input-wires receive input t, and both g and g’s input-wires are faultless.
Then there exists a unique value ηg(y, ζg) ∈ [0, 1] such that if
• the input wires of g fail independently with probability ζg, and
• gate g fails with probability ηg(y, ζg) when the gate receives input y,
then the probability that g does not output g(t) is equal to .
Observe that if for each wire i entering gate g we set its probability of failure to ζg = g/ng,
we can apply Statement 67 and set the failure probability on gate g when receiving input
y to ηg(y, ζg). The result is that when the input wires of gate g in C
′ receive input t, the
probability that g does not output g(t) is g (the same as the probability of failure of g in
the original circuit C). Thus by setting these failure probabilities for each gate and wire in
C ′ we have that, for any input x, C and C ′ output f(x) with the same probability, and so
C ′ is (1− δ)-reliable.
Since the fan-in of any gate is a constant, for any gate ngP (/ng) = Θ(P ()). Thus the
energy used by C ′ is within a constant of the energy used by C.
7.3 LOWER BOUND FOR RELATIONS
In this section we prove that, in contrast with the previous section, there are relations where
heterogeneous circuits can obtain a ω(log n) energy savings over homogeneous circuits. In
fact, we show that a natural supermajority relation obtains a Θ
(
log2 n
)
energy savings in
both the bounded and exact failure models, which is asymptotically the maximum possible
savings for any relation that does not require circuits of superlinear size. Formally, we have
the following theorem.
Theorem 68. Suppose δ = 1/nc for some constant c > 0. Then there is a relation that
can be computed by a heterogeneous circuit using O(n) energy, but for homogeneous circuits
requires Ω
(
n log2 n
)
energy.
76
We cite the following general theorem proved by Pippenger in [25] and formalized by
Gacs in [17] that will be useful in our construction in this section of the paper.
Theorem 69. There is an 0 > 0 such that for any  < 0, δ ≥ 3, and any function f
computable by a (faultless) circuit of size s, there is an (1 − δ)-reliable circuit computing f
of size O(s log(s/δ)) when gates fail with probability at most .
The following relation is quite natural. The relation outputs the majority if at least 3/4
of the bits are the majority, and otherwise we do not care about the output.
Definition 70. The Supermajority Relation (SR) is the following Boolean relation:
SR(x) =

0, if the number of 0′s in x is at least 3n/4,
1, if the number of 1′s in x is at least 3n/4, and
0 and 1 otherwise,
where x is the input and |x| = n.
Lemma 71. When δ = 1/nc, for some constant c > 0, SR can be computed by a circuit
with heterogeneous voltages using O(n) energy.
Proof. For simplicity, we assume n = 2k − 1 for some positive integer k. Throughout this
proof, we consider only the case when the input has m ≥ 3n/4 1’s, as the case when it has
at least 3n/4 0’s is symmetric. Let  = min{0, 1/488}. The high level idea is that, in a way
similar to Pippenger’s technique in Theorem 4.1 of [25], we add increasing redundancy to a
standard majority circuit so that failures become increasingly rare as we traverse down the
circuit, and close to the end of the circuit we switch to the standard majority circuit, and
set the failure rate to be sufficiently low. The majority circuit we modify is a tree of 1-bit
full adders; see Figure 9. The majority circuit is composed of log n levels, where level 1 is
the level that takes the input bits as input. When n = 2k − 1 for some positive integer k,
then ylogn is the output bit of the circuit (in general, the output will be some function of
y1, . . . , ylogn that is computable by a circuit of size o(n)). As an adder is composed of five
gates (see Figure 10), for simplicity we think of an adder as a component that fails with
probability at most 5 (i.e., it fails when at least one of its gates fails).
77
FA FA FA FA FA
x1x2 x3x4 x5x6 x7x8 xn
. . . y1
FA FA . . . y2
. . .
FA ylogn
Figure 9: A tree of adders. The majority of x1, . . . , xn is a function of y1, . . . , ylogn that is
computable by a circuit of size o(n).
We now describe the modified circuit, which consists of two distinct parts: One for levels
at most log(n)/6, and the other for the remainder of the circuit.1 We describe and analyze
the first part here, and describe and analyze the second part after that. The condition
needed by the second part of the circuit is that, with probability at least 1/2nc, the majority
of adder modules on level log(n)/6 output a majority of 1’s on their carry wires.
We replace each adder on level k ≤ log(n)/5 with a level k adder module. This module
has 2k − 1 inputs for each wire in the original circuit coming from the previous level, as
well as 2k + 1 inputs from the adder to its left’s sum bit, and outputs 2k + 1 wires for
both the sum bit and carry bit output. The adder module consists of 2k + 1 copies of the
following: To an adder, supply the fault tolerant majority of the 2k − 1 wires for each of
the two inputs from the previous level, and the fault tolerant majority of the 2k + 1 wires
of the sum bit input, and output the sum and carry bits. We say that an adder module
fails if the majority of its output wires do not contain the correct sum and carry bits based
on the majority of its input wires. Since each majority circuit can be done with O(k) gates
1Another possible construction is to replace all adders with modules, and add a majority circuit at the
end that fails with very low probability. Although this alternate circuit construction could be considered
simpler, the analysis appears slightly more complicated.
78
FAcin s
cout
x y
XOR
XOR
AND
AND
OR
x
y
cin s
cout
Figure 10: A 1-bit full adder: logic block (left) and circuit realization (right).
without failures, by Theorem 69 there exist majority circuits of size O(k log(k/)) that are
incorrect with probability at most 3 gates when gates fail with probability . Since the
adder fails with probability at most 5, the probability of failure of an adder module is at
most pk = 2
2k+1(14)k+1 = (56)k+1/2 < 1/23k+1, by our choice of .
Each adder module thus consists of O(k2 log(k/)) gates, and so the total number of
gates in the first part of the circuit is
O
log(n)/5∑
k=1
n
2k
k2 log(k/)
 = O(n),
since  is a constant. This also implies that the energy used by the first part of the circuit
is O(n), since  is a constant.
It remains to show that the condition needed for the second part of the circuit holds,
namely that, with probability at least 1/2n, the majority of adder modules on level log(n)/5
output a majority of 1’s on their carry wires. Let mk be the number of adder modules at
level k that output a majority of 1’s on their carry wires. We show that at level k, with high
probability
mk ≥ n
2k
(
3
4
− 1
4
k∑
i=1
1
2i
)
.
We have the following observation.
79
Observation 72. For any fixed input to level k, suppose some subset S of the adder modules
fail in any arbitrary way. Let m˜k be the number of adder modules at level k that output a
majority of 1’s on their carry wires, if every module in S failed such that both the sum wires
and the carry wires had a majority of 0’s Then
1. m˜k ≤ mk.
2. m˜k ≥ b(mk−1 − 3|S|)/2c.
Suppose there are `k failures on level k. The above observation allows us to conclude that
mk ≥ b(mk−1−3`k)/2c, and so the result follows by induction if we can show `k ≤ n2k 13·2k+1−1
with sufficiently high probability.
The base case, level 0 (i.e., the carry wires are the input bits), is obvious. For the
inductive step, we will use a Chernoff bound to show that ` is sufficiently small with high
probability. The expected number of adder modules to fail is at most µk ≤ pkn/2k ≤
n/24k+1. By a standard Chernoff bound, since modules fail independently, the probability
that n/24k ≤ n
2k
1
3·2k+1 − 1 modules fail is at most
Pr[`k ≥ 2µk] ≤ exp
(
−1
3
n
24k+1
)
≤ exp
(
−n
1/5
6
)
since k ≤ log(n)/5. By the union bound, the probability that mlog(n)/5 < n4/5/2 is at most
log(n) exp
(
−n1/5
6
)
/5 < 1/2nc for n large enough.
It remains to describe and analyze the second part of the circuit. From the first part, we
receive n4/5 sets of 2 log(n)/5 + 1 wires, and, with probability at least 1/2nc, at least half of
which carry a majority of 1’s. The main idea is to set the failure rate low enough so that
no gate fails with high enough probability, and take the majority of each set of input wires,
and compute the majority of the values obtained. For each set of wires, we first compute
the majority of them, using the standard majority circuit. Since there are 2 log(n)/5 + 1
wires, this can be done with a circuit of size O(log n), so in total these majority circuits
use O
(
n4/5 log n
)
gates. Let z1, . . . , zn4/5 be the results of this. We then use the standard
majority circuit to compute the majority of z1, . . . , zn4/5 , which uses O
(
n4/5
)
gates. Thus
the total gates in this part of the circuit is O
(
n4/5 log n
)
= o(n). We set the failure rate
in this part of the circuit to be 1/2nc+2, so since there are only o(n) gates, the probability
80
that even one gate fails is at most 1/2nc. The energy used by this part of the circuit is
O
(
n4/5 log3 n
)
= o(n).
We can now prove our main theorem, which is straightforward given the previous lemma.
Proof of Theorem 68. By Lemma 71, SR can be computed by a heterogeneous circuit that
uses O(n) energy. It remains to show that any homogeneous circuit computing SR uses
Ω
(
n log2 n
)
energy. Note that by Lemma 57, gates in any homogeneous circuit computing
SR cannot fail with probability more than δ, and since δ = 1/nc, the energy used by each gate
must be at least Ω
(
log2 n
)
. Additionally, it is obvious that any circuit correctly computing
SR must have gates connected to at least half the inputs, and so any circuit computing SR
using gates of constant fan-in must have Ω(n) gates. Therefore, any homogeneous circuit
computing SR must use Ω
(
n log2 n
)
energy.
7.4 GENERALIZING THE FAILURE-TO-ENERGY FUNCTION
Throughout this dissertation we have assumed the failure-to-energy function is E() =
Θ
(
log2(1/)
)
, based on what appears to be the case in the current technology. However,
since this may not be the exact function, or the technology may change, it is important to
note that our results hold for more general classes of failure-to-energy functions. In this
section, for each result in this chapter, we describe the classes of failure-to-energy functions
for which the result holds as it is written in the paper (it is also likely that some results,
using different proofs, hold for larger classes of functions than those described here). We
begin with a definition of one general class of failure-to-energy functions.
Definition 73. A failure-to-energy function E is called non-vanishing if it is nonincreasing
and lim→1/2− E() > 0.
For Theorem 63, in order for the given heterogeneous construction to save energy over
the given homogeneous lower bound, we need E to be non-vanishing, E() = ω(log(1/)),
and for some constant c > 0, log(n)E (1/nc+2) = o(nE (1/nc)).
81
For Theorem 66, we require E to be non-vanishing, E() = Ω(log(1/)), and E(1) +
E(2) ≥ 2E
(√
12
)
.
For Theorem 68, in order for the given heterogeneous construction to save the most
possible over the given homogeneous lower bound, we need E to be non-vanishing, and for
some constant c > 0, E (1/2nc+2) = O
(
n1/5/ log(n)
)
.
82
8.0 CONCLUSION
We have initiated the theoretical study of energy-efficient circuits, and provided the first
results in this area. Leveraging the previous work on fault-tolerant circuit design, we first
showed general upper bounds, in terms of circuit size, and lower bounds, in terms of sensi-
tivity, on the energy required to compute a function, when the reliability parameter δ is a
fixed constant. Using these results, we showed that when δ is a fixed constant, there is a nat-
ural class of functions that do not obtain asymptotically more energy savings when supply
voltages may be heterogeneous over the optimal circuit when supply voltages must be ho-
mogeneous, indicating that in this setting, allowing heterogeneous supply voltages does not
always asymptotically save energy over homogeneous supply voltages. However, we showed
that for a specific supermajority relation, and a natural circuit for that relation, a heteroge-
neous setting of the supply voltages does yield an asymptotic decrease over the energy used
by the best homogeneous voltage setting of that circuit, indicating that for fixed, natural
circuits, heterogeneous supply voltages may provide an energy savings.
We also considered the complexity of minimizing the energy used by a fixed circuit. We
showed that the traditional approach of increasing the voltage of a circuit such that no
gate in the circuit fails with sufficient probability is a log2 n approximation algorithm to
this problem. We also showed that it is NP-Hard to approximate the minimum energy of a
circuit to within a factor significantly less than this, indicating that there is a complexity-
theoretic barrier to reducing circuit energy beyond the traditional approach in general. We
additionally proved this hardness in a second failure model, indicating that these results are
not model specific. We showed that for tree circuits, it is possible to determine in polynomial
time the probability that a circuit will output correctly, providing some evidence that it may
be possible to bypass these hardness results by considering specific families of circuits.
83
Our next results considered the amount of energy required to compute an average Boolean
function. In the bounded failure model, it is straightforward that almost all functions require
exponential energy, as this is simply a corollary of Shannon’s classic result that almost all
functions require exponential circuit size. In the exact failure model, we found that a single
circuit with homogeneous supply voltages can compute a logarithmic number of functions,
and a single circuit with heterogeneous supply voltages can compute an exponential number
of functions, showing that, in this model, directly applying Shannon’s technique will not
work as a single circuit no longer computes a single function. Despite this, we were able
to show a sufficiently small upper bound on the number of functions a single circuit can
compute using both homogeneous and heterogeneous supply voltages. With this in hand,
we showed that almost all functions require exponentially many gates, and thus almost all
functions require exponential energy.
We also considered the minimum energy to compute functions and relations when δ must
vanish as the number of inputs to the function to be computed increases. We showed that
the minimum energy required to compute many functions is a factor of log n less when het-
erogeneous supply voltages are allowed, and thus heterogeneous supply voltages can provide
significant energy savings over homogeneous supply voltages. We then show that this energy
savings is tight for functions with small circuits that do not have degenerate input bits. We
additionally showed that a natural supermajority relation can bypass this bound, i.e., the
minimum energy circuit with heterogeneous supply voltages is log2 n less than the minimum
energy circuit with homogeneous supply voltages, and thus relations may potentially obtain
greater energy savings via heterogeneous supply voltages than functions.
8.1 OPEN PROBLEMS
There are a number of different, interesting research lines in this area. Here we present those
we think are most interesting.
84
8.1.1 Solving the Minimum Circuit Energy Problem for Restricted Classes of
Circuits
Recall that in Chapter 5, we showed that for an arbitrary circuit, it is NP-Hard to approx-
imate the solution to MCE within a factor of log2−γ n for any γ > 0. However, we were
able to give a polynomial time algorithm for determining whether or not a tree circuit is
(, δ)-reliable. This seems to indicate that solving or approximating MCE may be tractable
on subclasses of circuits.
A starting point for this would be tree circuits. The main challenge for tree circuits is
that, as we showed in Chapter 5, the probability that a circuit is correct does not monotoni-
cally decrease with , and thus one cannot simply binary search over  and use the algorithm
we provided for determining if the circuit is (, δ)-correct. Understanding the relationship
between  and the extreme points of the function mapping  to circuit reliability seems
to be the key to solving MCE on trees. Trees seem like the easiest case, and are thus a
natural starting point. Other, less restrictive, classes of underlying graphs may also make
MCE easier, for example circuits whose underlying graph is series-parallel, or has bounded
treewidth.
8.1.2 Whether Heterogeneity Reduces Energy When δ is a Fixed Constant
We showed in Chapter 4 that when δ is a fixed constant, some functions, in particular the
parity function, and in fact any function with sensitivity Θ(n) that has circuits of size Θ(n),
do not benefit from heterogeneous supply voltages by more than a constant. It remains open
whether or not there exists any function or relation that obtains a non-constant benefit from
heterogeneous supply voltages when δ is a fixed constant.
If heterogeneous supply voltages can provide a super-constant energy savings when δ is a
constant, it is clear that our techniques from Chapter 7 will not apply, as the energy savings
obtained there was in terms of the the energy required for a gate to fail with probability at
most δ, which is Θ(1) if δ is Θ(1). On the other hand, if reduced energy cannot be obtained
via heterogeneous supply voltages, the lower bound techniques from Chapter 3 do not seem
very promising, as there does not seem to be a way to apply them to obtain a lower bound
85
greater than O(n log n), and thus they could not be used on functions that require circuits
of polynomial size in general. The most promising direction to solving this problem seems
to be local replacement, i.e., take an arbitrary heterogeneous circuit computing a function,
and replace each gate in the circuit with a gadget using homogeneous supply voltages, such
that the result still computes the function with only constant increase in energy.
8.1.3 Whether log n Energy Savings via Heterogeneity is the Maximum Possible
When δ Vanishes
We showed in Chapter 7 that when δ is polynomial in 1/n, a log n energy savings by using
circuits with heterogeneous supply voltages is the maximum possible for functions with
circuits of size Θ(n) that have Θ(n) non-degenerate input bits. Is it possible this log n
energy savings is the maximum possible for all functions? It may be that the answer to
the previously stated open questions, when δ is a fixed constant, provides enough insight to
answer this question.
8.1.4 The Power of the Exact Failure Model
One of the difficulties in proving our main result in Chapter 6 that almost all functions
require exponential energy was that, in the exact failure model, a single circuit could compute
multiple functions. In fact, we were able to show circuits that could compute a logarithmic
number of functions if allowed homogeneous voltage supplies, and an exponential number of
functions if allowed heterogeneous voltage supplies. However, we do not yet have an example
where the circuit designer can use the failures to his advantage to compute functions using
less energy. More precisely, are there functions that can be computed with less energy,
perhaps even asymptotically, in the exact failure model than in the bounded failure model?
Proving the existence of such functions would likely have far-reaching effects in theoretical
computer science, as determining whether or not randomness can reduce circuit size remains
open, and the problem of reducing the energy needed for computation is very related; Thus
this problem may be untouchable given the current state of mathematical knowledge. Still,
whether or not it is possible to obtain reduced energy usage in the exact failure model would
86
be quite interesting, so we present two approaches to possibly answering this question.
First, it may be that the exact failure model allows randomization to be introduced into
the circuit, thereby allowing the circuit to be smaller. Gaining an asymptotic decrease in
circuit size this way seems difficult given the current state of knowledge of circuits, and in
particular the fact that proving a superlinear lower bound on the size of circuits for any
function is a longstanding open problem. A constant decrease in circuit size, and therefore
energy, may be more possible, though it is still likely very difficult.
Another approach to this problem that is subtly different would be to find a function
where instead of using randomization to create a smaller circuit, the circuit computes the
function when there are no failures, and also when it has a fixed, high failure rate, but if
the failure rate is set adversarially, the circuit does not compute the function (i.e., there is
some “intermediate” setting of the voltages that causes the circuit to fail, due to the non-
monotonic relationship between voltage and circuit correctness). It is not clear whether or
not this approach is more tractable than the approach above.
87
BIBLIOGRAPHY
[1] Jaume Abella, Javier Carretero, Pedro Chaparro, Xavier Vera, and Antonio Gonza´lez.
Low vccmin fault-tolerant cache with highly predictable performance. In Proceedings
of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pages
111–121. ACM, 2009.
[2] Noga Alon. Tools from higher algebra. In R. L. Graham, M. Gro¨tschel, and L. Lova´sz,
editors, Handbook of Combinatorics, volume 2, pages 1749–1783. MIT Press, 1995.
[3] Amin Ansari, Shantanu Gupta, Shuguang Feng, and Scott Mahlke. Zerehcache: Ar-
moring cache architectures in high defect density technologies. In Microarchitecture,
2009. MICRO-42. 42nd Annual IEEE/ACM International Symposium on, pages 100–
110. IEEE, 2009.
[4] Gary Anthes. Inexact design: beyond fault-tolerance. Communications of the ACM, 56
(4):18–20, 2013. ISSN 0001-0782. doi: 10.1145/2436256.2436262. URL http://doi.
acm.org/10.1145/2436256.2436262.
[5] Antonios Antoniadis, Neal Barcelo, Michael Nugent, Kirk Pruhs, and Michele Scquiz-
zato. Complexity-theoretic obstacles to achieving energy savings with near-threshold
computing. In 5th International Green Computing Conference. IEEE, 2014.
[6] Antonios Antoniadis, Neal Barcelo, Michael Nugent, Kirk Pruhs, and Michele Scquiz-
zato. Energy-efficient circuit design. In Proceedings of the 5th conference on Innovations
in Theoretical Computer Science (ITCS), pages 303–312. ACM, 2014.
[7] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy.
Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–
555, 1998.
[8] Neal Barcelo, Michael Nugent, Kirk Pruhs, and Michele Scquizzato. Almost all func-
tions require exponential energy. In 40th International Symposium on Mathematical
Foundations of Computer Science, 2015, in submission.
[9] Neal Barcelo, Michael Nugent, Kirk Pruhs, and Michele Scquizzato. The power of hetero-
geneity in near-threshold computing. In 6th International Green Computing Conference.
IEEE, 2015, to be submitted.
88
[10] J.A. Butts and G.S. Sohi. A static power model for architects. In Proceedings of
the 33rd annual ACM/IEEE International Symposium on Microarchitecture (MICRO),
pages 191–201, 2000.
[11] B.H. Calhoun and A.P. Chandrakasan. A 256-kb 65-nm sub-threshold SRAM design for
ultra-low-voltage operation. IEEE Journal of Solid-State Circuits, 42(3):680–688, 2007.
[12] Leland Chang, Yutaka Nakamura, Robert K Montoye, Jun Sawada, Andrew K Martin,
Kiyofumi Kinoshita, Fadi H Gebara, Kanak B Agarwal, Dhruva J Acharyya, Wilfried
Haensch, et al. A 5.3 ghz 8t-sram with operation down to 0.41 v in 65nm cmos. In
VLSI Circuits, 2007 IEEE Symposium on, pages 252–253. IEEE, 2007.
[13] Zeshan Chishti, Alaa R Alameldeen, Chris Wilkerson, Wei Wu, and Shih-Lien Lu. Im-
proving cache lifetime reliability at ultra-low voltages. In Proceedings of the 42nd Annual
IEEE/ACM International Symposium on Microarchitecture, pages 89–99. ACM, 2009.
[14] R. L. Dobrushin and S. I. Ortyukov. Lower bound for the redundancy of self-correcting
arrangements of unreliable functional elements. Problems of Information Transmission,
13:59–65, 1977.
[15] R. L. Dobrushin and S. I. Ortyukov. Upper bound for the redundancy of self-correcting
arrangements of unreliable functional elements. Problems of Information Transmission,
13:203–218, 1977.
[16] Ronald G. Dreslinski, Michael Wieckowski, David Blaauw, Dennis Sylvester, and
Trevor N. Mudge. Near-threshold computing: Reclaiming Moore’s law through energy
efficient integrated circuits. Proceedings of the IEEE, 98(2):253–266, 2010.
[17] Pe´ter Ga´cs. Algorithms in Informatics, volume 2, chapter Reliable Computation. ELTE
Eo¨tvo¨s Kiado´, Budapest, 2005. Electronic version also in English: http://www.cs.bu.
edu/faculty/gacs/papers/iv-eng.pdf.
[18] Pe´ter Ga´cs and Anna Ga´l. Lower bounds for the complexity of reliable boolean circuits
with noisy gates. IEEE Transactions on Information Theory, 40(2):579–583, 1994.
[19] Johan H˚astad. Some optimal inapproximability results. Journal of the ACM, 48(4):
798–859, 2001. ISSN 0004-5411. doi: 10.1145/502090.502098. URL http://doi.acm.
org/10.1145/502090.502098.
[20] Walid Ibrahim and Valeriu Beiu. Reliability of NAND-2 CMOS gates from threshold
voltage variations. In Proceedings of the International Conference on Innovations in
Information Technology (IIT), pages 310–314, 2009.
[21] Jangwoo Kim, Nikos Hardavellas, Ken Mai, Babak Falsafi, and James Hoe. Multi-
bit error tolerant caches using two-dimensional error coding. In Proceedings of the
40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 197–
209. IEEE Computer Society, 2007.
89
[22] Timothy N Miller, Renji Thomas, James Dinan, Bruce Adcock, and Radu Teodorescu.
Parichute: Generalized turbocode-based error correction for near-threshold caches. In
Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microar-
chitecture, pages 351–362. IEEE Computer Society, 2010.
[23] Ravi Montenegro and Prasad Tetali. Mathematical aspects of mixing times in markov
chains. Foundations and Trends in Theoretical Computer Science, 1(3), 2005.
[24] Krishna V. Palem. Energy aware computing through probabilistic switching: A study
of limits. IEEE Trans. Computers, 54(9):1123–1137, 2005.
[25] Nicholas Pippenger. On networks of noisy gates. In Proceedings of the 26th Annual
Symposium on Foundations of Computer Science (FOCS), pages 30–38, 1985.
[26] Nicholas Pippenger, George D. Stamoulis, and John N. Tsitsiklis. On a lower bound for
the redundancy of reliable networks with noisy gates. IEEE Transactions on Information
Theory, 37(3):639–643, 1991.
[27] Ru¨diger Reischuk and Bernd Schmeltz. Reliable computation with noisy circuits and
decision trees–A general n log n lower bound. In Proceedings of the 32nd Annual Sym-
posium on Foundations of Computer Science (FOCS), pages 602–611, 1991.
[28] Claude E. Shannon. The synthesis of two-terminal switching circuits. Bell Systems
Technical Journal, 28:59–98, 1949.
[29] Leslie G. Valiant. Short monotone formulae for the majority function. Journal of
Algorithms, 5(3):363–366, 1984.
[30] John von Neumann. Probabilistic logics and the synthesis of reliable organisms from
unreliable components. In C. E. Shannon and J. McCarthy, editors, Automata Studies,
pages 329–378. Princeton University Press, 1956.
[31] Chris Wilkerson, Honglliiang Gao, Alaa R Alameldeen, Zeshan Chishti, Muhammad M
Khellah, and Shiih-Liien Lu. Trading off cache capacity for reliability to enable low volt-
age operation. In Computer Architecture, 2008. ISCA’08. 35th International Symposium
on, pages 203–214. IEEE, 2008.
[32] Gulay Yalcin, Azam Seyedi, Osman S Unsal, and Adrian Cristal. Flexicache: Highly
reliable and low power cache under supply voltage scaling. In High Performance Com-
puting, pages 173–190. Springer, 2014.
90
