Satisfiability and Derandomization for Small Polynomial Threshold Circuits by Kabanets, Valentine & Lu, Zhenjian
Satisfiability and Derandomization for Small
Polynomial Threshold Circuits
Valentine Kabanets
School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
kabanets@sfu.ca
Zhenjian Lu
School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
zla54@sfu.ca
Abstract
A polynomial threshold function (PTF) is defined as the sign of a polynomial p : {0, 1}n → R.
A PTF circuit is a Boolean circuit whose gates are PTFs. We study the problems of exact and
(promise) approximate counting for PTF circuits of constant depth.
Satisfiability (#SAT). We give the first zero-error randomized algorithm faster than ex-
haustive search that counts the number of satisfying assignments of a given constant-depth
circuit with a super-linear number of wires whose gates are s-sparse PTFs, for s almost
quadratic in the input size of the circuit; here a PTF is called s-sparse if its underlying
polynomial has at most s monomials. More specifically, we show that, for any large enough
constant c, given a depth-d circuit with (n2−1/c)-sparse PTF gates that has at most n1+εd
wires, where εd depends only on c and d, the number of satisfying assignments of the circuit
can be computed in randomized time 2n−nεd with zero error. This generalizes the result by
Chen, Santhanam and Srinivasan (CCC, 2016) who gave a SAT algorithm for constant-depth
circuits of super-linear wire complexity with linear threshold function (LTF) gates only.
Quantified derandomization. The quantified derandomization problem, introduced by
Goldreich and Wigderson (STOC, 2014), asks to compute the majority value of a given
Boolean circuit, under the promise that the minority-value inputs to the circuit are very
few. We give a quantified derandomization algorithm for constant-depth PTF circuits with a
super-linear number of wires that runs in quasi-polynomial time. More specifically, we show
that for any sufficiently large constant c, there is an algorithm that, given a degree-∆ PTF
circuit C of depth d with n1+1/cd wires such that C has at most 2n1−1/c minority-value inputs,





and determines the majority value of C.
(We obtain a similar quantified derandomization result for PTF circuits with n∆-sparse PTF
gates.) This extends the recent result of Tell (STOC, 2018) for constant-depth LTF circuits
of super-linear wire complexity.
Pseudorandom generators. We show how the classical Nisan-Wigderson (NW) generator
(JCSS, 1994) yields a nontrivial pseudorandom generator for PTF circuits (of unrestricted
depth) with sub-linearly many gates. As a corollary, we get a PRG for degree-∆ PTFs with





2012 ACM Subject Classification Theory of computation → Circuit complexity
Keywords and phrases constant-depth circuits, polynomial threshold functions, circuit analysis
algorithms, SAT, derandomization, quantified derandomization, pseudorandom generators.
Digital Object Identifier 10.4230/LIPIcs.APPROX-RANDOM.2018.46
Related Version A full version of this paper is available at https://eccc.weizmann.ac.il/
report/2018/115 as ECCC TR18-115.
Acknowledgements We thank Suguru Tamaki for suggesting to us to consider sparse PTFs in
the case of satisfiability, and for clarifying the MAX-k-SAT algorithm in [20] for us.
© Valentine Kabanets and Zhenjian Lu;
licensed under Creative Commons License CC-BY
Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
(APPROX/RANDOM 2018).
Editors: Eric Blais, Klaus Jansen, José D. P. Rolim, and David Steurer; Article No. 46; pp. 46:1–46:19
Leibniz International Proceedings in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
46:2 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
1 Introduction
Satisfiability and derandomization are famous examples of “circuit analysis” problems that,
apart from being important algorithmic problems in their own right, are also intimately
related to the notoriously difficult problem of proving circuit lower bounds. In this paper,
we give several algorithmic results for these problems for the class of Boolean circuits with
polynomial-threshold functions (PTFs) as gates.
Circuit-SAT. Circuit-SAT asks to determine whether a given Boolean circuit has a satisfying
assignment. As a canonical NP-complete problem, it is not believed to have a polynomial-time
(or subexponential-time) algorithm. However, it is still very interesting to look for nontrivial
algorithms for Circuit-SAT running faster than naive exhaustive search. More specifically,
given a circuit of polynomial size on n variables, is there a satisfiability algorithm that runs
in time at most 2n/nω(1)?
It turns out that this task is challenging even for very restricted classes of circuits. The
difficulty of obtaining such a SAT algorithm can be partially explained by the work of
Williams [26, 27] showing that a Circuit-SAT algorithm faster than exhaustive search for
a given class of circuits can often be used to prove nontrivial circuit lower bounds against
that same class of circuits (given that the class of circuits satisfies some mild conditions).
In fact, Williams designed such a Circuit-SAT algorithm for ACC circuits (constant-depth
circuits with AND, OR, NOT, and modular counting gates) that runs in time 2n−n1/ exp(d)
(recently improved to 2n−n1/poly(d) by [6]), where d is the depth of the circuit, and then used
this algorithm to show that NEXP contains a language that is not computable by any family
of polynomial-size constant-depth ACC circuits, a breakthrough result in circuit complexity.
Given the connections between nontrivial Circuit-SAT algorithms and circuit lower bounds,
one of the next big goals in circuit complexity is to design such an algorithm for the class of
TC0 circuits, constant-depth circuits with majority gates. Lower bounds against the class
of polynomial-size TC0 circuits is currently one of the most important open problems in
complexity.
Derandomization. A central problem in derandomization is to give an efficient deterministic
algorithm for computing the majority value of a given Boolean circuit, under the promise
that the fraction of minority-value inputs to the circuit is at most 1/3. That is, given a
circuit that outputs some unknown value b ∈ {0, 1} on all but at most 1/3 fraction of inputs,
we need to determine this majority value b, efficiently deterministically.
As for Circuit-SAT, it is also known that a “faster-than-brute-force” algorithm solving
the aforementioned derandomization problem for a circuit class C (satisfying some mild
conditions) implies lower bounds against that class C [26].
Black-Box Derandomization: Pseudorandom generators. One way to solve the deran-
domization problem for a class C of circuits is to construct a pseudorandom generator (PRG)
for C. A PRG for a class C of n-input Boolean circuits is an efficiently deterministically
computable function G mapping short binary strings (seeds) to longer binary strings so
that every C ∈ C accepts G’s output on a uniformly random seed with about the same
probability as that for an actual uniformly random string. More precisely, we say that
a generator G : {0, 1}r → {0, 1}n is ε-fooling for a class C of Boolean circuits if for every
C : {0, 1}n → {0, 1} from C, |Pr[C(G(x)) = 1]−Pr[C(y) = 1]| ≤ ε, for uniformly random
x ∈ {0, 1}r and y ∈ {0, 1}n. The parameter r is called the seed length of the PRG. Then
V. Kabanets and Z. Lu 46:3
given a PRG that fools C, for every C ∈ C, we can estimate the fraction of accepted inputs to
within an additive error ε, by trying all possibles seeds. This gives a deterministic algorithm
solving the derandomization problem in time approximately 2r.
Note that a PRG yields black-box derandomization in the sense that we do not need to
be given as input a circuit C ∈ C in order to decide the set of 2r query points for C; the set
of 2r query points is the same for all circuits in class C.
Quantified derandomization. As standard derandomization appears difficult even for weak
circuit classes, one considers relaxations. One relaxation is to assume that a given n-input
circuit C outputs an unknown value b ∈ {0, 1} on all but “very few” inputs, e.g., 2n/nω(1)
inputs rather than 2n/3 in the case of standard derandomization. Goldreich and Wigderson [8]
named this a quantified derandomization problem. More formally, for a class C of circuits, and
a function B : N→ N, the (C, B)-quantified derandomization problem is the following: given
a circuit C ∈ C such that C has at most B(n) minority-value inputs in {0, 1}n, determine
the majority value b ∈ {0, 1} for C.
It was immediately observed by [8] that for “sufficiently powerful” circuit classes (e.g.,
AC0[⊕], polynomial-size constant-depth circuits with unbounded fan-in AND, OR, parity
gates, and negation gates), quantified derandomization is equivalent to standard derandom-
ization, as one can perform efficient pseudo-random sampling (via randomness extractors)
within the same circuit class. Thus, quantified derandomization may be possible to achieve
(given our current knowledge) only for “very weak” circuit classes. [8] gave quantified deran-
domization algorithms for AC0 (later strengthened by [22]) and some other classes. Recently,
Tell [24] showed that quantified derandomization is also possible for constant-depth LTF
circuits of small super-linear wire complexity (and that improving this to slightly higher
super-linear wire complexity is as hard as getting nontrivial standard derandomization for
the circuit class TC0, which in turn would imply TC0 circuit lower bounds).
PTF circuits. The focus of the present paper is on circuits whose gates are polynomial
threshold functions. An n-variate polynomial threshold function (PTF) is defined as the
sign sgn(p) of a multi-linear polynomial p : {0, 1}n → R. Here, for v ∈ R, we define the
sign function sgn(v) to be 1 on v > 0, and 0 on v < 0. There are two common complexity
measures for PTFs: degree, which is the degree of p, and sparsity, which is the number of
monomials in p, where a monomial is of the form
∏
i∈S(xi⊕ bi) where S ⊆ [n] and bi ∈ {0, 1}
for each i ∈ S. We call the PTF s-sparse if p(x1, . . . , xn) is the sum of at most s monomials.
PTFs of degree 1 are called linear threshold functions (LTFs). Thus an s-sparse PTF can be
equivalently defined as an LTF of at most s terms, where each term is an AND of literals
(variables and their negations).
Polynomial threshold circuits are circuits whose gates are PTFs. We will study both
circuits with low-degree PTF gates and circuits with sparse PTF gates. We call a circuit
degree-∆ PTF circuit if its gates are degree-∆ PTFs. Similarly, a circuit is called s-sparse
PTF circuit if its gates are s-sparse PTFs. We note that when discussing circuits, the word
“sparse” is often used to describe circuits with a small number of wires (recall that the number
of wires is the sum of fan-ins over all gates of the circuit). To avoid ambiguity, we clarify
that in this paper the word “sparse” always refers to PTFs. For example, a sub-quadratically
sparse PTF circuit means a circuit with gates that are sub-quadratically sparse PTFs (i.e.,
PTFs that have a sub-quadratic number of monomials).
APPROX/RANDOM 2018
46:4 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
1.1 Our results
Circuit-SAT for sub-quadratically sparse PTF circuits with n1+ε wires. PTFs are very
powerful even for small sparsity. For example, s-sparse PTFs can encode MAX-SAT with
s clauses and exponential weights, a problem known how to solve nontrivially only for a
sub-quadratic number of clauses. Therefore, a nontrivial SAT algorithm for PTFs of quadratic
sparsity would break the current barrier of solving MAX-SAT with exponential weights. In
fact, since a polynomial of degree-2 has at most a quadratic number of monomials, such an
algorithm would also give a nontrivial SAT algorithm for degree-2 PTFs, which is currently
unknown1.
We give the first nontrivial #SAT algorithms (counting the number of satisfying assign-
ments of a given circuit) for the class of constant-depth circuits with PTF gates, where the
PTF circuit has small super-linear wire complexity (defined as the sum of fan-ins over all
gates of the circuit) and each PTF gate has sub-quadratic sparsity. Our main result is the
following.
I Theorem 1 (#SAT algorithm for sub-quadratically sparse PTF circuits). There is a constant
b1 > 1 such that, for every c ≥ b1 and d > 0, there is a zero-error randomized algorithm that
counts the number of satisfying assignments of any given depth-d, n-variate circuit with
(n2−1/c)-sparse PTF gates, and
at most n1+εd wires.
The running time of this #SAT algorithm is at most 2n−nεd , where εd = c−3
d .
We also get an algorithm with better parameters if we further assume that the sparse PTF
gates in the circuit have low degree. Let G∆,c denote the class of Boolean functions where
each function can be computed as an LTF of at most n2−1/(c·∆2) arbitrary ∆-variate Boolean
functions.
I Theorem 2 (#SAT algorithm for sub-quadratically sparse PTF circuits with low-degree).
There exists a constant b2 > 1 such that, for every d,∆ > 0 and c ≥ b2, there is a zero-error
randomized algorithm that counts the number of satisfying assignments of any given depth-d,
n-variate circuit with
gates from G∆,c, and
at most n1+εd,∆ wires.
The running time of this #SAT algorithm is at most 2n−n




Quantified derandomization for PTF circuits with n1+ε wires in quasi-polynomial time.
I Theorem 3 (Quantified derandomization for low-degree (or sparse) PTF circuits). For any
constant c ≥ 122 and any ∆, d > 0 such that ∆
√
logn/(cd · log logn), let C = C(n, d,∆, c)
be the class of n-variate, depth d PTF circuits with
degree-∆ PTF gates (or n∆/cd-sparse PTF gates), and







-quantified derandomization problem is solvable in time 2(logn)
O(∆2) .
1 Sakai, Seto, Tamaki and Teruyama [20] recently reported a faster-than-brute-force algorithm for MAX-
k-SAT for any constant k with arbitrary weights (which implies a satisfiability algorithm for degree-k
PTFs). However, their algorithm is conditional in that it relies on an assumption that one can efficiently
reduce the weights of a given n-variate LTF to integral weights of magnitude at most 2O(n log n). While
it is known that such small weights exist for every LTF [17], it is currently not known how to find them
efficiently.
V. Kabanets and Z. Lu 46:5
PRG for PTF circuits with few gates.
I Theorem 4 (PRG for PTF Circuits). There exists a constant E > 0 such that the following
holds. For any positive integers α and ∆, let C = C(n, α,∆) be the class of degree-∆ PTF




E · 5α·∆ · log2(n) · log(n/ε)
)
gates. There exists
a poly(n)-time computable PRG G : {0, 1}r → {0, 1}n ε-fooling C, where the seed length is
r = n2/(α+1).
We get the following PRG for a single PTF (by setting α appropriately).
I Corollary 5 (PRG for PTFs). There exists a PRG G : {0, 1}r → {0, 1}n, computable in









A common way to analyze constant-depth circuits is to apply (random) restrictions, getting
some depth reduction, and iterate, until the resulting circuit becomes very simple. Our #SAT
algorithms and quantified derandomization algorithms for constant-depth PTF circuits also
follow this approach, mainly relying on the ideas of [5] for depth reduction, and [11] for
(pseudo-) random restrictions for PTFs. Our PRG is based on the celebrated Nisan-Wigderson
“hardness-based” generator (NW PRG) [19]. We give more details in the following sections
as we discuss each of the results.
1.3 Related work and comparison
Circuit Satisfiability. Impagliazzo, Paturi and Schneider [10] gave a Circuit-SAT algorithm
for depth-2 LTF circuits with few wires; this result was improved by Chen and Santhanam [4].
Recently, Alman, Chan and Williams [1] and Tamaki [21] both gave Circuit-SAT algorithms
for depth-2 LTF circuits with an almost quadratic number of gates.
The most closely related previous work is by Chen, Santhanam and Srinivasan [5] who
gave a Circuit-SAT algorithm for circuits with a super-linear number of wires whose gates are
LTFs. In particular, they show that the satisfiability of a depth-d, n-variate circuit with LTF
gates and at most n1+εd wires can be solved by a zero-error randomized algorithm in time
2n−nεd , where εd = c−d for some constant c. Our results extend their algorithm to the more
general case of circuits with sparse PTF gates. In particular, our algorithm in Theorem 2
for ∆ = 1 subsumes the Circuit-SAT algorithm for LTFs in [5]. Also note that the sparsity
of the PTF gates in our model is almost quadratic in n, which is the input size of the circuit.
“Opening up” the PTF gates in the circuit and expressing them as LTFs of terms will result
in a (constant-depth) LTF circuit that can have an almost quadratic number of wires, and
such a circuit cannot be analyzed by the result in [5].
Quantified derandomization. The quantified derandomization problem was first introduced
by Goldreich and Wigderson in [8], where they obtained a polynomial time algorithm that
finds the majority output of a given AC0 circuit that has at most 2n0.999 minority-value inputs.
The key tool in their algorithm is a derandomized version of Håstad’s switching lemma [9]
with logarithmic seed length. In addition, they obtain quantified derandomization results for
log-space algorithms and arithmetic circuits. The quantified derandomization algorithm for
AC0 was generalized by Tell [22] to handle AC0 circuits with at most 2Ω(n/ logd−2 n) minority-
value inputs, where d is the depth, with an increase of the running time to 2Õ(log3 n). As
APPROX/RANDOM 2018
46:6 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
mentioned above, Tell [24] has recently obtained a quantified derandomization algorithm
for depth-d LTF circuits with n1+1/ exp(d) wires with at most 2n1−1/5d minority-value inputs,
running in time n(log logn)2 . Our result extends this to low-degree PTF circuits and sparse
PTF circuits, at the expense of increasing the running time to quasi-polynomial (for constant
degree and polynomial sparsity). For the results on reducing standard derandomization to
quantified derandomization, see [8, 22, 23, 24].
PRGs. There has been a long sequence of works on constructing PRGs (of varying strength)
for various sub-classes of P/poly. Among these known PRG constructions, some are NW-
style “hardness-based” generators, while others are ad hoc constructions (often using such
standard pseudorandomness tools as hashing, limited-wise independence, expander graphs,
etc.) The previous PRGs for PTFs due to [16, 13] are of the latter kind. The construction
uses hashing and limited-wise independence. The analysis is quite involved, and depends
on a number of analytic tools for polynomials (concentration and anti-concentration results,
the invariance principle, hypercontractivity, regularization, etc.). In contrast, our PRG for
PTFs (of Corollary 5) is the NW-style construction, whose analysis is simple, assuming an
average-case lower bound for an appropriate class of functions.
For constant degree PTFs and constant error ε, the PRG of [16, 13] has exponential
stretch (mapping a seed of length O(logn) to an n-bit string fooling n-input PTFs). However,
these PRGs has polynomial dependence in the error 1/ε and cannot handle small error. Our
PRG cannot achieve such exponentially long stretch for constant error, but it can achieve
even exponentially small error ε with a nontrivial (sub-linear) seed size, which is impossible
for the PRGs of [16, 13].
In their work studying correlation bounds for AC0 circuits with few symmetric gates [14],
Lovett and Srinivasan obtained an average-case hard function for constant depth poly-size
AC0 circuits with few LTF gates and used it to construct a PRG fooling such circuits with
polynomial stretch and exponentially small error, also based on the generic construction of
Nisan and Wigderson. Since a PTF can be viewed as a depth-2 circuit computing an LTF of
ANDs, such a PRG also fools small PTF circuits. While the PRG in [14] can fool a more
general model, which is constant-depth AC0 circuits augmented with LTF gates, it can have
only polynomial seed stretch and the circuit can have only constant depth. Our work here
focuses on circuits with only PTF gates. Our PRG can have sub-polynomial seed length and
it can fool PTF circuits regardless of the depth as long as the number of gates is small. In
particular, our PRG for a single PTF with sub-polynomial seed length (Corollary 5) can be
used to construct a PRG for degree-2 PTFs with a seed length that is logarithmic in the
input size and sub-polynomial in the error (see [12]).
Threshold circuits. It is well known that the class of constant-depth polynomial-size TC0
circuits is equivalent to the class of constant-depth polynomial-size circuits with LTF gates [7].
LTF circuits have been intensively studied in complexity theory. PTF circuits have been
previously studied for lower bounds [18, 11]. Threshold circuits are also studied as a model
of artificial neural networks [15] (see also [2]), where a threshold gate is also called a neuron.
Remainder of the paper. We give the necessary background in Section 2. We prove our
satisfiability algorithms (Theorem 1). We describe our quantified derandomization result for
low-degree PTF Circuits in Section 4, and our PRG result in Section 5. We conclude with
some open problems in Section 6. For lack of space, we omit many proofs in this extended
abstract and refer the reader to the full version for those proofs.
V. Kabanets and Z. Lu 46:7
2 Preliminaries
2.1 Notation
For a positive integer n, let [n] denote the set {1, 2, . . . , n}. For a Boolean function
f : {0, 1}n → {0, 1}, we define the majority value of f to be the bit value b ∈ {0, 1}
that maximizes the quantity Prx∼{0,1}n [f(x) = b], and we call 1− b the minority value. We
say that two Boolean functions f and g are δ-close if Prx[f(x) 6= g(x)] ≤ δ. We say that a
function f is δ-close to an explicit constant if f is δ-close to some constant function and such
a constant function can be efficiently determined from f .
We will often view an s-sparse PTF as an LTF of at most s AND gates. It is well known
that every LTF on m variables has a canonical representation, where the coefficients are
integers of magnitude at most 2O(m logm) [17]. Therefore, every s-sparse PTF is equivalent to
some s-sparse PTF whose coefficients are integers of magnitude at most 2O(s log s). Without
loss of generality, for a circuit with s-sparse PTF gates, we assume the coefficients of all
gates have bit complexity poly(s).
2.2 Random restrictions
A random restriction is a process that randomly fixes the values of a subset of variables. We
will often view a random restriction as a two-step process: the first step is selecting (in some
random manner) a subset of unrestricted variables and the second step is fixing (in some
random manner) the values of all the other variables. Depending on different contexts, we
will consider different types of random restrictions based on how the unrestricted variables
are picked and how the restricted variables are fixed.
Truly random restriction. The first type is the (truly) r-random restriction, for a parameter
0 < r < 1. It is the process that leaves each variable, independently, free with probability r,
and otherwise assigns it 0 or 1 uniformly at random.
Pseudorandom restriction using limited-wise independence. We can also pick and fix
variables in a pseudorandom manner. One way to do this is to use a limited-wise independent
distribution. For integers n,m > 0, a distribution X on [m]n is called k-wise independent if
any k coordinates of X are uniformly distributed. That is, for any 1 ≤ i1, . . . , ik ≤ n and
every b1, . . . , bk ∈ [m], we have Pr[Xi1 = b1, . . . , Xik = bk] = m−k. A k-wise independent
distribution over [m]n can be constructed using k · logn random bits for m ≤ n (see, e.g., [25]).
For a random restriction ρ, we say that ρ picks the unrestricted variables k-wise independ-
ently, each with probability r, if each variable is set to be unrestricted by ρ with probability
r, and any k of the variables are independent. Note that this process can be done using a
k-wise independent distribution over [1/r]n, where n is the number of variables. Also, we say
that ρ fixes the variables k-wise independently if each variable is assigned 0 or 1 uniformly
at random by ρ and any k of the variables are independent.
Random block restriction. A random block restriction picks the set of unrestricted variables
by picking a block from some arbitrary predetermined partition of variables. More formally,
an m-block random restriction for a function is the following process: given an arbitrary
partitioning of input variables into m disjoint blocks, a random m-block restriction picks a
uniformly random block ` ∈ [m] and fixes all variable outside the chosen block ` to 0 or 1
according to some distribution. Note that we can use a random block restriction to simulate
APPROX/RANDOM 2018
46:8 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
the first two types of (pseudo-)random restrictions above. For example, to simulate a truly
r-random block restriction, we first randomly partition the variables into m = 1/r disjoint
blocks, where each variable is assigned to block i ∈ [m], independently, with probability 1/m.
Then we apply an m-random block restriction based on the partition in the previous step,
by fixing the variables outside the selected block uniformly at random. Similarly, to simulate
a pseudorandom restriction using limited-wise independence, we can partition (hash) the
variables into disjoint blocks limited-wise independently, and apply a random block restriction
where we fix the variables also using a limited-wise independent distribution.
2.3 Useful tools for analyzing PTFs
I Definition 6 (δ-concentrated PTFs). Let p : {0, 1}n → R be a degree-∆ multi-linear
polynomial and f = sgn(p). For parameters 0 < δ ≤ 1/2 and λ ≥ 1, we call p (and f)




, where Exp and Var denote
the expectation and the variance, respectively, under the uniform distribution over {0, 1}n.
We refer to (δ, 1)-concentrated polynomials as δ-concentrated.
A useful property of concentrated PTFs is that they are close to an explicit constant.
I Lemma 7 (Concentrated implies close to constant). For any 0 < δ ≤ 1/2, if a PTF
f = sgn(p) is δ-concentrated, then f is δ-close to the constant function sgn (Exp[p]).
For a multi-linear polynomial p : {0, 1}n → R, it easy to see that the constant function
sgn(Exp[p]) from Lemma 7 is efficiently computable for a given polynomial p.
The following is a random restriction lemma for PTFs which says that a low-degree PTF
is likely to become concentrated under a (truly) random block restriction.
I Lemma 8 (Random block restriction lemma [11]). For any 0 < δ < 1 and any positive
integers m,λ, let Bm be a m-block random restriction that fixes variables uniformly at random.
Then for degree-∆ PTF f whose variables are partitioned into m blocks, we have
Prρ∼Bm [fρ is not (δ, λ)-concentrated] ≤ m−1/2 · (logm · log(1/δ))O(λ·∆
2).
There is also a derandomized version of the above random block restriction lemma.
I Lemma 9 (Pseudorandom block restriction lemma [11]). For any 0 < δ, γ < 1 and any
positive integers m,λ, there is a polynomial-time algorithm for sampling a m-block random
restriction B′m, that uses at most mγ · logn random bits, so that the following holds. For any
n-variate degree-∆ PTF f whose variables are partitioned into m blocks, we have
Prρ∼B′m [fρ is not (δ, λ)-concentrated] ≤ m
−1/2 · (logm · log(1/δ))O(λ·∆
2/γ).
Moreover, B′m fixes the variables (192 ·∆ · log(1/δ))-wise independently.
3 #SAT algorithm for PTF circuits
To get our Circuit-SAT algorithm for circuits with sparse PTF gates, we generalize the
analysis of the Circuit-SAT algorithm for small LTF circuits in [5]. An oversimplified
description is as follows. We show that for a depth-d circuit with a slightly super-linear
number of wires, whose gates are sparse PTFs, there exists a shallow decision tree such that,
for most of the leaves, the circuit restricted to that leaf can be “approximated” by some
depth-(d− 1) circuit. Then we recursively apply a Circuit-SAT algorithm to depth-(d− 1)
circuits. However, to actually implement this idea, we need three ingredients, which we
describe in detail below.
V. Kabanets and Z. Lu 46:9
Satisfiability for conjunctions of sparse PTFs. First, we need a base-case algorithm. In
the case of LTF circuits in [5], the base case is a conjunction of LTFs, and there is a known
algorithm by Williams [28] for such circuits. In contrast, in our case, the base case is a
conjunction of sparse PTFs. Using the polynomial method in circuit complexity, we are able
to design a Circuit-SAT algorithm for such circuits. More specifically, the algorithm is based
on the framework for designing satisfiability algorithms developed by Williams [27, 28]. The
idea is to transform a given constant-depth circuit into a low-degree probabilistic polynomial
and solve satisfiability by evaluating the polynomial on all points in a faster-than-brute-force
manner. Applying this idea naively, we get a randomized SAT algorithm that makes error.
Such a base-case algorithm would result in the final SAT algorithm for PTF circuits that
also makes error. However, using some derandomization ideas similar to those in [3, 21],
we are able to obtain a deterministic base-case algorithm that can count the number of
satisfying assignments. This allows us to make our final SAT algorithm for PTF circuits to
be zero-error randomized algorithm that counts the number of satisfying assignments.
I Lemma 10. There exists a deterministic algorithm that counts the number of satisfying
assignments of every n-variate circuit C that is a conjunction of k s-sparse PTF gates, where







Depth reduction for sparse PTF circuits with few wires. Secondly, to construct the
aforementioned decision tree, we need a random restriction lemma showing that, under
a (truly) random restriction, a gate in the circuit is likely to be close to constant. More
specifically, Chen et al. [5] showed that using such a random restriction lemma, one can get
a shallow decision tree such that, restricted to most of the leaves, a circuit with few wires
will have many of its bottom-layer gates becoming close to constant, so that we can replace
them with actual constants, and the depth decreases by one if we futher remove the rest of
the few bottom-layer gates that are not close to constant. In our case, we need a similar
restriction lemma for sparse PTFs. It is easy to see that a sparse PTF is likely to become
a low-degree PTF under a mild random restriction. Combining this observation with the





-sparse PTF f , Prρ∼Rr [fρ is not δ-close to an explicit constant] ≤ rΩ(1),
for some very small δ, where Rr denotes the truly r-random restriction. Using this structual
result for sparse PTFs and the idea in [5] (see Section 4.1.1 of [5]), we get the following.
I Lemma 11. For any integer d ≥ 2 and any (logn)−1  ε < 1, let





C be any depth-d, n-variate, s-sparse PTF circuit with at most w = n1+ε wires.
Then there exists a decision tree T of depth n− n1−β such that, for a random leaf σ of T ,
with probability at least 1− exp(−nε), we have the following: Cσ is a depth-d circuit of wire
complexity at most w such that its bottom layer has at most n gates that are δ-close to an
explicit constant and at most nβ gates that are not δ-close to an explicit constant. Moreover,





Enumerating minority outputs of sparse PTFs. Finally, in our main algorithm, we will
need to apply the depth reduction lemma (Lemma 11) to the circuit to conclude that many
of the gates at the bottom layer will become close to constant so that we can replace them
with actual constants. This changes the function of the circuit and we need to deal with the
inputs where these gates do not evaluate to their majority values. This issue can be handled
APPROX/RANDOM 2018
46:10 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
if given a sparse PTF we can find the set of all inputs where it evaluates to its minority
value, in a relatively efficient way. As shown in [5], there is an efficient way to do this for
functions whose satisfiability can be decided in polynomial time, such as LTFs. However, we
cannot apply this for sparse PTFs since there is no known polynomial-time SAT algorithm for
sparse PTFs. We overcome this issue for sparse PTFs by reducing to the case of LTFs, using
the following observation from Chen and Santhanam [4], which says that for a collection
of sub-quadratic many monomials, there exists a decision tree of not-too-many leaves such
that, under each leaf, each of the monomials becomes a single literal. As a result, we get the
following way to enumerate the set of minority-value inputs for a sub-quadratically sparse
PTFs in non-trivial time. (See Section A for the proof).
I Lemma 12. Let f : {0, 1}n → {0, 1} be a s-sparse PTF with coefficients of bit complexity
poly(n), where s ≥ n and let S be the set of inputs on which f evaluates to 0 (or 1). Then S










and βd = E · εd, where E is a sufficiently
large constant. We can show the following.
I Theorem 13. For any integer d ≥ 1, the number of satisfying assignments of a depth-d,
n-variate circuit with (n2−10β2)-sparse PTF gates and at most n(1+εd) wires can be computed
by a zero-error randomized algorithm in time poly(n) · 2n−n
Ω(β3d) .
I Definition 14 (Skew Circuits). We say that a circuit C is (d, n, t, s)-skew if it is a n-variate
circuit that can be expressed as a conjunction of some circuit C ′ and at most t s-sparse
PTFs, where C ′ is a depth-d circuit with s-sparse PTF gates and has at most w = n1+εd
wires. We call C ′ the skew subscircuit of C.
Let T (d, n, t, s) denote the supremum, over all (d, n, t, s)-skew circuits C, of the random-
ized running time of counting the number of satisfying assignments of C.
Using the depth reduction lemma (Lemma 11) and the enumeration lemma (Lemma 12),
we can obtain the following lemma which says that we can reduce the task of counting
satisfying assignments of depth-d circuits to that of depth-(d− 1) circuits.
I Lemma 15. If s ≤ n2−5βd , then




d− 1, n1−2βd , t+ 2n, s
)
+ poly(n) · 2n−n
Ω(β3d)
.
The proof of the above lemma is similar to that in [5]. We give a detailed proof in Sectoin B.
Given the recursion in Lemma 15, it is not difficult to prove Theorem 13 by solving the
recursion and using the SAT algorithm for conjunctions of sparse PTFs (Lemma 10) as the
base case. We refer the reader to the full version for the proof.
4 Quantified derandomization for PTF circuits
At a high level, our quantified derandomization algorithm follows the approach of [8]. Given
a circuit C with at most B minority-value inputs, we find a restriction ρ such that ρ leaves
a large number, say n′, of variables unrestricted, and that Cρ is very close to some simple
function C̃ (say they agree on all but at most 1/6 fraction of inputs). Then the number of
minority-value inputs for C̃ is at most B + 2n′/6. If B is also at most 2n′/6, then we can
determine the required majority value for C by finding the majority value C̃, which is a
V. Kabanets and Z. Lu 46:11
simple function. This approach is also used by Tell [24] to get a quantified derandomization
algorithm for LTF circuits with a slightly super-linear number of wires.
Let’s first consider a depth-2 LTF circuit with few wires. In [5], Chen, Santhanam and
Srinivasan proved a random restriction lemma for LTFs, which says that under a random
restriction, an LTF is likely to become very close to an explicit constant. Using this result,
one gets that under such a random restriction, many of the gates in the bottom layer of the
circuit are expected to become close to constants. Since the circuit has only a few wires,
one can further fix a small number of variables so that only those gates that are close to
constants are left. Finally, by replacing these gate with their majority values, we obtain a
single LTF that is close to the original depth-2 circuit.
Such a random restriction lemma was extended to low-degree PTFs in [11], so we can
conclude the same for low-degree PTF circuits. One important issue, though, is that the above
“depth reduction” argument only holds for random restrictions (but with high probability).
So to get quantified derandomization, one will need to consider all possible restrictions. To
handle this issue, Tell [24] derandomized the random restriction lemma for LTFs mentioned
above so that such a restriction can be sampled using few random bits. As a result, one only
needs to consider a much smaller sample space of restrictions.
Pseudorandom restrictions for PTFs. The pseudorandom restriction lemma for LTFs
in [24] is obtained using a PRG for LTFs. One way to extend it to PTFs is to use a PRG for
PTFs. However, unlike LTFs, for which a PRG with a very short seed is known, all known
PRGs for PTFs have a large seed length (for small error, which is needed for the argument).
In fact, the only PRG that we can use in this case is the one in Corollary 5, and it would give







quasi-polynomial running time, we use a powerful pseudorandom block restriction lemma for
PTFs (Lemma 9), which needs only a poly-logarithmic number of random bits, and convert
it into the following pseudorandom restriction lemma. (The proof is in Section C.)
I Lemma 16 (Pseudorandom restriction lemma for low-degree PTFs). For any constant c > 0,
any α < 1 and any positive integer ∆ such that ∆
√
α · logn/ log logn, there is a random
restriction R such that the following holds:
R picks the unrestricted variables (logn)-wise independently, each with probability n−α.
R fixes the variables (600 · c ·∆ · logn)-wise independently.
R can be sampled in polynomial time using (logn)O(∆
2) random bits.
For any degree-∆ PTF f on n variables, Prρ∼R[fρ is not (n−c, 3)-concentrated] ≤ n−α/3.
Bias preservation for PTFs. Now we need to apply the above idea to a depth-d circuit
C. It seems that all we need to do is applying the pseudorandom restriction d − 1 times.
While this is true, the analysis is much more subtle. For example, after applying the first
pseudorandom restriction ρ1, we get a new circuit C̃ of depth (d− 1) on some n′ variables
so that it agrees with Cρ1 on all but at most say 2n
′
/6 inputs. Now consider a subsequent
restrictions ρ′. Note that the final number of unrestricted variable n′′ after ρ′ is much smaller
than n′. Therefore, (Cρ1)ρ′ and C̃ρ′ can disagree on all the inputs (since 2n
′
/6 2n′′) so C̃ρ′
cannot be used to determine the correct output of (Cρ1)ρ′ , which is also the correct output
of C. This issue can be handled if those bottom layer gates that become close to constant
after applying one step of pseudorandom restriction will remain close to the same constant
for subsequent pseudorandom restrictions. Such a “bias preservation lemma” for LTFs is also
proved in [24], again using a PRG for LTFs. For PTFs, we use an observation in [11], which
APPROX/RANDOM 2018
46:12 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
says that a concentrated PTF is likely to remain concentrated under any random restriction
that fixes variables limited-wise independently.
I Lemma 17 (see Lemma 4.2 and Claim 7.7 of [11]). Let f = sgn(p) be any degree-∆ PTF
that is (δ, λ+ 1)-concentrated. Let ρ be a random restriction that fixes any subset of variables
according to some (192 ·∆ · log(1/δ))-wise independent distribution. Then with probability at
least 1− δ we have: (1) fρ is (δ, λ)-concentrated, and (2) sgn(Exp(pρ)) = sgn(Exp(p)).
The above lemma means that if a PTF is (δ, 2)-concentrated and hence close to some constant.
Then the restricted PTF is likely to remain close to the same constant.
Quantified derandomization for low-degree PTF circuits. Let G be a class of Boolean
functions, we say that a circuit C is a (n, d, w,∆,G)-low-degree PTF circuits if: (1) C is an
n-variate circuit of depth-d with at most w wires, and (2) C has degree-∆ PTFs as its gates
except for the top gate, which is a function from G.
For a class of Boolean functions G, we denote by Apprn,ε(G) the running time, given an
n-variate function g from G, of approximating the acceptance probability of g to within an
additive error ε. We can show the following.
I Theorem 18. For any constant E ≥ 11 and any positive integers ∆ and d such that
∆
√
εd · logn/ log logn, where εd = E−2(d−1), let C be the class of
(
n, d, n1+εd ,∆,G
)
-low-








Theorem 18 implies Theorem 3 for low-degree PTF circuits since we can always add a dummy
gate (e.g., AND) to the top of a PTF circuits; this only increase the depth by 1.
Theorem 18 is obtained by iteratively applying the pseudorandom restriction lemma
(Lemma 16) to reduce the depth of the circuit until the circuit has depth 1. The following
lemma shows how to do this in one step.
I Lemma 19. For any constants E ≥ 11, c > 0, any ε ≤ 1/(7E), and any positive integer
∆ such that ∆
√
E · ε · logn/ log logn, there is a polynomial time algorithm that, given a(
n, d, n1+ε,∆,G
)
-low-degree PTF circuit C and a random seed of length (logn)O(∆
2), outputs
the following with probability at least 1− nε:
A restriction ρ ∈ {0, 1, ∗}n that leaves n′ = n1−3E·ε variables unrestricted and that the
restricted variables are fixed (600 · c ·∆ · logn)-wise independently.
A
(
n′, d− 1, (n′)1+7E·ε,∆,G
)
-sparse PTF circuit C̃ such that for all subsequent random
restriction ρ′ that fixes the variables in a (600 · c ·∆ · logn)-wise independent manner,
with probability 1− n−c over ρ′, it holds that C̃ρ′ is n−c-close to (Cρ)ρ′ .
The proof of Lemma 19 is similar to that in [24], which is based on the argument in [5], but
requires some critical modifications. A sketched proof is given in Section D. For the proof of
Theorem 18, we refer the reader to the full version.
5 PRG for PTF circuits
In this section, we give a high-level description of our PRG for small PTF circuits. The
detailed proof is presented in Section E.
Our PRG is based on the Nisan-Wigderson generator (NW PRG) [19]. To fool a class C
of Boolean functions f , the NW PRG construction requires a “hard function” h that cannot
be computed correctly on significantly more than a half of all possible inputs by any Boolean
V. Kabanets and Z. Lu 46:13
function g in a related class C̃ of “slightly more powerful” functions than those from C. Thus,
sufficiently strong average-case lower bounds against the class C̃ can be used to build a PRG
fooling the class C. In our case, the class C contains all those n-variate Boolean functions
that are computable by constant depth-d circuits with at most s n PTF gates of degree-∆.
Our main observation is that the corresponding class C̃ (for which we require average-case
lower bounds) is the class of Boolean functions computable by constant depth-d circuits with
at most s PTF gates of degree ∆′ = α ·∆, for some parameter α ≥ 1 that we can control
(and which will determine the seed size of our PRG). That is, the class C̃ is the same as C,
except for a somewhat higher degree ∆′ of the allowed PTF gates.
To illustrate the idea of our analysis of the NW PRG for PTF circuits, we consider the
special case of a single n-variate PTF f of degree ∆. That is, f = sgn(p(x1, . . . , xn)) for
some degree-∆ multi-linear polynomial p : {0, 1}n → R. Suppose that the NW generator
based on some “hard” Boolean function h failed to ε-fool this PTF f . First, the standard NW
analysis shows that the function h(z) can be computed, with probability at least 1/2 + ε/n,
by (possibly the negation of) the function
g(z) = f(h1(z), h2(z), . . . , hi(z), bi+1, . . . , bn), (1)
for some 1 ≤ i ≤ n, fixed bits bi+1, . . . , bn, and Boolean functions h1, . . . , hi, where each
hj(z) depends on at most some α bits in z, for a parameter a ≥ 1 coming from the NW
construction (the maximum overlap between pairs of sets in the NW design; see Section E
for details). It is well known that every Boolean function on α inputs can be written as a
multi-linear polynomial of degree α over the reals. Plugging in these polynomials for the
function hj ’s in Equation (1), we get that g(z) is a PTF of degree at most ∆′ = α ·∆. Hence,
to ensure that this NW generator based on h is indeed ε-fooling for degree-∆ PTFs, we just
need h to be such that no PTF of degree-(α ·∆) can compute h(z) on more than 1/2 + ε/n
of inputs z. Such hard functions h turn out to be easy to construct. For example, we use the
average-case hard function for low-degree PTF circuits due to Nisan [18].
The parameters of our PRG G : {0, 1}r → {0, 1}n (its error ε and seed length r) depend
on the strength of the average-case lower bound for the hard function h. To get a short seed
r, one needs to maximize the aforementioned parameter α, ideally setting α = logn (as is
the case for a standard application of the NW construction). However, we also need to prove
(average-case) lower bounds against PTFs of degree α ·∆, where virtually nothing is known
for the degree logn. Thus we are forced to set α  logn, which limits the stretch of our
PRG to be at most only super-polynomial. On the other hand, for such a small α, our hard
function h has exponentially small correlation with degree-(α ·∆) PTFs, thereby allowing
our PRG to have an exponentially small error ε.
6 Open problems
An important open problem is to get a nontrivial Circuit-SAT algorithm for circuits with
degree-2 PTF gates. Our algorithm only works for the case where the PTF gates have a
sub-quadratic number of monomials, so it does not work for the degree-2 case in general. Such
an algorithm is not known even for a single degree-2 PTF. Another interesting open problem
is to derandomize our zero-error randomized algorithms to get deterministic #Circuit-SAT
algorithms of similar time complexity. Can we get any nontrivial standard derandomization
for constant-depth PTF (LTF) circuits of small wire complexity? For PRGs, can we get a
nontrivial PRG for depth-2 LTF circuits with a super-linear number of gates?
APPROX/RANDOM 2018
46:14 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
References
1 Josh Alman, Timothy M. Chan, and R. Ryan Williams. Polynomial representations of
threshold functions and algorithmic applications. In FOCS, pages 467–476, 2016.
2 Martin Anthony. Discrete Mathematics of Neural Networks: Selected Topics. SIAM mono-
graphs on discrete mathematics and applications. Society for Industrial and Applied Math-
ematics, Philadelphia, PA, 2001. doi:10.1137/1.9780898718539.
3 Timothy M. Chan and Ryan Williams. Deterministic APSP, orthogonal vectors, and more:
Quickly derandomizing razborov-smolensky. In SODA, pages 1246–1255, 2016.
4 Ruiwen Chen and Rahul Santhanam. Improved algorithms for sparse MAX-SAT and MAX-
k-CSP. In SAT, pages 33–45, 2015.
5 Ruiwen Chen, Rahul Santhanam, and Srikanth Srinivasan. Average-case lower bounds and
satisfiability algorithms for small threshold circuits. In CCC, pages 1:1–1:35, 2016.
6 Shiteng Chen and Periklis A. Papakonstantinou. Depth-reduction for composites. In FOCS,
pages 99–108, 2016.
7 Mikael Goldmann, Johan Håstad, and Alexander A. Razborov. Majority gates vs. general
weighted threshold gates. Computational Complexity, 2:277–300, 1992.
8 Oded Goldreich and Avi Wigderson. On derandomizing algorithms that err extremely rarely.
In STOC, pages 109–118, 2014.
9 Johan Håstad. Almost optimal lower bounds for small depth circuits. In S. Micali, editor,
Randomness and Computation, pages 143–170, Greenwich, Connecticut, 1989. Advances in
Computing Research, vol. 5, JAI Press.
10 Russell Impagliazzo, Ramamohan Paturi, and Stefan Schneider. A satisfiability algorithm
for sparse depth two threshold circuits. In FOCS, pages 479–488, 2013.
11 Valentine Kabanets, Daniel M. Kane, and Zhenjian Lu. A polynomial restriction lemma
with applications. In STOC, pages 615–628, 2017.
12 Daniel Kane and Sankeerth Rao. A PRG for Boolean PTF of degree 2 with seed length
subpolynomial in ε and logarithmic in n. In CCC, 2018.
13 Daniel M. Kane. A structure theorem for poorly anticoncentrated gaussian chaoses and
applications to the study of polynomial threshold functions. In FOCS, pages 91–100, 2012.
14 Shachar Lovett and Srikanth Srinivasan. Correlation bounds for poly-size AC0 circuits with
n1−o(1) symmetric gates. In APPROX/RANDOM, pages 640–651, 2011.
15 Warren S. McCulloch and Walter Pitts. A logical calculus of the ideas immanent in nervous
activity. Bulletin of Mathematical Biophysics, 5(4):115–133, 1943.
16 Raghu Meka and David Zuckerman. Pseudorandom generators for polynomial threshold
functions. SIAM J. Comput., 42(3):1275–1301, 2013.
17 Saburo Muroga, Iwao Toda, and Satoru Takasu. Theory of majority decision elements.
Journal of the Franklin Institute, 271:376–418, 1961.
18 Noam Nisan. The communication complexity of threshold gates. In Proceedings of Com-
binatorics, Paul Erdős is Eighty, pages 301–315, 1994.
19 Noam Nisan and Avi Wigderson. Hardness vs randomness. J. Comput. Syst. Sci., 49(2):149–
167, 1994.
20 Takayuki Sakai, Kazuhisa Seto, Suguru Tamaki, and Junichi Teruyama. Bounded depth
circuits with weighted symmetric gates: Satisfiability, lower bounds and compression. In
MFCS, pages 82:1–82:16, 2016.
21 Suguru Tamaki. A satisfiability algorithm for depth two circuits with a sub-quadratic num-
ber of symmetric and threshold gates. Electronic Colloquium on Computational Complexity
(ECCC), 23:100, 2016.
22 Roei Tell. Improved bounds for quantified derandomization of constant-depth circuits and
polynomials. In CCC, pages 13:1–13:48, 2017.
V. Kabanets and Z. Lu 46:15
23 Roei Tell. A note on the limitations of two black-box techniques in quantified derandomiz-
ation. Electronic Colloquium on Computational Complexity (ECCC), 24:187, 2017.
24 Roei Tell. Quantified derandomization of linear threshold circuits. In STOC, 2018.
25 Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer
Science, 7(1-3):1–336, 2012.
26 Ryan Williams. Improving exhaustive search implies superpolynomial lower bounds. In
STOC, pages 231–240, 2010.
27 Ryan Williams. Non-uniform ACC circuit lower bounds. In CCC, pages 115–125, 2011.
28 Ryan Williams. New algorithms and lower bounds for circuits with linear threshold gates.
In STOC, pages 194–202, 2014.
A Enumerating minority-value inputs: proof of Lemma 12
We first need the folowing.
I Proposition 20 (see Section 4.1 of [4]). Let φ1, . . . , φs be a sequence of terms whose literals
are from a set of n variables, where s ≥ n. There exists a decision tree with at most 2n−Ω(n2/s)
leaves such that restricted to each leaf of the tree, φi contains at most 1 literal, for all i ∈ [s].
Proof of Lemma 12. We view f as an LTF of s AND gates. By Proposition 20, there exists
a decision tree for f with at most 2n−Ω(n2/s) leaves such that Φ restricted to each leaf is an
LTF. We then go through each leaf σ and enumerate the set of inputs on which fσ evaluates
to 0. Let Sσ be the size of such set. For an LTF, this enumeration takes time Sσ · poly(n)
(see, e.g., Proposition 5.2 of [5]). The total running time is the time for going through the
leaves of the decision tree, which is at most 2n−Ω(n2/s), and the time to enumerate the set of
inputs evaluating to 0, which is at most
∑
σ Sσ · poly(n) ≤ |S| · poly(n). J
B Recursion for depth-d circuits: proof of Lemma 15
Let C be any (d, n, t, s)-skew circuit, where its skew subcircuit is C ′. To count the number of
satisfying assignments of C. We first apply Lemma 11 to C ′ to get a decision tree with the
claimed property. We then count the number of satisfying assignments at each leaves. For
those “bad” leaves for which the conditions in Lemma 11 are not satisfied, we will simply do
brute force on all n1−2βd variables. The time to perform this is
2n−n




Next, consider a “good” leaf σ that satisfies the conditions in Lemma 11. We now describe
how to count the number of satisfying assignments of Cσ. We call a gate imbalanced if it
is δ-close to an explicit constant and balanced otherwise. Let (g1, . . . , g`≤n) be the set of
imbalanced gates and (a1, . . . , a`) be their majority values. Let (h1, . . . , ht≤nβd ) be the set
of balanced gates.
We first count the number of satisfying assignments of Cσ in the following subset of
inputs S = {x : ∃ i ∈ [`] for which gi(x) 6= ai}. To do so, for each of the imbalanced gates,
we enumerate the set of inputs on which it evaluates to its minority value, and keep those














46:16 Satisfiability and Derandomization for Small Polynomial Threshold Circuits

















so Equation (3) is at most
poly(n) · 2n
1−2βd−nΩ(β3d) . (4)
We do this for every imbalanced gate and obtain a set of satisfying inputs. In the end we
simply take the union of these sets to get the satisfying assignments in S.
Next, we counts the number of satisfying assignments in T = {0, 1}n − S. Let C ′σ,a be
the circuit with those imbalanced gates in C ′σ replaced with their majority values (i.e., the
values given by (a1, . . . , a`)). Instead of counting the number of satisfying assignments for
the original circuit Cσ, we consider the following circuit:







It is easy to see that D(x) = 0 for every x ∈ S and D(x) = Cσ(x) for every x ∈ T . We now
need to count the number of satisfying assignments of D. We first partition T into 2t subsets,
each of which is indexed by some b = (b1, . . . , bt) ∈ {0, 1}t, where the subset Tb given by the
index b is
Tb = {x : x ∈ T, h1(x) = b1, . . . , ht(x) = bt}.
To count the number of satisfying assignments of D in Tb. We consider the following circuit:







where Db is the circuit D with the balanced gates replaced by the values b1, . . . , bt ∈ {0, 1}.
Again, we have Eb(x) = 0 for every x ∈ [n]−Tb and Eb(x) = D(x) for every x ∈ Tb. Now our
task is reduced to counting the number of satisfying assignments of Eb for each b ∈ {0, 1}t.
But note that each Eb is a conjunction of some depth-(d− 1) circuit (i.e., the skew subcircuit
of Eb) and k s-sparse PTFs, where k = t+ n+ nβ ≤ t+ 2n. Also, the skew subcircuit has at






Therefore, each Eb is a
(
d− 1, n1−2βd , t+ 2n, s
)
-skew circuits, and its number of satisfying
assignments can be computed in time T
(
d− 1, n1−2βd , t+ 2n, s
)
. Then the total time for
counting the number of satisfying assignments of the original circuit Cσ in the subset T is
2t · T
(





d− 1, n1−2βd , t+ 2n, s
)
. (5)
Therefore, by Equation (4) and Equation (5), counting the number of satisfying assign-





d− 1, n1−2βd , t+ 2n, s
)
. (6)
There are at most L = 2n−n1−2βd such leaves. Multiplying L by the running time in
Equation (6) and combining Equation (2) yields the desired running time.
V. Kabanets and Z. Lu 46:17
C Pseudorandom restriction lemma for PTFs: proof of Lemma 16
We define R by describing the following process of sampling a random restriction from R:





-block pseudorandom restriction from Lemma 9 for degree-∆ PTFs, with
parameters
δ = n−c and λ = 3.
γ = (c1 ·∆2 · log logn)/(α · logn), where c1 is a sufficiently large constant (note that
γ < 1 for ∆
√
α · logn/ log logn).
We now argue that the random restriction R has the desired properties. For the first item,
it is easy to see from the above that R picks the set of unrestricted variables (logn)-wise
independently, each with probability 1/n−α/2. The second item follows from Lemma 9 that
the pseudorandom block restriction fixes the variables (600 · c ·∆ · logn)-wise independently.
For the third item, note that to sample from R, we need polylog(n) random bits for its first
step, and the number of random bits for its second step is
nα·γ · logn ≤ (logn)O(∆
2).
Finally, for the last item, note that in the above process of sampling R, for any partition
into nα blocks generated in the first step, by Lemma 9, the probability over the restrictions
in the second step that the restricted PTF is not (n−c, 3)-concentrated is at most n−α/2 ·
(logn)O(∆
2/γ). Thus,
Prρ[fρ is not (n−c, 3)-concentrated] ≤ n−α/2 · (logn)O(∆
2/γ) ≤ n−α/2 · nα/6 ≤ n−α/3.
D Depth reduction via pseudorandom restrictions: proof of
Lemma 19
Let β = E · ε and p = n−β . The restriction ρ consists of three sub-restrictions.
ρ1: Preprocessing. Fix each of the variables with fan-out greater than 2nε using a (600 ·
c ·∆ · logn)-wise independent distribution. Since the number of wires is at most n1+ε, it can
be easily seen that the number of variables needed to be fixed is at most n1+ε/(2nε) = n/2.
ρ2: Pseudorandom restriction to simplify PTFs. Let ρ2 be a random restriction from
Lemma 16 with parameters α2 = β and c2 = 2c. Note that ρ2 fixes the variables (600 · c ·∆ ·
logn)-wise independently. Now by Lemma 16, after ρ2, we expect all but at most a fraction
of n−β/5 of the gates in the bottom layer to become (n−2c, 3)-concentrated. Moreover, since
the number of unrestricted variables is picked in a (logn)-wise independent manner, by
a Chernoff-type concentration bound (for k-wise independence), the fan-in of each of the
non-concentrated gates (there are only about a fraction of n−β/5 of such gates) will shrink by
a factor of p with high probability, assuming they have large fan-ins. Then we can expect to
eliminate all those non-concentrated gates by fixing a small number of variables. As for the
gates with small fan-ins, using a simple graph theoretic argument along with the condition
given by the preprocessing step, we can also eliminate those gate by fixing a few variables.





the random restriction ρ2, the following holds: there is a set T of variables such that all the
bottom layer gates that are not (n−2c, 3)-concentrated can be replaced by constants after
fixing the variables in T . The number of unrestricted variables after applying ρ2 and fixing
T is at least n1−3E·ε.
APPROX/RANDOM 2018
46:18 Satisfiability and Derandomization for Small Polynomial Threshold Circuits
ρ3: Eliminate non-concentrated gates. We will use a (600 · c ·∆ · logn)-wise independent
distribution to fix the variable in the set T described above. Note that the number of
unrestricted variables is at least n′ = n1−3E·ε. We may further fix additional variables so
that the number of unrestricted variables is exactly n′. Although this restriction eliminates
all non-concentrated gates in the bottom layer, it may also cause some concentrated gates to
become non-concentrated. However, by Lemma 17, the probability that each of these gate is
not (n−2c, 2)-concentrated is at most n−2c. By the union bound, we get with probability all
but n−2c · n1+ε ≤ n−c, all these gates remain (n−2c, 2)-concentrated.




, we have a restriction
ρ such that all the bottom layer gates of Cρ are (n−2c, 2)-concentrated and hence close to
some associated constants. Let’s call these constants V . C̃ is the circuit obtained from
Cρ by replacing those concentrated gates in the bottom with the constants V . Let’s argue
that C̃ρ′ and (Cρ)ρ′ are n−c-close to each other for any subsequent random restriction ρ′
that fixes the variables (600 · c ·∆ · logn)-wise independently. Consider such a subsequent
random restriction ρ′ and the restricted circuit (Cρ)ρ′ . By Lemma 17, with probability except
n−2c, the bottom layer gates of (Cρ)ρ′ , which are just the bottom layer gates of Cρ, are still
(n−2c)-concentrated. Moreover, they are close to the same constants V . Now by replacing
these gates in (Cρ)ρ′ with the constants V , we obtain a circuit C ′. By a union bound, C ′
and (Cρ)ρ′ are (n−c)-close to each other. On the other hand, consider the circuit C̃, which is
obtained by replacing the concentrated gates in the bottom Cρ with the constant V . Note
that C̃ρ′ = C ′. Thus, C̃ρ′ and (Cρ)ρ′ are n−c-close to each other. Finally, we need to show
that C̃ is a
(
n′, d− 1, (n′)1+4E·ε, (n′)∆·E·ε,G
)
-sparse PTF circuit. As for the number of wires
in C̃, note that
(n′)1+7E·ε = n(1−3E·ε)·(1+7E·ε) ≥ n1+ε.
Also, we have (n′)2∆·ε = n(1−3E·ε)·2∆·ε ≥ n∆·ε.
E PRG for PTF circuits: proof of Theorem 4
In this section, we present our NW-style PRG for low-degree PTF circuits with few gates.
I Theorem 21. There exists a constant E > 0 such that for any positive integers α,∆ and any




E · 5α·∆ · log2(n) · log(n/ε)
)−1
gates, there exists a poly(n)-time computable PRG G : {0, 1}r → {0, 1}n ε-fooling C, with the
seed length r = n2/(α+1).
We first need a (average-case) hard function for such circuits.
I Theorem 22 ([18]). There exists a constant E > 0 such that for any degree ∆ ≥ 1, there
exists a polynomial-time computable function f : {0, 1}n → {0, 1} such that for any error
parameter ε and any n-variate degree-∆ PTF circuit C with at most
n ·
(
E · 5∆ · log2(n) · log(1/ε)
)−1 gates, we have
Prx∼{0,1}n [C(x) = f(x)] ≤
1
2 + ε.
Next we apply the Nisan-Wigderson construction to the hard function of Theorem 22.
We will use the following (standard) combinatorial designs.
I Claim 23 (NW Designs [19]). For any positive integers n, α, there exists an efficiently
computable family of sets S1, . . . , Sn such that
V. Kabanets and Z. Lu 46:19
Si ⊂ [r], ∀i ∈ [n], where r = n2/(α+1),
|Si| = ` = n1/(α+1), ∀i ∈ [n], and
|Si ∩ Sj | ≤ α, ∀i, j ∈ [n] such that i 6= j.
Proof Theorem 21. For ` = n1/(α+1), let f : {0, 1}` → {0, 1} be the hard function for degree-
(α ·∆) PTF circuits from Theorem 22. By Theorem 22 and assuming E is a sufficiently large
constant, we have that for any degree-(α ·∆) PTF circuit D on ` variables of size at most s,
Prz∼{0,1}` [D(z) = f(z)] ≤
1
2 + ε/n. (7)
Let S1, . . . , Sn be the sets from Claim 23. Define the generator Gα,∆ : {0, 1}r → {0, 1}n
as follows:
Gα,∆(y) = f(y|S1), . . . , f(y|Sn),
where, for i ∈ [n], y|Si denotes the substring of y indexed by the set Si.
Toward a contradiction, suppose
|Prx∼{0,1}n [C(x) = 1]−Pry∼{0,1}r [C(Gα,∆(y)) = 1]| > ε. (8)
By a standard argument via “reduction from distinguishing to predicting” as in [19], Equa-
tion (8) implies that there exist an i ∈ [n], and bits bi+1, . . . , bn ∈ {0, 1}, such that
Prz∼{0,1}` [C ′(h1(z), . . . , hi(z), bi+1, . . . , bn) = f(z)] > 1/2 + ε/n, (9)
where
C ′ = C or C ′ = ¬C, and
h1, . . . , hi are Boolean functions such that each depends on at most α bits of its input z.
First, note that each gate in C ′ is always a PTF of degree at most ∆. Next, observe
that every Boolean function that depends on at most α variables can be computed by a
multi-linear polynomial of degree at most α over the reals. Replacing our functions h1, . . . , hi
with such degree α polynomials p1, . . . , pi inside C ′, we get
C ′(p1(z), . . . , pi(z), bi+1, . . . , bn).
Now we can we merge the polynomials pi’s into every PTF gate in the circuit that reads
from them. This yields a new circuit with exactly the same number of gates, and of degree at
most α ·∆. Denote this new circuit by C ′′. Note that C ′′ is a degree-(α ·∆) PTF circuit on
` variables of size at most s. By Equation (9), this PTF circuit C ′′ computes the function f
with probability greater than 1/2 + ε/n, contradicting Equation (7). J
Then the PRG in Corollary 5 for PTFs can be obtained from the result in Theorem 21
by picking
α = 2 logn
L · (
√
∆ · logn+ 2 log log(1/ε))





where L is a sufficiently large constant. For this value of α, we get that the PRG in
Theorem 21 fools any degree-∆ PTF circuit of size at least 1, and has seed length at most
exp
(
O
(√
∆ · logn
))
· log2(1/ε).
APPROX/RANDOM 2018
