For Boolean functions computed by read-once, depth-D circuits with unbounded fan-in over the de Morgan basis, we present an explicit pseudorandom generator with seed length O(log D+1 n). The previous best seed length known for this model wasÕ(log D+4 n), obtained by Trevisan and Xue (CCC '13 ) for all of AC 0 (not just read-once). Our work makes use of Fourier analytic techniques for pseudorandomness introduced by Reingold, Steinke, and Vadhan (RANDOM '13 ) to show that the generator of Gopalan et al. (FOCS '12 ) fools read-once AC 0 . To this end, we prove a new Fourier growth bound for read-once circuits, namely that for every F : {0, 1} n → {0, 1} computed by a read-once, depth-D circuit,
Introduction

Pseudorandomness for Constant-Depth Circuits
A central question in pseudorandomness is whether the class of all decision problems solvable in randomized polynomial time can also be solved in deterministic polynomial time (P ? = BPP). To resolve this in the affirmative, it suffices to show that there exist logarithmic-seed-length pseudorandom generators that fool polynomial-size circuits. 1 Such generators were constructed by Impagliazzo and Wigderson [16] under the assumption that there are exponential time decision problems that require circuits of exponential size.
To obtain unconditional results in pseudorandomness, however, it becomes necessary to restrict the class of "distinguishers" that a generator should fool. Ajtai and Wigderson [1] were the first to consider the problem of constucting generators specifically for AC 0 , i.e. constant-depth circuits with unbounded fan-in over the de Morgan basis (AND, OR, and NOT gates), and in their pioneering work they achieved seed length O(n ε ) for any constant ε > 0.
Nisan [19] then improved this seed length to polylog(n) using hardness of parity for AC 0 . Subsequent works [4, 7, 9, 22] have used bounded independence or small-bias spaces [20] to fool AC 0 circuits. Most recently, Trevisan and Xue [28] used the insight that pseudorandom restrictions 1 A generator G : {0, 1} m → {0, 1} n is said to ε-fool a function F : {0, 1} n → {0, 1} if simplify circuits to decision trees as in Håstad's switching lemma to improve the seed length for depth-D circuits toÕ(log D+4 n), which remains the best-known generator for AC 0 .
For the further restricted class of read-once depth-2 circuits (i.e. CNF or DNF formulas in which every variable appears at most once), Gopalan et al. [10] constructed a pseudorandom generator generator with seed lengthÕ(log n).
In this paper, we restrict our attention to read-once AC 0 , that is, constant-depth formulas over the de Morgan basis with unbounded fanin. We continue the approach initiated by Ajtai and Wigderson [1] , namely that of applying pseudorandom restrictions to the circuit to be fooled and incorporate more recent techniques [10, 23, 26] into the analysis.
Our Results
Our main result is an improvement upon Trevisan and Xue'sÕ(log D+4 n) seed length [28] for AC 0 in the special case of read-once AC 0 circuits:
There is an explicit pseudorandom generator G : {0, 1}Õ (log D+1 n) → {0, 1} n fooling read-once AC 0 circuits of depth D on n inputs.
In contrast, the probabilistic method implies the existence of an inefficient pseudorandom generator for AC 0 with seed length O(log(n/ε)) and it is conjectured that efficient generators with matching seed length exist. However, an efficient pseudorandom generator with seed length o(log D (n/ε)) would imply stronger circuit lower bounds for AC 0 than are currently known [12] . This presents a serious barrier to the construction of pseudorandom generators and our results show that we can match this barrier up to oneÕ(log(n/ε)) factor in the read-once setting.
Our Techniques
Our pseudorandom generator is that of Gopalan et al. [10] , which is also used by Reingold et al. [23] and Steinke et al. [26] . Roughly speaking, the generator fixes a carefully chosen fraction of the input bits of a given circuit in a way that approximately preserves the acceptance probability on average. This is applied recursively to fool the circuit using few random bits.
The key to the analysis is discrete Fourier analysis: Fourier analysis has proven highly effective in studying functions on the Boolean hypercube [21] , finding applications in not just pseudorandomness but also arithmetic combinatorics, circuit complexity, communication complexity, learning theory, and quantum computing. The basic principle is to study a function F : {0, 1} n → R by expressing it in the Fourier basis, namely
where χ s (x) = (−1) s·x for s, x ∈ {0, 1} n . Of particular relevance to pseudorandomness is the fact that the Fourier coefficientsF can be used to measure the "complexity" of F . For example, if s∈{0,1} n |F [s]| ≤ B, then F can be ε-fooled by an efficient small-bias generator [20] with seed length O(log(nB/ε)).
Reingold et al. [23] showed that to be fooled by the pseudorandom generator of Gopalan et al. [10] , it suffices to satisfy a weaker condition on the Fourier coefficients: we only need to bound the Fourier growth -that is, we must show that ∀k ∈ {1, 2, · · · , n} s∈{0,1} n :|s|=k F [s] ≤ B · c k for a "small" value of c (e.g. c = polylog(n)). By bounding the Fourier growth of read-once, "permutation" branching programs, Reingold et al. proved that this generator fools such branching programs; Steinke et al. [26] then showed a similar bound for all read-once branching programs of width three.
The main contribution of this work is to prove such a Fourier growth bound for the case of readonce AC 0 . To our knowledge, while there are known Fourier growth bounds for AC 0 (of a different nature than those we require) due to Linial et al. [17] and Impagliazzo and Kabanets [14] (with implications for the sensitivity and learnability of formulas), and while a Fourier concentration result of Mansour [18] was used by De et al. [9] to show small-bias spaces fool depth-2 circuits, this work is the first to apply Fourier growth bounds to the problem of pseudorandomness against AC 0 .
To prove our Fourier growth bound, we induct on depth to show that the Fourier mass at any node of F is either polynomially small or can be bounded in terms of both the acceptance and rejection probabilities at that node. Theorem 1.2 together with the analysis of Steinke et al. [26] gives a generator with seed length O(log D+1 (n)). Roughy speaking, Theorem 1.2 implies that we can restrict an Ω(1/ log D−1 (n)) fraction of inputs via a small-bias space and approximately preserve the acceptance probability (on average). Doing this O(log D−1 n) · O(log n) times sets all the input bits. Each restriction uses O(log n) random bits, whence we obtain a pseudorandom generator with seed lengthÕ(log D+1 (n)).
Organization
In Section 2, we introduce preliminary definitions and technical tools to be used in our analysis. In Section 3, we prove our Fourier growth bound. In Section 4 we verify that the analysis in [26] of their pseudorandom restriction generator for branching programs applies to our setting of readonce AC 0 and use the results of the preceding sections to prove that it indeed fools read-once AC 0 circuits.
Preliminaries
represented by a tree of depth D with n leaves whose nodes either compute the AND or OR of the values computed by their child nodes or the NOT of the value computed by a single child node, and whose output is the value computed by the root of the tree. For a node f of F , we say that f is of height d if it is the parent of a node of height d − 1, and of height 0 if it is a leaf (i.e. an input node). By standard techniques, all the NOT gates can be pushed to the inputs.
Fourier Analysis
Recall the following basic definitions in Fourier analysis: Definition 2.2. Define the characters of {0, 1} n to be the maps χ s (x) = (−1) x·s for s ∈ {0, 1} n , where x · s denotes the bitwise dot product.
For any function F : {0, 1} n → R, the (discrete) Fourier transform of F is the function
We callF [s] the sth Fourier coefficient of F , and its order is defined to be |s|, the number of nonzero bits in s. The characters form an orthonormal basis for the space of all F : {0, 1} n → R. In particular, the Fourier expansion of F is
The expectation of F under any distribution X can then be written as
We can now define notions of "Fourier growth":
where for k < 0 and k > n, we say that L k (F ) = 0. The Fourier mass of F is merely k≥1 L k (F ). We also define L ≥k = k ≥k L k (F ). For any p ∈ [0, 1], the p-damped Fourier mass is the quantity
The motivation for working with L p is that a bound on L p yields bounds on each L k .
A Fourier Growth Bound
To prove Theorem 1.2, we will show that for any function F computed by a read-once AC 0 circuit, L p (F ) can be bounded in terms of the size, depth, and bothF [0] and (1 −F [0]).
for all ε ≤ 1/n and p ≤ 1/(9 log(4 D n/ε)) D .
We will prove the theorem by induction on the depth D. The following propositions will allow us to analyze the Fourier growth of formula F in terms of its immediate subformulas (which are at smaller depth).
where in the penultimate equality we use the fact that
Proof. We will prove this for the case of m = 2; the proof for general m is entirely analogous.
and we get the desired result because L 0 (F ) =F [0] for all {0, 1}-valued functions F .
We are now ready to prove our Fourier growth bound.
Proof of Theorem 3.4. Base case (D = 0): F is a constant, the identity, or the negation of the identity. If F is a constant, then L p (F ) = 0. If F is the identity or its negation, then the Fourier
Now consider any F computed by a read-once AC 0 circuit of depth D on n inputs. Because both sides of (1) are invariant under negation of F , we can assume without loss of generality that F is the AND of functions F 1 , ..., F k computed by circuits of depth D − 1 on n 1 , ..., n k inputs, respectively; we call these functions the children of F .
Let ε i = n i ε/(4n) so that 4 D−1 n i /ε i = 4 D n/ε and ε i = ε/4. We inductively know that (1) holds for every F i and ε i so that
For the inductive step, roughly, we will show that either the ratio L p (F )/ min(F [0], 1 −F [0]) is small, or L p (F ) < ε. Our analysis will be divided into the following three cases: 1) some child of F has very low acceptance probability, 2) the expected number of children F i of F which output zero under uniformly random assignment to the inputs to F is at most logarithmic, or 3) the expected number of children which output zero is large. In case 1,F i [0] being low for some i inductively implies that L p (F i ) is low enough that L p (F ) < ε. In case 2, we reduce bounding L p (F ) to bounding
, and we again use the inductive hypothesis to argue that this is small. In case 3, we show that L p (F ) is inversely exponential in the expected number of children which output zero and thus that L p (F ) < ε.
For all j ∈ [k], by (2), we have that
We can rewrite L p (F ) as
Now we must simply upper bound
where the penultimate inequality follows from the hypotheses of Case 2. Applying the inequality
). By (5) and (4), we have
as desired, where in the latter inequality we used the fact thatF
Then by (4), we can rewrite (5) as
By (2),
Proof. As before, say that F is the AND of some F 1 , ..., F k . If we apply Theorem 1.1 to each F i with p = 1/(9 log(4 D−1 n/ε)) D−1 to get
Therefore, by Proposition 3.3, L p (F ) ≤ (1 + ε) k . In particular, for D = O(1) and ε = 1/n,
Note that the proof of our Fourier growth bound amounts to inductively showing in Theorem 3.1 that for fixed p = 1/(9 log(4 D n/ε)) D−1 , (1) holds for every descendant of the root, and then concluding in the proof of the above corollary that at the root, L p (F ) is small because L p (F i ) is small for all children F i .
The reason the analysis for the root of F differs from that for its descendants is that we cannot strengthen Theorem (3.1) to show
for all p ≤ 1/O(log(n/ε)) D−1 . For example, when D = 1, this would say that for all sufficiently small p, we have
. Furthermore, as discussed in [23] , Fourier growth bounds are related to the Coin Theorem of Brody and Verbin [8] . They proved that for a read-once, width-(D + 1) branching program F to distinguish the distribution X ∈ {0, 1} n of n independent samples from a coin with bias p ∈ [−1, 1] from the uniform distribution, |p| must be at least Ω(log 1−D n). Specifically, they show that for
which is simply L p without absolute values. Read-once AC 0 circuits of depth D can be simulated by read-once, width-(D + 1) branching programs, and just as Brody and Verbin show that (6) is small for p = 1/O(log D−1 n) for read-once branching programs, Corollary 3.4 shows that L p is small for this setting of p for read-once AC 0 circuits. Moreover, by using the recursive tribes formula, Brody and Verbin show that their bound is essentially tight in the choice of p, implying that our bound is tight as well.
The Pseudorandom Generator
In this section, we will show that the pseudorandom restriction generator of [26] can be used to fool read-once AC 0 circuits. Their result deals with fooling families of branching programs, so before recalling this result, we will define the relevant terminology. We will think of B as having a fixed start state and accept state, both of which for convenience we will denote by the index 1. A branching program reads a single bit of the input at a time (rather than reading x all at once) and only keeps track of the state in [w] at each step. We enforce this by requiring the program to be composed of smaller programs as follows. A length-n, width-w ordered branching program can also be regarded as a directed acyclic graph. The vertices are arranged into n + 1 layers each of size w. The edges connect vertices in adjacent layers; in particular, for each layer i, each vertex u in layer i, and each b ∈ {0, 1}, there is an edge labeled b from u to vertex B i [b](u) in layer i + 1.
Branching Programs
We use the following notational conventions when referring to layers of a length-n branching program. There is a distinction between layers of edges and layers of vertices: the former are the length-1 subprograms B i defined above and are numbered from 1 to n, while the latter are the states between the B i s and are numbered from 0 to n. The edges in B i go from vertices in layer i − 1 to vertices in layer i.
Lastly, as mentioned in the introduction, the pseudorandom generator we will use makes use of pseudorandom restrictions. We formalize the notion of restrictions to Boolean functions. Definition 4.4. For t, x ∈ {0, 1} n , and F : {0, 1} n → {0, 1} the restriction of F to t using x, denoted F | t←x , is the function obtained by setting the inputs indexed by the zero bits of t to the corresponding bits of x and leaving the inputs indexed by the nonzero bits of t free. Formally,
We can define restrictions B| t←x of branching programs B : {0, 1} n × [w] → [w] analogously.
Closure Under Restrictions, Subprograms, and Permutations
We now state the result of [26] on pseudorandomness for branching programs and show that it can be applied to our setting. 
Then for ε > 0, there exists a pseudorandom generator G a,b,n,ε : {0, 1} s a,b,n,ε → {0, 1} n with seed length s a,b,n,ε = O b · log(b) · log(n) · log abw 2 n ε such that, for any F computed by some B ∈ C,
Moreover, G a,b,n,ε can be computed in space O(s a,b,n,ε ).
Note that the statement above differs slightly from the statement in [26]; in particular, the seed length s a,b,n,ε above is related to their seed length t a,b,n,ε by s a,b,n,ε = t wa,b,n,ε . The reason is that in In the theorem stated in [26], the hypothesis was that L k (B) ≤ ab k , where L k (B) is defined in terms of the matrix-valued Fourier transform and the subordinate L 2 matrix norm · 2 . In general, if M is a w × w matrix whose entries are each bounded in absolute value by C, then To apply their construction to our setting, we need to show that every function F computed by a read-once AC 0 circuit is computed by some branching program B whose restrictions and subprograms can be simulated by read-once AC 0 circuits.
Firstly, given a branching program B, vertex layers i, j ∈ [n], and states
Now define the class C to be the set of ordered, length-n, width-D + 1 branching programs B on variable sets V (B) ⊆ [n] such that for all i, j ∈ V (B) and d 1 ,
i···j is computed by an AC 0 read-once formula of depth D. Proposition 4.6. If F : {0, 1} n → {0, 1} is computed by a read-once, depth-D AC 0 circuit, then F is also computed by an ordered, length-n, width-(D + 1) branching program B ∈ C.
Proof. We will induct on depth. The claim is trivially true for D = 0 in which F can only be a constant, the identity, or the negation of the identity. Now consider any F computed by a read-once AC 0 circuit of depth D on n inputs. Assume without loss of generality that F is the AND of functions F 1 , ..., F k computed by circuits of depth D − 1 on n 1 , ..., n k inputs respectively (the argument for the case where F is an OR of functions is completely analogous).
Inductively, we have ordered branching programs B 1 , ..., B k ∈ C of width D on n 1 , ..., n k inputs which compute F 1 , ..., F k respectively. To construct the desired branching program B for F , we essentially concatenate the B 1 , .., B k and, for each i ∈ [k − 1], connect the accept state in the last layer of B i to the start state in the first layer of B i+1 and connect the non-accept states in the last layer of B i to a non-accept state in the last layer of B k .
Formally, for each B i define B i to be the width-(D + 1) program given by introducing an extra state reject to each layer of vertices and rearranging the edges in the last layer that do not lead to the accept state to lead to the reject state instead. Specifically, define It just remains to check that every B d 1 ,d 2 i···j can also be computed by a read-once AC 0 circuit. If the first and last layers of S both lie in a single B m , then we're done by the inductive hypothesis on F m . Otherwise, suppose S starts at state d 1 of the the i 1 th layer of B j 1 and ends at state d 2 of the i 2 nd layer of B j 2 . By the inductive hypothesis on F j 1 and F j 2 , the subprograms (B j 1 ) d 1 ,1
are computed by read-once AC 0 circuits of depth D − 1, call them G and H. Then the function that the subprogram S computes is also computed by the depth-D circuit
It is fairly immediate that C is closed under taking restrictions, taking subprograms, and permuting layers. Certainly if B ∈ C, then B i···j ∈ C. Furthermore, if each B d 1 ,d 2 i···j is computed by a read-once AC 0 circuit F d 1 respectively. We can now take the family of ordered branching programs in the statement of Theorem 4.5 to be this family C. By our Fourier growth bound in Corollary 3.4, we obtain a pseudorandom generator for read-once AC 0 . Corollary 4.7. For every n ∈ N, > 0, there exists a pseudorandom generator G : {0, 1} sn, → {0, 1} n for s n, =Õ(log D n · log(n/ε)) that ε-fools any function F computed by a read-once AC 0 circuit of depth D on n inputs.
Future Work
Motivated by the analysis of [10] in the case of read-once CNFs F , we see two directions for improvement upon the current seed length ofÕ(log D+1 (n)).
Firstly, we could try relaxing our notion of Fourier growth: rather than bounding L k (F ), it suffices to bound L k (G) where G approximates F :
The functions F + and F − are called δ-sandwiching approximators for F . Gopalan et al. [10] used the results of [9] to construct sandwiching approximators with low L 1 -norm for read-once CNFs, and these approximators allowed them to set a constant fraction of the bits at each level of recursion (p = Ω(1)), whereas the generator we use only sets a 1/O(log n) fraction at each level (when D = 2). We would thus like to similarly exploit sandwiching approximators for arbitrary read-once AC 0 circuits to improve the seed length of the generator.
Additionally, Gopalan et al. [10] showed that after each round of pseudorandomly restricting a constant fraction of the input bits, F shrinks from m to m 1−Ω(1) clauses, so after only O(log log n) (rather than O(log n)) steps, the resulting CNF is sufficiently small with high probability that it can be fooled directly by a small-bias space 2 .
We would also like to argue that arbitrary read-once AC 0 circuits shrink well under pseudorandom restrictions. At least in the case of truly random restrictions, as we show in Appendix A, it is true that read-once AC 0 circuits with all but 1/polylog(n) of the input bits restricted will shrink with high probability to size polylog(n), which gives hope that our seed length can be reduced at least toÕ(log D n). That said, it is not immediately clear to the authors how to modify the argument to handle pseudorandom restrictions.
[26] Thomas Steinke, Salil P. Vadhan, and Andrew Wan, Pseudorandomness and fourier growth bounds for width 3 branching programs, CoRR abs/1405.7028 (2014).
[27] A. Tal. Shrinkage of de Morgan formulas from quantum query complexity. Electronic Colloquium on Computational Complexity, 21(48), 2014.
[28] L. Trevisan and T. Xue. A derandomized switching lemma and an improved derandomization of AC0. In Proceedings of the Twenty-Eighth Annual IEEE Conference on Computational Complexity, pages 242-247, 2013.
A Random Restrictions Simplify Circuits
We prove that any read-once AC 0 circuit is approximated by read-once AC 0 circuits which shrink to polylogarithmic size with high probability under a truly random restriction of sufficiently many bits. First, we make precise the distribution from which we are sampling our restrictions.
The restrictions F | t←x we will be considering are such that t ∼ T and x ∼ U for T a p-regular distribution and U the uniform distribution.
Theorem A.2. For ε = 1/poly(n), let F : {0, 1} n → {0, 1} be computed by a read-once, depth-D circuit. Let T be a p-regular distribution for p = 1/O(log D−1 n) and U the uniform distribution on {0, 1} n .
Then F has O(n √ ε)-sandwiching approximators F and F u computed by read-once AC 0 circuits of depth D such that F | t←x and F u | t←x are of size at mostÕ(log D n) with probability at least 1 − 2ε over the choice of x ∼ U , t ∼ T .
For the rest of this section, we will assume without loss of generality that the circuits we are dealing with consist solely of NAND gates, potentially with some NOT gates over the inputs. Indeed, any AND gate can be replaced with a negated NAND gate, and any OR of nodes can be replaced with the NAND of the negations of those nodes. By standard techniques, all the negations can be moved to lie directly above the inputs.
A.1 Collapse Probability
To prove Theorem A.2, we will first prove that by Theorem 3.1, the probability that a readonce AC 0 circuit does not collapse to a constant under p-regular restriction is small relative to its acceptance and rejection probabilities. This lemma will then allow us to prove Theorem A.2 in the last subsection by generalizing the arguments of [ 
Proof. Without loss of generality, we can assume that F is monotone: if we have another F given by adding NOT gates above some of the inputs, then because each bit is set to 0 or 1 with equal probability, F | t←x and F | t←x have the same probability of remaining nonconstant. By monotonicity, F | t←x is nonconstant if and only if (F | t←x )(0) = (F | t←x )(1), where 0 and 1 denote the strings of n repeated 0's and repeated 1's respectively. But
where X and Y are the distributions of n independent samples from a coin with bias p and −p, respectively. By (6) and the triangle inequality,
so we're done by Theorem 3.1.
A.2 Concentrated Shrinkage
Lemma A.4. For ε = 1/poly(n), let F : {0, 1} n → {0, 1} be computed by a read-once, depth-D circuit such that for each node f , 1 −f [0] ≥ ε. If T is a p-regular distribution and U is the uniform distribution, then F | t←x is of sizeÕ(log D n) with probability at least 1 − ε over the choice of x ∼ U , t ∼ T .
Proof. Our claim is that each remaining node in F | t←x fails to have fan-in at mostÕ(log n) with probability at most ε/(nD) so that by the union bound, F | t←x fails to have the desired size with probability at most ε.
Fix some node f of F , and partition its children into chunks C 0 , ..., C m where C i is the set of all children c for which 2 i ≤ 1 −ĉ[0]/ε ≤ 2 i+1 . Note that m ≤ O(log n) because ε = 1/poly(n). Let
Denote the nodes of C i by c i 1 , ..., c i |C i | , and let Y i j be the indicator variable equal to 1 if c i j survives in F | t←x (i.e. does not collapse to a constant), and 0 otherwise. Note that
where the penultimate inequality follows by Lemma A.3. We want to show that for each i, j Y i j is small with high probability.
Let M ∈ Z and k < M be some parameters which we will determine later, and let S k (Y i 1 , ..., Y i |C i | ) denote the kth symmetric polynomial in the variables Y i j . 3 It follows that
where the former inequality holds by noting that if more than M of the Y i j are 1, then there are at least M k terms equal to 1 in S k (Y i 1 , ..., Y i |C i | ), and the latter inequality holds by (8) and independence. Stirling's approximation and (7) give that
Now if ε/(mnD) > 1/n c for some constant c, take M to be 3e log log(n) c log(1/ε i ) and k to be log log(n) c for a large enough constant c that (log log(n) c ) log log(n) c > n c and Pr j Y i j > M < ε/(mnD). A union bound over the m choices of i and the at most nD choices of node f gives the desired bound on probability that fan-in at f is at most i 3e log log(n) c log(1/ε i ) = O log log(n) c i log(1/ε i ) ≤Õ(log n), where the last equality follows because i ε i ≥ ε = 1/poly(n).
We now drop the assumption that rejection probability is not too small in order to prove Theorem A.2.
Proof of Theorem A.2. If F has the property that 1−f [0] ≥ ε for every node f , then by Lemma A.4, we can take F and F u to be F itself. Otherwise, we will show how to modify F to obtain sandwiching formulas with this property.
Let L(G) denote the number of leaves of a formula G. We inductively show that each node f Define f (resp. f u ) to be f but with each child c of f replaced by c u (resp. c ). Then
The same analysis tells us ( 
). If f u ≥ ε, take f u to be f u ; otherwise, take f u to be the constant 1 function, in which case
It follows that f and f u are Define f u to be the constant 1 function. Define f to be f but with each child c of f replaced by c u . If f ≥ ε, take f to be f .
Otherwise, we note that it's possible to prune from f enough children to get f such that ε ≤ 1 −f [0] ≤ √ ε. Assume to the contrary. Order the children c u in any way {c 1 , ..., c k } and define q j = k i=j 1 − (1 −ĉ i [0]). Then q 1 > √ ε and q k < ε. Then either there is some j for which ε ≤ q j ≤ √ ε, or there is some j for which ε ≤ 1 −ĉ j [0] ≤ √ ε, a contradiction. By construction, f and f u are sandwiching formulas for f which satisfy ii). It remains to verify i). f u is constant. For f , 1 −f [0] ≥ ε by construction. If f = f , then because 1 −f [0] ≤ 1 − ε for the same reason as in Case 1. Otherwise, we know by construction
