The classical lemma of Ore-DeMillo-Lipton-Schwartz-Zippel states that any nonzero polynomial f (x 1 , . . . , x n ) of degree at most s will evaluate to a nonzero value at some point on a grid S n ⊆ F n with |S| > s. Thus, there is a deterministic polynomial identity test (PIT) for all degree-s size-s algebraic circuits in n variables that runs in time poly(s) · (s + 1) n . In a surprising recent result, Agrawal, Ghosh and Saxena (STOC 2018) showed any deterministic blackbox PIT algorithm for degree-s, size-s, n-variate circuits with running time as bad as s n 0.5−δ HUGE(n), where δ > 0 and HUGE(n) is an arbitrary function, can be used to construct blackbox PIT algorithms for degree-s size s circuits with running time s exp • exp(O(log * s)) .
Introduction
Multivariate polynomials are the primary protagonists in the field of algebraic complexity and algebraic circuits form a natural robust model of computation for multivariate polynomials. For completeness, an algebraic circuit is defined via a directed acyclic graph with internal gates labeled by + (addition) and × (multiplication) and with leaves labeled by either variables or field constants; computation flows in the natural way.
In the field of algebraic complexity, much of the focus has been restricted to studying n-variate polynomials whose degree is bounded by a polynomial function in n, and such polynomials are called low-degree polynomials. This restriction has several a-priori and a-posteriori motivations, and excellent discussions of this can be seen in the thesis of Forbes [For14, Section 3.2] and Grochow's answer [Gro] on cstheory.SE. The central question in algebraic complexity is to find a family of low-degree polynomials that requires large algebraic circuits to compute it. Despite having made substantial progress in various subclasses of algebraic circuits (cf. surveys [SY10, Sap15] ), the current best lower bound for general algebraic circuits is merely an Ω(n log d) lower bound of Baur and Strassen [BS83] .
An interesting approach towards proving lower bounds for algebraic circuits is via showing good upper bounds for the algorithmic task called polynomial identity testing. Our results deal with this approach and we elaborate on this now.
Polynomial Identity Testing
Polynomial identity testing (PIT 1 ) is the algorithmic task of checking if a given algebraic circuit C of size s computes the identically zero polynomial. As discussed earlier, although a circuit of size s can compute a polynomial of degree 2 s , this question typically deals only with circuits whose formal degree 2 is bounded by the size of the circuit.
This algorithmic question has two flavours: whitebox PIT and blackbox PIT. Whitebox polynomial identity tests consist of algorithms that can inspect the circuit (that is, look at the underlying gate connections etc.) to decide whether the circuit computes the zero polynomial or not. A stronger algorithm is a blackbox polynomial identity test where the algorithm is only provided basic parameters of the circuit (such as its size, the number of variables, a bound on the formal degree) and only has evaluation access to the circuit C. Hence, a blackbox polynomial identity test for a class C of circuits is just a list of evaluation points H ⊆ F n such that every nonzero circuit C ∈ C is guaranteed to have some a ∈ H such that C(a) = 0. Such sets of points are also called hitting sets for C. Therefore, the running time of a blackbox PIT algorithm is essentially given by the size of the hitting set and the time taken to generate it given the parameters of the circuit.
The classical Ore-DeMillo-Lipton-Schwartz-Zippel Lemma [Ore22, DL78, Zip79, Sch80] states that any nonzero polynomial f (x 1 , . . . , x n ) of degree at most d will evaluate to a nonzero value at a randomly chosen point from a grid S n ⊆ F n with probability at least 1 − d |S| . Therefore, this automatically yields a randomized polynomial time blackbox PIT algorithm, and also a deterministic (d + 1) n · poly(s) blackbox PIT algorithm, for the class of size s and formal-degree d circuits. Furthermore, a simple counting/dimension argument also says that there exist (non-explicit) poly(s) sized hitting sets for the class of polynomials computed by size s algebraic circuits. The major open question is to find a better deterministic algorithm for this problem.
PIT is an important algorithmic question of its own right, and many classical results such as the primality testing algorithm [AKS04] , IP = PSPACE [LFKN90, Sha90] , algorithms for graph matching [MVV87, FGT16, ST17] all have a polynomial identity test at its core. Yet another reason why PIT is an important algorithmic question is its intimate connections with the question of proving explicit lower bounds for algebraic circuits.
Heintz and Schnorr [HS80] , and Agrawal [Agr05] observed that given an explicit hitting set for size s circuits, any nonzero polynomial that is designed to vanish on every point of the hitting set cannot be computable by size s circuits. By tailoring the number of variables and degree of the polynomial in this observation, they showed that polynomial time blackbox PITs yield an Ecomputable family { f n } of n-variate multilinear polynomials that require 2 Ω(n) sized circuits. This connection between PIT and lower bounds was strengthened further by Kabanets and Impagliazzo [KI04] who showed that explicit families of hard functions can be used to give non-trivial derandomizations for PIT. Thus, the question of proving explicit lower bounds and the task of finding upper bounds for PIT are essentially two sides of the same coin.
Bootstrapping
A recent result of Agrawal, Ghosh and Saxena [AGS18] showed, among other things, the following surprising result: blackbox PIT algorithms for size s and n-variate circuits with running time as bad as s n 0.5−δ · HUGE(n) , where δ > 0 and HUGE is an arbitrary function of n, can be used to construct blackbox PIT algorithms for size s circuits with running time s exp • exp(O(log * s)) . Note that log * n refers to the smallest i such that the i-th iterated logarithm log 
Proof overview
The basic intuition for the proofs in this paper, and as per our understanding also for the proofs 
there is a nonzero polynomial on n variables and individual degree d that vanishes on the hitting set H(n, d, s), and hence cannot be computed by a circuit of size s.
In a nutshell, given an explicit hitting set, we can obtain hard polynomials. In fact, playing around with the parameters d and k ≤ n, we can get a hard polynomial on k variables, degree kd
We now state a result of Kabanets and Impagliazzo [KI04] that shows that hardness can lead to derandomization.
Theorem 1.3 (Informal, Kabanets and Impagliazzo [KI04]).
A superpolynomial lower bound for algebraic circuits for an explicit family of polynomials implies a deterministic blackbox PIT algorithm for all algebraic circuits in n variables and degree d of size poly(n) that runs in time poly(d) n ε for every ε > 0. Now, we move on to the main ideas in our proof. Suppose we have hitting sets of size s o(n) for size s, degree d ≤ s circuits on n variables. The goal is to obtain a blackbox PIT for circuits of size s, degree s on s variables with a much better dependence on the number of variables.
Observe that if the number of variables was much much smaller than s, say at most a constant, then the hitting set in the hypothesis has a polynomial dependence on s, and we are done. With this in mind, the hitting sets for s variate circuits in the conclusion of Theorem 1.1 are designed iteratively starting from hitting sets for circuits with very few variables. In each iteration, we start with a hitting set for size s, degree d ≤ s circuits on n variables with some dependence on n and obtain a hitting set for size s, degree d ≤ s circuits on m = 2 n δ variables (for some δ > 0), that has a much better dependence on m. Then, we repeat this process till the number of variables increases up to s, which takes O(log * s) iterations. We now briefly outline the steps in each such iteration.
• Obtaining a family of hard polynomials : The first step is to obtain a family of explicit hard polynomials from the given hitting sets. This step is done via Theorem 1.2, which simply uses interpolation to find a nonzero polynomial Q on k variables and degree d that vanishes on the hitting set for size s , degree d circuits on n variables, for some s , d to be chosen appropriately.
• Variable reduction using Q : Next, we take a Nisan-Wigderson design (see Definition 2.1) {S 1 , S 2 , . . . , S m }, where each S i is a subset of size k of a universe of size k 2 , and
Consider the map Γ :
. As Kabanets and Impagliazzo show in the proof of Theorem 1.3, Γ preserves the nonzeroness of all algebraic circuits of size s on m variables, provided Q is hard enough, i.e. s = s a for a sufficiently large a.
• Blackbox PIT for m-variate circuits of size s and degree s : We now take the hitting set given by the hypothesis for the circuit Γ(C) (invoked with appropriate size and degree parameters) and evaluate Γ(C) on this set. From the discussion so far, we know that if C is nonzero, then Γ(C) cannot be identically zero, and hence it must evaluate to a nonzero value at some point on this set. The number of variables in Γ(C) is at most k 2 = log 2 m, whereas its size turns out to be not too much larger than s. Hence, the size of the hitting set for C obtained via this argument turns out to have a better dependence on the number of variables m than the hitting set in the hypothesis.
Similarities and differences with the proof of Agrawal et al. [AGS18].
The high level outline of our proof is essentially the same as that of Agrawal et al. [AGS18] . However, there are some quantitative differences in the argument, that make our final arguments shorter and simpler than those of Agrawal et al. and lead to a stronger and near optimal boostrapping statement in Theorem 1.1.
The primary differences between our proof and that of Agrawal et al. are rather technical but we try to briefly describe them. The first difference is in the choice of Nisan-Wigderson designs.
The designs used in this paper are based on the standard Reed-Solomon code and they yield larger set families than the designs used by Agrawal et. al. 3 The second difference is the evolution of parameters in the inductive argument. We believe the primary difference is in the main inductive hypothesis [AGS18, Lemma 18] that assumes a non-trivial hitting set for some n-variate circuits for n smaller than some large constant n 0 (which they then use to bootstrap). However, this hypothesis does not degrade gracefully for n-variate circuits when n is a little larger than n 0 and appears to force them to use more stringent parameters. Also, their proof is quite involved and we are unsure if there are other constraints in their proof that force such choices of parameters.
Our proof, though along almost exactly the same lines, appears to be more transparent and more malleable with respect to the choice of parameters.
The strength of the hypothesis. The hypothesis of Theorem 1.1 and also those of the results in the work of Agrawal et al. [AGS18] is that we have a non-trivial explicit hitting set for algebraic circuits of size s, degree d on n variables where d and s could be arbitrarily large as a function of n.
This seems like an extremely strong assumption, and also slightly non-standard in the following sense. In a typical setting in algebraic complexity, we are interested in PIT for size s, degree d 3 However, even without these improved design parameters, our proof can be used to provide the same conclusion when starting off with a hitting set of size s n 1−δ · HUGE(n), instead of the hypothesis of Theorem 1. Remark. Throughout the paper, we shall assume that there are suitable · 's or · 's if necessary so that certain parameters chosen are integers. We avoid writing this purely for the sake of readability.
Furthermore, we make absolutely no attempt to optimise constants. Several of the inequalities used are weak and tightening them makes little qualitative difference to the final theorem statements. ♦
Preliminaries

Notation
• For a positive integer n, we use [n] to denote the set {1, 2, . . . , n}.
• We use boldface letters such as x [n] to denote a set {x 1 , . . . , x n }. We drop the subscript whenever the number of elements is clear or irrelevant in the context.
• We use C(n, d, s) to denote the class of n-variate polynomials of formal degree at most d that are computable by algebraic circuits of size at most s. This class may also include polynomials that actually depend on fewer variables but are masquerading to be n-variate polynomials.
• For a polynomial f (x 1 , . . . , x n ), we shall say its individual degree is at most k to mean that the exponent of any of the x i 's in any monomial is at most k.
Some basic definitions and lemmas Definition 2.1 (Nisan-Wigderson designs [NW94]).
A family of sets S 1 , . . . , S m ⊆ [ ] is said to be an ( , k, r)-design if
The following is a standard construction of such designs based on the Reed-Solomon code.
Lemma 2.2 (Construction of NW designs).
There is an algorithm that, given parameters , k, r satisfying = k 2 and r ≤ k with k being a power of 2, outputs an ( , k, r)-design {S 1 , . . . , S m } for m ≤ k r in time poly(m).
Proof. Since k is a power of 2, we can identify [k] with the field F k of k-elements and [ ] with F k × F k . For each univariate polynomial p(x) ∈ F k [x] of degree less than r, define the set S p as
Since there are k r such polynomials we get k r subsets of F k × F k of size k each. Furthermore, since any two distinct univariate polynomials cannot agree at r or more places, it follows that S p ∩ S q < r for p = q.
Hardness-randomness connection
For a fixed ( , k, r)-design S 1 , . . . , S m and a polynomial Q(z 1 , . . . , z k ) ∈ F[x] we shall use the notation Q , k, r NW to denote the vector of polynomials Proof. This is achieved by finding a nonzero k-variate polynomial, for k ≤ n, of individual degree smaller than |H| 1/k that vanishes on the hitting set H for C(n, d, s). The degree of Q k is at most k · |H| 1/k ≤ d from the hypothesis. Such a Q k can be found by solving a system of linear equations in time poly(|H|). By the definition of the hitting set, we must have that Q k (z 1 , . . . , z k ) cannot be an element of C(n, d, s) and therefore Q k cannot be computed by algebraic circuits of size s. However, note that Q k is a sum of at most |H| monomials over k variables and thus has an algebraic circuit of size at most k + 1 + |H|.
Lemma 2.3 (Generators from hard polynomials [KI04]
Bootstrapping Hitting Sets
In this section, we give a simple proof of the main result of Agrawal et al. [AGS18] along the same lines as the original proof albeit with different parameters. This proof, besides being a more transparent exposition, would also ensure that any constraints while setting various parameters are made clear. such that, for all large enough values of s, there is an explicit hitting set of size s g(n 0 ) for C(n 0 , s, s).
Then there is an explicit hitting set for C(s, s, s) of size s exp • exp(O(log * s)) .
The following lemma describes the main inductive statement using which Theorem 3.1 follows readily.
Lemma 3.2. Let n 0 be a large enough 4 power of 2, and let s be a growing parameter. Suppose g : N → N is a non-decreasing function with 30 · g(n 0 ) < n 1/4 0 such that, for all large enough s, there is an explicit hitting set of size s g(n 0 ) for degree-s size-s circuits over n 0 variables.
Then for n 1 = 2 n 1/4 0 > n 0 and h : N → N given by h(n) = 30 · (g((log n) 4 )) 2 , there is an explicit hitting set of size s h(n 1 ) for degree-s size-s circuits on n 1 variables. Furthermore, h(n 1 ) also satisfies 30 · h(n 1 ) < n 1/4 1 .
We will defer the proof of this lemma and finish the proof of Theorem 3.1.
Proof of Theorem 3.1. The hypothesis and conclusion of Lemma 3.2 admit repeated applications of the lemma to get hitting sets for polynomials depending on larger sets of variables. The natural strategy is therefore to apply the lemma repeatedly to obtain a hitting set for the class C(s, s, s). We now set up some basic notation to facilitate this analysis.
We start with an explicit hitting set of size s g(n 0 ) for C(n 0 , s, s) circuits and say after i applications of Lemma 3.2 we have an explicit hitting set for the class C(n i , s, s) of size s t i . We wish to track the evolution of n i and t i . Recall that n i = 2 n 1/4 i−1 after one iteration of Lemma 3.2. Let {m i } i be such that m 0 = log n 0 and, for every i > 0, let m i = 2 (m i−1 /4) so that m i = log n i . Similarly to keep track of the complexity of the hitting set, if s t i is the size of the hitting set for C(n i , s, s), then by Lemma 3.2 we have t 0 = g(n 0 ) and t i = 30 · t 2 i−1 for all i ≥ 1. The following facts are easy to verify.
• m i ≥ log s for i = O(log * s),
• for all j, we have t j = 30
• the exponent of s in the complexity of the final hitting set is t O(log
Therefore we have an s exp • exp(O(log * s)) sized explicit hitting set for C(s, s, s).
Proof of Lemma 3.2. We would need to fix some parameters: k = n 1/2 0 , = n 0 and r = n 1/4 0 .
Constructing a hard polynomial:
The first step is to construct a polynomial Q k (z 1 , . . . , z k ) that cannot be computed by n 0 -variate size s 15 circuits. This can be done by using Lemma 2.4.
The polynomial Q k (z) will therefore have the following properties.
4 This is to ensure that 2 n 1/4 > n for all n > n 0 and this is true for any n 0 > 2 16 .
• Individual degree ≤ s 15g(n 0 )/k ≤ s, and degree ≤ s 15g(n 0 )/k · s ≤ s 2 .
• Q k is not computable by circuits of size s 15 .
• Q k has an algebraic circuit of size ≤ 2s 15g(n 0 ) .
Building the NW design: Using Lemma 2.2 5 , construct an ( , k, r)-design {S 1 , . . . , S n 1 } for 2 r = n 1 , which is bigger than n 0 since n 0 is large enough 4 .
Variable reduction using Q k : Let 0 ≡ P(x 1 , . . . , x m ) ∈ C(m, s, s). Suppose P(Q , k, r NW ) ≡ 0, then Lemma 2.3 forces Q k to have an algebraic circuit of size bounded by
which we know is false by our construction of Q k . Therefore, P(Q , k, r NW ) is a nonzero polynomial.
Since P has a circuit of size s and Q k has a circuit of size s 15g(n 0 ) , it follows that the polynomial P(Q k , k, r NW ) has a circuit of size at most s + k · 2s 15g(n 0 ) ≤ s 30g(n 0 ) =: s . Furthermore, the degree of the polynomial
Hitting set for C(n 1 , s, s): From the above discussion, the polynomial P(x) ≡ 0 if and only if P = P(Q k , k, r NW ) ≡ 0. We also know that P ∈ C( , s , s ) where s = s 30g(n 0 ) . Therefore, by composing the hitting set for C( , s , s ) with Q k , k, r NW , we obtain a hitting set for C(n 1 , s, s). The size of the hitting set is
Re-establishing the invariant:
where the last inequality uses the fact that (log n) 2 < 30n 1/4 for all n ≥ 1.
It is clear that the entire construction is in polynomial time in the size of the hitting set of the conclusion and the running time of the hitting set construction in the hypothesis.
Near-optimal bootstrapping
To finish the proof of the main theorem (Theorem 1.1), we show how we can go from the hypothesis of Theorem 1.1 to the hypothesis of Theorem 3.1. This is again along the same lines as the proof of Lemma 3.2 but with a different choice of parameters. 5 The lemma can provide more sets but this weaker version is chosen to just make some calculations easier. Again, we will defer the proof of this lemma as Theorem 1.1 follows readily from this lemma.
Proof of Theorem 1.1. Since this conclusion satisfies the hypothesis of Theorem 3.1 we can infer that there is an explicit hitting set for C(s, s, s) of size s exp • exp(O(log * s)) .
Proof of Lemma 3.3. The strategy is exactly along the lines of Lemma 3.2 but we would have to work with slightly different parameters. Let n be the smallest integer satisfying the following constraints:
• n is at least 7 and is a power of 2,
Fix n 0 := n 2 . Note that for every s ≥ max m≤n 0 (HUGE(m)) and every m ≤ n 0 , we have an explicit hitting set of size at most s 2m/ f (m) for the class C(m, s, s). Fix the parameters := n 0 , k := √ n 0 = n and r := f (n).
Constructing a suitably hard polynomial: We will construct a polynomial Q n (z 1 , . . . , z n ) that is not computable by algebraic circuits of size s 15 . We will again invoke Lemma 2.4 to do this.
The polynomial Q n (z) has the following properties.
• Individual degree ≤ s 30/ f (n) ≤ s, and degree ≤ s 30/ f (n) · s ≤ s 2 .
• Q n is not computable by circuits of size s 15 .
• Q n has an algebraic circuit of size ≤ 2s 30n/ f (n) .
Building the NW design:
We will use Lemma 2.2 to construct an ( , n, r) design {S 1 , . . . , S m 0 } with m 0 := n r = n √ f (n) .
Variable reduction: Let P(x 1 , . . . , x m 0 ) be a nonzero circuit from C(m 0 , s, s). Like in Lemma 3.2, since
and Q n is hard for circuits of size s 15 by construction, we have that P(Q , n, r NW ) must be nonzero.
Note that P has a circuit of size s and Q n is trivially computable by a circuit of size 2s 30n/ f (n) .
Therefore P(Q , n, r NW ) has a circuit of size s + m 0 · 2s 30n/ f (n) ≤ s 60n/ f (n) = s (say). Also the degree of the polynomial computed by P(Q , n, r NW ) is at most s · ns 30/ f (n) ≤ s 3 ≤ s 60n/ f (n) .
Hitting set for C(m 0 , s, s): Starting with a nonzero circuit from C(m 0 , s, s), we have now obtained a nonzero circuit in the class C(n 0 , s , s ). We now apply the hypothesis to the circuit on n 0 variables thereby obtaining an explicit hitting set for C(m 0 , s, s) of size It is easy to verify that the entire construction runs in time that is polynomial in (1) the time required for the construction of the hitting set from the hypothesis and (2) the size of the hitting set in the conclusion (s h(m) ).
Conclusions
The main results show that it suffices to construct hitting sets of size s o(n) , that are barely better than the trivial hitting set of s n , to obtain an almost complete derandomisation. A natural question in the spirit of the results in this paper, and those in Agrawal et al. [AGS18] seems to be the following : Can we hope to bootstrap lower bounds? In particular, can we hope to start from a mildly non-trivial lower bound for general arithmetic circuits (e.g. superlinear or just superpolynomial), and hope to amplify it to get a stronger lower bound (superpolynomial or truly exponential respectively). In the context of non-commutative algebraic circuits, Carmosino et al. [CILM18] recently showed such results, but no such result appears to be known for commutative algebraic circuits.
