We give the first sub-exponential time deterministic polynomial identity testing algorithm for depth-4 multilinear circuits with a small top fan-in. More accurately, our algorithm works for depth-4 circuits with a plus gate at the top (also known as ΣΠΣΠ circuits) and has a running time of exp(poly(log(n), log(s), k)) where n is the number of variables, s is the size of the circuit and k is the fan-in of the top gate. In particular, when the circuit is of polynomial (or quasi-polynomial) size, our algorithm runs in quasi-polynomial time. In [AV08], it was shown that derandomizing polynomial identity testing for general ΣΠΣΠ circuits implies a derandomization of polynomial identity testing in general arithmetic circuits. Prior to this work sub-exponential time deterministic algorithms were known for depth-3 circuits with small top fan-in and for very restricted versions of depth-4 circuits.
INTRODUCTION
Polynomial Identity Testing (PIT) is one of the central problems in algebraic complexity theory: Given an arithmetic circuit C over a field F with input variables x1, x2, · · · , xn, can we check efficiently whether C computes the identically zero polynomial in the polynomial ring F[x1, x2, · · · , xn]? The same question can be asked in the black-box model too. In the black-box model, C is accessed by a black-box where we are allowed to substitute field elements ai ∈ F for xi and the black-box returns the value of C(a1, a2, · · · , an).
A randomized polynomial-time algorithm (more precisely a coRP algorithm) for this problem is known due to the Schwartz-Zippel Lemma [Sch80, Zip79] . Over the years PIT has played a significant role in our understanding of important complexity theoretic and algorithmic problems. Wellknown examples are the randomized NC algorithms for the matching problem in graphs [Lov79, MVV87] , and the AKS primality test [AKS04] . The PIT problem has also played an indirect role in important complexity results such as IP = PSPACE [LFKN92, Sha90] and the old proof of PCP theorem [ALM + 98] . The main open problem is to come up with a deterministic polynomial-time (or at least subexponential-time) algorithm for PIT. In 2003, Kabanets and Impagliazzo [KI03] show that giving a deterministic polynomial-time (even subexponential-time) identity testing algorithm means either NEXP ⊂ P/poly or that the integer Permanent has no polynomial size arithmetic circuit. Considering the blackbox derandomization of PIT, Agrawal further strengthen the connection of PIT with proving circuit lower bounds [Agr05] . More precisely, he shows that the black-box derandomization of PIT implies that an explicit multilinear polynomial has no subexponential size arithmetic circuit.
The results of [KI03] and [Agr05] have triggered a large amount of research for PIT derandomization. So far, most of the derandomization results are known for depth-3 ΣΠΣ(k, d) circuits when the top Σ gate is of bounded fan-in k (d is the fan-in of the Π gates which can be unbounded) [DS06, KS07, KS08, SS09, KS09] . In an important discovery, Agrawal and Vinay [AV08] justified the lack of progress beyond depth-3. What they show is that the black-box derandomization of PIT for only depth-4 ΣΠΣΠ circuits is almost as hard as that for general arithmetic circuits. Their result is based on a depth reduction technique [VSBR83, AJMV98] that converts any arithmetic circuits C to a depth-4 circuit C such that C and C compute the same polynomial. Thus, their reduction is suitable for black-box PIT derandomization. This connection makes the problem of black-box derandomization of PIT for depth-4 circuits an intriguing open problem.
So far all the black-box derandomization algorithms for depth-3 ΣΠΣ(k, d) circuits [DS06, KS08, SS09, KS09] exploit one common theme: The subspace spanned by the linear forms of an identically zero ΣΠΣ circuit (viewing each linear form as a vector in F n ) is of low dimension. More precisely, over a finite field, the current best known bound for the dimension is O(k 3 log d) [SS09] and over the field Q of rational numbers, the bound is 2 O(k log k) which is still a constant for a constant k [KS09] . For multilinear ΣΠΣ(k, d) circuits, over any characteristic, the dimension is bounded by O(k 3 log(k)) [DS06, SS09] . Yet, the algorithm with the best running time [SV09] was obtained using a different approach. For depth-4 circuits, the situation is very different. It seems unlikely that the method of black-box derandomization for ΣΠΣ circuits can be adopted/extended for depth-4 circuits. The main difficulty is that there seems to be no notion of a linear space, spanned by the circuit components, that can be used.
In this paper, we study the black-box PIT problem for multilinear depth-4 circuits with bounded fan-in at the top Σ gate. We give new techniques and come up with an efficient black-box algorithm which runs in time quasi-polynomial in the input size. We first formally define a depth-4 circuit. A depth-4 ΣΠΣΠ circuit has four layers of alternating Σ and Π gates. The top gate is a Σ gate (at level one).
1 The circuit computes a polynomial C(x1, x2, · · · , xn) of the form
Pij, where k is the fan-in of the top Σ gate and di are the fan-in's of the Π gates at the second level. Pij-s are the polynomials computed at the third level of the circuit (which is a ΣΠ component). It is clear that the number of monomials in each Pij is bounded by the fan-in of the Σ gates at the third level, and in particular, bounded by the circuit size s. In the rest of the paper, we refer to the polynomials Pij as s-sparse polynomials where the sparsity should be understood as a parameter of the circuit size. Also for notational convenience, we denote depth-4 circuits whose top Σ fan-in is at most k by ΣΠΣΠ(k) circuits.
We consider the identity testing problem of ΣΠΣΠ(k) circuits when each multiplication gate
Pij computes a multilinear polynomial and the fan-in of the top Σ gate is a constant k. We call such circuits depth-4 multilinear ΣΠΣΠ circuits. We give a deterministic black-box PIT algorithm for this model with a running time exp(poly(log(n), log(s), k)) (s is the size of the circuit) which is quasi-polynomial in the input size. More formally, we prove the following theorem. Theorem 1. Let k, n, s be integers. There is an explicit set H of size n O(k 6 log(k) log 2 s) , that can be constructed in time n O(k 6 log(k) log 2 s) such that the following holds. Let P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a multilinear ΣΠΣΠ(k) circuit of size s on n variables. Then there is someᾱ ∈ H such that P (ᾱ) = 0.
To the best of our knowledge, prior to our work, efficient deterministic algorithm were known only in the non blackbox setting for very restricted classes of depth-4 circuits 1 One can consider ΠΣΠΣ circuits, however, for identity testing purposes the interesting case is the ΣΠΣΠ model. [AM07, Sax08, SV09] . For higher depths, efficient blackbox algorithms are known only for read-once formulas [SV08, SV09] . Designing efficient PIT algorithms for multilinear circuits is an important problem and our result makes the first step for the important class of depth-4 circuits.
Overview of Our Algorithm and Proof Technique
To start with, we briefly define the notion of generators and hitting sets for arithmetic circuits which are important for our algorithm. Intuitively, a generator G for a class of polynomials M, is a function that stretches q independent variables into n >> q dependent variables that can be plugged into any polynomial P ∈ M without vanishing it. A set H ⊆ F n is a hitting set for a class of polynomials M, if for every nonzero polynomial P ∈ M, there exists a ∈ H, such that P (ā) = 0. In identity testing, the role of generators and hitting sets are equivalent. The image of a generator for a class of circuits always contains a hitting set for the same class of circuits. Conversely, given a hitting set for a class of arithmetic circuits, it is fairly easy to construct a generator.
In our algorithm, we use a recursive technique (on the fan-in k of the top Σ gate) to find a generator for ΣΠΣΠ(k) circuits and in every stage of the recursion we also construct a hitting set. Recall that the sparsity of the polynomials Pij is bounded by the circuit size s. For k = 1, we need to build a generator for product of s-sparse polynomials. It is easy to see that a generator for a single s-sparse polynomial is also a generator for a product of s-sparse polynomials and the construction of a generator for a s-sparse polynomial is well known [KS01] .
For k > 1, we construct the generator via the following procedure: Let P be a non-zero n-variate polynomial computed by a ΣΠΣΠ(k) circuit C of size s and let G k−1 be a generator for ΣΠΣΠ(k − 1) circuits of size s. We prove that there exist a set U ⊆ [n] of size poly(log s) such that a substitution of the generator G k−1 to the variables (indexed by) [n] \ U leads to a non-zero polynomial. By going over all possible sets of choice for U , we can produce a small size hitting set for ΣΠΣΠ(k) circuits, which in turn is transformed into a generator using the techniques of [SV09] . Notice that the number of choices for U is bounded by n poly(log s) , which is quasi-polynomial in s. Now we justify the existence of U which is enough to justify the correctness of our algorithm. We describe the construction of U in two different cases.
Case I: Assume that there exists some large constant r (that depends on k) such that for each i, j, the polynomial Pij depends on at most n/r variables. We show that there exists a subset of the variables V ⊆ [n] of size roughly r/k such that every Pij has at most one variable x such that ∈ V . Now, let G k−1 be a generator for ΣΠΣΠ(k − 1) circuits. Then by suitably fixing the variables whose indices are in [n] \ V from the image set of G k−1 , we obtain a multilinear depth-3 ΣΠΣ(k) circuit 2 . Using a structural theorem of identically zero depth-3 circuits from [DS06, SS09] , our fixing ensures that the resulting depth-3 circuit computes a nonzero polynomial.
Case II: We prove that for any polynomial P that can be computed by a ΣΠΣΠ(k) circuit, there exists a set W ⊆ [n] of size poly(log s) (recall that s is the size of the given circuit) such that the following property holds: For a set S ⊆ [n], let m(S) be the multilinear monomial Q i∈S xi. Express the polynomial P as P = P S⊆W m(S)PS where each polynomial PS is over x [n]\W . We prove that there exists a subset S such that PS can be computed by a ΣΠΣΠ(k) circuit C and each polynomial P ij computed in the third level of C depends only on a small fraction of the total number of variables (which gives a reduction to the first stage). We now explain how to find the set W . Fix r suitably. Let C be the given ΣΠΣΠ(k) circuit. Write C as C = P k i=1 Ni · Ai where Ni = Q j Pij such that each Pij depends on at most n/r variables. Similarly, let Ai be the product of the rest of the polynomials under the i-th Π gate. So by definition, each Pij in Ai depends on at least n/r variables. Hence, by multilinearity, each Ai is a product of at most r many Pij-s. We eliminate Ai-s by the following process: Consider a variable appearing in some Ai. By either setting the variable to zero or taking a partial derivative with respect to it, we can get rid of at least half of the monomials in Ai. Moreover, we show that such a variable exists with the additional property that both the choices (either setting to zero or taking partial derivative) will result in a non-zero polynomial. Repeating this process at most O(log s) times, we can eliminate all such Ai.
The final set U that our algorithm considers is defined as U ∆ = W ∪ V (the union of the sets found at the first and second stages).
Organization
We start by giving the required definitions in Sections 2 and 3. We prove our main theorem (Theorem 4.11) in Section 4, showing a construction of a generator for ΣΠΣΠ(k) circuits. In Section 4.5 we give as an easy corollary a hitting set for ΣΠΣΠ(k) circuits.
PRELIMINARIES
For a positive integer n denote [n] = {1, . . . , n}. Let F be a field andF be its algebraic closure. For a polynomial P (x1, . . . , xn), a variable xi, and α ∈ F, P |x i =α is the polynomial resulting after substituting α to the variable xi. The following definitions are for polynomials P, Q ∈ F[x1, x2, . . . , xn]. We say that P depends on xi if there exist a ∈F n and b ∈F such that:
an).
We denote var(P ) ∆ = {xi | P depends on xi }. Intuitively, P depends on xi if xi appears when P is listed as a sum of monomials. Given a subset I ⊆ [n] and an assignmentā ∈ F n we define P |x I =ā I to be the polynomial resulting from substituting ai to the variable xi for every i ∈ I. Let P, Q be two non-constant polynomials. We say that P and Q are similar and denote P ∼ Q if there exist α, β ∈ F \ {0} such that α · P = β · Q. Let Di(P, Q) be the polynomial defined as follows:
over F. The following is an easy observation.
Observation 2.1. Let P, Q ∈ F[x1, x2, . . . , xn] be two multilinear polynomials such that xi ∈ var(P ) ∩ var(Q). If P ∼ Q then Di(P, Q) ≡ 0.
Mappings and Generators for arithmetic circuits
In this section, we formally define the notion of generators and hitting sets for polynomials and describe a few useful properties. A mapping G = (G 1 , . . . , G n ) : F q → F n , is a generator for the circuit class M if for every non-zero nvariate polynomial P ∈ M, it holds that P (G) ≡ 0. The image of the map G is denoted as Im (G) = G(F q ). Ideally, q should be very small compared to n. A set H ⊆ F n is a hitting set for a circuit class M, if for every nonzero polynomial P ∈ M, there existsā ∈ H, such that P (ā) = 0. A generator can also be viewed as a mapping containing a hitting set for M in its image. That is, for every nonzero P ∈ M there existsā ∈ Im (G) such that P (ā) = 0. In [SV09] an efficient method of constructing a generator from a hitting set, for a (relatively) small q, is given.
Lemma 2.2 (Lemma 4.8 in [SV09] ). Let |F| > n. Given a hitting set H ⊆ F n for a circuit class M there is an algorithm that in time poly(|H| , n) constructs a mapping L(ȳ) : F q → F n , which is a generator for M with q ∆ = log n |H| and the individual degrees of L i are bounded by n − 1.
The following is an immediate and important property of a generator:
Observation 2.3. Let P = P1 · P2, ·... · P k be a product of non-zero polynomials Pi ∈ M and let G be a generator for M. Then P (G) ≡ 0.
At times we would like to use only a partial substitution generator to a polynomial. Given a subset I ⊆ [n] we define the mapping: G I as (G I )i = G i when i ∈ I and (G I )i = xi when i ∈ I. In addition, we define P | x I =G I to be the polynomial resulting from substituting the function G i to the variable xi for each i ∈ I. The following is an immediate observation:
Observation 2.4. Let M be a class of polynomials and let G be a generator for n-variate polynomials in M. Let I ⊆ [n] and P ∈ M be a non-zero polynomial. Then P | x I =G I ≡ 0. Moreover, there existsā ∈ Im`G I´s uch that P (ā) = 0.
Partial Derivatives
Discrete partial derivatives will play an important role in the analysis of our algorithms.
Definition 2.5. Let P be an n-variate polynomial over a field F. We define the discrete partial derivative of P with respect to xi as
For a non-empty subset I ⊆ [n], I =˘i1, . . . , i |I|¯, we define the iterated partial derivative with respect to I in the following way:
Notice that if P is a multilinear polynomial then this definition coincides with the "analytical" one when F = R or F = C. We now state some easy facts about discrete partial derivatives that can be easily verified. Let P ∈ F[x1, x2, . . . , xn] be a multilinear polynomial. Then, P depends on xi if and only if ∂P ∂x i ≡ 0. For every i and j,
For two different variables xi, xj, derivative and substitution commutes:
Known Results
In this section, we recall some known results about sparse polynomials and depth-3 ΣΠΣ circuits which play an important role in the design of our algorithm. A m-sparse polynomial is a polynomial with at most m non-zero monomials. Equivalently, it is a polynomial computed by a depth-2 circuit with top fan-in m. Using a result of [KS01] , we can construct an efficient generator for sparse polynomials.
Lemma 2.6 (Theorem 10 of [KS01] ). In time polynomial in m, n, d and log |F|, one can output a hitting set H of cardinality |H| = poly(n, m, d) for n-variate m-sparse polynomials of degree d over a field F. If F = R then each element of each vector in the set has bit-length at most O(log(nd)). If F is a finite field with less than (nd) 6 elements, then the elements of the vectors lie in the smallest extension of F with at least (nd) 6 elements; otherwise, the vectors contain just elements of F.
Using Lemma 2.2, we can construct a generator from the hitting set output by the above result. Proof. Since the degree of a multilinear polynomial is bounded by n, we apply Lemma 2.2 on the hitting set H output by Lemma 2.6. Note that as |H| = poly(n, m), we obtain that q(n, m) = O(log nm).
The proof technique of our main result involves a reduction from identity testing of a class of depth-4 circuits to depth-3 circuits. Here, we define depth-3 circuits formally and recall some of their relevant properties. A depth-3 ΣΠΣ(k, d) circuit C computes a polynomial of the form
where the Lij(x)-s are linear functions Lij(x) = P a ij x + aij0 with a ij ∈ F, and di ≤ d. We refer to the Fm-s as the multiplication gates of the circuit. A subcircuit of C is defined as a sum of a subset of the multiplication gates in C. Let gcd(C) ∆ = gcd (F1, F2, . . . , F k ). We say that a circuit is simple if gcd(C) = 1. We say that a circuit is minimal if no proper subcircuit of C computes the zero polynomial. Define the rank of C, denoted by rank(C), as the rank of its linear functions, viewed as (n + 1)-dimensional vectors over
) circuit has the additional requirement that each Fi is a multilinear polynomial. We require an important structural theorem regarding the rank of an identically zero ΣΠΣ(k, d) multilinear circuit.
Theorem 2.8. [DS06, SS09] There exists an increasing integer function R(k) upper bounded by O(k 3 log(k)) with the following property: Let C be an n-variate multilinear, simple and minimal ΣΠΣ(k, d) circuit computing the zero polynomial. Then rank(C) < R(k).
We conclude this section with a well-known lemma concerning polynomials, giving a trivial (yet possibly large) hitting set. A proof can be found in [Alo99] .
Lemma 2.9. Let P ∈ F[x1, x2, . . . , xn] be a polynomial. Suppose that for every i ∈ [n] the individual degree of xi is bounded by di, and let Si ⊆ F be such that |Si| > di. We denote S = S1 × S2 × · · · × Sn then P ≡ 0 iff P |S ≡ 0.
DEPTH-4 MULTILINEAR CIRCUITS
In this section, we recall the model of depth-4 multilinear circuits and present a simple structural property of such circuits which is useful for our main result.
Definition 3.1. A multilinear depth-4 ΣΠΣΠ(k) circuit C has four layers of alternating Σ and Π gates (the top Σ gate is at level one) and it computes a polynomial of the form
where the Pij(x)-s are multilinear polynomials computed by the last two layers of ΣΠ gates of the circuit and are the inputs to the Π gates at the second level. Each multiplication gate Fi computes a multilinear polynomial.
Note that the requirement that the Fi-s compute multilinear polynomials implies that for each i the polynomials {Pij} j∈[d i ] are variable-disjoint. It is clear that if the circuit size is s, then the number of monomials in Pij (i.e. its sparsity) is bounded by s. In this paper, we often refer to the polynomials Pij as s-sparse where the sparsity should be understood in terms of the circuit size s. Similar to the case of depth-3 circuits, a (proper) subcircuit of C is defined as the sum of a (proper) subset of multiplication gates of C. Also, a ΣΠΣΠ(k) circuit is simple when no Pij appears in all the multiplication gates at the second level. Namely, gcd(C) ∆ = gcd(F1, . . . , F k ) = 1. When C is not simple we define its simplification to be sim(C)
Note that sim(C) is a simple ΣΠΣΠ(k) circuit.
Our identity testing algorithm builds on a reduction from identity testing of multilinear ΣΠΣΠ(k) circuits to identity testing of a special type of such circuits where for every i, j, |var(Pij)| ≤ n/r. We call such circuits r-compressed circuits. Now we prove an easy structural property of r-compressed circuits which is useful for our algorithm design. Proof. The first element of V can be arbitrarily set to x1. Let T1 ⊆ [n] be the set of variables that appear in some Pij along with x1. As |var(Pij)| ≤ n/r, we get that |T1| ≤ k · ( n r − 1). Hence, the set W = [n] \ (T1 ∪ {x1}) is non empty. We pick one arbitrarily (say, the one with lowest index) from W and construct a set analogical to T1 for it. Due to the size restriction of T1 (and the other T 's), we can continue this process at least r/k times. The set V is the set of these (at least) r/k chosen indices.
BLACK-BOX PIT
In this section we give an efficient black-box PIT algorithm for multilinear ΣΠΣΠ(k) circuits. We construct a generator for such circuits, which gives us a small hitting set. We start by describing the construction of a polynomial map which we eventually use as our generator.
The Construction and Some Easy Properties
In this section we construct a map from F 2t to F n with the following property: Its image contains all vectorsā ∈ F n with at most t non-zero entries. This map will later be put to use in the construction of the generator for depth-4 circuits. We assume that |F| > n as we are allowed to use elements from an appropriate extension field. Throughout the entire section we fix a set A = {α1, α2, . . . , αn} ⊆ F of n distinct elements.
Definition 4.1. For every i ∈ [n] let ui(w) : F → F be the i-th Lagrange Interpolation polynomial for the set A. That is, each ui(w) is polynomial of degree n − 1 satisfying ui(αj) = 1 if i = j and zero otherwise. For every i ∈ [n] and t ≥ 1 we define G i t (y1, . . . , yt, z1, . . . , zt) :
Finally, let Gt(y1, . . . , yt, z1, . . . , zt) : F 2t → F n be defined as Gt(y1, . . . , yt, z1, . . . , zt)
We will use the following immediate observations:
Observation 4.2. For every t ≥ 1, it holds Gt(ȳ,0) ≡ 0.
Observation 4.3. Denote withēi ∈ {0, 1} n the vector that has 1 in the i-th coordinate and 0 elsewhere. Then
Hence, for every t ≥ 1 and αm ∈ A we have that
We now state a simple but crucial property of the generator G that follows from the above observations. (Recall the notation above Observation 2.4).
Observation 4.4. Let , t ∈ N, I ⊆ [n] and |I| ≤ t. Then, it holds that
A Restricted Case: r-compressed ΣΠΣΠ(k) Circuits
In this section we consider a restricted class of ΣΠΣΠ(k) circuits: For a fixed r, we assume that the polynomial is computed by a simple r-compressed ΣΠΣΠ(k) circuit of size s. Using a generator that works for sparse polynomials as well as for ΣΠΣΠ(k − 1) circuits of size s, we construct a generator for r-compressed ΣΠΣΠ(k) circuits of size s. To do so, the key idea is to use the set V ⊆ [n] that we obtain from Lemma 3.2. Recall that V has the following property: The size of V is at least r/k and for every Pij in C, |V ∩ var(Pij)| ≤ 1. Let G k−1 be a generator for ΣΠΣΠ(k−1) circuits and for sparse polynomials of suitable sparsity. In the following lemma we show that if r = R(k) · k then when we restrict the variables in [n] \ V to G k−1 , we obtain a non-zero polynomial.
Lemma 4.5. Let k ≥ 2 and P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a simple, multilinear, k · R(k)-compressed ΣΠΣΠ(k) circuit of size s. In addition, let G k−1 be a generator for ΣΠΣΠ(k − 1) circuits of size s and (2s 2 )-sparse polynomials.
3 Then there exists a subset
Pij(x) be a simple, multilinear, and (k · R(k))-compressed ΣΠΣΠ(k) circuit of size s, computing P . If C is not minimal, then P can be computed by a ΣΠΣΠ(k − 1) circuit of size s and we are done (set V = ∅). Assume w.l.o.g. that C is minimal. Let V be a set promised by Lemma 3.2. We can assume w.l.o.g that
We now describe a way to find an assignment for xT such that the resulting polynomial is non-zero. We do so via a reduction to depth-3 circuits. Let C1, . . . , C 2 k −2 be the proper subcircuits of C (excluding the empty circuit). Clearly they are all ΣΠΣΠ(k − 1) circuits of size s. For any Pi 1 j 1 and Pi 2 j 2 appearing in C, and a variable x such that var(Pi 1 j 1 ) ∩ var(Pi 2 j 2 ) ∩ V = {x }, define the polynomial Q as D (Pi 1 j 1 , Pi 2 j 2 ) (recall D and its property from Observation 2.1). Let Q be the set of all non-zero such Q's. The following lemma gives a sufficient condition that a given partial assignment for xT results in a simple, minimal and nonzero depth-3 circuit.
Letā ∈F n be such that ϕ|x T =ā T ≡ 0. Then C|x T =ā T is a simple, minimal multilinear ΣΠΣ(k) circuit.
Proof. The minimality of C|x T =ā T is clear since all of the subcircuits of C are factors of ϕ. If one of them is zero, then so is ϕ|x T =ā T . Notice that due to the same reason, no Pij is reduced to zero. In order to prove that C|x T =ā T is simple, notice two following simples facts. First, by the definition of V , for every i, j it holds that |var(Pij|x T =ā T )| ≤
is a factor of ϕ and so
Hence, by Observation 2.1, Pi 1 j i 1 |x T =ā T Pi 2 j i 2 |x T =ā T . Since C is itself a simple circuit, the claim follows from those two facts. Now we return to the proof of Lemma 4.5. The polynomial ϕ is a product of (2s 2 )-sparse polynomials and ΣΠΣΠ(k −1) circuits of size s. By Observations 2.3 and 2.4 we get that ϕ| x T =G T k−1 ≡ 0. It follows that there exists somē a ∈ Im (G k−1 ) for which C|x T =ā T is a simple, minimal, and multilinear ΣΠΣ(k) circuit. Notice now that C|x T =ā T contains R(k) variables (the previous proof shows that all the variables in V 'survived') and any linear function appearing in it contains only one variable. Hence, the rank of C|x T =ā T is R(k). By the definition of R(k) (Theorem 2.8) it cannot be a zero circuit. We thus proved that
In this section we prove a structural theorem for multilinear ΣΠΣΠ(k) circuits. This theorem enables us to reduce the identity testing of multilinear ΣΠΣΠ(k) circuits to the identity testing of r-compressed multilinear ΣΠΣΠ(k) circuits for any r > 0. Roughly, the theorem says that there exists a small set of variables W with the following property. Let P = P T ⊆W m(T )FT , where FT are polynomials defined over the variables [n] \ W and m(T ) = Q i∈T xi. Then there exists T such that FT can be computed by an r-compressed ΣΠΣΠ(k) circuit of size s. Now we state the theorem formally.
Theorem 4.7. Let P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a multilinear ΣΠΣΠ(k) circuit of size s. Let r > 0 be a parameter. Then there exists a set W of size |W | ≤ 2 log n · log s · kr for which the following holds: Write
where the FT 's are non-zero polynomials independent of the variables in W . Then there exists at least one set T ⊆ W for which FT = Q · H where Q is a product of s-sparse polynomials and H is computable by a simple, r-compressed ΣΠΣΠ(k) circuit of size s.
An alternative view of the theorem states that there exist two sets I, J of the following properties: If we set the variables of J to zero and take a partial derivative w.r.t. I (i.e. compute ∂I P | x J =0 J ) then we get an r-compressed ΣΠΣΠ(k) circuit of size s, multiplied by s-sparse polynomials. The set I corresponds to the variables in the monomial (i.e. T ) and J to the variables of W outside the monomial (i.e. W \ T ). We find this alternative view more convenient for the purpose of proving the theorem.
Lemma 4.8. Let n, s, r, k > 1 be integers. Let P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a multilinear ΣΠΣΠ(k) circuit C of size s. Then there exist disjoint subsets I, J ⊆ [n] such that |I| + |J| ≤ 2kr log(s) and ∂I P | x J =0 J is a non-zero polynomial computed by a ΣΠΣΠ(k) circuit CIJ with at least one of the following properties:
• var (sim(CIJ )) < var (sim(C)) /2 (recall the definition of sim(C) from Section 3).
• sim(CIJ ) is an r-compressed ΣΠΣΠ(k) circuit of size s.
Proof. Assume that C itself does not meet any of the needed conditions. Let sim(C) =
Mj where each Mj is a multiplication gate of a ΣΠΣΠ(k) circuit of size s. Write Mj = Nj · Aj where Nj is a product of s-sparse polynomials which are defined on at most |var(sim(C))| /2r variables and Aj = Mj/Nj. Clearly, Aj is a product of s-sparse polynomials, each defined on at least |var(sim(C))| /2r variables. Hence, due to the multilinearity of Mj, Aj must be s 2r -sparse. Let mon (A) denote the number of monomials in a polynomial A. Let Φ(C)
be a potential function that will aid us during the proof. We assume w.l.o.g. that Φ(C) is minimal w.r.t. all possible ΣΠΣΠ(k) circuits of size s computing P . Notice that Φ(C) ≤ 2kr log(s). Let I0 = J0 = ∅, P0 = P and Aj,0 = Aj for each j ∈ [k]. We now describe an algorithm that produces the required sets I, J. The algorithm is composed of enumerated steps, starting from 1. At each step, we add a single element either to I or J. Denote by I and J the sets at the end of step . Correspondingly, define P = ∂I P | x J =0 J . Also, define
where the N j, 's are s-sparse polynomials that rely on at most |var(sim(C))| /2r variables (notice that we used C and not C ) and C ≡ P . Let ΦC (C )
Define C as the circuit achieving the minimal ΦC among all size s ΣΠΣΠ(k) circuits computing P .
We now describe the process of adding a single variable to I or J. The idea is to take a variable appearing in some A j, and add it to a set that will result in a maximal reduction to the monomials of A j, . One of the choices must reduce the number of monomials by a factor of 2 and thus reduce the ΦC function by at least 1. This is since adding the variable to I means keeping only monomials in which it appears and adding it to J means keeping only the monomials in which it does not appear. The problem is to ensure that the resulting circuit computes a non-zero polynomial. The following lemma guarantees the existence of a variable for which neither action would result in a non-zero polynomial.
Lemma 4.9. Let ≥ 0. Assume that P ≡ 0 and that C does not meet the conditions of Lemma 4.8. Then there exist some i ∈ [n] and j ∈ [k] such that A j, and P depend on xi and xi is not a factor of P .
Proof. Assume that the claim is false. We have one of the following cases: case 1: For some i, j, A j, depends on xi and P does not. We can replace A j, with A j, |x i =0 and result in a circuit C computing the same polynomial P with a Φ(C ) < Φ(C ). This is a contradiction to the minimality of C . case 2: All the A j, 's are constant. That is: C = gcd(C ) · k P j=1 N j, . In this case, either var (sim(C )) < var (sim(C)) /2 or sim(C ) is an r-compressed ΣΠΣΠ(k) circuit of size s. Either way this is a contradiction. case 3: There exists a variable xi in A j, that divides P .
This is a contradiction to the minimality of C w.r.t. ΦC .
We return to the proof of lemma 4.8. Due to the lemma, there exist some xi that appears in some A j, where both
and P |x i =0 are non-zero polynomials. Clearly, one of these choices for P +1 results in a non-zero polynomial for which ΦC (C +1 ) ≤ ΦC (C ) − 1. Since ΦC is always nonnegative, after at most 2kr log(s) (the initial value for Φ) steps, we get the required circuit.
By repeating Lemma 4.8 at most log n times, we get Theorem 4.7 (indeed, in each step if we do not have the conclusion of 4.7 then |var(sim(C ))| is reduced by a factor of 2).
Rounding the Components Together
In the previous sections we found that there exists a small set of variables W and an additional disjoint small set of variables V with the following properties: When picking the variables outside of W and V from a generator for ΣΠΣΠ(k− 1) circuits and for (2s 2 )-sparse polynomials, we get a nonzero polynomial. The following is the formal claim:
Lemma 4.10. Let k ≥ 2 and Let P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a multilinear ΣΠΣΠ(k) circuit of size s. In addition, let G k−1 be a generator for ΣΠΣΠ(k −1) circuits of size s and (2s 2 )-sparse polynomials. Then there exists a subset U ⊆ [n], depending only on P (i.e. U does not depend on the generator), of size |U | ≤ 3k
Proof. Let T0 and W be the sets guaranteed by Theorem 4.7. Namely, when writing P = P T ⊆W m(T )FT , we have that 
As Q is a product of s-sparse polynomials we get, by Observations 2.3 and 2.4, that
It follows that under the restriction x [n]\U = G
[n]\U k−1 , FT 0 is a non-zero polynomial. As we did not substitute anything to the variables in W the claim clearly follows.
We now establish the generator for ΣΠΣΠ(k) circuits. This is our main theorem and it guarantees that we get the required black-box algorithm. In particular Theorem 1 is an immediate corollary.
Theorem 4.11 (Main). Let P ∈ F[x1, x2, . . . , xn] be a non-zero polynomial computed by a multilinear ΣΠΣΠ(k) circuit of size s. Then for every ≥ 3k 3 R(k) log(s) log(n) it holds that P (G (ȳ,z) + S 2s 2 (w)) ≡ 0, whereȳ,z andw are new sets of variables.
Proof. We prove the claim by induction on k. For k = 1 we note that P is a product of (2s 2 )-sparse polynomials. By definition of S 2s 2 (recall Lemma 2.7), and Observations 2.3 and 4.2, we get that P (G + S 2s 2 ) ≡ 0 and the claim follows. Assume that k ≥ 2. Let U ⊆ [n] be the subset guaranteed by Lemma 4.10. By the induction hypothesis, we get that for v =˚3(k − 1) 3 R(k − 1) log(s) log(n)ˇthe mapping Gv(ȳ,z) + S 2s 2 (w) is a generator for both ΣΠΣΠ(k − 1) circuits and for (2s 2 )-sparse polynomials. From Lemma 4.10 and Observation 2.4 it follows that Im " G
contains a pointā for which
and thusā ∈ Im (G (ȳ,z) + S 2s 2 (w)) and the claim holds.
An explicit hitting set
The hitting set for ΣΠΣΠ(k) circuit is an immediate corollary of Theorem 4.11. Basically, as G (ȳ,z) + S 2s 2 (w) are (relatively) low degree polynomials defined on m = O(k 3 R(k) log(s) log(n)) many variables, we can simply evaluate P • (G (ȳ,z) + S 2s 2 (w)) on all inputs from E m where E ⊆ F is a set of size poly(n). Algorithm 1 follows exactly this intuition and produces the hitting set.
Input: n, k, s ∈ N. Output: A set H Let W ⊆ F be of size |W | = n 2 ; Let ∆ =˚3k 3 R(k) log(s) log(n)ˇwhere R(k) is defined in Theorem 2.8; Let q ∆ = q(n, 2s
2 ) as defined in Lemma 2.7; Initialize H = ∅; foreachā,b ∈ W andc ∈ W q do Evaluate G (ā,b) + S 2s 2 (c) and add it to H. end Algorithm 1: Construction of a hitting set for ΣΠΣΠ(k) circuits Theorem 4.12. Let n, s, k > 0. Algorithm 1, given n, s, k as input runs in n O(k 3 R(k) log 2 s) = n O(k 6 log(k) log 2 s)
time. The set H it produces is of size n O(k 3 R(k) log 2 s) = n O(k 6 log(k) log 2 s) and is a hitting set for n-variate polynomials that can be computed by a ΣΠΣΠ(k) circuit of size s.
Proof. Let P ∈ F[x1, x2, . . . , xn] be a polynomial computed by a multilinear ΣΠΣΠ(k) circuit of size s. Let H be the set given by Algorithm 1. We claim that P ≡ 0 if and only if P |H ≡ 0. If P ≡ 0 then the claim is trivial. If P ≡ 0, by Theorem 4.11 we get that P (G + S 2s 2 ) ≡ 0. According to their definition, the degrees of all the output variables of G and S 2s 2 are at most n − 1. Therefore, the degrees of the variables in P (G + S 2s 2 ) are bounded by (n − 1)n < n 2 . Since P (G + S 2s 2 ) ≡ 0, Lemma 2.9 implies that P |H ≡ 0.
We now bound the size of H and the time required to construct it. From their definition, G depends on 2 variables and S 2s 2 depends on O(log n s) variables. Hence, |H| ≤ n 4 +2q(n,2s
2 ) = n O(k 3 R(k) log s log n·log n s) = n O(k 3 R(k) log 2 s) .
The time required to construct S 2s 2 and G is polynomial in n, s. The time to evaluate (G + S 2s 2 ) on a point from W 2 +q(n,2s
2 ) is polynomial in n. Hence, the time to construct H is |H| · (ns) O(1) = n O(k 3 R(k) log 2 s) .
CONCLUSION
Derandomizing the Polynomial Identity Testing problem for depth-4 arithmetic circuits is an outstanding open problem in complexity theory [AV08] . Any efficient derandomized algorithm for depth-4 circuits will imply strong lower bounds [KI03, Agr05] . So far, the progress in depth-4 identity testing is very limited [AM07, Sax08, SV09] . In this paper, we improve the situation by giving a quasi-polynomial time black-box identity testing algorithm for depth-4 multilinear circuits with bounded fan in top gate. Our algorithm is based on new structural theorems about such circuits.
In identity testing and explicit lower bound proofs, multilinear circuits have already received significant attention from the community [DS06, KS08, SV08, SV09, Raz04a, Raz04b, RSY08, RY08]. In [Raz04a] , Raz asked whether one could design efficient identity testing algorithms for multilinear formulas. The best algorithms today are for sums of read-once formulas [SV09] and for set-multilinear depth-3 formulas (non black-box) [RS05] . For depth-4 multilinear circuits with bounded fan in top gate, our result gives the first efficient identity testing algorithm.
It will be very interesting to generalize our result for nonmultilinear circuits with bounded fan in top gate. Another problem is to give a deterministic polynomial time identity testing algorithm for such circuits in non black-box model.
ACKNOWLEDGMENTS

