Abstract. We study the notion of "cancellation-free" circuits. This is a restriction of linear Boolean circuits (XOR circuits), but can be considered as being equivalent to previously studied models of computation. The notion was coined by Boyar and Peralta in a study of heuristics for a particular circuit minimization problem. They asked how large a gap there can be between the smallest cancellation-free circuit and the smallest linear circuit. We show that the difference can be a factor Ω(n/ log 2 n). This improves on a recent result by Sergeev and Gashkov who have studied a similar problem. Furthermore, our proof holds for circuits of constant depth. We also study the complexity of computing the Sierpinski matrix using cancellation-free circuits and give a tight Ω(n log n) lower bound.
Introduction
Let F 2 be the field of order 2, and let F n 2 be the n-dimensional vector space over F 2 . A Boolean function f : F n 2 → F m 2 is said to be linear if there exists a Boolean m × n matrix A such that f (x) = Ax for every x ∈ F n 2 . This is equivalent to saying that f can be computed using only XOR gates.
A linear circuit (or XOR-circuit) C is a directed acyclic graph. There are n nodes with in-degree 0, called the inputs. All other nodes have indegree 2 and are called gates. There are m nodes which are called the outputs; these are labeled y 1 , . . . , y m . The value of a gate is the sum of its two children (addition in F 2 , denoted ⊕). The circuit C, with inputs x = (x 1 , . . . , x n ), computes the m × n matrix A if the output vector computed by C, y = (y 1 , . . . , y m ), satisfies y = Ax. In other words, output y i is defined by the ith row of the matrix. The size of a circuit C, is the number of gates in C. For simplicity, we will let m = n unless otherwise explicitly stated. For a matrix A, let |A| be the number of nonzero entries in A.
Our contributions: In this paper we deal with a restriction of linear circuits called cancellation-free circuits, coined in [3] , where the authors noticed that many heuristics for finding small linear circuits would always produce cancellation-free linear circuits. They asked the question of how large a separation there can be between these two models. Recently, a separation of Ω n log 6 n log log n due to Gashkov and Sergeev was given in [9] . We improve on this result by giving a slightly stronger separation, namely Ω n log 2 n . Furthermore, our proof gives a similar separation for the case of linear circuits with constant depth. We conclude that many heuristics for finding linear circuits do not approximate better than a factor of Θ n log 2 n of the optimal. We also study the complexity of computing the Sierpinski matrix using cancellation-free circuits. We show that the complexity is exactly 1 2 n log n. Furthermore, our proof holds for OR circuits.
Cancellation-Free Linear Circuits
For linear circuits, the value computed by every gate is the parity of some subset of the n variables. That is, the output of every gate u can be considered as a vector κ(u) in the vector space F n 2 , where κ(u) i = 1 if and only if x i is a term in the parity function computed by the gate u. We call κ(u) the value vector of u, and for input variables define κ(x i ) = e (i) , the unit vector having the ith coordinate 1 and all others 0. It is clear by definition that if a gate u has the two children w, t, then κ(u) = κ(w) ⊕ κ(t), where ⊕ denotes coordinate-wise addition in F 2 . We say that a linear circuit is cancellation-free if for every pair of gates u, w where u is an ancestor of w, then κ(u) ≥ κ(w), where ≥ denotes the usual coordinate-wise partial order. If this is satisfied, the circuit never exploits the fact that in F 2 , a ⊕ a = 0, so things do not "cancel out" in the circuit.
Although it is not hard to see that the model is equivalent to addition chains [21, 23] and "ensemble computations", [8] , in this paper we will stick to the term "cancellation-free", since we will think of it as a special case of linear circuits. A different, but tightly related kind of circuits are OR circuits. The definition is exactly the same as for linear circuits, but with ∨ instead of ⊕, see [19, 13, 8] . In particular every cancellation-free circuit gives an OR-circuit for the same matrix, so lower bounds in the ORmodel carry over to lower bounds on cancellation-free circuits. However, the converse does not hold in general [7] .
For a matrix A, we will let C ⊕ (A), C CF (A), C ∨ (A) denote the smallest linear circuit, the smallest cancellation-free circuit and the smallest OR circuit computing the matrix A. By the discussion above, the following is immediate:
Every matrix admits a cancellation-free circuit of size at most n(n−1). This can be obtained simply by computing each row independently. It was shown by Nechiporuk [19] and Pippenger [21] (see also [13] ) that this upper bound can be improved to (1 + o(1)) n 2 2 log n . The proof due to Nechiporuk is for the OR model, but the proof also holds for cancellation-free circuits.
A Shannon-style counting argument gives that this is tight up to low order terms. A proof of this can be found in [21] . Combining these results, we get that for most matrices, cancellation does not help much. Theorem 1. For every ǫ > 0, almost every n × n matrix has
For m × n matrices, the following upper bound also holds.
Theorem 2 (Lupanov [16] ). Any m × n matrix, admits a cancellationfree linear circuit of size O min{ mn log n , mn log m } + n + m .
The theorem follows directly from Lupanov's result and the application of the "transposition principle" (see e.g. [13] ).
A matrix, A is k-free if it does not have an all one submatrix of size (k +1)×(k +1). The following lemma due to Mehlhorn [17] and Pippenger [22] will be used later.
3 Relationship Between Cancellation-Free Linear Circuits and General Linear Circuits
In [3] , Boyar and Peralta exhibited an infinite family of matrices where the sizes of cancellation-free circuits computing them are at least 3 2 − o(1) times larger than the optimum. We call this ratio the cancellation ratio, ρ(n), defined as
.
The following proposition on the Boolean Sylvester-Hadamard matrix was pointed out by Edward Hirsch and Olga Melanich [10] . The n × n Boolean Sylvester-Hadamard matrix H n , is defined recursively:
It is known that C ⊕ (H n ) ∈ O(n), but that in depth 2 it requires circuits of size Ω(n log n) [2] .
Proposition 1. The n × n Boolean Sylvester-Hadamard matrix requires cancellation-free circuits of size C CF (H n ) ∈ Ω(n log n).
It is based on the following theorem due to Morgenstern, ( [18] , see also [4, Thm. 13.14]).
Theorem 3 (Morgenstern) . For a Boolean matrix M ,
The statement holds more generally, namely for circuits with addition over the complex numbers and scalar multiplication of any constant c ∈ C with |c| ≤ 2. Cancellation-free can be seen as a special case of this.
The determinant of H n is n Ω(n) , so the lower bound follows. This demonstrates that ρ(n) ∈ Ω(log n). It should be noted that no n × n Boolean matrix can have determinant larger than n! (since the permanent is at most n!), so this technique cannot give lower a bound on ρ(n) stronger than Ω(log n).
In [9] , Gashkov and Sergeev studied the ratio λ(n) = max A∈F n×n 2
C∨(A)
C ⊕ (A) . They showed that λ(n) ∈ Ω n log 6 n log log n . We improve this by showing
. Alternative proofs of this are given in more recent work [7, 11] . We include the proof below because the construction is different, and the technique we use to analyze the matrix is as well. More concretely we use communication complexity for the analysis in a way that might have independent interest. Also our construction gives a similar separation for circuits of constant depth (see Section 5) .
The proof uses the probabilistic method. We construct randomly two matrices, and let A be their product. In order to use Lemma 1 on A, we need a technical lemma stated below, which will be useful in showing that with high probability, our matrix will be 3 log n-free.
In the following, for a matrix M , we let M i (M i ) denote the ith row (column). And for I ⊆ [n], we let M I (M I ) denote the submatrix consisting of the rows (columns) with indices in I.
Lemma 2 might seem somewhat technical. However, there is a very simple intuition behind it: Suppose M is obtained at random as in the statement of the lemma. Informally we want to say that the entries do not "depend" too much on each other. More formally we show that given all but one entry in M it is not possible to guess the last entry with significant advantage over random guessing. The proof idea is to transform any good guess into a deterministic communication protocol for computation of the inner product, and to use a well known limitation on how well this can be done [6, 15] .
We will say that two (partially) defined matrices are consistent if they agree on all their defined entries.
Lemma 2. Let M be an m × m partially defined matrix, where all entries except M q p are defined. Let B, C be matrices over F 2 with dimensions m × 8m and 8m × m respectively, be uniformly random among all possible pairs (B, C) such that BC is consistent with M .
Then for sufficiently large m, the conditional probability that M q p is 1, given all other entries, is contained in the interval ( Before proving the lemma, we will first recall a fact from communication complexity, due to Chor and Goldreich [6] , see also [15] .
Theorem 5 (Chor, Goldreich) . Let x and y be independent and uniformly random vectors, each of n bits. Suppose a deterministic communication protocol is used to compute the inner product of x and y, and the protocol is correct with probability at least 1 2 + p. Then on some inputs, the protocol uses Lemma 2) . Suppose for the sake of contradiction that there exists a partially defined matrix M , such that when all entries but one are revealed, the conditional probability of the last entry being a is at least Assuming this, we will first present a randomized communication protocol computing the inner product of two independent and uniformly random 8m bit vectors x and y that always uses m bits of communication and is correct with probability at least
4m . We will then argue that this protocol can be completely derandomized. This results in a deterministic communication protocol that violates Theorem 5. From this we conclude that such a partially defined matrix, with this large probability of the last entry being a, does not exist.
Let Alice and Bob have as input vectors x and y, respectively, each of length 8m. Before getting their inputs, they use their shared random bits to agree on a random choice of the two matrices B and C distributed as stated in the Lemma. To compute the inner product of x and y, Alice replaces the row B p with x and Bob replaces the column C q with y, let the resulting matrices be B ′ and C ′ . Let In order for M ′ and M to be consistent, it is only necessary that the m − 1 defined entries in row p and m − 1 defined entries in column q be equal in the two matrices, since B ′ and C ′ were defined such that all other entries were equal. This occurs with probability at least 2 −2m−2 . In this case, the value Alice and Bob want to compute is exactly the only unknown entry M ′q p . By assumption, this last entry is a with probability at least 1 2 + 1 m , so Bob outputs a. If the known entries in M ′ are not consistent with the known entries in M , Bob outputs 0. This will be the correct value for the inner product on at least half of all inputs. Thus, the probability of this protocol being correct is at least:
So when the inputs are uniformly distributed, the randomized protocol computes the inner product of two 8m bits vectors with m bits communication, and it is correct with probability at least
4m . The only use of randomness was by choosing the matrix pair (B, C). By an averaging argument it follows that instead of choosing (B, C) randomly there is a fixed choice of (B, C) that gives at least the same success probability. Hence we arrive at a deterministic communication protocol violating Theorem 5.
⊓ ⊔
We now use this to prove Theorem 4.
Proof (of Theorem 4). We will probabilistically construct two matrices B, C of dimensions n × 24 log(n), 24 log n × n. Each entry in B and C will be chosen independently and uniformly at random on F 2 . We let A = BC. First notice that it follows directly from Theorem 2 that B and C can be computed with linear circuits, both of size O(n). Now we can let the outputs of the circuit computing C be the inputs of the circuit computing B. The resulting circuit computes the matrix A and has size O(n). We will argue that with nonzero probability this matrix will not have a 3 log n × 3 log n all 1 submatrix, while |A| ∈ Ω(n 2 ). By Lemma 1 the results follows. We show that with nonzero probability neither of the following two events will happen:
2. BC has an all one submatrix of dimension 3 log n × 3 log n is computed as the inner product of the row i in B with column j in C. The probability that column j in C is all zeros is 2 −24 log(n) . If it is not zero, the probability of the inner product having the result one is exactly 1 2 (for any 1 in C j the probability of it being added into the sum in the inner product is exactly . By applying Markov's inequality to the non-negative random variable n 2 − |BC|, we get
2.) Fix a submatrix M of BC with dimensions 3 log n × 3 log n. That is, some subset I of the rows of B, and a subset J of the columns in C so M = B I C J . We now want to show that the probability of this matrix having only 1's is so small that a union bound over all choices of 3 log n × 3 log n submatrices gives that the probability that there exists such a submatrix is less than 4 . Notice that this would be easy if all the entries in M were mutually independent and uniformly distributed.
Though this is not case, Lemma 2 for m = 3 log n states, that this is almost the case. More precisely, the conditional probability that a given entry is 1 is at most 1 2 + 1 3 log n . We can now use the union bound to estimate the probability that A has an all one submatrix of dimension 3 log n × 3 log n: n 3 log n 2 1 2 + 1 3 log n 9 log 2 n ≤ n 6 log n (3 log n)! 1 + 2 3 log n 2 9 log 2 n ≤ 1 + 2 3 log n 9 log 2 n (3 log n)!2 3 log 2 n
This tends to 0, so for sufficiently large n this probability is strictly smaller than . By the union bound we conclude that with nonzero probability, neither of the two events occur. Thus, with nonzero probability, A is 3 log n-free with |A| ∈ Ω(n 2 ) and C ⊕ (A) ∈ O(n). By Lemma 1 this proves the theorem.
⊓ ⊔
Remark: Originally, this result was slightly weaker, obtaining a separation of Ω n log 2+ǫ n . Motivated by this result, the Ω n log 2 n was obtained in [7] . Referring to that result, Stasys Jukna and Igor Sergeev pointed out that Theorem 4 can be proved using a slightly different method [11] . Roughly speaking, it is shown that certain "t-Ramsey" graphs admit small circuits. A bipartite graph is said to be t-Ramsey, if it neither contains a (bi)clique with t vertices in each vertex set nor an independent set with t vertices in each vertex set. Though the proofs and constructions are different, they are along the same general lines. The proof of Theorem 4 is mainly concerned with the asymptotic values. Therefore, we only showed that with high probability, the matrix is 3 log n-free. However, it is not hard to see that in fact it is also 2 log n-free with high probability. Applying the same argument to estimate the probability of the existence of a 2 log n × 2 log n all zero matrix, we conclude that with high probabilty A is 2 log n-Ramsey. Corollary 1. With high probability, the bipartite graph with adjacency matrix A from Theorem 4 is t-Ramsey for t = 2 log n.
Notice that by Theorem 2, the obtained separation is at most a factor of O(log(n)) from being optimal. Also, except for lower bounds based on counting, all strong lower bounds we know of are essentially based on Lemma 1. Following that line of thought, one might hope to improve the separation above by coming up with a better choice of A that does not have a O(log 1−ǫ ) × O(log 1−ǫ ) all 1 submatrix to get a stronger lower bound on C ∨ (A), or perhaps hope that a tighter analysis than the above would give a stronger separation. However, this direction does not seem promising. To make this precise, we will use following result on "Zarankiewicz problem" [14] , see also [12] .
Theorem 6 (Kovári, Sós, Turán). Let M be an n × n matrix without an a × a submatrix of all ones. Then the number of ones in M is at most (a − 1) 1/a n 2−1/a + (a − 1)n.
This implies that for a matrix without a log 1−ǫ (n) × log 1−ǫ (n) all 1 submatrix, the lower bound obtained using Lemma 1 would be of order
, which is o n 2 log 2 n .
Smallest linear circuit problem
As mentioned earlier, the notion cancellation-free was introduced by Boyar and Peralta in [3] . The paper concerns shortest straight line programs for computing linear forms, which is equivalent to the model studied in this paper. In [8] , it is shown that the Ensemble Computation Problem (recall that this is equivalent to cancellation-free) is NP-complete. For general linear circuits, the problem remains NP-complete ( [3] ). It was observed in [3] that several researchers have used heuristics that will always produce cancellation-free circuits, see [5, 20, 24] . By definition, any heuristic which only produces cancellation-free circuits cannot achieve an approximation ratio better than ρ(n). By Remark 1, ρ(n) ≥ λ(n). By Theorem 4, we get that techniques which only produce cancellation-free circuits are not guaranteed to be very close to optimal.
Corollary 2. The algorithms in [5, 20, 24] do not guarantee approximation ratios better than Θ n log 2 n .
Constant Depth
For unbounded depth, there is no known family of (polynomial time computable) matrices known to require linear circuits of superlinear size. However, if one puts restriction on the depth, superlinear lower bounds are known [13] . In this case, we allow each gate to have unbounded fan-in, and instead of counting the number of gates we count the number of wires in the circuit. In particular, the circuit model where the depth is bounded to be at most 2 is well studied (see e.g. [13] ). Similarly to previously, we say that a circuit C is linear if every gate computes the XOR or its inputs.
When considering matrices computed by linear circuits, the general situation in the two circuit models are very similar. The upper bound comes from Lupanov [16] , and the lower bound is folklore. See also [13] .
Theorem 7 (Lupanov) . For every n × n matrix A, there exists a depth 2 cancellation-free circuit with at most O n 2 log n wires computing A. Furthermore, almost every such matrix requires Ω n 2 log n wires.
Let λ d (n) denote λ(n) for circuits restricted to depth d (recall that now size is defined as the number of wires). Neither of the separations in [9] and [7] seem to carry over to bounded depth circuits in any obvious way. By inspecting the proof of Theorem 4, the upper bound on the size of the linear circuit worked as follows: First construct a circuit to compute C, and then construct a circuit for B with the outputs of C as inputs, that is, a circuit for B that comes topologically after C. To get to an upper bound of O(n), we used Theorem 2. By using Theorem 7 twice, we get a depth 4 circuit of that size.
For depths 2 and 3, we can construct a depth 1 circuit for each of B and C. This results in a depth 2 circuit for A with O(n log n) wires.
Computing the Sierpinski Matrix
In this section we prove that the n×n Sierpinski matrix, S n , needs 1 2 n log n gates when computed by a cancellation-free circuit, and that this suffices. The proof strategy is surprisingly simple, it is essentially gate elimination where more than one gate is eliminated in each step. Neither Theorem 3 nor Lemma 1 gives anything nontrivial in this case.
As mentioned previously, there is no known (polynomial time computable) family of matrices requiring linear circuits of superlinear size. However there are simple matrices that are conjectured to require circuits of size Ω(n log n). One such matrix is the Sierpinski matrix, (Aaronson, personal communication and [1] ). The n × n Sierpinski (also called set disjointness) matrix, S n , is defined inductively
Independently of this conjecture, Jukna [11] has very recently asked if the "set intersection matrix" (also called the Kneser matrix in [11] ), K n , has C ⊕ (K n ) ∈ ω(n). The motivation for this is that C ∨ (K n ) ∈ O(n), so if true this would give a counterpart to Theorem 4. The n × n set intersection matrix K n can be defined by associating each row and column with a subset of [log n], and letting an entry be 1 if and only if the corresponding row and column sets have non-empty intersection. One can also define K n inductively:
where J is the n × n matrix with 1 in each entry. It is easy to see that the complement of K n contains exactly the same rows as S n . Thus, C ⊕ (K n ) is superlinear if and only if C ⊕ (S n ) is, since either matrix can be computed from the other with at most 2n − 1 extra XOR gates, using cancellation heavily. To see that the set intersection matrix can be computed with OR circuits of linear size observe that over the Boolean semiring, K n decomposes into K n = B · B T , where the ith row in B is the binary representation of i. Now apply Theorem 2 to the n × log n matrix B and its transpose.
Any lower bound against linear circuits must hold for cancellationfree circuits, so a first step in proving superlinear lower bounds for the set intersection matrix is to prove superlinear cancellation-free lower bounds for the Sierpinski matrix. Our technique also holds for OR circuits. This provides a simple example of a matrix family where the complements are significantly easier to compute with OR circuits than the matrices themselves.
Gate Elimination Suppose some subset of the input variables are restricted to the value 0. Now look at the resulting circuit. Some of the gates will now compute the value z = 0 ⊕ w. In this case, we say that the gate is eliminated since it no longer does any computation. The situation can be more extreme, some gate might "compute" z = 0 ⊕ 0. In both cases, we can remove the gate from the circuit, and forward the input if necessary (if z is an output gate, w now outputs the result). In the second case, the parent of z will get eliminated, so the effect might cascade. For any subset of the variables, there is a unique set of gates that become eliminated when setting these variables to 0.
In all of the following let n be a power of 2, and let S n be the n × n Sierpinski matrix. The following proposition is easily established.
Proposition 2. For every n, the Sierpinski matrix has full rank, over both R and F 2 .
We now proceed to the proof of the lower bound of the Sierpinski matrix for cancellation-free circuits. It is our hope that this might be a step towards proving a ω(n) lower bound for linear circuits.
Theorem 10. For every n ≥ 2, any cancellation-free circuit that computes the n × n Sierpinski matrix has size at least 1 2 n log n. Proof. The proof is by induction on n. For the base case, look at the 2 × 2 matrix S 2 . This clearly needs at least 1 2 2 log 2 = 1 gate. (Figure 1 illustrates the situation). Suppose the statement is true for some n and consider the 2n×2n matrix S 2n . Denote the output gates y 1 , . . . , y 2n and the inputs x 1 , . . . , x 2n . Partition the gates of C into three disjoint sets, C 1 , C 2 and C 3 defined as follows:
The gates having only inputs from x 1 , . . . , x n and C 1 . Equivalently the gates not reachable from inputs x n+1 , . . . , x 2n .
-C 2 : The gates in C −C 1 that are not eliminated when inputs x 1 , . . . , x n are set to 0.
. That is, the gates in C − C 1 that do become eliminated when inputs x 1 , . . . , x n is set to 0.
Obviously |C| = |C 1 | + |C 2 | + |C 3 |. We will now give lower bounds on the sizes of C 1 , C 2 , and C 3 .
Since the circuit is cancellation-free, the outputs y 1 , . . . , y n and all their predecessors are in C 1 . By the induction hypothesis, |C 1 | ≥ 1 2 n log n.
Since the gates in C 2 are not eliminated, they compute S n on the inputs x n+1 , . . . , x 2n . By the induction hypothesis, |C 2 | ≥ 1 2 n log n.
The goal is to prove that this set has size at least n. Let δ(C 1 ) be the set of wires from C 1 ∪ {x 1 , . . . , x n } to C 2 ∪ C 3 . We first prove that |C 3 | ≥ |δ(C 1 )|. By definition, all gates in C 1 attain the value 0 when x 1 , . . . , x n are set to 0. Let (v, w) ∈ δ(C 1 ) be arbitrary. Since v ∈ C 1 ∪ {x 1 , . . . , x n }, w becomes eliminated, so w ∈ C 3 . By definition, every u ∈ C 3 can only have one child in C 1 . So |C 3 | ≥ |δ(C 1 )|. We now show that |δ(C 1 )| ≥ n. Let the endpoints of δ(C 1 ) in C 1 be e 1 , . . . , e p and let their corresponding value vectors be v 1 , . . . , v p .
The circuit is cancellation-free, so coordinate-wise addition corresponds to addition in R. Now look at the value vectors of the output gates y n+1 , . . . , y 2n . For each of these, the vector consisting of the first n coordinates must be in span R (v 1 , . . . , v p ), but the dimension of S n is n, so p ≥ n. We have that |C 3 | ≥ |δ(C 1 )| ≥ n, so
⊓ ⊔

This is tight:
Proposition 3. The Sierpinski matrix can be computed by a cancellationfree circuit using 1 2 n log n gates. Proof. This is clearly true for S 2 . Assume that S n can be computed using 1 2 n log n gates. Consider the matrix S 2n . Construct the circuit in a divide and conquer manner by constructing recursively on the variables x 1 , . . . , x n and x n+1 , . . . , x 2n . This gives outputs y 1 , . . . , y n . After this use n operations to finish the outputs y n+1 , . . . y 2n . This adds up to exactly 1 2 (2n) log 2n.
⊓ ⊔ Circuits With Cancellation In the proof of Theorem 10, we used the cancellation-free property when estimating the sizes of both C 1 and C 3 . However, since S n has full rank over F 2 , a similar dimensionality argument to that used when estimating C 3 holds even if the circuits use cancellation. Therefore we might replace the cancellation-free assumption with the assumption that for the 2n × 2n Sierpinski matrix, there is no path from x n+i to y j for i ≥ 1, j ≤ n. We have not been able to show whether or not this is the case for minimum sized circuits, although we have experimentally verified that even for circuits where cancellation is allowed, the matrices S 2 , S 4 , S 8 do not admit circuits smaller than the lower bound from Theorem 10.
OR circuits In the proof of Theorem 10, the estimates for C 1 and C 2 hold for OR circuits too, but when estimating C 3 , it does not suffice to appeal to rank over F 2 or R. However, it is not hard to see that any set of row vectors that "spans" S n (with the operation being coordinate-wise OR) must have size at least n.
Theorem 11. Theorem 10 holds for OR circuits as well.
Since C ∨ (K n ) ∈ O(n) and K n contains the same rows asS n , the complement of S n , the Sierpinski matrix is harder to compute than its complement.
Corollary 3. C ∨ (S n ) = Θ(log n)C ∨ (S n ).
This proof strategy for Theorem 10 has recently been used by Sergeev to prove similar lower bounds for another family of Boolean matrices in the OR model [25] .
Conclusions and Open Problems
For circuits of unbounded depth, we show the existence of matrices, for which OR circuits and cancellation-free linear circuits are both a factor of Ω n log 2 n larger than the smallest linear circuit. For circuits of constant depth we give a separation of Ω n log D n where D = 3 for circuits of depth 2 or 3 and D = 2 for any larger depth.
This means that when designing linear (sub)circuits, it can be important that the methods employed can produce circuits which have cancellation.
If a cancellation-free or an OR circuit computes the Sierpinski matrix correctly, it has size at least Ω(n log n). For this particular family of matrices, it is not obvious to what extent cancellation can help. It would be very interesting to determine this, since it would automatically also solve Jukna's conjecture concerning set intersection matrices.
