Abstract. Recently, Gennaro, Gentry, Parno and Raykova [GGPR12] proposed an efficient noninteractive zero knowledge argument for Circuit-SAT, based on non-standard notions like conscientious and quadratic span programs. We propose a new non-interactive zero knowledge argument, based on a simple combination of standard span programs (that verify the correctness of every individual gate) and high-distance linear error-correcting codes (that check the consistency of wire assignments). We simplify all steps of the argument. As one of the corollaries, we design an (optimal) wire checker, based on systematic Reed-Solomon codes, of size 8n and degree 4n, while the wire checker from [GGPR12] has size 24n and degree 76n, where n is the circuit size. Importantly, the new argument has constant verifier's computation.
adaptively sound by using universal circuits, [Val76] , and the adaptively sound argument has CRS length Θ(n log n), prover's computation Θ(n log 3 n) and verifier's computation Θ(n).
Briefly, [GGPR12] first constructs span programs (which satisfy a non-standard conscientiousness property) that verify the correct evaluation of every individual gate. Conscientiousness means that the span program accepts only if all inputs to the span program were actually used (in the case of a Circuit-SAT argument, the prover has set some value to every input and output wire of the gate and that exactly the same value can be uniquely extracted from the argument). The gate checkers are aggregated to obtain a single large conscientious span program that verifies every individual gate's operation in parallel. Second, [GGPR12] constructs a weak wire checker that verifies consistency, i.e., that all individual gate checkers work on an unequivocally defined set of wire values. (The weak wire checker guarantees consistency only when all gate checkers are conscientious.) They define quadratic span programs, construct a quadratic span program that implements both the aggregate gate checker and the weak wire checker, and then construct an efficient NIZK argument that verifies (given a vector commitment to all coefficients) the quadratic span program. Our Contributions. We improve the construction of [GGPR12] in several aspects. Some of our improvements are conceptual (e.g., we provide more clear definitions which result in better constructions) and some of the improvements are technical (with special emphasis on concrete efficiency). We outline our construction below, and briefly sketch the differences compared to [GGPR12] .
To verify whether the circuit C accepts an input, we first use a constant-size standard (i.e., not necessary conscientious) span program to verify every gate separately. Then, by using the standard "AND composition" of span programs [Amb10] , we construct a single large span program that verifies the computation of every gate in parallel.
Unfortunately, simple AND composition of the gate checkers is not secure, because it allows "doubleassignments". More precisely, there will be vectors from different adjacent gate checkers that correspond to the variable corresponding to the same wire. While every individual checker might be locally correct, one checker could work with value 0 assigned to this wire while another checker could work with value 1 assigned to the same wire. Clearly, such bad cases should be detected.
We solve this issue as follows. Let Code be an arbitrary high-distance linear [N, K, D] error-correcting code that satisfies D > N/2. For a concrete wire η, consider all vectors from adjacent gate checkers that correspond to the claimed value x η of this wire. Some of those vectors (say v i ) are labelled by the positive literal x η and some (say w i ) by the negative literalx η . The individual gate checkers's acceptance "fixes" certain coefficients a i (that are used with v i ) and b i (that are used with w i ) for all adjacent gate checkers. Roughly stating, for consistency one requires that either all values a i are zero (then unequivocally x η = 0), or all values b i are zero (then unequivocally x η = 1). We verify that this is the case by applying an efficient high-distance linear error-correcting code separately to the vectors a and b. The high-distance property of the linear-error correcting code guarantees that if a and b are not consistent, then there exists a coefficient i such that Code(a) i · Code(b) i = 0. We use the systematic , since it is a maximum distance separable (MDS) code with optimal support (that is, it has the minimal possible number of non-zero elements in its generating matrix).
Motivated by this construction, we redefine quadratic span programs [GGPR12] as follows. A quadratic span program -that consists of two target vectors t v and t w and two matrices V and W -accepts an input only if for some vectors a and b that are consistent with this input, (V · a − t v ) • (W · b − t w ) = 0. Here, • denotes the pointwise (Hadamard) product of two vectors. Clearly, the above linear error-correcting code based construction implements a quadratic span program, with V and W basically being the generating matrices of the code. (No connection to error-correcting codes was made in [GGPR12].) We also construct an aggregate wire checker by applying an AND composition rule to the individual wire checkers, and then construct a single quadratic span program (the circuit checker) that implements both the aggregate gate checker and the aggregate wire checker.
To summarize, the new circuit checker consists of two elements. First, an aggregate gate checker (a standard span program) that verifies that every individual gate is executed correctly on their local variables. Second, an aggregate wire checker (a quadratic span program, based on a high-distance linear error-correcting code) which verifies that individual gates are executed on the consistent assignments to the variables. Im-portantly, the circuit checker is a composition of small (quadratic) span programs, and in total has only athe argument from [GGPR12] (e.g., the use of techniques from [BCCT13] ) are also applicable to the new argument. Other Applications. We do hope that by using our techniques, one can construct efficient NIZK arguments for other languages. As an example, the techniques of [Lip12] were used in [CLZ12] to construct an efficient range argument, and in [LZ12] to construct an efficient shuffle. Quadratic span programs have more applications than just in the NIZK construction (or more generally, in the construction of the wire checker). We only mention that one can construct a related zap [DN00], a related (public or designated-verifier) succinct non-interactive argument of knowledge (SNARK, see [Mic94, DL08, GW11, BCCT12] ) by using the techniques of [BCCT12, GGPR12] , and implement verifiable computation [GGP10] . In fact, applying our techniques to verifiable computation is extremely natural: instead of gates, one can talk about small (but possibly much larger than gates) computational units, and instead of wires, about the values transferred between the small computational units. Since here one deals with much larger span programs than in the case of the Circuit-SAT argument, it is especially beneficial that one can use standard (non-conscientious) span programs.
We leave it as an open question whether the non-cryptographic part of the new construction (splitting the verification of a computation into small steps and then using high-distance linear error-correcting codes codes to verify the consistency of individual steps) has some non-cryptographic applications.
Preliminaries: Span Programs
We assume that F is a finite field of size q 2, where q is a prime. However, most of the results can be generalized to arbitrary fields. By default, vectors like v denote row vectors. For a matrix V, let v i be its ith row vector. For an
Let x ι be formal variables. We denote the positive literals x ι by x 1 ι and the negative literalsx ι by x
s rows by one of 2n literals or by ⊥. Let V u be the submatrix of V consisting of those rows whose labels are satisfied by the assignment u, that is, by {x uι ι : ι ∈ [n 0 ]} ∪ {⊥}. The span program computes a function f , if for all u ∈ {0, 1} n0 : t ∈ span V u if and only if f (u) = 1.
We define
to be the set of rows whose labels are satisfied by the assignment u. The size, size P , of the span program is m. The dimension sdim P is equal to d. We say that the span program P has support supp P , if all vectors v ∈ V have altogether supp P non-zero elements. Clearly, t can be replaced by an arbitrary non-zero vector; one obtains the corresponding new span program (of the same size and dimension, but possibly different support) by applying a basis change matrix. Since linear algebra can be implemented in log-space uniform-NC2 [BGP95,BW03], polynomial-sized span programs can implement only languages in the complexity class NC2.
, be the maximum number of vectors that have the same label (ι, j) with j ∈ {0, 1}. This parameter is needed later when we construct wire checkers.
Span programs were defined in [KW93], originally to help proving various lower bounds (see, for example, [Gál01] ). Later, they have been used to design quantum algorithms [RS08] (see also [Rei11, Bel12b, Bel12a] , or the survey [Amb10] ), linear secret sharing schemes (as already shown in [KW93], see for example [CF02]), and non-interactive zero knowledge (NIZK) arguments [GGPR12] . See [Juk12] for a general exposition of span programs.
One commonly constructs more complex span programs by using simple span programs and their composition rules, see, e.g., [Amb10] . Span programs for AND, OR, XOR, and equality of two variables x and y are as follows:
Given span programs P 0 = SP (f 0 ) an P 1 = SP (f 1 ) for functions f 0 and f 1 , it is well known how to build span programs for SP (f 0 ∧ f 1 ) and SP (f 0 ∨ f 1 ). Both compositions assume that the target vector is in a specific form -t = (1, . . . , 1) in the first case and t = (0, . . . , 0, 1) in the second case. Thus, in a circuit that consists of both AND and OR gates, one has to implement a basis change to transform t to the correct form.
Assume that the size m i and dimension d i span program SP (f i ) has the target vector t i = (1, . . . , 1), with the jth row vector v ij being labelled by x ij . The span program SP (f 0 ∧ f 1 ) has size m 0 + m 1 and dimension d 0 +d 1 . In SP (f 0 ∧f 1 ), t is a concatenation of t 0 and t 1 , the first d 0 vectors are equal to (v 0j , 0 d0 ) (and labelled by x 0j ), and the last d 1 vectors are equal to (0 d1 , v 1j ) (and labelled by x 1j ). The following span program for P = SP (f 0 ∨ f 1 ) has size m 0 + m 1 and dimension
, where v di is their last coordinate. Let the target vector of P i be (0 di−1 , 1). The target vector of P is (0 d0+d1−2 , 1). For each v i from P 0 , we add the vector (v i,−d0 , 0, v id0 ) to P . For each v i from P 1 , we include the vector (0, v i,−d1 , v id1 ).
Efficient Gate Checkers
A gate checker for a gate function f : {0, 1} n0 → {0, 1} n1 is a function c f : {0, 1} n0+n1 → {0, 1}, such that c f (x, y) = 1 iff f (x) = y. We are mainly interested in unary and binary Boolean functions f .
The Boolean function NAND∧ is defined as∧(x, y) = x∧y = ¬(x∧y). The NAND-checker c∧ : {0, 1} 3 → {0, 1} outputs 1 iff z = x∧y. We now propose an efficient span program for c∧.
Lemma 1. Fig. 1 depicts a span program for c∧. It has size 6, dimension 3, and support 7.
Proof. We obtained SP (c∧) by using simple AND and OR compositions from the observation that c∧(x,x, y,ȳ, z,z) = (x ∨ z) ∧ (y ∨ z) ∧ (x ∨ȳ ∨z). One can use a simple case analysis to see that SP (c∧) computes c∧:
-a 1 = a 2 = a 3 = 0 (i.e., x = y = z = 0) does not give a solution, -a 1 = a 2 = a 6 = 0 (i.e., x = y = 0 and z = 1) gives a solution with a 3 = 1, a 4 ∈ Z q , and a 5 = 1 − a 4 , -a 1 = a 5 = a 3 = 0 (i.e., x = z = 0 and y = 1) does not give a solution, -a 1 = a 5 = a 6 = 0 (i.e., x = 0 and y = z = 1) gives a solution with a 3 = 1, a 2 = −2, a 4 = 1, -a 4 = a 2 = a 3 = 0 (i.e., x = 1 and y = z = 0) does not give a solution, -a 4 = a 2 = a 6 = 0 (i.e., x = z = 1 and y = 0) gives a solution with a 1 = −2, a 3 = 1, a 5 = 1, -a 4 = a 5 = a 3 = 0 (i.e., x = y = 1 and z = 0) gives a solution with a 1 = 1, a 2 = 1, a 5 = 1, -a 4 = a 5 = a 6 = 0 (i.e., x = y = z = 1) does not give a solution.
The claim about the size, dimension, and support is straightforward.
As seen from the proof, given an accepting assignment (x, y, z), one can efficiently find small values a i ∈ [−2, 1] such that a i v i = t. However, a satisfying input to SP (c∧) does not fix the values a i unequivocally. Namely, if (x, y, z) = (0, 0, 1) (that is, a 1 = a 2 = a 6 = 0), then one can choose an arbitrary a 4 and set a 5 ← 1 − a 4 . Since one can set a 4 = 0 (and a 5 = 1), SP (c∧) is not conscientious.
Given SP (c∧), one can construct a size 6 and dimension 3 span program for the ANDchecker function c ∧ (x, y, z) := (x ∧ y) ⊕z by interchanging the rows labelled by z andz in SP (c∧). Similarly, one can construct a size 6 and dimension 3 span program for the OR-checker function c ∨ (x, y, z) := (x ∧ȳ) ⊕ z by interchanging the rows labelled by x and x, and the rows labelled by y andȳ, in SP (c∧).
NOT-checker [x = y] = x ⊕ y is just the XOR function, and thus one can construct a size 4 and dimension 2 span program for the NOT-checker function. (See Sect. 2.)
We need a fork gate that computes y 1 ← x, y 2 ← x. That is, c Y (x, y 1 , y 2 ) = (x∧y 1 ∧y 2 )∨ (x∧ȳ 1 ∧ȳ 2 ). We write c Y in the CNF form, c Y (x, y 1 , y 2 ) = (x∨y 2 )∧(x∨ȳ 1 )∧(y 1 ∨ȳ 2 ). Since every literal is mentioned only once in the CNF, we can use AND and OR compositions to derive the span program on Fig. 2 . Thus, c Y has a span program of size 6, dimension 3, and support 6.
We also need a 1-to-t fork-checker which has 1 input x and t outputs y ι , with y ι = x. The t-fork checker is then c
. From this we construct a span program exactly as in the case t = 2. The span program has size 2(t + 1) and dimension t + 1. It has only one vector labelled with every x ι /y or its negation, thus D(x) = D(y ι ) = 1 for all ι. To compute the support, we note that SP * (c t Y ) has two 1-entries in every column, and one in every row. Thus, it has support supp(SP (c
Aggregate Gate Checker
Given a circuit that consists only of NAND, AND, OR, XOR, and NOT gates, we combine the individual gate checkers by using the span program AND composition rule from Sect. 2. In addition, to make the wire checker of Sect. 6.1 (and thus also the final NIZK argument) more efficient, all gates of the circuit C need to have a small fan-out. In [GGPR12], the authors designed a circuit of size 3 · |C| that implements the functionality of the circuit C but only has fan-out 2 except for a specially introduced dummy input. The GGPR aggregate gate checker has size 36 · |C| and dimension 27 · |C|. By using the techniques of [HKP84] (that replace every high fan-out gate with an inverse binary tree of fork gates, and then gives a precise estimation of the resulting circuit size), we prove a much more precise result. Differently from [GGPR12], we also do not introduce dummy gates at every input, or the dummy input.
Let C be a circuit. For a gate i of C, let deg + (i) be its fan-out, and let deg
The aggregate gate checker function agc of a circuit C is a function
|C| . I If c i is the gate checker of the ith gate and dim
n0 → {0, 1} be a function implemented by a fan-in ≤ 2 circuit C with n = |C| NAND, AND, OR, XOR, and NOT gates. There exists a fan-in ≤ 2 and fan-out ≤ t circuit C bnd for f which has the same n gates as C and up to (n − 2n 0 )/(t − 1) additional t-fork gates. Denote t * := 1/(t − 1). The aggregate gate checker agc = agc(C bnd ) for f has a span program P with size P ≤ (8 + 4t * ) n − (10 + 8t * ) n 0 , sdim P ≤ (4 + 2t * ) n − (5 + 4t * ) n 0 , and supp P ≤ (9 + 4t * ) n − (11 + 8t * ) n 0 . If t = 3, then size P ≤ 10n − 14n 0 , sdim P ≤ 5n − 7n 0 , and supp P ≤ 11n − 15n 0 .
The proof of this theorem is given in App. B.
We emphasize that the optimal choice of t depends on the parameter that we are going to optimize. The aggregate gate checker has optimal size, dimension and support when t is large (preferably even if the fan-out bounding procedure of Thm. 1 is not applied at all). The support of the aggregate wire checker (see Sect. 6.2) is minimized when t = 2. To somewhat balance the parameters, we concentrate on the case t = 3.
Quadratic Span Programs
A quadratic span program is an extension of span programs, motivated by what one can actually do by using bilinear maps. We will first give a definition of quadratic span programs by using the language of linear algebra. After that, in Sect. 8, we will provide a polynomial redefinition of quadratic span programs and show that the result is equivalent to the definition given in [GGPR12]. Definition 1. A quadratic span program P = (t v , t w , V, W, ) over a field F consists of two (possibly allzero) target vectors t v , t w ∈ F d , two m × d matrices V and W, and a common labelling :
, where x • y denotes the pointwise (Hadamard) product of x and y. The quadratic span program computes a function f if for all u ∈ {0, 1} n0 :
The size, size P , of the quadratic span program is m. The dimension sdim P is equal to d. The support supp P of a quadratic span program P is equal to the sum of the supports (that is, the number of non-zero elements) of all vectors v i and w i .
Clearly
b i w ij = t wj , which can be seen as an element-wise OR of two span programs. This can be compared to the element-wise AND of two span programs that accepts only if for all
This AND composition accepts exactly if two span programs accept simultaneously, that is,
On the other hand, one cannot implement an element-wise OR composition (quadratic span program) as a span program. Quadratic span programs add an element-wise OR to an element-wise AND, and thus it is not surprising that they increase the expressiveness of span programs.
Clearly, one can compose quadratic span programs by using the AND and OR composition rules of span programs. One has to take care to apply the same transformation to both V and W simultaneously.
6 Wire Checker and Aggregate Wire Checker
Wire Checker
Gate checkers verify that every individual gate is followed correctly, that is, that its output wire obtains a value which is consistent with its input wires. On top of that, one also requires intra-gate (wire) consistency that ensures that adjacent gate checkers do not make double assignments to any of the wires. Following [GGPR12], we construct a wire checker to verify such intra-gate consistency. We first construct a wire checker for every single wire (that verifies that the variables involved in the span programs of the vertices that are adjacent to this concrete wire do not get inconsistent assignments), and then aggregate them by using AND composition of quadratic span programs.
We will need the following notation. Let G = (V, E) = G(C) be the hypergraph of the circuit C. A hyperedge η connects the input gate of some wire to (potentially many) output gates of the same wire. In C, an edge η (except input edges, that have t adjacent vertices) has t + 1 adjacent vertices, where t is the fan-out of η's designated input gate. Every vertex of G can only be the starting gate of one hyperedge and the final gate of two hyperedges (since we only consider unary and binary gate operations). Clearly, |E(G)| ≤ 2(|V (G)| − n 0 ), where n 0 is the number of inputs to the circuit, e.g., the number of the sources of G. We denote the set of gates of C by V (C) and the set of wires of C by E(C).
Every wire η ∈ E(C) corresponds to a formal variable x η in a natural way. This variable obtains an assignment, computed from the input assignment u. Let N (η) be the set of η's adjacent gates. For every i ∈ N (η), let P i = (t i , V i , i ) be the corresponding gate checker. For every i ∈ N (η), one of the input or output variables of P i (that we denote by x i:η ) corresponds to x η . Recall that for a local variable y of a span
We define the ηth wire checker between the rows of adjacent gates i ∈ N (η) that are labelled either by the local variable x i:η or its negationx i:η , i.e., between the rows {i : ∃k ∈ N (η) s.t.
where k is defined as in the previous paragraph. Let ψ be the natural labelling of the wire checkers, with ψ(i) = x j η iff k (i) = x j k:η for some k ∈ N (η). After possible re-enumerating of rows, assume that
. We first give a definition of wire checkers for the case where there is only one wire η and thus only one variable x η . In Sect. 6.2, we will give a definition and a construction for the aggregate case.
. Fix a wire η, and 
We informally define the degree sdeg P of a (quadratic) span program P as the degree of the interpolating polynomial that obtains the value v ij at point j. See Sect. 8 for a formal definition.
Theorem 2. Q
wc is a wire checker of size 2D, degree D, dimension D * = 2D − 1, and support 4D 2 .
Proof. It is easy to see that if a and b indicate a consistent bit assignment, then the new wire checker accepts. For example, if
. Now, assume that a and b indicate an inconsistent bit assignment, that is,
Since RS D is the generator matrix of the systematic Reed-Solomon code, the vector (a
wc does not accept. The claim about the size, the dimension, the degree, and the support is straightforward.
Intuitively, we use a Reed-Solomon code since it is a maximum distance separable (MDS) code and thus minimizes the number of columns in RS D . It also naturally minimizes the degree of the wire checker. Moreover, RS D has D 2 non-zero elements. Clearly (and this is the reason we use a systematic code), D 2 is also the smallest support the generator matrix of an [n = 2D − 1, k = D, d = D] q code can have, since every row of RS D is a codeword and thus must have at least d non-zero entries, and thus RS D must have at least dD ≥ D 2 non-zero entries, where the last inequality is due to the singleton bound [Rot06] . We note that a wire checker with V = W = RS D satisfies the even stronger security requirement that either a = 0 or b = 0. One could hope to pair up literals corresponding to x η in the V part and literals corresponding tox η in the W part. This is impossible in our application, since when we aggregate the wire checkers, we have to use use vectors labelled with both negative and positive literals in the same part, V or W, and we cannot pair up columns from V and W that have different indices.
For labelling ψ, we define the dual labelling
η . Let W = V dual be the same matrix as V, except that it has rows from ψ −1 (x η ) and ψ −1 (x η ) switched, for every η. To simplify the notation, we will not mention the dual labelling ψ dual unless absolutely necessary, and we will assume implicitly that W has been constructed as in the current paragraph. Now, [GGPR12] constructed a weak wire checker that guarantees consistency when all individual gate checkers are conscientious. The new wire checker is both more efficient and more secure.
Aggregate Wire Checker
Let P = (0, 0, V, W, ψ), with two m × d matrices V and W = V dual , be a quadratic span program. P is an aggregate wire checker, if (V · a − t v ) • (W · b − t w ) = 0 if and only if a and b indicate consistent bit assignments in the following sense: for each η ∈ E(C), 
We construct an aggregate wire checker by AND-composing wire checkers for the individual wires. The aggregate wire checker, see Prot. 1, first resets all vectors v i and w i to 0, and precomputes RS Dη for all possible values D η (clearly, D η ≤ t + 1). After that, for every wire η, it sets the entries in rows, labelled by either x η orx η , and columns corresponding to wire η, according to the ηth wire checker. The variables D η and D cum are defined as in Sect. 6.1.
We recall from Sect. 6.1 that for the wire checker of some wire to work, it must be the case that the vectors in V and W of this wire checker have different (but consistent) orderings. To keep notation simple, we will not mention this in what follows.
Theorem 3. Let t ≥ 2. Assume that C bnd is the circuit, obtained by the transformation described in Thm. 1. For any η ∈ E(C bnd ), denote D * η = 2D η −1. We obtain an aggregate wire checker Q awc , see Prot. 1, by merging wire checkers for the individual indices η ∈ E(C bnd ) as in Prot. 1 from the span program S that compute the aggregate gate checker function agc for C bnd .
Proof. Let m be the size of the aggregate wire checker (computed in Thm. 4). If a, b indicate consistent assignments, then they indicate consistent assignments of the ηth bit for i restricted to ψ −1 (x η ) ∪ ψ −1 (x η ). For every η ∈ E(C bnd ), the wire checker for wire η guarantees that (
iff the bit assignments of the ηth wire are consistent. Thus, (
iff the bit assignments of all wires are consistent.
Theorem 4. Let t * := 1/(t − 1). Assume C implements some f : {0, 1} n0 → {0, 1}, and n = |C|. The aggregate wire checker Q awc has size Q awc ≤ (6+4t
(The proof of this theorem is given in App. C.) Clearly, other parameters but support are minimized when t is large. If the support is not important than one can dismiss the bounding fan-out step, and get size 2n, dimension 2n and degree n.
Gennaro et alt [GGPR12] only defined a weak aggregate wire checker that guarantees the required "no double assignments" property only when the individual gate checkers are conscientious. The new aggregate wire checker does not have this restriction. The size of the GGPR weak aggregate wire checker is 24n and the degree of it is 76n. Differently from [GGPR12], we gave the description of our aggregate wire checker by using the non-polynomial interpretation of quadratic span program.
Circuit Checker
Next, we combine the aggregate gate and wire checkers to perform the verification of a Circuit-SAT instance. We will give two different descriptions of the resulting circuit checker 2 , based on wire checkers. Combined Circuit Checker. We construct the combined circuit checker for C as follows: let P w = (0, 0, V w , W w , ψ), be an aggregate wire checker for C bnd . Let P g = (t, V g , ) be an aggregate gate checker for C bnd . Here, and ψ are related as in Sect. 6.2. Let m = size P w = size P g . Assume that the vectors V w = {v Definition 3. The combined circuit checker c Λ (C) for C consists of P g and P w . It accepts u (that is, c Λ (C)(u) = 1) if there exist two vectors a and b, such that a i = b i = 0 for i ∈ ψ −1 u , which make both P g and P w simultaneously accept, in the sense that the following holds true:
We note that the instantiation of P g used in conjunction with vector b differs from the instantiation used in conjunction with vector a: as explained in Sect. 6.1, the two instantiations have a different ordering of the vectors v g i . To ease on notation, we will not make it explicit.
Theorem 5. Assume that P w is an aggregate wire checker. C(u) = 1 iff c Λ (C)(u) = 1.
Proof. First, assume C(u) = 1. By the construction of the aggregate gate checker, there exists a, with
Since a and b indicate bit assignments of wires in the evaluation of C(u), the aggregate wire checker accepts.
Second, assume that there exist vectors a and b, such that c Λ (C) accepts with those vectors. Since P w accepts, there are no double assignments. That means, that for some (possibly non-unique) bit u η ∈ {0, 1} and all i ∈ ψ −1 (xū η η ), a i = 0. Dually, b i = 0 for all i ∈ ψ −1 (xū η η ) (u η clearly has to be the same in both cases). Since this holds for every wire, we get that there exists an assignment u of wire values, such that for
Pure circuit checker. The previous construction of c Λ (C) consists of two span programs and one quadratic span program. Following the ideas of [GGPR12], one can represent everything as one (slightly larger) quadratic span program. Namely, for d g = sdim P g , consider the quadratic span program
(1)
Here, V = (v 0 , . . . , v m ) , W = (w 0 , . . . , w m ) , and 1 = (1, . . . , 1). Clearly, ( i a i v ij − v 0j )( i b i w ij − w 0j ) = 0 for j ∈ [1, 2 · sdim P g + sdim P w ] iff the following three things hold:
Thus, this quadratic span program accepts iff the combined circuit checker accepts. Thus, c Λ (C) is a circuit checker for C. However, it also increases the dimension of the final quadratic span program.
Clearly, sdeg c Λ (C) ≤ 2 · sdim P g + sdeg V w . Let n = |C|. From Thm. 1 and Thm. 4, sdeg c Λ (C) ≤ (11 + 6t * )n − 12(1 + t * )n 0 − 2. This decreases when t increases, obtaining the value ≤ 11n − 12n 0 − 2 when one does not apply Thm. 1 at all, or 14n − 18n 0 − 2 when t = 3. Analogously, size c Λ (C) = size P w + size P g ≤ 2(7 + 4t * )n − 2(7 + 8t * )n 0 − 4. This decreases when t increases, obtaining the value ≤ 14n − 14n 0 − 4 when one does not apply Thm. 1 at all, or 18n − 22n 0 − 4 when t = 3.
One can similarly compute the dimension and the support supp c Λ (C) = 2(supp P g + supp P w ) = (50 + 8t(3 + t) + 40t * )n − 2(5 + t(27 + 8t))t * n 0 − 8(1 + t) 2 of the circuit checker. The support is upperbounded by 214n − 158n 0 − 128 when t = 3. We note that the degree of the circuit checker from [GGPR12] is 130n and its size is 36n. Thus, even when t = 3, we have improved the efficiency of their construction more than 9 times degree-wise and 2 times size-wise.
Polynomial Span Programs and Quadratic Span Programs
One can build a linear-communication NIZK argument on top of the circuit checker by using well-known techniques. However, since we are interested in succinct arguments, we need to be able to somehow compress the witness vectors a and b. As in [GGPR12], we will do it by using polynomial interpolation.
For large prime q, let F = Z q . Instead of considering the target and row vectors as being members of the vector space The requirement that t is in the span of the vectors that belong to −1 u is equivalent to the requirement that t = i∈ −1 u a i v i for some a i ∈ F. In the polynomial notation, the latter translates to the requirement that z(X) divides v(X) := v 0 (X) + i∈ n : there exists a ∈ F m such that z(X) | (v 0 (X)+ v∈Vu a i v(X)) (P accepts) iff f (u) = 1.
Alternatively, P accepts u ∈ {0, 1} n iff there exists a vector a ∈ F m , with
The size of the span program is m and the degree of P is deg z(X). We now give exactly the same definition of quadratic span programs as it was given in [GGPR12].
Definition 5. A polynomial quadratic span program P over a field F consists of a target polynomial z(X) ∈ F[X], two sets V = {v 0 (X), v 1 (X), . . . , v m (X)} and W = {w 0 (X), w 1 (X), . . . , w m (X)} of polynomials from F[X], and a labelling : [m] → {x ι ,x ι : ι ∈ [n]} ∪ {⊥}. P accepts an input u ∈ {0, 1} n iff there exist two vectors a and b from F m , with
n → {0, 1} if it accepts exactly those inputs u where f (u) = 1. The size of the polynomial quadratic span program is m and the degree of P is deg z(X). Now, keeping in mind the reinterpretation of span programs, Def. 5 is clearly equivalent to Def. 1. (Also here, W = V dual , with the dual operation defined appropriately.)
To get from the non-polynomial interpretation to polynomial interpretation, one has to do the following. Assume that the dimension of the quadratic span program is d and that the size is m. 
New NIZK Argument
In this section, we propose the new Circuit-SAT NIZK argument. We start with definitions. Definitions. Let κ be the security parameter. We abbreviate probabilistic polynomial-time by PPT. Let poly(κ) := κ O(1) and negl(κ) := κ −ω(1) . Let R = {(C, w)} be an efficiently computable binary relation with |w| = poly(|C|). Here, C is a statement, and w is a witness. Let L = {C : ∃w, (C, w) ∈ R} be an NP-language. Let n be the input length n = |C|. For fixed n, we have a relation R n and a language L n . A non-interactive argument Π for R consists of the following PPT algorithms: a common reference string (CRS) generator G, a prover P, and a verifier V. For crs ← G(1 κ , n), P(crs; C, w) produces an argument π. The verifier V(crs; C, π) outputs either 1 (accept) or 0 (reject). Π is perfectly complete, if ∀n = poly(κ), Pr[crs ← G(1 κ , n), (C, w) ← R n : V(crs; C, P(crs; C, w)) = 1] = 1 .
Π is perfectly zero-knowledge, if there exists a PPT simulator S = (S 1 , S 2 ), such that for all stateful nonuniform PPT adversaries A and n = poly(κ) (with td being the simulation trapdoor ),
Π is adaptively computationally sound, if for all non-uniform PPT A and all n = poly(κ),
For algorithms A and E A , we write (y; z) ← (A||E A )(x) if A on input x outputs y, and E A on the same input (including the random tape of A) outputs z. A non-interactive argument is a non-interactive argument of knowledge, if for any non-uniform PPT prover A there exists an extractor E A such that for n = poly(κ) and any auxiliary information z ∈ {0, 1} κ ,
Construction. Gennaro et alt [GGPR12] constructed a quadratic span program-based NIZK for Circuit-SAT. Their NIZK argument is non-adaptive, i.e., it incorporates a function-specific CRS crs(C). As mentioned in [GGPR12], their argument can be made adaptive by using universal circuits [Val76] . We will give a direct construction through universal circuits [Val76] . That is, we assume that UC n is a universal circuit of size Θ(n log n) that accepts an input (C, u) iff C is an n-gate circuit that accepts the input u. More precisely, assuming that the original circuit has fan-out 2, Valiant's universal circuit consists of 19n log n controlled crossbar gates (omitting lower-order terms), and Θ(n) gates for the universal function from {0, 1} 2 to {0, 1}. To enable a better comparison with [GGPR12], we just assume that UC n can be implemented by using Θ(n log n) unary or binary gates.
In the new Circuit-SAT NIZK argument, polynomials (e.g., v i (X)) are represented by encodings of these polynomials evaluated at some secret point σ (e.g., by Next, we will describe the three constituent algorithms of the new NIZK argument. Briefly, the prover uses the circuit checker to create the necessary coefficients a i and b i . He encodes v(X) = a i v i (X) and
and [[w(σ)]]. He also encodes [[h(σ)]]
, where h(X)z(X) = v(X)w(X). He then creates an argument π that convinces the verifier that h(X) satisfies this condition.. To achieve zeroknowledge, the prover additionally masks the argument accordingly. The verifier verifies that the argument is created correctly. She also verifies that h(X), v(X) and z(X) satisfy h(σ)z(σ) = v(σ)w(σ). We prove in Thm. 8 that it follows from this and the PDH and PKE assumptions, h(X)z(X) = v(X)w(X) and thus z(X) | v(X)w(X). Computational soundness follows due to the properties of the circuit checker. The actual proof is significantly more complicated. Since the verifier will require less information than the prover, we define the verifier's CRS separately as vcrs. As in [GGPR12], the secret elements β v , β w and γ (and corresponding public elements like [[β v z(σ)]] and π y ) are required for us to be able to reduce the soundness to the PKE and PDH assumptions. This part of the proof also uses multilinear universal hash functions. CRS generation G(1 κ , n):
1 Let UC n be the universal circuit for circuits of size |C bnd |, where |C| = n; 2 Let Q := c Λ (UC n ) = (z(X), V, W, ψ) be the pure polynomial circuit checker for UC n with m = size c Λ (
The trapdoor is (σ, α, β v , β w ); Prove P(crs; C, w):
Verify V(vcrs; C, π):
1 V confirms that the terms are in the support of validly encoded elements; 2 V confirms that the following equations hold:
Protocol 2: The simulator S 2 (crs, C, td)
We base computational soundness and the argument of knowledge property on two assumptions, the (d 1 , d 2 )-power Diffie-Hellman ((d 1 , d 2 )-PDH) assumption and the d-power knowledge of exponent (d-PKE) assumption. Variants of these assumptions are well known, see [Gro10, Lip12, GGPR12] , where their security was proven in the generic group model.
Gennaro et alt [GGPR12] use the (λ + 1, 2λ)-PDH assumption for some λ ≈ 2d, while we use the (d + 1, 2d + 3)-PDH assumption for d. In both cases, d is the degree of the underlying circuit checker. The corresponding security definition in [Gro10, Lip12] is somewhat weaker, since there the adversary was required to return the secret key σ.
Let d = poly(κ). The d-power knowledge of exponent (d-PKE) assumption [Gro10] holds for the encoding [[·] ] if for any non-uniform PPT adversary A there exists a non-uniform PPT extractor E A , s.t. for any auxiliary information z ∈ {0, 1} poly(κ) which is generated independently of α,
Theorem 8. Fix the circuit size n. Let the pure circuit checker c Λ (UC n ) = (z(X), V, W, ψ) for UC n have degree d. If the (d + 1, 2d + 3)-PDH and d-PKE assumptions hold, then the NIZK argument of this section is an adaptively sound argument of knowledge.
We postpone the soundness proof to Sect. 10. Efficiency. The new NIZK argument behaves efficiency-wise similarly to the (adaptive variant) of the Circuit-SAT argument from [GGPR12] when we recall that using universal circuits results in a logarithmic increase of most of the complexity parameters. Like the latter argument 3 , the new argument has CRS of size Θ(n log n), prover's computational complexity Θ(n log 3 n), and constant communication complexity. The main difference is that the new argument has significantly smaller constants. (As always, we assume that the complexity measures are in appropriate units like the number of group elements in the case of the CRS and argument length.)
Given a, b, (v i ) and (w i ), both v = m i=1 a i v i and w = m i=1 b i w i can be computed in Θ(supp c Λ (UC n )) = Θ(|UC n |) time due to the sparsity of the vectors v i and w i . The only superlinear (in |UC n | = Θ(d) = Θ(n log n)) part of the prover's computation is the computation of the degree-d polynomial h. As explained in [GGPR12], this can be done by using multipoint evaluation and polynomial interpolation in time Θ(d log 2 d).
4
We note that under the mild assumption that d is a power of 2, h(x) ← v(x)w(x)/z(x) can be computed in time Θ(d log d) by using a polynomial multiplication followed by a polynomial division. This can be further optimized by letting c(
is the reversal of z(x), to be a part of the CRS. Then the prover essentially has to execute only two multiplications, first to compute a(x) = v(x)w(x), and then to compute h(
The cryptographic part of the prover's computation is dominated by 8 Θ(d)-wide multiexponentiations. One can use Pippenger's multiexponentiation algorithm [Pip80] 
where α is the largest exponent. We expect that the cryptographic part will dominate the prover's computation when d = Ω(2 √ log 2 q ), and in practice even sooner.
3 We refer to [GGPR12] for a detailed analysis of the computational complexity issues that surround polynomial interpolation. On the other hand, in the construction of [GGPR12], the verifier's computational complexity is linear in the statement length. Because of the use of universal circuits, in the adaptively sound case, the statement length is linear in the circuit size n = |C|. In the new argument, the verifier's computational complexity is dominated by 11 invocations of the bilinear pairings. The main reason behind this difference is that in the argument of [GGPR12], the verifier has to compute the coefficients a i for all i ∈ [n 0 ]. In the new argument, this is not necessary due to the use of "non-weak" wire checkers and the extraction technique of Lem. 2 that does not require the gate checkers to be conscientious. Further Optimizations. We can use the result of Bitansky et al. [BCCT13] that says that any SNARK with preprocessing can be transformed into a SNARK with no preprocessing. A related result holds for NIZK arguments. We refer to [GGPR12] for a description of a number of other optimizations that all apply also to the new argument.
Proof of Theorem 8
Before the soundness proof we prove two technical lemmas. The first lemma basically says that the honest verifier will be unconditionally convinced, if the prover sends her the actual vectors v = i∈[m] a i v i and w = i∈[m] b i w i such that the (pure) circuit checker will accept. Moreover, one can extract the whole witness from this proof. A similar lemma was proven in [GGPR12] only in the case all gate checkers are conscientious. This allowed to extract the value of every wire uniquely. Our proof, that does not assume this property, extracts the values recursively (and possibly non-deterministically).
Lemma 2. Let C be an n-gate circuit. Let c Λ (C) = (0, 0, V, W, ψ), with V = (v 0 , . . . , v m ) and W = (w 0 , . . . , w m ), be a pure circuit checker for the circuit C. Let π = (v, w) be such that
Then, π implies unconditionally that there exists a witness u ∈ {0, 1} n such that C(u) = 1. Such u can be extracted from π.
Proof. Let a, b be such that v = i∈[m] a i v i and w = i∈[m] b i w i . We now show how to construct a witness u such that C(u) = 1. The construction is bottom-up recursive on the circuit. We note that if the three requirements hold than the the wire checker implies that no wire η gets a double assignment.
First, let η be one of the wires starting from some input gate ι. Since the output gate of η can have a non-conscientious gate checker, η might have no assigned value. However, recall that the wire checker checks the consistency of η with all other wires that start from ι. Since this wire checker accepts, all those wires have been assigned either u η (for an unambiguous bit u η ∈ {0, 1}) or no value; no wire has got the assignment u η . If some output wire of ι got the assignment u η , we assign u η to all output wires of ι. Otherwise, we pick some value u η ∈ {0, 1}, and assign this u η to all output wires of ι. Since the wire was originally unassigned, the value of the output gate of unassigned wire η does not depend on the particular assignment, and thus the given assignment is u η is consistent with the gate checkers of the output gates.
Consider now some internal gate ι. Assume that it has t 0 incoming edges η j that start from some gates ι j . By recursive construction, the gate checkers of ι j have assigned an assignment to the wires η j . (This is true since every gate implements a function, and therefore the corresponding gate checker must only accept for one possible value of the output wire. In the case this is the input wire, the assignment was done in the previous paragraph.) However, since the gate checker for ι might not be conscientious, the gate checker P = P (ι) of ι might not have assigned any value to these wires (that is, the corresponding coefficients a i and b i are zero). In this case, given the values of all other wires that have assignments, the output of ι does not depend on the values of the unassigned input wires. We then can assign arbitrary values to the unset coefficients a i and b i , and in particular we can assign values that are consistent with the output values of all ι j . This in particular also assigns unequivocal values u ηj to all wires η j .
The total witness is defined as the concatenation of u η for all wires.
Clearly, this lemma can be restated in the language of polynomial circuit checker. For a set P of polynomials, let span P be their span (that is, the set of F-linear combinations). In particular, v is in the span of vectors v i , v = a i v i , iff the corresponding interpolated polynomial v(X) is in the span of polynomials v i (X), that is, v(X) = a i v i (X). Thus, in Lem. 2, the corresponding requirements will be (i)
For the next lemma and the main theorem (Thm. 8) we need the standard multilinear universal hash function family [GMS74,CW79,WC81] ML :
Lemma 3. Let m and d be two positive integers. For ML :
Then for any x 1 ∈ F d+2 \ span P, x 2 ∈ F d+2 \ span(P ∪ {x 1 }), and any y 1 , y 2 ∈ F,
Proof. The key k is drawn completely random, except that it has to satisfy k · v = 0 for v ∈ P. The only other thing which is known about k is the value k · x 1 . Since x 2 ∈ span(P ∪ {x 1 }), the inner product k · x 2 looks completely random.
We are now ready to prove the soundness of the new NIZK argument.
Proof (Of Thm. 8).
Only within this proof, we will implicitly use the canonical isomorphism between poly-
and their coefficient vectors g = (g 0 , . . . , g d+1 ) from F d+2 . (In most of the current paper, v(X) denotes the interpolated polynomial obtained from v(r i ) = v i . This is not the case in this proof.) For a polynomial g(X), let cf(g(X)) be the coefficient of X d+1 in g(X).
Soundness: assume that there exists an adversary A that succeeds in breaking the soundness of the argument from Sect. 9. We show how to construct an adversary B, which interacts with A and breaks the (d + 1, 2d + 3)-PDH assumption.
Let UC n : {0, 1} n → {0, 1} be the universal circuit for circuits of size C bnd , where |C| = n, which has a polynomial circuit checker c Λ (UC n ) of degree d. Suppose that B receives a (d + 1, 2d + 3)-PDH challenge
B computes UC n and associated parameters. He generates a random α ← F. He generates β v , β w , and γ indirectly in terms of their representations over the power basis {σ j }, so that he can generate the CRS despite only knowing these values implicitly.
To generate β v , B generates a random key for ML − ,
(Thus, β v depends on c Λ (UC n ).) Note that B cannot create encoding of β v . We will deal with this issue later.
The generation of β w is analogous. Let 
, since these two terms have a zero coefficient for σ d+1 and, due to deg σ
∈ ch, B can compute the rest of the crs and vcrs as in Sect. 9. B provides crs and vcrs to A.
Assume that A(crs) generates an argument π of a false statement C that passes the verification. From the verification equations and the fact that the image of the encoding is verifiable, the argument must have . Similarly, he obtains degree-d polynomials w * (X) (with w * (X) = w(X) + r w z(X) if the prover is honest) and h * (X) (with h * (X) = h(X) + r v · (w 0 (X) + w(X)) + r w · (v 0 (X) + v(X)) + r v r w z(X) if the prover is honest). Since π verifies, we have that
, and -the last term of the proof properly encodes β v v(σ) + β w w(σ).
Since π is an argument for a false statement, Lem. 2 (more precisely, its polynomial reinterpretation) implies that at least one of the following two cases must hold:
We recall that z(X) is a mapping of the all-zero vector (0, . . . , 0), and thus when applying Lem. 2, we can omit mentioning z(X). We now show that in either case, B can solve the (d + 1, 2d + 3)-PDH problem. Suppose that Case 1 holds. Then
is a non-zero polynomial of degree ≤ 2d having σ as a root. B uses an efficient polynomial factorization algorithm to find ≤ 2d roots σ * i of f (X) over F, and then finds by exhaustive search an index i such that
. Thus, he has broken the (d + 1, 2d + 3)-PDH assumption.
5
5 We remark that [GGPR12] used a different proof technique here that did not require the use of polynomial factorization, but resulted in the (λ + 1, 2λ)-PDH assumption for λ ≥ 2d − 1. We could use the same technique, but we think that weakening the assumption is worth the extra step in reduction.
Suppose that Case 2 holds. W.l.o.g., suppose that v * (X) cannot be expressed as a linear combination of
(We also note that z(X) is an interpolation of the all-zero vector. The only information that E A has about K v is (i) that K v ∈ V(P v ) and thus ML − Kv (p) = 0 for p(X) ∈ P v , and (ii) the value ML − Kv (σ * ) = β v .
By Lemma 3, since v * (X) ∈ span P v , v * (X) = σ * (X), and K v ∈ V(P v ), the value ML
) is uniformly random. Thus, the coefficient of σ d+1 in β v v * (σ) + β w w * (σ) is uniformly random, regardless of the choice of w * (X). With probability 1 − 1/F, this coefficient is non-zero. Assume now that this is the case. If it is non-zero, due to Eq. (2) (and the choice of β w ), π y encodes an element
, and, with probability 1 − (d − 1)/q, y(σ) has a non-zero coefficient for σ d+1 . Since all coefficients of y are known to B (for example,
, and thus B can compute the coefficients of β v v * (σ) from K v and v * ), he can subtract off encodings of multiples of the other powers of σ (given in the (d + 1, 2d + 3)-PDH instance) to obtain an encoding of a non-zero multiple of σ d+1 , from which B can obtain an encoding of σ d+1 , solving the (d + 1, 2d + 3)-PDH problem.
Argument of knowledge:
The argument of knowledge property follows from the extraction of the polynomials v * (X), w * (X) and h * (X), as described above, and Lemma 2. 1) − t * ≤t * · (t 0 · (n − n 0 ) − (n − n 1 )) − t * =(t 0 − 1)t * · n + (n 1 − 1 − t 0 · n 0 )t * .
Thus, this operation adds (t 0 − 1)t * · n + (n 1 − 1 − t 0 · n 0 )t * = t * · n − 2t * n 0 gates to the circuit. (The last equality is true only when t 0 = 2 and n 1 = 1, but we assume that the circuit has fan-in ≤ 2 anyway.) Thus, the total number of the gates in C bnd will be (1 + t * ) · n − 2t * n 0 . Recall from Sect. · (n − 2n 0 )t * , where m * = 6 is an upper bound on the size of the span programs for the individual gates (NAND, AND, OR, XOR, NOT, and fork), thus at most 6 · (n − n 0 ) + 2(t + 1) · (n − 2n 0 )t * = (8 + 4t * ) · n − (10 + 8t * ) · n 0 by using SP (c∧) and SP (c t Y ) from Sect. 3. Analogously, the dimension of the span program is at most 3 · (n − n 0 ) + (t + 1) · (n − 2n 0 )t * = (4 + 2t * ) · n − (5 + 4t * ) · n 0 .
Clearly, each vector in the span program for agc (aside from the target vector) has only a small constant number of non-zero coefficients, since the vectors in the span program for agc are inherited from the small span programs for the individual gates of C bnd . More precisely, the number of non-zero entries of the aggregate gate checker is supp(SP (c∧)) · (n − n 0 ) + supp(SP (c t Y )) · (n − 2n 0 )t * =7 · (n − n 0 ) + (2t + 2) · (n − 2n 0 )t * = (9 + 4t * ) · n − (11 + 8t * ) · n 0 .
C Proof of Thm. 4 (Parameters of Aggregate Wire Checker)
Proof. Let |C bnd | be the circuit after we have applied Thm. 1 to it. Let again t * = 1/(t − 1). First, note that in the original circuit, for every gate i ∈ [n − 1], the wire checker corresponding to its output wire has D(η) = deg + (i) + 1. If deg + (i) > t, Thm. 1 builds a t-ary inverse tree on top of the ith node. This tree adds (deg + (i) − 1)t * − 1 new vertices. Let Γ (i) be the set of vertices induced by the original vertex i (including i itself), then |Γ (i)| = (deg + (i) − 1)t * . Every node of Γ (i) corresponds to one wire checker which is induced by the vertex i in circuit C bnd .
To compute size and dimension of the part of the aggregate wire checker of the gates induced by a concrete gate i, we note that the corresponding D values have in total 
1 − 1 =t * · (2(n − n 0 ) − (n − 1) − 1) = t * n − 2t * n 0 .
Thus,
D j ≤ 3n − 2n 0 − 1 + 2(t * n − 2t * n 0 ) = (3 + 2t * )n − 2(1 + 2t * )n 0 − 1 .
The size of the aggregate wire checker is upperbounded by 2 n−1 i=1 j∈Γ (i) D j =(6 + 4t * )n − (4 + 8t * ) · n 0 − 4 .
This value is minimal for large t, and maximal when t is small. If t = 3, then size Q awc ≤ 8n − 8n 0 − 4. Similarly, sdim Q awc = n−1 i=1 j∈Γ (i) (2D j − 1) ≤(6 + 4t * )n − (4 + 8t * ) · n 0 − 4 .
(Here, as in the case of degree we have omitted lower order terms.) This value is minimal when t is large. If t = 3, then sdim Q awc ≤ 8n − 8n 0 − 4. Degree:
D j ≤ (3 + 2t * )n − (2 + 4t * )n 0 − 1 . of degree less than n at n points in R can be performed using at most ( 11 2 M (n) + O(n)) log n or O(M (n) log n) operations in R.
Fact 5 (Fast interpolation, Cor. 10.12 from [GG03]) Polynomial interpolation over a (commutative) ring R can be solved by using at most ( 13 2 M (n) + O(n)) log n or O(M (n) log n) operations in R.
