Abstract. In this paper we study the role of cliquewidth in succinct representation of Boolean functions. Our main statement is the following: Let Z be a Boolean circuit having cliquewidth k. Then there is another circuit Z * computing the same function as Z having treewidth at most 18k+2 and which has at most 4|Z| gates where |Z| is the number of gates of Z. In this sense, cliquewidth is not more 'powerful' than treewidth for the purpose of representation of Boolean functions. We believe this is quite a surprising fact because it contrasts the situation with graphs where an upper bound on the treewidth implies an upper bound on the cliquewidth but not vice versa. We demonstrate the usefulness of the new theorem for knowledge compilation. In particular, we show that a circuit Z of cliquewidth k can be compiled into a Decomposable Negation Normal Form (dnnf) of size O(9 18k k 2 |Z|) and the same runtime. To the best of our knowledge, this is the first result on efficient knowledge compilation parameterized by cliquewidth of a Boolean circuit.
Introduction
Cliquewidth is a graph parameter, probably best known for its role in the design of fixed-parameter algorithms for graph-theoretic problems [2] . In this context the most interesting property of cliquewidth is that it is 'stronger' than treewidth in the following sense: if all graphs in some (infinite) class have treewidth bounded by some constant c, then the cliquewidth of the graphs of this class is also bounded by a constant O(2 c ). However, the opposite is not true. Consider, for example, the class of all complete graphs. The treewidth of this class is unbounded while the cliquewidth of any complete graph is 2.
In this paper we essentially show that, roughly speaking, cliquewidth of a Boolean function is not a stronger parameter than its treewidth. In particular, given a Boolean circuit Z, we define its cliquewidth as the cliquewidth of the DAG of this circuit and the treewidth as the treewidth of the undirected graph underlying this DAG. The main theorem of this paper states that for any circuit Z of cliquewidth k there is another circuit Z * computing the same function whose treewidth is at most 18k + 2 and the number of gates is at most 4 times the number of gates of Z. Moreover, if Z is accompanied with the respective clique decomposition then such a circuit Z * (and the tree decomposition of width 18k + 2) can be obtained in time O(k 2 n). The definition of circuit treewidth is taken from [14] and the definition of circuit cliquewidth naturally follows from the treewidth definition. In fact, the relationship between circuit treewidth and cliquewidth is put in [14] as an open question.
We demonstrate that the main theorem is useful for knowledge compilation, that is, compact representation of Boolean functions that allows to efficiently answer certain queries regarding the considered function. In particular, we show that any circuit Z of cliquewidth k can be compiled into decomposable negation normal form (dnnf) [3] of size O(9 9k k 2 |Z|) (where |Z| is the number of gates) by an algorithm taking the same runtime. To the best of our knowledge, this is the first result on space-efficient knowledge compilation parameterized by cliquewidth. We believe this result is interesting because the parameterization by cliqewidth, compared to treewidth, allows to capture a wider class of inputs including those circuits whose underlying graphs are dense.
This bound is obtained as an immediate corollary of the main theorem and the O(9 t t 2 |Z|) bound on the dnnf size for the given circuit Z, where t is the treewidth of Z. The intermediate step for the latter result is an O(3 p (|C| + n)) bound of the dnnf size of the given cnf where C and n are, respectively the number of clauses and variables of this cnf and p is the treewidth of its incidence graph. All these 3 bounds significantly extend the currently existing bound O(2 r n) of [3] where r is the treewidth of the primal graph of the given cnf. For example, if the given cnf has large clauses (and hence a large treewidth of the primal graph) then the O(2 r n) bound becomes practically infeasible while the O(3 p (C + n)) bound may be still feasible provided a small treewidth of the incidence graph and a number of clauses polynomially dependent on n.
Related Work
The algorithmic power of cliquewidth stems from the meta-theorem of [2] stating that any problem definable in Monadic Second Order Logic (MSO 1 ) can be solved in linear time for a class of graphs of fixed cliquewidth k. The cliquewidth of the given graph is NP-hard to compute [8] and it is not known to be FPT. On the other hand, cliquewidth is FPT approximable by an FPT computable parameter called rankwidth [13, 11] . As said above, there are classes of graphs with unrestricted treewidth and bounded cliquewidth. However, it has been shown in [10] that the only reason for treewidth to be much larger than cliquewidth is the presence of a large complete bipartite graph (biclique) in the considered graph. In fact, we prove the main theorem of this paper by applying a transformation that eliminates all bicliques from the DAG of the given circuit.
dnnfs have been introduced as a knowledge compilation formalism in [3] , where it has been shown that any cnf on n variables of treewidth t of the primary graph can be compiled into a dnnf of size O(2 t tn) with the same runtime. A detailed analysis of special cases of dnnf has been provided in [6] . In particular, it has been shown that Free Binary Decision Diagrams (fbdd) and hence Ordered Binary Decision Diagrams (obdd) can be seen as special cases of dnnf. In fact, there is a separation between dnnf and fbdd [4] . This additional expression power of dnnf has its disadvantages: a number of queries that can be answered in polynomial time (polytime) for fbdd and obdd are NP-complete for dnnf [6] . This trade-off led to investigation of subclasses of dnnf that, on one hand, retain the succinctness of dnnf for cnfs of small treewidth and, on the other hand, have an increased set of queries that can be answered in polytime. Probably the most notable result obtained in this direction are Sentential Decision Diagrams (sdd) [5] that, on one hand, can answer in polytime the equivalence query (possibility to answer this query in polytime for obdds is probably the main reason why this formalism is very popular in the area of verification) and, on the other hand, retains the same upper bound dependence on treewidth as dnnf.
In fact the size of obdd can also be efficiently parameterized by the treewidth of the initial representation of the considered function. Indeed, there is an obdd of size O(n2 p ) where p is the pathwidth of the primal graph of the given cnf and of size (n O(t) ) where t is the treewidth of the graph, see e.g. [9] . It is shown in [14] that similar pattern retains if we consider the pathwidth and treewidth of a circuit but in the former case p is replaced by an exponential function of p and in the latter case, t is replaced by a double exponential function of t.
Preliminaries
A labeled graph G = (V, E, S), in addition to the usual set V (G) of vertices and a set E(G) of edges, contains a component S(G), which is a partition of V (G). Each element of the partition class is called a label. A simplified clique decomposition (scd) is a pair (T, G) where T is a rooted tree and G is a family of labeled graphs. Each node t of T is associated with a graph G(t), which is defined as follows. If t is a leaf node, then G(t) = ({v}, ∅, {{v}}). Assume that t has two children t 1 and t 2 and let G 1 = G(t 1 ) and G 2 = G(t 2 ). Then V (G 1 ) ∩ V (G 2 ) = ∅ and G(t) = (V (G 1 ) ∪ V (G 2 ), E(G 1 ) ∪ E(G 2 ), S(G 1 ) ∪ S(G 2 )). Finally, assume that t has only one child t 1 and let G 1 = G(t 1 ). Graph G(t) can be obtained from G 1 by one of the following three operations:
-Adding a new vertex. There is v / ∈ V (G 1 ) such that
. We say that S 1 and S 2 are children of S. -New adjacency. There are
. We say that S 1 and S 2 are adjacent.
The width of a node t of T is |S(G t )|. The width of (T, G) is the largest width of a node t of T . Let r be the root of T . Then we say that (T, G) is an scd of G(r) and of (V (G(r)), E(G(r))) (the unlabeled version of G(r). The simplified cliquewidth (scw) of a graph G is the smallest width among all scds of G. The definition of scd is closely related to the standard notion of clique decomposition. In fact scw of a graph G is at most twice larger than the cliquewidth of G. The details of comparison are postponed to the appendix.
Clique decomposition and scd are easily extended to the directed case. In fact the notion of cliquewidth has been initially proposed for the directed case, as noted in [7] . The only change is that the new adjacency operation adds to G(t) all possible directed arcs from label S 1 to label S 2 instead of undirected edges. In this case we say that there is an arc from S 1 to S 2 .
We denote t∈V (T ) S(G(t)) by S = S(T, G) and call it the set of labels of (T, G).
A tree decomposition of a graph G is a pair (T, B) where T is a tree and the elements of B are subsets of vertices called bags. There is a mapping between the nodes of T and elements of B. Let us say a vertex v of G is contained in a node t of T if v belongs to the bag B(t) of t. Two properties of a tree decomposition are connectedness (all the nodes containing the given vertex v form a subtree of T ), adjacency (each edge {u, v} is a subset of some bag), and union (the union of all bags is V (G). In this paper we consider the treewidth of a directed graph as the treewidth of the underlying undirected graph.
Boolean circuits considered in this paper are over the basis {∨, ∧, ¬}. In such a circuit there are input gates (having only output wires) corresponding to variables and constants true and f alse. The output of each gate of a circuit Z computes a function on the set of input variables. We denote by f unctions(Z) the set of all functions computed by the gates of Z. The number of gates of Z is denoted by |Z|.
A clique or tree decomposition of a circuit Z is the respective decomposition of the DAG of Z. In our discussion, we often associate the vertices of the DAG with the respective gates. De Morgan circuits are a subclass of circuits where the inputs of all the not gates are variables (i.e. the outputs of not gates serve as negative literals). For a gate g of Z, denote by V ar(g) the set of variables having a path to g in the DAG of Z. A circuit Z has the decomposability property if for any two in-neighbors g 1 and g 2 of an and gate g, V ar(g 1 ) ∩ V ar(g 2 ) = ∅. dnnf is a decomposable De Morgan circuit. When we consider a general circuit Z, we assume that it does not have constant input gates, since these gates can be propagated by removal of some gates of Z, which in turn does not increase the cliquewidth nor the treewidth of the circuit. However, for convenience of reasoning, we may use constant input gates when we describe construction of a dnnf. If the given circuit Z is a cnf then its variables-clauses relation can be represented by the incidence graph, a bipartite graph with parts corresponding to variables and clauses and a variable-clause edge representing occurrence of a variable in a clause.
From small cliquewidth to small treewidth
The central result of this section is the following theorem: Theorem 1. Let F be a circuit of cliquewidth k over n variables Then there is a circuit F * of treewidth at most 18k+2 and |F * | ≤ 4|F | such that f unctions(F ) ⊆ f unctions(F * ). Moreover, given F and a clique decomposition of F of width k there is an O(k 2 n) algorithm constructing F * and a tree decomposition of F * of width at most 18k + 2 having at most 2|F | bags.
The rest of this section is the proof of Theorem 1. The main idea of the proof is to replace 'parts' of the given circuit forming large bicliques by circuits computing equivalent functions where such bicliques do not occur. As an example consider a cnf of 3 clauses
The circuit of this graph contains a biclique of order 3 created by C 1 , C 2 , C 3 on one side and a 1 , a 2 , a 3 on the other one. This biclique can be eliminated by the introduction of additional or gate C 4 having input a 1 , a 2 , a 3 and output C 4 so that the clauses
, respectively. It is not hard to see that the new circuit computes the same function as the original one. This is the main idea behind the construction of circuit F * . The formal description of the construction is given below.
For the purpose of construction of F * we consider a type respecting scd (T, G) of F where each non-singleton label is one of the following:
-A unary label containing input gates and negation gates.
-An and label containing and gates.
-An or label containing or gates.
The following lemma essentially follows from splitting each label of the given clique decomposition into three type respecting labels. Lemma 1. Let k be the cliquewidth of F and let k * be the smallest width of an scd of F that respects types. Then k * ≤ 6k.
Proof. Let (T * , G * ) be an scd of Z having width at most 2k (existing since the cliquewidth is k). In each graph G ′ ∈ G * split each label into at most 3 subsets so that each subset contains one type of the gates as specified above. Clearly, the resulting number of labels is at most 3 times larger than the original one. The resulting structure is not necessarily an scd. In particular, there may be situation when the graph associated with a node is the same as the graph associated with the parent node because the union operation in the parent has been reversed by the splitting. Also, the new adjacency operation may become applied between more than one pair of labels. However, a legal scd is easy to recover: the 'redundant' parent nodes can be removed (since they are unary this will no cause problems with the structure of the binary tree) and each node with a multiple adjacency operation can be replaced by a sequence of nodes applying these operations one by one.
Given a type respecting scd (T, G), let us construct the circuit F * . In the first stage, we associate each label S ∈ S with a set of gates as follows:
-If S is non-singleton then it is associated with an and gate denoted by oand(S) and an or gate denoted by oor(S).
-If S is non-singleton and does not contain input gates then it is associated with an additional gate called in(S) whose type is determined as follows: If S is an and or or label then in(S) is an and or or gate, respectively. If S is a unary label then in(S) is a circuit (perceived as a single atomic gate) consisting of two not gates, the output of one of them is the input of the other. So, the input of the former and the output of the latter are, respectively, the input and output of in(S). -Each singleton label {g} is associated with the gate g of F . We call the gates associated with singleton labels original gates because they are the gates of F * appearing in F . For the sake of uniformity, for each original gate g associated with label S, we put g = oand(S) = oor(S) = in(S).
The wires of F * are described below. When we say that there is a wire from gate g 1 to gate g 2 , we mean that the wire is from the output of g 1 to the input of g 2 .
-Child-parent wires. Let S 1 and S 2 be labels of (T, G) such that S 1 is a child of S 2 . Then there is a wire from oand(S 1 ) to oand(S 2 ) and a wire from oor(S 1 ) to oor(S 2 ). -Parent-child wires. Let S 1 and S 2 be as above and assume that S 2 does not contain input gates. Then there is a wire from in(S 2 ) to in(S 1 ). That is, the direction of child-parent wires is opposite to the direction of parent-child wires.
3
-Adjacency wires. Assume that in (T, G) there is an arc from S 1 to S 2 (established by the new adjacency node). Then the following cases apply:
• If S 2 is an and label then put a wire from oand(S 1 ) to in(S 2 ).
• If S 2 is an or label then put a wire from oor(S 1 ) to in(S 2 ).
• If S 2 is a unary label consisting of negation gates only then put a wire from an arbitrary one of oand(S 1 ) or oor(S 1 ) to in(S 2 ).
Finally, we remove in(S) gates that have no inputs. This removal may be iterative as removal of one gate may leave without input another one.
It is not hard to see by construction that F and F * have the same input gates. This gives us possibility to state the following theorem with proof in Section 4.1.
Theorem 2. F
* is a well formed circuit. The output of each original gate g of F * computes exactly the same function (in terms of input gates) as in F .
In Section 4.2, we prove that the treewidth of F * is not much larger than the width of (T, G). Theorem 3. There is a tree decomposition of F * with at most 2|F | bags having width at most 3k + 2, where k is the width of (T, G). Now we are ready to prove Theorem 1. Proof of Theorem 1 Due to Theorem 2, f unctions(F ) ⊆ f unctions(F * ). If we take (T, G) to be of the smallest possible type respecting width then the treewidth of F * is at most 18k + 2 by combination of Theorem 3 and Lemma 1. To compute the number of gates of F * , let n be the number of gates of F , which is also the number of singleton labels of (T, G). Since each non-singleton label has two children (i.e. in the respective tree of labels each non-leaf node is binary), the number of non-singleton labels is at most n − 1. By construction, F * has one gate per singleton label plus at most 3 gates per non-singleton label, which adds up to at most 4n.
The technical details of the runtime derivation are postponed to the appendix.
Proof of Theorem 2
We start with establishing simple combinatorial properties of F * (Lemmas 2,3, 4,5). A path in a circuit is a sequence of gates so that the output of every gate (except the last one) is connected by a wire to the input of its successor. Let us call a path a connecting path if it contains exactly one adjacency circuit.
Lemma 2. -Any path P of F * starting at an original gate and not containing adjacency wires contains child-parent wires only.
-Any path P of F * ending at an original gate and not containing adjacency wires contains parent-child wires only.
Proof. The only possible wire to leave the original gate is a child-parent wire. Any path starting from an original gate and containing child-parent wires only ends up in an oand or oor gate. This means that the next wire (if not an adjacency one) can be only another child-parent wire. Thus the correctness of the lemma for all the paths of length i implies its correctness for all such paths of length i + 1, confirming the first statement.
For the second statement, we start from an original gate and go back against the direction of wires. The reasoning similar to the previous paragraph applies with the in gates of non-singleton labels replacing the oor and oand ones.
Lemma 3. Let g 1 and g 2 be gates of F such that g 2 is an and or an or gate. Then there is a wire from g 1 to g 2 in F if and only if F * has a connecting path from g 1 to g 2 such that all the gates of this path except possibly g 1 are of the same type as g 2 .
Proof. We prove only the case where g 2 is an and gate, the other case is symmetric. Let P be a connecting path of F * from g 1 to g 2 of the specified kind. Let g In particular, there is a wire from g 1 to g 2 in F . Conversely, assume that there is a wire from g 1 to g 2 in F . Then there are labels S 1 and S 2 containing g 1 and g 2 , respectively, such that (T, G) introduces an adjacency arc from S 1 to S 2 . By construction of F * there is a gate g It remains to be shown that the prefix and suffix do not intersect. However, this is impossible due to the disjointness of S 1 and S 2 .
Lemma 4. Let g 1 and g 2 be the gates of F such that g 2 is a not gate. Then F has a wire from g 1 to g 2 if and only if there is a connecting path P in F * from g 1 to g 2 with the adjacency wire (g Proof. Let P be a connecting path of F * of the specified form. Then either g ′ 2 = g 2 or g ′ 2 corresponds to a label containing g 2 . In both cases this means that F has a wire from g 1 to g 2 .
Conversely, assume that F has a wire from g 1 to g 2 . Then there are labels S 1 and S 2 containing g 1 and g 2 such that (T, G) sets an adjacency wire from S 1 to S 2 . Observe that S 1 cannot contain more than one element because in this case g 2 , a not gate, will have two inputs. Furthermore, either S 2 contains g 2 only or S 2 is a unary label containing negation gates only (because the input gates do not have input wires). In the latter case, the desired suffix from the head of the adjacency arc to g 2 follows by construction.
Lemma 5. Any path of F * between two original gates that does not involve other original gates is a connecting path.
Proof. First of all, let us show that any path of F * between original gates involves at least one adjacency wire. Indeed, by Lemma 2, any path leaving an original gate and not having adjacency wires has only child-parent wires. Such wires lead only to bigger and bigger labels and cannot end up with a singleton gate. It follows that at least one adjacency wire is needed.
Let us show that additional adjacency wires cannot occur without original gates as intermediate vertices. Indeed, the head of the first adjacency wire is an in gate of some label S. Unless S is a singleton, the only wires leaving in(S) are parent-child wires to the in gates of the children of S. Applying this argumentation iteratively, we observe that no other wires except parent-child wires are possible until the path meets the in gate of a singleton label. However, this is an original gate that cannot be an intermediate node in our path. It follows that any path between two original gates without other original cannot involve 2 adjacency wires. Combining with the previous paragraph, it follows that any such path involves exactly one adjacency wire, i.e. it is a connecting path.
Using the lemmas above, it can be shown that any cycle in F * involves at least one original gate and that this implies that F contains a cycle as well, a contradiction showing that F * is acyclic. The technical details of this derivation are provided in the lemma below. By construction, each wire connects output to input and there are no gates (except the input gates of course) having no input. It follows that F * is a well formed circuit.
Lemma 6. F * has no cycles.
Proof. Observe first that if F * has a cycle involving at least two original gates g 1 and g 2 then we can conclude existence of such cycle in F , which will supply us a desired contradiction. Indeed, let g 1 , . . . , g r be all the original gates of the cycle. Then, according to Lemma 5 there is a connecting path between any two consecutive singleton gates and also between g r and g 1 . Applying Lemmas 3, and 4 depending on the nature of the specific gates, we observe that in F there are wires from each g i to g i+1 (treating r + 1 = 1) that is, F has a cycle, a contradiction.
Furthermore, let us observe that existence of one original gate in a cycle implies existence of another one. Indeed, following the argumentation in the proof of Lemma 5, we observe that to arrive from a singleton gate to a singleton gate (even to itself) one has to go through an adjacency wire. However, the label on the other side of the adjacency wire is disjoint with the label of the tail side and thus when we start to descend through in(S) gates we eventually (without closing the cycle before that since we have not arrived yet at the initial original gate!) will encounter another original gate, different from the starting one. Similar argumentation means that any in-gate in a cycle imply the presence of a singleton gate eventually. This rules out adjacency and parent-child arc from a potential cycle and leaves us only with child-parent arc but they are acyclic by construction since they go from a smaller label to a larger one.
In the rest of the discussion we implicitly assume that F * is well formed without explicit reference to Lemma 6.
For each gate g of F * denote by f (g, F * ) the function computed by a subcircuit of F * rooted by g. We establish properties of these functions from which Theorem 2 will follow by induction. In the following we sometimes refer to f (g, F * ) as the function of g.
Proof. According to Lemma 4, F * has a path from g ′ to g where all vertices except the first one are not gates. Since all of them but the last one are doubled, there is an odd number of such not gates. Each not gate has a single input, hence the function of each gate of the path (except the first one) is the negation of the function of its predecessor. Hence these functions are, alternatively, the negation of the function of g ′ and the function of g ′ . Since the number of not gates in the path is odd, the function of g is the negation of the function of g ′ , as required.
In order to establish a similar statement regarding and and or gates we need two auxiliary lemmas.
is the disjunction of the functions of such gates.
Proof. We prove the lemma only for the oand gates as for the oor gates the proof is symmetric. The proof easily goes by induction. For an original gate this is just a conjunction of a single element, namely itself, and this is clear by construction. For a larger label S, it follows by construction that
, where S 1 and S 2 are the children of S. For S 1 and S 2 the rule holds by the induction assumption. Hence, f (oand(S), F * ) is the conjunction of all the functions of all the original gates in the union of S 1 and S 2 , the same as f (oand(S), F * ) is the conjunction of the functions of all the original gates contained in S, as required.
Let us call a path of F * semi-connecting if it starts with an adjacency wire and the rest of the wires are parent-child ones.
Lemma 9. Let S be an and label. Then f (in(S), F * ) is the conjunction of the functions of all gates from which there is a semi-connecting path to in(S). For the or label the statement is analogous with the conjunction replaced by disjunction.
Proof. We provide the proof only for the and label, for the or label the proof is analogous with the corresponding replacements of and by or and conjunctions by disjunctions.
The proof is by induction on the decreasing size of labels. For the largest and label S, all the input wires are the adjacency wires. Clearly the considered function is the conjunction of the functions of the gates at the tails of these adjacency wires. It remains to see if there are no more gates to arrive at in(S) by semi-connected paths. But any such gate, after passing through the adjacency wire must meet an ancestor of S and, by the maximality assumption, S has no ancestors.
The same reasoning as above is valid for any label S without ancestors. If S has ancestors, then f (in(S), F * ) is the conjunction of the functions of the gates at the tails of the adjacency wires incident to in(S) and the function of the in gate of the parent of S . By the induction assumption, this function is in fact a conjunction of the gates at the tails of the adjacency wires incident to in(S) plus those connected to in(S) by semi-connected paths through the parent. Since any semi-connected path either directly hits in(S) at the head of an adjacency wire or approaches it through the parent, the statement is proven.
Lemma 10. The function of any original and gate g of F * is the conjunction of the functions of the singleton gates whose outputs are the inputs of g in F .
The same happens for the or gate and the disjunction.
Proof. As before, we prove the statement for the and gate, for the or gate it is analogous with the respective substitutions. By construction and Lemma 9, f (g, F * ) is the conjunction of functions of all oand gates (since there are no other ones) connected to g by semi-connected paths. Let us call the labels of these oand gates the critical labels. Combining this with Lemma 8, we see that f (g, F * ) is in fact a conjunction of the functions of all original gates contained in the critical labels. It remains to show that these gates are exactly the in-neighbors of g in F . Let us take a particular in-neighbor g ′ . By Lemma 3, there is a connecting path from g ′ to g and by Lemma 8, the tail of the adjacency wire of this path is the oand gate of a critical label, so g ′ is in the required set. Conversely, assume that g ′ is a gate in the required set. Specify a critical label S g ′ belongs to. Clearly, there is a child-parent path from g ′ to oand(S) which, together with a semi-connected path from oand(S) to g, makes a connecting path. The latter means that in F there is a wire from g ′ to g according to Lemma 3, as required.
Proof of Theorem 2. Let us order the gates topologically and do induction on the topological order. The first gate is an input gate and the function of the input is just the corresponding variable both in F and in F * . Otherwise, the gate is and or or or not gate. In the former two cases, according to Lemma 10 the function of g in F * is the conjunction (or disjunction, in case of or) of the functions of its inputs in F , the same relation as in F . The theorem holds regarding the inputs by the induction assumption, hence the function of g in F * is the same as in F . Regarding the not gate, the argumentation is analogous, employing Lemma 7.
Proof of Theorem 3
Let us define the undirected graph H = H(T, G) called the representation graph of (T, G) as follows. The vertices of this graph are the labels of (T, G) and two vertices S 1 and S 2 are adjacent if and only if either S 1 is a child of S 2 (or vice versa of course) or S 1 and S 2 are adjacent in (T, G) (meaning that the new adjacency operation is applied on S 1 and S 2 ). We call the first type of edges child-parent edges and the second type adjacency edges.
Lemma 11. Let t be the treewidth of H. Then the treewidth of F * is at most 3t + 2.
Proof (Sketch). Observe that if we contract the gates in F * of each label into a single vertex, eliminate directions and remove multiple occurrences of edges, we obtain a graph isomorphic to H. The desired tree decompositom is obtained from the tree decomposition of H by replacing the occurrence of each vertex of H in a bag by the gates corresponding to this vertex. Thus, there is a tree decomposition of F * with at most 3(t + 1) elements in each bag, that is the treewidth of F * is at most 3t + 2.
Lemma 12. The treewidth of H is at most k, where k is the width of (T, G).
Proof. For each node t of T , let S(t) be the set of labels of the graph associated with t. Consider the structure (T, B) where B is a family of subsets of H associating with each node t a set B(t) consisting of vertices of H corresponding to the elements of S(t). We are going to show that (T, B) is a tree decomposition of graph H ′ obtained from H by removal of all child-parent edges. First of all, observe that for each v ∈ V (H), the subgraph T v of T consisting of all nodes containing v is a subtree of T . Let us consider T as a rooted tree with the root t being the same as in (T, G). Let t 1 and t 2 be two nodes containing v. Then one of them is an ancestor of the other. Indeed, otherwise t 1 and t 2 are nodes of two disjoint subtrees T 1 and T 2 whose roots t 
Since any label is a subset of the set of vertices of the graph it belongs to, S(t 1 ) and S(t 2 ) cannot have a common label and hence B(t 1 ) and B(t 2 ) cannot have a joint node. Furthermore, it is not hard to observe, if t 1 is ancestor of t 2 and S ∈ S(t 1 ) ∩ S(t 2 ) then S belongs to S(t ′ ) of all nodes t ′ in the path between t 1 and t 2 . Of course, the same is true regarding the node of H corresponding to S. Thus we have shown that if t 1 and t 2 contain v they cannot belong to different connected components of T v , confirming the connectedness of T v .
Next, we observe that if v 1 and v 2 are incident to an adjacency edge then there is a node t containing both v 1 and v 2 . Indeed, let S 1 and S 2 be the labels corresponding to v 1 and v 2 , respectively. Let t be the node where the adjacency operation regarding S 1 and S 2 is applied. Then both S 1 and S 2 belong to S(t) and, consequently, t contains both v 1 and v 2 . Finally, by construction, each vertex of H is contained in some node.
To obtain the desired tree decomposition of H, we are going to modify (T, B) to acquire two properties: that the number of nodes of the resulting tree is at most 2|F | and that each parent-child pair u, v is contained in some node t. For the former just iteratively remove all nodes whose operations are new adjacency. If the node t being removed is not the root then make the parent of t to be the parent of the only child of t (since t has only one child the tree remains binary). The latter property can be established by adding at most one vertex to each bag of the resulting structure (T ′ , B ′ ). Indeed, for each non-singleton label S, let t(S) be the node where this label is created by the union operation. Then both children of S belong to the only child of t(S). Let (T ′ , B * ) be obtained from (T ′ , B ′ ) as follows. For each non-singleton label S, add the vertex corresponding to S to the bag of the child of t(S). Since at most one new label is created per node of T ′ , at most one vertex is added to each bag. It is not hard to see both of the modifications preserve properties stated in the previous paragraphs and achieve the desired properties regarding the child-parent edges. Since each bag of (T, B ′ ) contains at most k + 1 elements, we conclude that the treewidth of H is at most k. Since the number of bags is at most as the number of labels, we conclude that the number of bags is at most 2|F | Proof of Theorem 3. Immediately follows from the combination of Lemmas 11 and Lemma 12.
Application to knowledge compilation
In this section we demonstrate an application of Theorem 1 to knowledge compilation by showing existence of an algorithm compiling the given circuit Z into dnnf. Both the time complexity of the algorithm and the space complexity of the resulting dnnf are fixed-parameter linear parameterized by the cliquewidth of Z. More precisely, the statement is the following:
Theorem 4. Given a single-output circuit Z of cliquewidth k, there is a dnnf of Z having size O(9 18 k 2 |Z|). Moreover, given a clique decomposition of Z of width k, there is a O(9 18k k 2 |Z|) algorithm constructing such a dnnf.
Theorem 4 is an immediate corollary of Theorem 1 and the following one:
Theorem 5. Given a single-output circuit Z of treewidth p, there is a dnnf of Z having size O(9 p p 2 |Z|). Moreover, such a dnnf can be constructed by an algorithm of the same runtime that gets as input the circuit Z and a tree decomposition of Z of width p having O(Z) bags.
The rest of this section is a proof of Theorem 5. Our first step is Tseitin transformation from circuit Z into a cnf F ′ . For this purpose we assume that Z does not have paths of 2 or more not gates. Depending on whether this path is of odd or even length, it can be replaced by a single not gate or by a wire, without treewidth increase. In this case the variables y 1 , . . . , y m of F ′ are the variables of Z and the outputs of and and or gates of Z. Under this assumption, it is not hard to see that the inputs of each gate are literals of y 1 , . . . , y m . Then the output x of Z is either y i or ¬y i for some i. Let us call x the output literal.
The cnf F ′ is a conjunction of the singleton clause containing the output literal and the cnfs associated with each and and or gate. Let C be an and gate with inputs t 1 , . . . , t r and output z. Then the resulting cnf is (t 1 ∨ ¬z) ∧ . . . ∧ (t r ∨ ¬z) ∧ (¬t 1 ∨ . . . ∨ ¬t r ∨ z). If C is an or gate then the resulting cnf is (¬t 1 ∨ z) ∧ . . . ∧ (¬t r ∨ z) ∧ (t 1 ∨ . . . ∨ t r ∨ ¬z). We call the last clause of the cnf of C the carrying clause w.r.t. C and the rest are auxiliary ones w.r.t. C and the corresponding input.
To formulate the property of Tseitin transformation that we need for our transformation, let us extend the notation. We consider sets of literals that do not contain a variable and its negation. For a set S of literals, V ar(S) is the set of variables of S. The projection P r(S, V ′ ) of S to a set V ′ of variables is the subset S ′ of S obtained by the removal of variables that are not in V ′ . Let S be a family of sets of literals over a set V of variables. Then the projection P r(S,
Denote by V ar(Z) and V ar(F ′ ) the sets of variables of Z and F ′ , respectively. Let us say that a set S of literals with V ar(S) = V ar(Z) is a satisfying assignment of Z if Z is true on the truth assignment on V ar(Z) that assigns all the literals of S to true. For a cnf, the definition is analogous. The well known property of Tseitin transformation is the following:
Lemma 13. Let S 1 and S 2 be the sets of satisfying assignments of F ′ and Z, respectively. Then P r(S 1 , V ar(Z)) = S 2 .
Lemma 13 is useful because of the following nice property of dnnf. Thus it follows from Lemmas 13 and 14 that having compiled F ′ into a dnnf D ′ , a dnnf D of Z can be obtained by replacing the variables of V ar(F ′ )\V ar(Z) with the true constant. Clearly, this does not incur any additional gates. In order to obtain a dnnf of F ′ , we observe that the treewidth of the incidence graph of F ′ is not much larger than the treewidth of Z. Proof. Let F ′′ be the cnf obtained from F ′ by removal of all the clauses but the carrying ones and let G ′′ be the respective incidence graph. Transform (T, B) into (T, B ′′ ) as follows:
-Replace each occurrence of an and or or gate X with the respective carrying clause and the variable corresponding to the output of X. -Replace each occurrence of a not gate with the variable corresponding to the input of the gate (it may either be an input variable of Z or the output variable of some and or or gate).
Let us show that (T, B ′′ ) is indeed a tree decomposition of G ′′ of width 2p + 1. Each element of a bag of B is replaced by at most 2 elements, hence the size of a bag is at most twice the maximal size of bag of B, i.e. at most 2(p + 1). Consequently the width of (T, B ′′ ) is at most 2p + 1. Let us verify the connectedness property. An original variable x of Z is contained in a node t of (T, B ′′ ) if and only if in (T, B) t contains either x or the NOT gate Y with input x. By the connectedness property both nodes of (T, B) containing x and those containing Y form subtrees and by the adjacency property, these subtrees have at least one joint vertex. It follows that their union forms a subtree of T . Each new variable y corresponding to a gate C of Z is contained in exactly those nodes of (T, B ′′ ) that contain C or the negation of C in (T, B) . Exactly the same argument as in the previous case ensures connectedness regarding y. Finally each carrying clause C is contained in exactly those nodes of (T, B ′′ ) that contain the corresponding gate in (T, B) . So, the connectedness regarding C follows from the connectedness property of (T, B). Thus we have established the connectedness of (T, B ′′ ). To establish the adjacency property, let C be a carrying clause corresponding to a gate X in C and let v be a variable occurring in C. If v corresponds to the output of C then the adjacency follows by construction because v is explicitly put in those clauses where X appears. So, assume that v corresponds to an input of X. If v is an original variable then (T, B) has a node t containing a literal of v and X. By construction, in (T, B ′′ ), t contains v and C. So, assume that v is the output variable of some gate X ′ and let C ′ be the corresponding carrying clause of X ′ . It follows that in (T, B) there is anode t containing both X and X ′ . Consequently, in (T, B ′′ ) the same node t contains C, C ′ , v. So, the adjacency property has been established and we conclude that (T, B ′′ ) is indeed a tree decomposition of G ′′ of width 2p + 1. Next, we observe that for each and or or gate X of Z and for each variable u of F ′ corresponding to an input of X and for variable y of F ′ corresponding to the output of X, there is a node t of (T, B ′′ ) containing both y and u. Indeed, let C be the carrying clause corresponding to X. By construction, whenever t contains C, t also contains y. By the adjacency property, there is at least one t containing C and u. Since this last t contains also y, this is a desired clause. Pick one node with the specified property and denote it by t(y, u). Add to T a new node t ′ with t(y, u) being its only neighbor. The bag of t ′ will contain y, u, and C(y, u) the auxiliary clause of X corresponding to the input u. Do so for all the auxiliary clauses. Finally, let y be the variable occurring in the output clause (the clause containing the output literal). Specify a node t containing this variable. Add a new node t ′ for whom t is the only neighbor and add y and the output clause to the bag of t ′ . Let (T * , B * ) be the resulting structure. Clearly the connectedness is preserved and the adjacency property is established for the clauses of F ′ that are not included in F ′′ . It follows that (T * , B * ) is a tree decomposition of G ′ by construction, its width does not exceed the width of (T, B ′ ), i.e. at most 2p + 1 and the additional O(p 2 |Z|) nodes (their number is bounded by the number of wires of Z plus 1 for the output clause) are leaves. The desired runtime of the transformation from (T, B) to (T * , B * ) clearly follows from the above description.
It remains to show that a space-efficient dnnf can be created parameterized by the treewidth of the incidence graph.
Theorem 6. Let F be a cnf and let (T ′ , B ′ ) be a tree decomposition of the incidence graph of F . Then F has a dnnf of size O(3 t |T ′ |) where t is the width of (T ′ , B ′ ). Moreover, given F and (T ′ , B ′ ) such a dnnf can be constructed by an algorithm having the same runtime.
The proof of Theorem 6 is provided in Section 5.1. Proof of Theorem 5. The construction of a dnnf for Z consists of 4 stages: transform Z into F ′ by the Tseitin transformation; transform the tree decomposition of Z into a tree decomposition of the incidence graph of F ′ ; obtain a dnnf of F ′ as specified by Theorem 6 and obtain a dnnf of Z as specified in Lemma 14. The correctness of this procedure follows from the above discussion. The time and space complexities easily follow from the combination of the complexities of intermediate stages.
Proof of Theorem 6
The proof of Theorem 6 is based on the same idea as the proof that a CNF with the width of the primary graph at most p has a dnnf of size O(2 p n) [3] . The difference is that we have to take into account that the bags of the tree decomposition contain clauses as well as variables. Let us introduce notation. Let F be the CNF whose dnnf we are going to construct, G be the incidence graph of F and (T, B) be a tree decomposition of G. In what follows we identify the vertices of G with the respective variables and clauses. For each node t of T , we denote the bag of T by B(t). Recall that for an element a ∈ B(t) (either a variable or clause), we say that t contains a. We assume that (T, B) is a minimal tree decomposition in the sense that removal of any element from a bag violates a tree decomposition property. This assumption is not constraining because such tree decomposition is easy to obtain by iterative removal of nodes from the bags until no further removal is possible.
We pick an arbitrary node tr of T and let to be the root and in what follows we consider T to be a rooted tree. We assume w.l.o.g. that T has at most 2 children. Indeed, otherwise, if some node t has children t 1 , . . . , t r for r > 2, we introduce additional nodes t ′ 2 , . . . , t ′ r make the sequence t, t ′ 2 , . . . , t ′ r going from the parent to a child, t 1 remains a child of t and for each t ′ i , node t i becomes the additional child. The bags of t ′ 2 , . . . , t ′ r are made identical to the bag of t. Such transformation increases the number of nodes at most twice and hence proving the theorem for such transformed tree preserves the desired asymptotic.
We consider only sets of literals with at most one literal per variable. For a set S of literals, let V ar(S) be the set of variables whose literals occur in S. We denote by Cl(F ) and V ar(F ) the set of clauses and variables of F . For a node t of T we denote by Cl(t) and V ar(t) the set of clauses and variables contained in t. For a subtree T ′ of T , Cl(T ′ ) and V ar(T ′ ) denote the set of clauses and variables contained in the nodes of T ′ . For a clause C and a set V of variables, we denote by P r(C, V ) the projection of C to V i.e. the clause obtained by the removal from C the occurrences of all the variables that are not in V . Recall that for a set S of literals, we use P r(S, V ) with the analogous meaning. For a CNF F ′ , we denote by P r(F ′ , V ) the CNF obtained from F ′ by projecting all of its clauses to V . For a subtree T ′ of T , we denote P r(F, V ar(T ′ )) by F (T ′ ).
Let us call two circuits (formulas including CNF are regarded a special cases of circuits) equivalent if they have the same set of variables and the same set of satisfying assignments. One way to create a formula equivalent to the given CNF is Shannon expansion. Let F be a CNF and let x be a variable of F . Then F |x denotes the CNF obtained from F by removal of all the clauses containing x and removal from of all the occurrences of ¬x from the remaining clauses. It is known that (F |x)x ∨ (F |¬x)¬x is equivalent to F . Applying this transformation over a set V of variables works as follows. Let S be a set of literals such that V ar(S) = V . Analogously to F |x, F |S is the CNF obtained from F by removal of all the clauses containing the occurrences of S and removal of the occurrences of the opposite literals from the remaining clauses. Let us call the disjunction V ar(S)=V (F |S)S the generalized Shannon expansion of F w.r.t. V . Applying the Shannon expansion inductively, it is not hard to show that the generalized Shannon expansion of F w.r.t. V is equivalent to F .
Let us extend our notation. We denote by F \ C ′ the set of clauses obtained from F by removal of all the clauses of C ′ . Let t ′ be the root of T ′ and let C ′ ⊆ Cl(t ′ ) and let S be a set of literals assigning a set of variables V ′ ⊆ V ar(t ′ ). We denote (P r(F \C ′ , V ar(T ′ )))|S by F (T ′ , C ′ , S) and call it a residual of F (T ′ ) (induced by C ′ and S if the context requires mentioning it). When S or C ′ is empty, we can use F (T ′ , C ′ ) and F (T ′ , S) with the obvious meaning. If S assigns all the variables contained in t ′ , we say that
Lemma 16. Any residual or extended residual of F (T ′ ) is equivalent to a disjunction of EBRs of F (T ′ ).
Lemma 17. Let t 1 be a child of t ′ and let T 1 be the subtree rooted by t 1 . Let C be a clause of F containing an occurrence of a variable x ∈ V ar(T 1 ) \ V ar(t ′ ). Then C is contained in a node of T 1 .
Proof. By the adjacency property, there must be a node t ′′ of T containing both C and x. This node cannot be t ′ by definition. This node cannot be anyone outside T 1 because otherwise, by the connectedness property, it will be required that x is contained in t ′ in contradiction to our assumption. It remains to conclude that t ′′ is a node of T 1 .
Lemma 18. Let t 1 and t 2 be 2 children of T ′ and let T 1 and T 2 be the subtrees of T rooted by them. Let C ∈ Cl(t 1 ) ∩ Cl(t 2 ). Then C contains occurrences of V ar(T 1 ) \ V ar(t ′ ) and of V ar(T 2 ) \ V ar(t ′ ).
Proof. Assume that C does not contain occurrences of, say, V ar(T 1 ) \ V ar(t ′ ). We claim that the occurrences of C can be removed from all the nodes of T 1 in contradiction to the minimality of (T, B). This removal clearly does not violate the connectedness property because the path between any two nodes outside of T 1 does not go through T 1 . As for adjacency property, let x be any variable contained together with C in a node of T 1 . If C and x are adjacent then x ∈ V ar(T 1 ) ∩ V ar(t ′ ) and therefore, their adjacency is witnessed by the bag of
be the union of C ′ and the set of clauses of t ′ satisfied by S. We call the set P r((Cl(
Lemma 19. Let t 1 and t 2 be the children of t ′ rooting repective subtrees T 1 and
. Moreover,the set of clauses containing occurrences of both V ar(T 1 ) \ V ar(T 2 ) and V ar(T 2 ) \ V ar(T 1 ) is precisely the branching set of F (T ′ , C ′ , S).
. For the second statement, let C be a clause of F (T ′ , C ′ , S) containing entries of both V ar(T 1 ) \ V ar(T 2 ) and V ar(T 2 ) \ V ar(T 1 ). This means that there is a clause C or of F such that C = P r(C or , V ar(T ′ ))|S. According to Lemma 17,
)|S in particular contains P r(C or , V ar(T ′ ))|S = C. Conversely, assume that C belongs to the branching set of F (T ′ , C ′ , S). It follows that there is a clause C or ∈ Cl(t 1 ) ∩ Cl(t 2 ) \ C ′′ related to C as defined above. By the connectedness property, Cl(t 1 ) ∩ Cl(t 2 ) ⊆ Cl(t ′ ) that is, C or is contained in t ′ . Consequently, since C or / ∈ C ′′ , we conclude that C or is not satisfied by S. It follows that C = P r(C or , V ar(T ′ ))|S is a clause of F (T ′ , C ′ , S). According to Lemma 18, C or contains occurrences of x ∈ V ar(T 1 ) \ V ar(t ′ ) and y ∈ V ar(T 2 ) \ V ar(t ′ ). By definition, both x and y belong to V ar(T ′ ) \ V ar(S) and hence they are preserved in C. By the connectednes property, x ∈ V ar(T 1 ) \ V ar(T 2 ) and y ∈ V ar(T 2 ) \ V ar(T 1 ), hence the opposite direction holds.
Another method of equivalence preserving transformation is clausal expansion. Let C be a clause of a CNF F and I 1 , I 2 be a partition of C. Then, it follows from De Morgan laws that (F \ {C} ∧ I 1 ) ∨ (F \ {C} ∧ I 2 ) is equivalent to F . We extend this to the generalized clausal expansion. Let C * be a set of clauses of F . For each C ∈ C * , define a partition I 1 (C), I 2 (C). Let I be the set of all CNFs I whose set of clauses are exactly one of I 1 (C), I 2 (C) for each C ∈ C * . Then I∈I (F \ C * ) ∧ I is equivalent to F and called a generalized clausal expansion of F w.r.t. I.
Lemma 20. Let T ′ be a subtree of T with root t ′ and assume that t ′ has two children t 1 and t 2 and let T 1 and T 2 be the subtrees of T rooted by t 1 and t 2 , respectively. Then each basic residual
is either unsatisfiable or can be represented (for some r) as a (F 1 ∨ . . . ∨ F r ) where each p−p1−p2 . It follows that the number of required EBRs is at most 2 p , each of them formed as the conjunction of the respective BR, available as one of outputs of D and the set of at most p literals requiring O(p) gates for their computation. Thus we conclude that O(2 p p) gates will be enough for computing of all the required EBRs of F (T 1 ) and F (T 2 ). Summing up numbers of gates considered throughout the proof, we conclude that O(3 p ) additional gates will be sufficient for our purpose.
Proof of Theorem 6. We order nodes of T so that every child appears before its parent. By induction on this order relation, we prove that it is possible to construct a DNNF of size O(3 p |T |) whose outputs compute all BRs of F (T ′ ) for all the subtress T ′ of T . To make Lemma 21 working for the case where a nonleaf node has only one child, we extend T so that such nodes have an additional child being a leaf node with the empty bag.
Let T ′ be a subtree of T consisting of a single node being a leaf. The only BRs of such node are constant true and f alse functions. Thus the number of BRs over all leaf nodes is O(1), so regarding these nodes the inductive claim holds. Applying Lemma 21 inductively, for each non-singleton subtree T ′ , we observe that in order to compute basic residuals of F (T ′ ) requires at most 3 p additional gates, so the claim stands for each non-singleton subtree T ′ as well and for T in particular. It remains to compute F (T ). Applying the generalized Shannon expansion, we observe that F (T ) is a disjunction of at most 2 p EBRs of F (T ) , however the additional O(2 p ) gates preserve the asymptotic. The runtime of this construction is discussed in detail in the appendix.
Discussion
In this paper we presented a theorem that shows that a circuit of treewidth k can be transformed into, roughly speaking, an equivalent circuit of treewidth 9k + 2 with at most 4 times more gates. A consequence of this statement is that any space-efficient knowledge compilation parameterized by the treewidth of the input circuit can be transformed into a space efficient knowledge compilation parameterized by the cliquewidth of the input circuit. We elaborated this consequence on the example of dnnf. As a result we obtained a theoretically efficient but formidably looking space complexity of (9 18k k 2 n). Therefore, the first natural question is how likely it is that this huge exponent base can be reduced.
The next question for further investigation is to check if the proposed upper bound can be applied to sdd [5] which is more practical than dnnf in the sense that it allows a larger set of queries to be efficiently handled. To answer this question positively, it will be sufficient to extend Theorem 6 to the case of sdd, the 'upper' levels of the reasoning will be applied analogously to the case of dnnf.
It is important to note that rankwidth is a better parameter for capturing dense graphs than cliquewidth in the sense that rankwidth of a graph does not exceed its treewidth plus one [12] as well as cliquewidth [13] , while cliquewidth can be exponentially larger than treewidth (and hence rankwidth) [1] . Also, computing of rankwidth, unlike cliquewidth, is known to be FPT [11] . Therefore, it is interesting to investigate the relationship between rankwidth and treewidth of a Boolean function. For this purpose rankwidth has to be extended to directed graphs [15] . It is worth saying that if the question is answered negatively, i.e. that treewidth of a circuit can be exponentially larger than its rankwidth, it would be an interesting circuit complexity result.
Finally, recall that all the upper bounds on the dnnf size obtained in this paper are polynomial in the size of the circuit that can be much larger than the number of variables. On the other hand, the upper bound on the dnnf size parameterized by the treewidth of the primal graph of the given cnf is polynomial in the number of variables [3] . Can we do the same in the circuit case?
A Cliquewidth vs.simplified cliquewidth
To define cliquewidth, we introduce a graph G = (V, E, L) where L is a function from V to the set of natural numbers. To distinguish from the labeled graph in the scd, we can call such graph numerically labeled.
We introduce the following operations on graphs.
-Let i be a number that L(u) = i for all u ∈ V (G) and let v / ∈ V (G). The operation i(v) adds a new vertex v to the graph and extends L so that the corresponding number of v is i.
-Let i, j be two numbers having non-empty preimages in L. Then η i,j adds all possible edges between vertices labeled with i and vertices labeled with j. -The operation ρ i,j changes to j all vertices having label i.
be two graphs with disjoint sets of vertices. Then the result of disjoint union
A clique decomposition is a binary rooted tree T every node of which is associated with a numerically labeled G(t) graph and the following rules are observed.
-Each leaf node is associated with a single vertex graph.
-Let t be a node having the only child t 1 . Then G(t) is obtained from G(t 1 ) by one of the first 3 operations in the above list. -If t is a binary node with children t 1 and t 2 then G(t) = G(t 1 ) ⊕ G(t 2 ).
The width of the given clique decomposition is the smallest k such that the images of all vertices of all graphs G(t) are members of [1, . . . , k]. The cliquewidth of the given graph G is the smallest k such that there is a clique decomposition T with the root r such that G = (V (G(r) ), E(G(r))), i.e. the function L(G(r)) may be arbitrary.
For the rest of the discussion we need to choose sutiable terminology. First, abusing the notation, we associate the decompositions with their trees, especally as the function G(t) allows to obtain the graph associated with a particular node. Let G be a numerically labeled graph. Then G ′ = Lb(G) is a labeled graph such that the elements of S(G ′ ) are sets of vertices assigned with the same number by L. Let us call the number of images of the elements of the numerically labeled graph G the width of G. Finally for a rooted tree T we denote by r(T ) the root of T .
Lemma 22. For any clique decomposition T there is an scd T s such that Lb(G(r(T )) = G(r(T s )) and the width of T s is at most twice larger than the width of T .
Proof. The proof is by induction on the height of T . If T is a leaf with the only node associated with a graph G then we create a single-node scd associated with Lb(G). Otherwise, assume that r(T ) has the only child r 1 ad let T 1 be the subtree (and the respective clique decomposition) rooted by r 1 . By the induction assumption, there is T 
, that is, we even do not add a new vertex. A direct inspection shows that the lemma holds in all the considered cases.
Assume now that r(T ) is a binary node and let r 1 and r 2 be the children of r(T ) and let T 1 and T 2 be the subtrees of T rooted by r 1 and r 2 , respectively. By the induction assumption, there are trees (and the respective scds) T Clearly, the widh of G ′ is at most twice larger than the width of G(r(T )) and he width of the rest of the additional nodes of T s is smaller than the treewidth of G ′′ . Finally,it is not hard to see that G(z x ) = G ′′ . Thus the lemma holds for the considered case.
That the scw of a graph G is at most larger than the cliquewidth of G immediately follows from Lemma 22 B Runtime for Theorem 1
B.1 Data structure for clique decomposition
The above approach to define the scd is convenient for our reasoning, however the explicit representation (T, G) is too time consuming as input for an algorithm. Instead, each node of the tree can be associated with the respective operation with pointers to labels required to perform the operation, thus requiring a constant memory per node of T .
It is not hard to see by an inductive argumentation that any two elements in S are either disjoint or one is a subset of the other. The labels are naturally organized into a binary tree according to the child-parent relation with the singleton nodes being leaves. It is thus not hard to see that the number of labels ts at most 2n − 1.
We are going to show that the number of nodesof T is O(kn), where k i the width of (T, G). Let S ∈ S(G(t)) for some node t of T . Then we say that t contains S.
For each binary node t, let us identify one of the subtrees rooted by a child of this node as the left subtree and the other one as the right subtree. Then define a DAG D on the labels of (T, G) as follows. The pair (S 1 , S 2 ) is an arc of D if one of the following conditions hold.
-S 1 is contained in the node where S 2 is created as a result of union of labels or adding a new vertex operation. -Both S 1 ad S 2 are contained in a binary node, so that S 1 is contained in the left subtree, while S 2 is contained in the right subtree.
Lemma 23. Labels S 1 and S 2 are contained in the same node of t if and only if either (S 1 , S 2 ) or (S 2 , S 1 ) is an arc of D.
Proof. By induction on the height of the node of T . For a leaf this is obvious. Consider a non-leaf node t. If this node satisfies one of the two conditions above, we are done. Otherwise, if t is a unary node then both S 1 and S 2 are contained in the only child of t, so the statement holds by the induction assumption. If none of the above happens then t is a binary node and both S 1 and S 2 are contained in a node of either in the left subtree or in the right subtree. In any case both S 1 and S 2 are contained in a node of a smaller height and again the induction assumption applies.
By definition of graph D, the in-degree of each vertex is at most k − 1. Since there are O(n) labels, it follows that the number of arcs of D is O(kn). It follows from Lemma 23 that the number of pairs of labels contained in the same node is O(kn). Consequently,the number of new adjacency nodes is O(kn). Since the number of the rest ofthe nodes is O(n), we conclude that the number of nodes of T is O(kn).
B.2 The procedure
We are going to demonstrate an O(kn) time procedure that constructs a mixed (having both directed and undirected edges) graph H * whose nodes correspond to the labels and two labels will be connected by either directed child-parent arcs (going from the child to the parent) or undirected adjacency arcs. The size of this graph (the number of vertices plus the number of arcs) will be O(kn). Also, each label will be associated with a type (and, or, or unary). F * can be straightforwardly obtained from H * by simply substituting labels with suitable gates and the arcs with suitable wires as specified by the description of F * , implying the O(kn) construction time for F * . Recall that for algorithmic purposes the scd is represented as a tree whose nodes are associated with operations with pointers to the labels. In the resulting graph the vertices will be associated with labels. Each label will be supplied with the adjacency list specifying the parent and children and label connected by the adjacency arcs (if any). Each label and each arc are the result of some operation. Therefore, exploring the tree in a topological order from the leaves to the root, we will be able to reconstruct H * . We start from the empty graph. If the currently considered operation is adding a new vertex (gate) of F then give the corresponding singleton label the type of this gate (and, or, or unary) . If the operation is union of two labels S 1 and S 2 then introduce the child-parent arcs from S 1 and S 2 to S 1 ∪S 2 . Technically this means following the pointer to the label S 1 ∪S 2 and adding pointers to S 1 and S 2 to the adjacency list marking them as children and, similarly, adding S 1 ∪ S 2 as the parent to the adjacency lists of S 1 and S 2 . Also, the type of S 1 ∪ S 2 is as the type of S 1 (or of S 2 , they are the same by definition of type respecting clique decomposition). Accordingly, the adjacency operation results in adding the adjacency arc between the respective labels. Notice that the binary node of the clique decomposition tree does not introduce any changes: it requires union of two disjoint graphs but at the time of exploration of the node, the union has been already performed because all the modifications specified above are done on the same graph, whose nodes are the set of labels of the scd.
It is not hard to see from the description that the above procedure takes O(1) time per node of T . Since the number of nodes of the tree is O(kn), the desired bounds follow.
It is not hard to observe that the graph H defined is Section 4.2 is isomorphic to H * except that H * assigns types to nodes and directions to edges. We will establish the tree decomposition of H as specified in the proof of lemma 12 following post-order exploration of T (children before the parents). The elements of bags will be represented by pointers to the corresponding labels. The only element contained in leaf node t is the vertex corresponding to the singleton label of t. Assume that t is not a leaf node. If the operation of t is new adjacency then remove t and make the parent of t (if any) to be the parent of the only child of t Otherwise, copy to the bag of t all the elements from the bags of the children of t. Then, if the operation of T is the union of labels S 1 and S 2 , then replace the vertices corresponding to S 1 and S 2 by the vertex corresponding to S 1 ∪ S 2 . It only remains to replace each label by the respective gates of F * . This algorithm spends O(k) time per node of T . It follows that the overall time is O(k 2 n).
C Runtime calculation for Theorem 6
The desired DNNF is constructed inductively from the leaves to the root. First, BRs for the subtrees rooted by leaves are constructed. Construction of BRs for a non-singleton subtree (having constructed BRs for the immediate subtrees) is done in 2 stages. First, the required residuals of the children are produced. Then the BRs of the considered subtree are produced as disjunctions of conjunctions of residuals of the children as specified in Lemma 20. Having constructed all the basic residuals, the desired output F is constructed as a residual F (T, ⊆, ⊆). The difficulty of this construction is finding pointers to the in-neighbors of the gate currently constructed. If implemented straightforwardly, the whole array of the currently existing gates may have been searched, making the construction runtime quadratic is the size of the dnnf being constructed. We propose a more sophisticated procedure based on amortized analysis that makes the runtime asymptotically the same as the size of the resulting dnnf. The description of the procedure provided below is divided into 3 subsections specifying the data structures, computation of residuals of the given subtree having computed all the basic residuals (including also computation of the root), and computation of the basic residuals. The final calculation of the runtime is given in the last subsection.
C.1 Data structures
The circuit is maintained in the form of adjacency list. Put it differently, there are records corresponding to each gate. These records contain the gate and the pointers to the records of the other gates who are in and out-neighbors of the corresponding gate of the considered record. The records are not located in a homogenouos array but rather grouped around the nodes of the tree decomposition (T, B). Let us see how to do that.
An important subset of the gates are those whose output are BRs. Sligtly abusing the notation, we call these gats BRs as well. The pointer to each BR F (T ′ , C ′ , S ′ ) is conatined in the record associated with the root of T ′ . At the time of construction of the circuit, it is important to very efficiently find the record associated with each gate of the DNNF being constructed. For this purpose, each BR F (T ′ , C ′ , S ′ ) is associated with the elements of C ′ and S ′ . In particular, such BRs are kept in an array, let us call it BR(T ′ ). The sets of variables and clauses of the input CNF are linearly ordered. This linear order is naturally projected to the set of clauses and variables contained in the root r ′ of T ′ . Denote by Cl(r ′ ) and V ar(r ′ ) the set of clauses and variables, respectively. Then the BRs of F (T ′ ) are put in correpondence with binary vectors indexed by Cl(r ′ ) ∪ V ar(r ′ ). The order of the respective coordinates is exactly as the order of the corresponding elements in the above mentioned order of variables and clauses. Let C be a clause contained in r ′ and let x be a vector corresponding to F (T ′ , C ′ , S ′ ). Then the coordinate of C is 0 if and only if C / ∈ C ′ , i.e. C is not removed. If Y i a variable contained in r ′ then the coordinate of Y in x is 1 if Y ∈ S ′ and 0 otherwise, i.e. ¬Y ∈ S ′ . The vectors x, considered as binary numbers, serve as array indices. This means that BR(T ′ ) [x] contains the pointer to the gate whose output is
. Consequently, given x, this pointer can be obtained in O(1). We call x the characteristic vector of F (T ′ , C ′ , S ′ ). Assume that r ′ is not the root and let r * be the parent of r ′ . Then the storage of T ′ also maintains an array RR(T ′ ) of pointers to the residuals
The vectors of RR(T ′ ) are enumerated analogously to BR(T ′ ) but indexed by elements of
) ordered according to the above mentioned order of variables and clauses of F .
C.2 Construction of residuals given basic residuals
It follows from Lemma 16 that each residual is a disjunction of EBRs. We are going to show how to construct the circuit computing the residuals provided the gates computing the BRs have already been constructed. The first step is simple. We go along the array RR(T ′ ) and specify in the record of each corresponding gate that this gate is a disjunction. Now we are going to create the rest of the circuit. The first step is to create a binary vector P attern indexed by Cl(r ′ ) ∪ V ar(r ′ ) exactly in the same order of coordinates as for BR(T ′ ) the 1 entries correspond precisely to the elements of I. We also need a binary vector ClV ar indexed in the ame way the element equals one if and only if the corresponding element of Cl(r ′ ) ∪ V al(r ′ ) is a clause. Both of these vectors can be prepared in a time polynomial in k. Since the whole time of the computation of RR(T ′ ) is exponential in k, this runtime may be not taken into account, so we do not elaborate on it anymore. Next, we process each element of BR(T ′ ). The processing of the given element BR(T ′ )[x] conists of 3 stages.
-Redundancy testing. On that stage the algorithm tests if the given BR is needed at all for the forming of the array . This operation can be performed in O(k) (we may safely assume that each coordinate is accompanied with pointers to both of the variables). The output of the obtained conjunction is the EBR S ′′ ∧ F (T ′ , C ′ , S ′ ). -Connecting the EBR to the input of the corresponding residual.
We create a new vector y and copy there the elements of x on coordinates i where P attern[i] = 1. Clearly, y can be created in O(k). Then we connect the output of the conjunction created in the previous item to the input of the residual pointedto by RR[T ′ ](y). This can be done in O(1).
It follows from Lemma 16 and by construction that each gate pointed to by an element of RR(T ′ ) is indeed a residual of T ′ as specified. Notice also that we have not applied the reuse of conjnctions of literals as was specified in the proof of Lemma 21, however, it does not increase the asymptotic space of O(2 k ) nor the runtime O(2 k k) spent to the contruction of residuals of T ′ . In the case of root r, we need to have the residual F (T, ∅, ∅) whose output is the function of the cnf F . This can be done according to the same scheme. That is, we explore the array BR(T ) extracting elements F (T, ∅, S) and forming the dijunctions of all S ∧ F (T, ∅, S).
C.3 Construction of basic residuals
We are now going to describe the construction of the rest of the DNNF including the gates whose outputs are BRs, their in-neighbours and the rest of incident arcs. Let T ′ be a subtree of T having only one node, that is its root is a leaf of T . In order to construct BR(T ′ ), we explore all the characteristic vectors x of the basic residuals of F (T ′ ). For the given x, BR(T ′ ) [x] points to the true constant if the corresponding BR is true constant and to the false constant otherwise (i.e. when the corresponding BR is a f alse constant). In order to keep the complexity of this step within the desired boundary, it is essential that the CNF would be represented in the form of adjacency matrix that allows O(1) testing if the given literal belongs to a particular clause. In this case, it is not hard to see that the complexity ofthis step is O(2 k k 2 ). Assume that T ′ contains more than one node. We assume w.l.o.g. that the root r ′ has two children r 1 and r 2 (the reasoning for one child is a restricted version of the reasoning for the case of two children). Again, we explore all the characteristic vectors of r ′ . Let x be such vector and let F (T ′ , C ′ , S ′ ) be the corresponding BR. The first step is to see if there is C ∈ Cl(r) \ (C ′ ∪ Cl(r 1 ) ∪ Cl(r 2 )) such that C is not satisfied by any literal of S ′ . If such C is found then the respective BR is unsatisfiable and all the algorithm has to do is to record the pointer to the f alse constant in BR(T ′ ) [x] . Otherwise, F (T ′ , C ′ , S ′ ) is represented as the disjunction of conjunctions of pairs of residuals of T 1 and T 2 . The number of such conjnctions over all the characteristic vectors is 3 k , hence it would not be difficult to design a procedure whose runtime is proportional to 3 k multiplied by a polynomial of k. However, we want to get rid of the polynomial factor and hence the procedure will be more tricky to enable the amortised analysis.
A standard data structure for amortised analysis is the binary counter. Consider a binary vector of k elements and let us compute the runtime of 2 k consecutive increments. Although the runtime of one particular increment can be as large as O(k) due to the carry the overall runtime is O(2 k ), i.e. O(1) per increment. In our construction, we use a refined version of binary counter which we call selective counter. In this counter, there are a number of fixed digits and the increment is performed only on the digits that are not fixed. There are a few ways how to keep information about non-fixed digits so that the next digit can be found in O(1). For example there may be a pointer to the rightmost non-fixed digit and each non-fixed digit can contain a pointer to the next one and the last digit also records some bit telling the algorithm about that. Let k 1 be the number of non-fixed digits. Then, it is not hard to see that 2 k1 increments can be performed in O(2 k1 ). It is important to notice that if we use decrement instead increment then we have the same upperbound on the runtime.
Back to the DNNF construction, given x, we introduce two vectors x 1 and x 2 . The coordinates of x 1 correspond to (Cl(r ′ ) ∩ Cl(r 1 )) ∪ (V ar(r ′ ) ∩ V ar(r 1 )). The element of x 1 whose coordinate corresponds to a variable v equals 1 if and only if v ∈ S ′ . Otherwise (i.e. if ¬v ∈ S ′ ) the element equals 0. The elements corresponding to clauses can be partitioned into the following 3 sets.
-Elements, whose coordinates correspond to clauses of C ′ , equal 0. -Elements, whose coordinates coorespond to clauses of Cl(r 1 ) \ (Cl(r 2 ) ∪ C ′ ), are 1.
-Elements, whose coordinates correspond to clauses of (Cl(r 1 ) ∩ Cl(r 2 )) \ C ′ , are 0.
The structure of vector x 2 is symmetric with the roles of r 2 and r 1 exchanged. The only difference is that elements of coordinates as in the last item of the above list are 1.
We treat vectors x 1 and x 2 as selective binary counters with (Cl(r 1 )∩Cl(r 2 ))\ C ′ being coordinates of non-fixed digits, the increment operation applied to x 1 and the decrement operation applied to x 2 . Then the algorithm proceeds as follows.
-We set the gate BR(T ′ ) [x] points to as the OR-gate. -We create the data structure with two items whose initial value is (x 1 , x 2 ) perceived as binary vectors as defined above. The only operation of this data strcture is the modification applying increment to the first item and decrement to the second one. Let k ′ be the number of non fixed digits in the above selective vectors. Then this data structure can be in 2 . The runtime spent to construction of BR(T ′ ) can be calculated as follows. Checking whether the given BR is a f alse constant takes a polynomial time per x, so the total time is 2 k multiplied by a polynomial of k. The same can be said regarding creation of the data structure as in the above list. Exploration of the states the data structure over all the vectors x takes O(3 k ). This follows from Lemma 20 and from the fact that by construction, the algorithm spends O(1) per such state.
The description of the procedure for creation of the dnnf is now complete. Summarising, the runtime calculations we see that it takes O(3 k ) time per node of the tree decomposition.
