    Kernelization of Constraint Satisfaction Problems:A Study through Universal Algebra

    A kernelization algorithm for a computational problem is a procedure which compresses an instance into an equivalent instance whose size is bounded with respect to a complexity parameter. For the Boolean satisfiability problem (SAT), and the constraint satisfaction problem (CSP), there exist many results concerning upper and lower bounds for kernelizability of specific problems, but it is safe to say that we lack general methods to determine whether a given SAT problem admits a kernel of a particular size. This could be contrasted to the currently flourishing research program of determining the classical complexity of finite-domain CSP problems, where almost all non-trivial tractable classes have been identified with the help of algebraic properties. In this paper, we take an algebraic approach to the problem of characterizing the kernelization limits of NP-hard SAT and CSP problems, parameterized by the number of variables. Our main focus is on problems admitting linear kernels, as has, somewhat surprisingly, previously been shown to exist. We show that a CSP problem has a kernel with O(n) constraints if it can be embedded (via a domain extension) into a CSP problem which is preserved by a Maltsev operation. We also study extensions of this towards SAT and CSP problems with kernels with O(n^c) constraints, c>1, based on embeddings into CSP problems preserved by a k-edge operation, k > c. These results follow via a variant of the celebrated few subpowers algorithm. In the complementary direction, we give indication that the Maltsev condition might be a complete characterization of SAT problems with linear kernels, by showing that an algebraic condition that is shared by all problems with a Maltsev embedding is also necessary for the existence of a linear kernel unless NP is included in co-NP/poly

    Satisfiability in multi-valued circuits

    Satisfiability of Boolean circuits is among the most known and important problems in theoretical computer science. This problem is NP-complete in general but becomes polynomial time when restricted either to monotone gates or linear gates. We go outside Boolean realm and consider circuits built of any fixed set of gates on an arbitrary large finite domain. From the complexity point of view this is strictly connected with the problems of solving equations (or systems of equations) over finite algebras. The research reported in this work was motivated by a desire to know for which finite algebras A\mathbf A there is a polynomial time algorithm that decides if an equation over A\mathbf A has a solution. We are also looking for polynomial time algorithms that decide if two circuits over a finite algebra compute the same function. Although we have not managed to solve these problems in the most general setting we have obtained such a characterization for a very broad class of algebras from congruence modular varieties. This class includes most known and well-studied algebras such as groups, rings, modules (and their generalizations like quasigroups, loops, near-rings, nonassociative rings, Lie algebras), lattices (and their extensions like Boolean algebras, Heyting algebras or other algebras connected with multi-valued logics including MV-algebras). This paper seems to be the first systematic study of the computational complexity of satisfiability of non-Boolean circuits and solving equations over finite algebras. The characterization results provided by the paper is given in terms of nice structural properties of algebras for which the problems are solvable in polynomial time.Comment: 50 page

    The Impact of Symmetry Handling for the Stable Set Problem via Schreier-Sims Cuts

    Symmetry handling inequalities (SHIs) are an appealing and popular tool for handling symmetries in integer programming. Despite their practical application, little is known about their interaction with optimization problems. This article focuses on Schreier-Sims (SST) cuts, a recently introduced family of SHIs, and investigate their impact on the computational and polyhedral complexity of optimization problems. Given that SST cuts are not unique, a crucial question is to understand how different constructions of SST cuts influence the solving process. First, we observe that SST cuts do not increase the computational complexity of solving a linear optimization problem over any polytope PP. However, separating the integer hull of PP enriched by SST cuts can be NP-hard, even if PP is integral and has a compact formulation. We study this phenomenon more in-depth for the stable set problem, particularly for subclasses of perfect graphs. For bipartite graphs, we give a complete characterization of the integer hull after adding SST cuts based on odd-cycle inequalities. For trivially perfect graphs, we observe that the separation problem is still NP-hard after adding a generic set of SST cuts. Our main contribution is to identify a specific class of SST cuts, called stringent SST cuts, that keeps the separation problem polynomial and a complete set of inequalities, namely SST clique cuts, that yield a complete linear description. We complement these results by giving SST cuts based presolving techniques and provide a computational study to compare the different approaches. In particular, our newly identified stringent SST cuts dominate other approaches

    Best-case and worst-case sparsifiability of Boolean CSPs

    We continue the investigation of polynomial-time sparsification for NP-complete Boolean Constraint Satisfaction Problems (CSPs). The goal in sparsification is to reduce the number of constraints in a problem instance without changing the answer, such that a bound on the number of resulting constraints can be given in terms of the number of variables n. We investigate how the worst-case sparsification size depends on the types of constraints allowed in the problem formulation (the constraint language). Two algorithmic results are presented. The first result essentially shows that for any arity k, the only constraint type for which no nontrivial sparsification is possible has exactly one falsifying assignment, and corresponds to logical OR (up to negations). Our second result concerns linear sparsification, that is, a reduction to an equivalent instance with O(n) constraints. Using linear algebra over rings of integers modulo prime powers, we give an elegant necessary and sufficient condition for a constraint type to be captured by a degree-1 polynomial over such a ring, which yields linear sparsifications. The combination of these algorithmic results allows us to prove two characterizations that capture the optimal sparsification sizes for a range of Boolean CSPs. For NP-complete Boolean CSPs whose constraints are symmetric (the satisfaction depends only on the number of 1 values in the assignment, not on their positions), we give a complete characterization of which constraint languages allow for a linear sparsification. For Boolean CSPs in which every constraint has arity at most three, we characterize the optimal size of sparsifications in terms of the largest OR that can be expressed by the constraint language

    A study of discrepancy results in partially ordered sets

    In 2001, Fishburn, Tanenbaum, and Trenk published a pair of papers that introduced the notions of linear and weak discrepancy of a partially ordered set or poset. Linear discrepancy for a poset is the least k such that for any ordering of the points in the poset there is a pair of incomparable points at least distance k away in the ordering. Weak discrepancy is similar to linear discrepancy except that the distance is observed over weak labelings (i.e. two points can have the same label if they are incomparable, but order is still preserved). My thesis gives a variety of results pertaining to these properties and other forms of discrepancy in posets. The first chapter of my thesis partially answers a question of Fishburn, Tanenbaum, and Trenk that was to characterize those posets with linear discrepancy two. It makes the characterization for those posets with width two and references the paper where the full characterization is given. The second chapter introduces the notion of t-discrepancy which is similar to weak discrepancy except only the weak labelings with at most t copies of any label are considered. This chapter shows that determining a poset's t-discrepancy is NP-Complete. It also gives the t-discrepancy for the disjoint sum of chains and provides a polynomial time algorithm for determining t-discrepancy of semiorders. The third chapter presents another notion of discrepancy namely total discrepancy which minimizes the average distance between incomparable elements. This chapter proves that finding this value can be done in polynomial time unlike linear discrepancy and t-discrepancy. The final chapter answers another question of Fishburn, Tanenbaum, and Trenk that asked to characterize those posets that have equal linear and weak discrepancies. Though determining the answer of whether the weak discrepancy and linear discrepancy of a poset are equal is an NP-Complete problem, the set of minimal posets that have this property are given. At the end of the thesis I discuss two other open problems not mentioned in the previous chapters that relate to linear discrepancy. The first asks if there is a link between a poset's dimension and its linear discrepancy. The second refers to approximating linear discrepancy and possible ways to do it.Ph.D.Committee Chair: Trotter, William T.; Committee Member: Dieci, Luca; Committee Member: Duke, Richard; Committee Member: Randall, Dana; Committee Member: Tetali, Prasa

    Robustness Verification of Tree-based Models

    We study the robustness verification problem for tree-based models, including decision trees, random forests (RFs) and gradient boosted decision trees (GBDTs). Formal robustness verification of decision tree ensembles involves finding the exact minimal adversarial perturbation or a guaranteed lower bound of it. Existing approaches find the minimal adversarial perturbation by a mixed integer linear programming (MILP) problem, which takes exponential time so is impractical for large ensembles. Although this verification problem is NP-complete in general, we give a more precise complexity characterization. We show that there is a simple linear time algorithm for verifying a single tree, and for tree ensembles, the verification problem can be cast as a max-clique problem on a multi-partite graph with bounded boxicity. For low dimensional problems when boxicity can be viewed as constant, this reformulation leads to a polynomial time algorithm. For general problems, by exploiting the boxicity of the graph, we develop an efficient multi-level verification algorithm that can give tight lower bounds on the robustness of decision tree ensembles, while allowing iterative improvement and any-time termination. OnRF/GBDT models trained on 10 datasets, our algorithm is hundreds of times faster than the previous approach that requires solving MILPs, and is able to give tight robustness verification bounds on large GBDTs with hundreds of deep trees.Comment: Hongge Chen and Huan Zhang contributed equall
