196 research outputs found

    Understanding and Improving SAT Solvers via Proof Complexity and Reinforcement Learning

    Get PDF
    Despite the fact that the Boolean satisfiability (SAT) problem is NP-complete and believed to be intractable, SAT solvers are routinely used by practitioners to solve hard problems in wide variety of fields such as software engineering, formal methods, security, and AI. This gap between theory and practice has motivated an entire line of research whose primary goals are twofold: first, to develop a better theoretical framework aimed at accurately capturing solver behavior and thus prove tighter complexity bounds; and second, to further experimentally verify the soundness of the theory thus developed via rigorous empirical analysis and design theory-inspired techniques to improve solver performance. This interplay between theory and practice is at the heart of the work presented here. More precisely, this thesis contains a collection of results which attempt to resolve the above-described discrepancy between theory and practice. The first two sets of results are centered around the restart problem. Restarts are classes of methods which aim at erasing part of the progress a solver may have made at run time, in order to help solvers escape from the ``bad parts'' of the search space. We provide a detailed theoretical analysis of the power of restarts used in modern Conflict-Driven Clause Learning (CDCL) SAT solvers, where we prove a series of equivalence and separation results for various configurations of solvers with and without restarts. From the intuition developed via this theoretical analysis, we design and implement a machine learning based reset policy, where resets are variants of restarts that erase activity scores in addition to the parts of the solver state erased by restarts. We perform extensive experimental work to show that our reset policy outperforms both baseline and state-of-the-art solvers over a class of cryptographic instances derived from bitcoin mining problems. In a different direction, we propose the concept of hierarchical community structure (HCS) for Boolean formulas. We first theoretically show that formulas with ``good'' HCS parameter values have short CDCL proofs. Then we construct an Empirical Hardness Model using the HCS parameters. These HCS parameters exhibit a robust correlation with solver run time, leading to the development of a classifier capable of accurately distinguishing between easily solvable industrial instances and challenging random/crafted scenarios. We also present scaling studies of formulas with HCS structures to further support of theoretical analysis. In the latter part of the thesis, the focus shifts to satisfaction-driven clause-learning (SDCL) solvers, known to be being exponentially more powerful than CDCL solvers. Despite the theoretical strength of SDCL, it remains a challenge to automate and determinize such solvers. To address this, we again leverage machine learning techniques to strategically decide when to invoke an SDCL subroutine, with the goal of minimizing the associated overhead. The resulting SDCL solver, enhanced with MaxSAT techniques and conflict analysis, outperforms existing solvers on combinatorial benchmarks, particularly demonstrating superior efficacy on Mutilated Chess Board (MCB) problems

    Understanding the Relative Strength of QBF CDCL Solvers and QBF Resolution

    Get PDF
    QBF solvers implementing the QCDCL paradigm are powerful algorithms that successfully tackle many computationally complex applications. However, our theoretical understanding of the strength and limitations of these QCDCL solvers is very limited. In this paper we suggest to formally model QCDCL solvers as proof systems. We define different policies that can be used for decision heuristics and unit propagation and give rise to a number of sound and complete QBF proof systems (and hence new QCDCL algorithms). With respect to the standard policies used in practical QCDCL solving, we show that the corresponding QCDCL proof system is incomparable (via exponential separations) to Q-resolution, the classical QBF resolution system used in the literature. This is in stark contrast to the propositional setting where CDCL and resolution are known to be p-equivalent. This raises the question what formulas are hard for standard QCDCL, since Q-resolution lower bounds do not necessarily apply to QCDCL as we show here. In answer to this question we prove several lower bounds for QCDCL, including exponential lower bounds for a large class of random QBFs. We also introduce a strengthening of the decision heuristic used in classical QCDCL, which does not necessarily decide variables in order of the prefix, but still allows to learn asserting clauses. We show that with this decision policy, QCDCL can be exponentially faster on some formulas. We further exhibit a QCDCL proof system that is p-equivalent to Q-resolution. In comparison to classical QCDCL, this new QCDCL version adapts both decision and unit propagation policies

    Towards a Theoretical Understanding of the Power of Restart in SAT solvers

    Get PDF
    Restart policy is a widely used class of techniques integral to the efficiency of conflict-driven clause-learning (CDCL) SAT solvers. While the utility of such policies has been well-established, to-date we still lack a deep theoretical understanding of why restart policies are crucial to the power of CDCL SAT solvers. In this paper, we provide a series of results that theoretically establish the power of restarts for various models of Boolean SAT solvers. More precisely, we make the following contributions. First, we show that certain model of CDCL solvers with restarts are no more powerful from a proof-complexity theoretic point of view than the same configurations without restarts. Second, we define \textit{decision depth} for DPLL proofs of an unsatisfiable formula φ\varphi, and then we relate decision depth of φ\varphi and size of DPLL proofs (or the running time of a DPLL based solver) for φ\varphi. Third, we introduce a new class of satisfiable instances called LaddernLadder_n, then we use decision depth as a tool to proved that a drunk style DPLL solver with restarts can solve LaddernLadder_n in polynomial time with high probability while the solvers with the same configuration without restarts have exponential run time with high probability. Finally, the crucial insight that drives this line of research is the fact that restarts add proof-theoretic or algorithmic power to solver configurations by compensating for the weaknesses of some other important heuristics like branching or value selection or clause learning

    Limits of CDCL Learning via Merge Resolution

    Get PDF
    In their seminal work, Atserias et al. and independently Pipatsrisawat and Darwiche in 2009 showed that CDCL solvers can simulate resolution proofs with polynomial overhead. However, previous work does not address the tightness of the simulation, i.e., the question of how large this overhead needs to be. In this paper, we address this question by focusing on an important property of proofs generated by CDCL solvers that employ standard learning schemes, namely that the derivation of a learned clause has at least one inference where a literal appears in both premises (aka, a merge literal). Specifically, we show that proofs of this kind can simulate resolution proofs with at most a linear overhead, but there also exist formulas where such overhead is necessary or, more precisely, that there exist formulas with resolution proofs of linear length that require quadratic CDCL proofs

    Hardness measures and resolution lower bounds

    Full text link
    Various "hardness" measures have been studied for resolution, providing theoretical insight into the proof complexity of resolution and its fragments, as well as explanations for the hardness of instances in SAT solving. In this report we aim at a unified view of a number of hardness measures, including different measures of width, space and size of resolution proofs. We also extend these measures to all clause-sets (possibly satisfiable).Comment: 43 pages, preliminary version (yet the application part is only sketched, with proofs missing

    Understanding and Enhancing CDCL-based SAT Solvers

    Get PDF
    Modern conflict-driven clause-learning (CDCL) Boolean satisfiability (SAT) solvers routinely solve formulas from industrial domains with millions of variables and clauses, despite the Boolean satisfiability problem being NP-complete and widely regarded as intractable in general. At the same time, very small crafted or randomly generated formulas are often infeasible for CDCL solvers. A commonly proposed explanation is that these solvers somehow exploit the underlying structure inherent in industrial instances. A better understanding of the structure of Boolean formulas not only enables improvements to modern SAT solvers, but also lends insight as to why solvers perform well or poorly on certain types of instances. Even further, examining solvers through the lens of these underlying structures can help to distinguish the behavior of different solving heuristics, both in theory and practice. The first issue we address relates to the representation of SAT formulas. A given Boolean satisfiability problem can be represented in arbitrarily many ways, and the type of encoding can have significant effects on SAT solver performance. Further, in some cases, a direct encoding to SAT may not be the best choice. We introduce a new system that integrates SAT solving with computer algebra systems (CAS) to address representation issues for several graph-theoretic problems. We use this system to improve the bounds on several finitely-verified conjectures related to graph-theoretic problems. We demonstrate how our approach is more appropriate for these problems than other off-the-shelf SAT-based tools. For more typical SAT formulas, a better understanding of their underlying structural properties, and how they relate to SAT solving, can deepen our understanding of SAT. We perform a largescale evaluation of many of the popular structural measures of formulas, such as community structure, treewidth, and backdoors. We investigate how these parameters correlate with CDCL solving time, and whether they can effectively be used to distinguish formulas from different domains. We demonstrate how these measures can be used as a means to understand the behavior of solvers during search. A common theme is that the solver exhibits locality during search through the lens of these underlying structures, and that the choice of solving heuristic can greatly influence this locality. We posit that this local behavior of modern SAT solvers is crucial to their performance. The remaining contributions dive deeper into two new measures of SAT formulas. We first consider a simple measure, denoted “mergeability,” which characterizes the proportion of input clauses pairs that can resolve and merge. We develop a formula generator that takes as input a seed formula, and creates a sequence of increasingly more mergeable formulas, while maintaining many of the properties of the original formula. Experiments over randomly-generated industrial-like instances suggest that mergeability strongly negatively correlates with CDCL solving time, i.e., as the mergeability of formulas increases, the solving time decreases, particularly for unsatisfiable instances. Our final contribution considers whether one of the aforementioned measures, namely backdoor size, is influenced by solver heuristics in theory. Starting from the notion of learning-sensitive (LS) backdoors, we consider various extensions of LS backdoors by incorporating different branching heuristics and restart policies. We introduce learning-sensitive with restarts (LSR) backdoors and show that, when backjumping is disallowed, LSR backdoors may be exponentially smaller than LS backdoors. We further demonstrate that the size of LSR backdoors are dependent on the learning scheme used during search. Finally, we present new algorithms to compute upper-bounds on LSR backdoors that intrinsically rely upon restarts, and can be computed with a single run of a SAT solver. We empirically demonstrate that this can often produce smaller backdoors than previous approaches to computing LS backdoors

    A game characterisation of tree-like Q-Resolution size

    Get PDF
    We provide a characterisation for the size of proofs in tree-like Q-Resolution and tree-like QU-Resolution by a Prover–Delayer game, which is inspired by a similar characterisation for the proof size in classical tree-like Resolution. This gives one of the first successful transfers of one of the lower bound techniques for classical proof systems to QBF proof systems. We apply our technique to show the hardness of three classes of formulas for tree-like Q-Resolution. In particular, we give a proof of the hardness of the parity formulas from Beyersdorff et al. (2015) for tree-like Q-Resolution and of the formulas of Kleine Büning et al. (1995) for tree-like QU-Resolution
    • …
    corecore