101,597 research outputs found

    Scale-Free Random SAT Instances

    Full text link
    We focus on the random generation of SAT instances that have properties similar to real-world instances. It is known that many industrial instances, even with a great number of variables, can be solved by a clever solver in a reasonable amount of time. This is not possible, in general, with classical randomly generated instances. We provide a different generation model of SAT instances, called \emph{scale-free random SAT instances}. It is based on the use of a non-uniform probability distribution P(i)∼i−βP(i)\sim i^{-\beta} to select variable ii, where β\beta is a parameter of the model. This results into formulas where the number of occurrences kk of variables follows a power-law distribution P(k)∼k−δP(k)\sim k^{-\delta} where δ=1+1/β\delta = 1 + 1/\beta. This property has been observed in most real-world SAT instances. For β=0\beta=0, our model extends classical random SAT instances. We prove the existence of a SAT-UNSAT phase transition phenomenon for scale-free random 2-SAT instances with β<1/2\beta<1/2 when the clause/variable ratio is m/n=1−2β(1−β)2m/n=\frac{1-2\beta}{(1-\beta)^2}. We also prove that scale-free random k-SAT instances are unsatisfiable with high probability when the number of clauses exceeds ω(n(1−β)k)\omega(n^{(1-\beta)k}). %This implies that the SAT/UNSAT phase transition phenomena vanishes when β>1−1/k\beta>1-1/k, and formulas are unsatisfiable due to a small core of clauses. The proof of this result suggests that, when β>1−1/k\beta>1-1/k, the unsatisfiability of most formulas may be due to small cores of clauses. Finally, we show how this model will allow us to generate random instances similar to industrial instances, of interest for testing purposes

    Community structure in industrial SAT instances

    Get PDF
    Modern SAT solvers have experienced a remarkable progress on solving industrial instances. It is believed that most of these successful techniques exploit the underlying structure of industrial instances. Recently, there have been some attempts to analyze the structure of industrial SAT instances in terms of complex networks, with the aim of explaining the success of SAT solving techniques, and possibly improving them. In this paper, we study the community structure, or modularity, of industrial SAT instances. In a graph with clear community structure, or high modularity, we can find a partition of its nodes into communities such that most edges connect variables of the same community. Representing SAT instances as graphs, we show that most application benchmarks are characterized by a high modularity. On the contrary, random SAT instances are closer to the classical Erdös-Rényi random graph model, where no structure can be observed. We also analyze how this structure evolves by the effects of the execution of a CDCL SAT solver, and observe that new clauses learned by the solver during the search contribute to destroy the original structure of the formula. Motivated by this observation, we finally present an application that exploits the community structure to detect relevant learned clauses, and we show that detecting these clauses results in an improvement on the performance of the SAT solver. Empirically, we observe that this improves the performance of several SAT solvers on industrial SAT formulas, especially on satisfiable instances.Peer ReviewedPostprint (published version

    Community Structure in Industrial SAT Instances

    Get PDF
    Modern SAT solvers have experienced a remarkable progress on solving industrial instances. Most of the techniques have been developed after an intensive experimental process. It is believed that these techniques exploit the underlying structure of industrial instances. However, there are few works trying to exactly characterize the main features of this structure. The research community on complex networks has developed techniques of analysis and algorithms to study real-world graphs that can be used by the SAT community. Recently, there have been some attempts to analyze the structure of industrial SAT instances in terms of complex networks, with the aim of explaining the success of SAT solving techniques, and possibly improving them. In this paper, inspired by the results on complex networks, we study the community structure, or modularity, of industrial SAT instances. In a graph with clear community structure, or high modularity, we can find a partition of its nodes into communities such that most edges connect variables of the same community. In our analysis, we represent SAT instances as graphs, and we show that most application benchmarks are characterized by a high modularity. On the contrary, random SAT instances are closer to the classical Erd\"os-R\'enyi random graph model, where no structure can be observed. We also analyze how this structure evolves by the effects of the execution of a CDCL SAT solver. In particular, we use the community structure to detect that new clauses learned by the solver during the search contribute to destroy the original structure of the formula. This is, learned clauses tend to contain variables of distinct communities

    Characterizing the Temperature of SAT Formulas

    Get PDF
    The remarkable advances in SAT solving achieved in the last years have allowed to use this technology to solve many real-world applications, such as planning, formal verification and cryptography, among others. Interestingly, these industrial SAT problems are commonly believed to be easier than classical random SAT formulas, but estimating their actual hardness is still a very challenging question, which in some cases even requires to solve them. In this context, realistic pseudo-industrial random SAT generators have emerged with the aim of reproducing the main features of these application problems to better understand the success of those SAT solving techniques on them. In this work, we present a model to estimate the temperature of real-world SAT instances. This temperature represents the degree of distortion into the expected structure of the formula, from highly structured benchmarks (more similar to real-world SAT instances) to the complete absence of structure (observed in the classical random SAT model). Our solution is based on the popularity–similarity random model for SAT, which has been recently presented to reproduce two crucial features of application SAT benchmarks: scale-free and community structures. This model is able to control the hardness of the generated formula by introducing some randomizations in the expected structure. Using our regression model, we observe that the estimated temperature of the applications benchmarks used in the last SAT Competitions correlates to their hardness in most of the cases.Juan de la Cierva program, fellowship IJC2019-040489-I, funded by MCIN and AE

    Scale-Free Random SAT Instances

    Get PDF
    We focus on the random generation of SAT instances that have properties similar to real-world instances. It is known that many industrial instances, even with a great number of variables, can be solved by a clever solver in a reasonable amount of time. This is not possible, in general, with classical randomly generated instances. We provide a different generation model of SAT instances, called scale-free random SAT instances. This is based on the use of a non-uniform probability distribution P(i) ∼ i −β to select variable i, where β is a parameter of the model. This results in formulas where the number of occurrences k of variables follows a power-law distribution P(k) ∼ k −δ , where δ = 1 + 1/β. This property has been observed in most real-world SAT instances. For β = 0, our model extends classical random SAT instances. We prove the existence of a SAT– UNSAT phase transition phenomenon for scale-free random 2-SAT instances with β < 1/2 when the clause/variable ratio is m/n = 1−2β (1−β) 2 . We also prove that scale-free random k-SAT instances are unsatisfiable with a high probability when the number of clauses exceeds ω(n (1−β)k ). The proof of this result suggests that, when β > 1 − 1/k, the unsatisfiability of most formulas may be due to small cores of clauses. Finally, we show how this model will allow us to generate random instances similar to industrial instances, of interest for testing purposes.This research was supported by the project PROOFS, Grant PID2019-109137GB-C21 funded by MCIN/AEI/10.13039/501100011033

    On the Hardness of SAT with Community Structure

    Full text link
    Recent attempts to explain the effectiveness of Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) on large industrial benchmarks have focused on the concept of community structure. Specifically, industrial benchmarks have been empirically found to have good community structure, and experiments seem to show a correlation between such structure and the efficiency of CDCL. However, in this paper we establish hardness results suggesting that community structure is not sufficient to explain the success of CDCL in practice. First, we formally characterize a property shared by a wide class of metrics capturing community structure, including "modularity". Next, we show that the SAT instances with good community structure according to any metric with this property are still NP-hard. Finally, we study a class of random instances generated from the "pseudo-industrial" community attachment model of Gir\'aldez-Cru and Levy. We prove that, with high probability, instances from this model that have relatively few communities but are still highly modular require exponentially long resolution proofs and so are hard for CDCL. We also present experimental evidence that our result continues to hold for instances with many more communities. This indicates that actual industrial instances easily solved by CDCL may have some other relevant structure not captured by the community attachment model.Comment: 23 pages. Full version of a SAT 2016 pape

    Machine Learning for SAT: Restricted Heuristics and New Graph Representations

    Full text link
    Boolean satisfiability (SAT) is a fundamental NP-complete problem with many applications, including automated planning and scheduling. To solve large instances, SAT solvers have to rely on heuristics, e.g., choosing a branching variable in DPLL and CDCL solvers. Such heuristics can be improved with machine learning (ML) models; they can reduce the number of steps but usually hinder the running time because useful models are relatively large and slow. We suggest the strategy of making a few initial steps with a trained ML model and then releasing control to classical heuristics; this simplifies cold start for SAT solving and can decrease both the number of steps and overall runtime, but requires a separate decision of when to release control to the solver. Moreover, we introduce a modification of Graph-Q-SAT tailored to SAT problems converted from other domains, e.g., open shop scheduling problems. We validate the feasibility of our approach with random and industrial SAT problems

    Beyond the structure of SAT formulas

    Get PDF
    Hoy en día, muchos problemas del mundo real son codificados en instancias SAT y resueltos eficientemente por modernos SAT solvers. Estos solvers, usualmente conocidos como Conflict-Driven Clause Learning (CDCL: Aprendizaje de cláusulas guiado por conflictos) SAT solvers, incluyen una variedad de sofisticadas técnicas, como el aprendizaje de cláusulas, estructuras de datos perezosas, heurísticas de ramificación adaptativas basadas en los conflictos, o reinicios aleatorios, entre otros. Sin embargo, las razones de su eficiencia resolviendo problemas SAT del mundo real, o industriales, son todavía desconocidas. La creencia común en la comunidad SAT es que estas técnicas explotan alguna estructura oculta de los problemas del mundo real. En esta tesis, primeramente se caracteriza algunas importantes características de la estructura subyacente de las instancias SAT industriales. Específicamente, estas son la estructura de comunidades y la estructura auto-similar. Se observa que la mayoría de las fórmulas SAT industriales, vistas como grafos, tienen estas dos propiedades. Esto significa que (i) en un grafo con una estructura de comunidades clara; es decir, alta modularidad, se puede encontrar una partición de sus nodos en comunidades de tal forma que la mayoría de las aristas conectan nodos de la misma comunidad; y (ii) en un grafo con el patrón de auto-similitud; es decir, siendo fractal, su forma se mantiene después de re-escalados (agrupando conjuntos de nodos en uno). Se analiza también cómo estas estructuras están afectadas por los efectos de las técnicas CDCL durante la búsqueda. Usando los estudios estructurales previos, se proponen tres aplicaciones. Primero, se aborda el problema de la generación aleatoria de instancias SAT pseudo-industriales usando la noción de modularidad. Nuestro modelo genera instancias similares a las (clásicas) fórmulas SAT aleatorias cuando la modularidad es baja, pero cuando este valor es alto, nuestro modelo también es adecuado para modelar problemas pseudo-industriales realísticamente. Segundo, se propone un método basado en la estructura en comunidades de la instancia para detectar cláusulas aprendidas relevantes. Nuestra técnica aumenta la instancia original con un conjunto de cláusulas relevantes, y esto resulta en una mejora general de la eficiencia de varios CDCL SAT solvers. Finalmente, se analiza la clasificación de instancias SAT industrial en familias usando las características estructurales previamente analizadas, y se comparan con otros clasificadores comúnmente usados en aproximaciones SAT portfolio. En resumen, esta disertación extiende nuestro conocimiento sobre la estructura de las instancias SAT, con el objetivo de explicar mejor el éxito de las técnicas CDCL, con la posibilidad de mejorarlas; y propone una serie de aplicaciones basadas en este análisis de la estructura subyacente de las fórmulas SAT.Nowadays, many real-world problems are encoded into SAT instances and efficiently solved by modern SAT solvers. These solvers, usually known as Conflict-Driven Clause Learning (CDCL) SAT solvers, include a variety of sophisticated techniques, such as clause learning, lazy data structures, conflict-based adaptive branching heuristics, or random restarts, among others. However, the reasons of their efficiency in solving real-world, or industrial, SAT instances are still unknown. The common wisdom in the SAT community is that these technique exploit some hidden structure of real-world problems. In this thesis, we characterize some important features of the underlying structure of industrial SAT instances. Namely, they are the community structure and the self-similar structure. We observe that most industrial SAT formulas, viewed as graphs, have these two properties. This means that~(i) in a graph with a clear community structure, i.e. having high modularity, we can find a partition of its nodes into communities such that most edges connect nodes of the same community; and~(ii) in a graph with a self-similar pattern, i.e. being fractal, its shape is kept after re-scalings, i.e., grouping sets of nodes into a single node. We also analyze how these structures are affected by the effects of CDCL techniques during the search. Using the previous structural studies, we propose three applications. First, we face the problem of generating pseudo-industrial random SAT instances using the notion of modularity. Our model generates instances similar to (classical) random SAT formulas when the modularity is low, but when this value is high, our model is also adequate to model realistic pseudo-industrial problems. Second, we propose a method based on the community structure of the instance to detect relevant learnt clauses. Our technique augments the original instance with this set of relevant clauses, and this results into an overall improvement of the efficiency of several state-of-the-art CDCL SAT solvers. Finally, we analyze the classification of industrial SAT instances into families using the previously analyzed structure features, and we compare them to other classifiers commonly used in portfolio SAT approaches. In summary, this \dissertation extends the understandings of the structure of SAT instances, with the aim of better explaining the success of CDCL techniques and possibly improve them, and propose a number of applications based on this analysis of the underlying structure of SAT formulas
    • …
    corecore