7 research outputs found

    Using Restarts in Constraint Programming over Finite Domains - An Experimental Evaluation

    Get PDF
    The use of restart techniques in complete Satisfiability (SAT) algorithms has made solving hard real world instances possible. Without restarts such algorithms could not solve those instances, in practice. State of the art algorithms for SAT use restart techniques, conflict clause recording (nogoods), heuristics based on activity variable in conflict clauses, among others. Algorithms for SAT and Constraint problems share many techniques; however, the use of restart techniques in constraint programming with finite domains (CP(FD)) is not widely used as it is in SAT. We believe that the use of restarts in CP(FD) algorithms could also be the key to efficiently solve hard combinatorial problems. In this PhD thesis we study restarts and associated techniques in CP(FD) solvers. In particular, we propose to including in a CP(FD) solver restarts, nogoods and heuristics based in nogoods as this should improve search algorithms, and, consequently, efficiently solve hard combinatorial problems. We thus intend to: a) implement restart techniques (successfully used in SAT) to solve constraint problems with finite domains; b) implement nogoods (learning) and heuristics based on nogoods, already in use in SAT and associated with restarts; and c) evaluate the use of restarts and the interplay with the other implemented techniques. We have conducted the study in the context of domain splitting backtrack search algorithms with restarts. We have defined domain splitting nogoods that are extracted from the last branch of the search algorithm before the restart. And, inspired by SAT solvers, we were able to use information within those nogoods to successfully help the variable selection heuristics. A frequent restart strategy is also necessary, since our approach learns from restarts

    Approaches to grid-based SAT solving

    Get PDF
    In this work we develop techniques for using distributed computing resources to efficiently solve instances of the propositional satisfiability problem (SAT). The computing resources considered in this work are assumed to be geographically distributed and connected by a non-dedicated network. Such systems are typically referred to as computational grid environments. The time a modern SAT solver consumes while solving an instance varies according to a random distribution. Unlike many other methods for distributed SAT solving, this work identifies the random distribution as a valuable resource for solving-time reduction. The methods which use randomness in the run times of a search algorithm, such as the ones discussed in this work, are examples of multi-search. The main contribution of this work is in developing and analyzing the multi-search approach in SAT solving and showing its efficiency with several experiments. For the purpose of the analysis, the work introduces a grid simulation model which captures several of the properties of a grid environment which are not observed in more traditional parallel computing systems. The work develops two algorithmic frameworks for multi-search in SAT. The first, SDSAT, is based on using properties of the distribution of the solving time so that the expected time required to solve an instance is reduced. Based on the analysis of SDSAT, the work proposes an algorithm for efficiently using large number of computing resources simultaneously to solve collections of SAT instances. The analysis of SDSAT also motivates the second algorithmic framework, CL-SDSAT. The framework is used to efficiently solve many industrial SAT instances by carefully combining information learned in the distributed SAT solvers. All methods described in the work are directly applicable in a wide range of grid environments and can be used together with virtually unmodified state-of-the-art SAT solvers. The methods are experimentally verified using standard benchmark SAT instances in a production-level grid environment. The experiments show that using the relatively simple methods developed in the work, SAT instances which cannot be solved efficiently in sequential settings can be now solved in a grid environment

    Text Similarity Between Concepts Extracted from Source Code and Documentation

    Get PDF
    Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p
    corecore