7,603 research outputs found
Tarmo: A Framework for Parallelized Bounded Model Checking
This paper investigates approaches to parallelizing Bounded Model Checking
(BMC) for shared memory environments as well as for clusters of workstations.
We present a generic framework for parallelized BMC named Tarmo. Our framework
can be used with any incremental SAT encoding for BMC but for the results in
this paper we use only the current state-of-the-art encoding for full PLTL.
Using this encoding allows us to check both safety and liveness properties,
contrary to an earlier work on distributing BMC that is limited to safety
properties only.
Despite our focus on BMC after it has been translated to SAT, existing
distributed SAT solvers are not well suited for our application. This is
because solving a BMC problem is not solving a set of independent SAT instances
but rather involves solving multiple related SAT instances, encoded
incrementally, where the satisfiability of each instance corresponds to the
existence of a counterexample of a specific length. Our framework includes a
generic architecture for a shared clause database that allows easy clause
sharing between SAT solver threads solving various such instances.
We present extensive experimental results obtained with multiple variants of
our Tarmo implementation. Our shared memory variants have a significantly
better performance than conventional single threaded approaches, which is a
result that many users can benefit from as multi-core and multi-processor
technology is widely available. Furthermore we demonstrate that our framework
can be deployed in a typical cluster of workstations, where several multi-core
machines are connected by a network
Parallel local search for solving Constraint Problems on the Cell Broadband Engine (Preliminary Results)
We explore the use of the Cell Broadband Engine (Cell/BE for short) for
combinatorial optimization applications: we present a parallel version of a
constraint-based local search algorithm that has been implemented on a
multiprocessor BladeCenter machine with twin Cell/BE processors (total of 16
SPUs per blade). This algorithm was chosen because it fits very well the
Cell/BE architecture and requires neither shared memory nor communication
between processors, while retaining a compact memory footprint. We study the
performance on several large optimization benchmarks and show that this
achieves mostly linear time speedups, even sometimes super-linear. This is
possible because the parallel implementation might explore simultaneously
different parts of the search space and therefore converge faster towards the
best sub-space and thus towards a solution. Besides getting speedups, the
resulting times exhibit a much smaller variance, which benefits applications
where a timely reply is critical
An adaptive prefix-assignment technique for symmetry reduction
This paper presents a technique for symmetry reduction that adaptively
assigns a prefix of variables in a system of constraints so that the generated
prefix-assignments are pairwise nonisomorphic under the action of the symmetry
group of the system. The technique is based on McKay's canonical extension
framework [J.~Algorithms 26 (1998), no.~2, 306--324]. Among key features of the
technique are (i) adaptability---the prefix sequence can be user-prescribed and
truncated for compatibility with the group of symmetries; (ii)
parallelizability---prefix-assignments can be processed in parallel
independently of each other; (iii) versatility---the method is applicable
whenever the group of symmetries can be concisely represented as the
automorphism group of a vertex-colored graph; and (iv) implementability---the
method can be implemented relying on a canonical labeling map for
vertex-colored graphs as the only nontrivial subroutine. To demonstrate the
practical applicability of our technique, we have prepared an experimental
open-source implementation of the technique and carry out a set of experiments
that demonstrate ability to reduce symmetry on hard instances. Furthermore, we
demonstrate that the implementation effectively parallelizes to compute
clusters with multiple nodes via a message-passing interface.Comment: Updated manuscript submitted for revie
Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates
The optimization of algorithm (hyper-)parameters is crucial for achieving
peak performance across a wide range of domains, ranging from deep neural
networks to solvers for hard combinatorial problems. The resulting algorithm
configuration (AC) problem has attracted much attention from the machine
learning community. However, the proper evaluation of new AC procedures is
hindered by two key hurdles. First, AC benchmarks are hard to set up. Second
and even more significantly, they are computationally expensive: a single run
of an AC procedure involves many costly runs of the target algorithm whose
performance is to be optimized in a given AC benchmark scenario. One common
workaround is to optimize cheap-to-evaluate artificial benchmark functions
(e.g., Branin) instead of actual algorithms; however, these have different
properties than realistic AC problems. Here, we propose an alternative
benchmarking approach that is similarly cheap to evaluate but much closer to
the original AC problem: replacing expensive benchmarks by surrogate benchmarks
constructed from AC benchmarks. These surrogate benchmarks approximate the
response surface corresponding to true target algorithm performance using a
regression model, and the original and surrogate benchmark share the same
(hyper-)parameter space. In our experiments, we construct and evaluate
surrogate benchmarks for hyperparameter optimization as well as for AC problems
that involve performance optimization of solvers for hard combinatorial
problems, drawing training data from the runs of existing AC procedures. We
show that our surrogate benchmarks capture overall important characteristics of
the AC scenarios, such as high- and low-performing regions, from which they
were derived, while being much easier to use and orders of magnitude cheaper to
evaluate
Estimating parallel runtimes for randomized algorithms in constraint solving
International audienceThis paper presents a detailed analysis of the scalability and par-allelization of Local Search algorithms for constraint-based and SAT (Boolean satisfiability) solvers. We propose a framework to estimate the parallel performance of a given algorithm by analyzing the runtime behavior of its sequential version. Indeed, by approximating the runtime distribution of the sequential process with statistical methods, the runtime behavior of the parallel process can be predicted by a model based on order statistics. We apply this approach to study the parallel performance of a Constraint-Based Local Search solver (Adaptive Search), two SAT Local Search solvers (namely Sparrow and CCASAT), and a propagation-based constraint solver (Gecode, with a random labeling heuristic). We compare the performance predicted by our model to actual parallel implementations of those methods using up to 384 processes. We show that the model is accurate and predicts performance close to the empirical data. Moreover, as we study different types of problems, we observe that the experimented solvers exhibit different behaviors and that their runtime distributions can be approximated by two types of distributions: exponential (shifted and non-shifted) and lognormal. Our results show that the proposed framework estimates the runtime of the parallel algorithm with an average discrepancy of 21% w.r.t. the empirical data across all the experiments with the maximum allowed number of processors for each technique
- …