1,755 research outputs found
Balancing Scalability and Uniformity in SAT Witness Generator
Constrained-random simulation is the predominant approach used in the
industry for functional verification of complex digital designs. The
effectiveness of this approach depends on two key factors: the quality of
constraints used to generate test vectors, and the randomness of solutions
generated from a given set of constraints. In this paper, we focus on the
second problem, and present an algorithm that significantly improves the
state-of-the-art of (almost-)uniform generation of solutions of large Boolean
constraints. Our algorithm provides strong theoretical guarantees on the
uniformity of generated solutions and scales to problems involving hundreds of
thousands of variables.Comment: This is a full version of DAC 2014 pape
Sampling Techniques for Boolean Satisfiability
Boolean satisfiability ({\SAT}) has played a key role in diverse areas
spanning testing, formal verification, planning, optimization, inferencing and
the like. Apart from the classical problem of checking boolean satisfiability,
the problems of generating satisfying uniformly at random, and of counting the
total number of satisfying assignments have also attracted significant
theoretical and practical interest over the years. Prior work offered heuristic
approaches with very weak or no guarantee of performance, and theoretical
approaches with proven guarantees, but poor performance in practice.
We propose a novel approach based on limited-independence hashing that allows
us to design algorithms for both problems, with strong theoretical guarantees
and scalability extending to thousands of variables. Based on this approach, we
present two practical algorithms, {\UniformWitness}: a near uniform generator
and {\approxMC}: the first scalable approximate model counter, along with
reference implementations. Our algorithms work by issuing polynomial calls to
{\SAT} solver. We demonstrate scalability of our algorithms over a large set of
benchmarks arising from different application domains.Comment: MS Thesis submitted to Rice Universit
Uniform and scalable sampling of highly configurable systems
Many analyses on confgurable software systems are intractable when confronted with
colossal and highly-constrained confguration spaces. These analyses could instead use
statistical inference, where a tractable sample accurately predicts results for the entire
space. To do so, the laws of statistical inference requires each member of the population
to be equally likely to be included in the sample, i.e., the sampling process needs to be
“uniform”. SAT-samplers have been developed to generate uniform random samples at a
reasonable computational cost. However, there is a lack of experimental validation over
colossal spaces to show whether the samplers indeed produce uniform samples or not. This
paper (i) proposes a new sampler named BDDSampler, (ii) presents a new statistical test
to verify sampler uniformity, and (iii) reports the evaluation of BDDSampler and fve
other state-of-the-art samplers: KUS, QuickSampler, Smarch, Spur, and Unigen2. Our
experimental results show only BDDSampler satisfes both scalability and uniformity.Universidad Nacional de Educación a Distancia (UNED) OPTIVAC 096-034091 2021V/PUNED/008Ministerio de Ciencia, Innovación y Universidades RTI2018-101204-B-C22 (OPHELIA)Comunidad Autónoma de Madrid ROBOCITY2030-DIH-CM S2018/NMT-4331Agencia Estatal de Investigación TIN2017-90644-RED
Flexible constrained sampling with guarantees for pattern mining
Pattern sampling has been proposed as a potential solution to the infamous
pattern explosion. Instead of enumerating all patterns that satisfy the
constraints, individual patterns are sampled proportional to a given quality
measure. Several sampling algorithms have been proposed, but each of them has
its limitations when it comes to 1) flexibility in terms of quality measures
and constraints that can be used, and/or 2) guarantees with respect to sampling
accuracy. We therefore present Flexics, the first flexible pattern sampler that
supports a broad class of quality measures and constraints, while providing
strong guarantees regarding sampling accuracy. To achieve this, we leverage the
perspective on pattern mining as a constraint satisfaction problem and build
upon the latest advances in sampling solutions in SAT as well as existing
pattern mining algorithms. Furthermore, the proposed algorithm is applicable to
a variety of pattern languages, which allows us to introduce and tackle the
novel task of sampling sets of patterns. We introduce and empirically evaluate
two variants of Flexics: 1) a generic variant that addresses the well-known
itemset sampling task and the novel pattern set sampling task as well as a wide
range of expressive constraints within these tasks, and 2) a specialized
variant that exploits existing frequent itemset techniques to achieve
substantial speed-ups. Experiments show that Flexics is both accurate and
efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal
(ECML/PKDD 2017 journal track
Learning what matters - Sampling interesting patterns
In the field of exploratory data mining, local structure in data can be
described by patterns and discovered by mining algorithms. Although many
solutions have been proposed to address the redundancy problems in pattern
mining, most of them either provide succinct pattern sets or take the interests
of the user into account-but not both. Consequently, the analyst has to invest
substantial effort in identifying those patterns that are relevant to her
specific interests and goals. To address this problem, we propose a novel
approach that combines pattern sampling with interactive data mining. In
particular, we introduce the LetSIP algorithm, which builds upon recent
advances in 1) weighted sampling in SAT and 2) learning to rank in interactive
pattern mining. Specifically, it exploits user feedback to directly learn the
parameters of the sampling distribution that represents the user's interests.
We compare the performance of the proposed algorithm to the state-of-the-art in
interactive pattern mining by emulating the interests of a user. The resulting
system allows efficient and interleaved learning and sampling, thus
user-specific anytime data exploration. Finally, LetSIP demonstrates favourable
trade-offs concerning both quality-diversity and exploitation-exploration when
compared to existing methods.Comment: PAKDD 2017, extended versio
- …