669 research outputs found
An Infinite Needle in a Finite Haystack: Finding Infinite Counter-Models in Deductive Verification
First-order logic, and quantifiers in particular, are widely used in
deductive verification. Quantifiers are essential for describing systems with
unbounded domains, but prove difficult for automated solvers. Significant
effort has been dedicated to finding quantifier instantiations that establish
unsatisfiability, thus ensuring validity of a system's verification conditions.
However, in many cases the formulas are satisfiable: this is often the case in
intermediate steps of the verification process. For such cases, existing tools
are limited to finding finite models as counterexamples. Yet, some quantified
formulas are satisfiable but only have infinite models. Such infinite
counter-models are especially typical when first-order logic is used to
approximate inductive definitions such as linked lists or the natural numbers.
The inability of solvers to find infinite models makes them diverge in these
cases. In this paper, we tackle the problem of finding such infinite models.
These models allow the user to identify and fix bugs in the modeling of the
system and its properties. Our approach consists of three parts. First, we
introduce symbolic structures as a way to represent certain infinite models.
Second, we describe an effective model finding procedure that symbolically
explores a given family of symbolic structures. Finally, we identify a new
decidable fragment of first-order logic that extends and subsumes the
many-sorted variant of EPR, where satisfiable formulas always have a model
representable by a symbolic structure within a known family. We evaluate our
approach on examples from the domains of distributed consensus protocols and of
heap-manipulating programs. Our implementation quickly finds infinite
counter-models that demonstrate the source of verification failures in a simple
way, while SMT solvers and theorem provers such as Z3, cvc5, and Vampire
diverge
Testing probability distributions underlying aggregated data
In this paper, we analyze and study a hybrid model for testing and learning
probability distributions. Here, in addition to samples, the testing algorithm
is provided with one of two different types of oracles to the unknown
distribution over . More precisely, we define both the dual and
cumulative dual access models, in which the algorithm can both sample from
and respectively, for any ,
- query the probability mass (query access); or
- get the total mass of , i.e. (cumulative
access)
These two models, by generalizing the previously studied sampling and query
oracle models, allow us to bypass the strong lower bounds established for a
number of problems in these settings, while capturing several interesting
aspects of these problems -- and providing new insight on the limitations of
the models. Finally, we show that while the testing algorithms can be in most
cases strictly more efficient, some tasks remain hard even with this additional
power
The role of Walsh structure and ordinal linkage in the optimisation of pseudo-Boolean functions under monotonicity invariance.
Optimisation heuristics rely on implicit or explicit assumptions about the structure of the black-box fitness function they optimise. A review of the literature shows that understanding of structure and linkage is helpful to the design and analysis of heuristics. The aim of this thesis is to investigate the role that problem structure plays in heuristic optimisation. Many heuristics use ordinal operators; which are those that are invariant under monotonic transformations of the fitness function. In this thesis we develop a classification of pseudo-Boolean functions based on rank-invariance. This approach classifies functions which are monotonic transformations of one another as equivalent, and so partitions an infinite set of functions into a finite set of classes. Reasoning about heuristics composed of ordinal operators is, by construction, invariant over these classes. We perform a complete analysis of 2-bit and 3-bit pseudo-Boolean functions. We use Walsh analysis to define concepts of necessary, unnecessary, and conditionally necessary interactions, and of Walsh families. This helps to make precise some existing ideas in the literature such as benign interactions. Many algorithms are invariant under the classes we define, which allows us to examine the difficulty of pseudo-Boolean functions in terms of function classes. We analyse a range of ordinal selection operators for an EDA. Using a concept of directed ordinal linkage, we define precedence networks and precedence profiles to represent key algorithmic steps and their interdependency in terms of problem structure. The precedence profiles provide a measure of problem difficulty. This corresponds to problem difficulty and algorithmic steps for optimisation. This work develops insight into the relationship between function structure and problem difficulty for optimisation, which may be used to direct the development of novel algorithms. Concepts of structure are also used to construct easy and hard problems for a hill-climber
Unsupervised ensemble minority clustering
Cluster a alysis lies at the core of most unsupervised learning tasks. However, the majority of clustering algorithms depend on the all-in assumption, in which all objects belong to some cluster, and perform poorly on minority clustering tasks, in which a small fraction of signal data stands against a majority of noise.
The approaches proposed so far for minority clustering are supervised: they require the
number and distribution of the foreground and background clusters. In supervised learning and all-in clustering, combination methods have been successfully applied to obtain distribution-free learners, even from the output of weak individual algorithms.
In this report, we present a novel ensemble minority clustering algorithm, Ewocs, suitable for weak clustering combination, and provide a theoretical proof of its properties under a loose set of constraints. The validity of the assumptions used in the proof is empirically assessed using a collection of synthetic datasets.Preprin
- …