13,443 research outputs found
Learning Invariants using Decision Trees
The problem of inferring an inductive invariant for verifying program safety
can be formulated in terms of binary classification. This is a standard problem
in machine learning: given a sample of good and bad points, one is asked to
find a classifier that generalizes from the sample and separates the two sets.
Here, the good points are the reachable states of the program, and the bad
points are those that reach a safety property violation. Thus, a learned
classifier is a candidate invariant. In this paper, we propose a new algorithm
that uses decision trees to learn candidate invariants in the form of arbitrary
Boolean combinations of numerical inequalities. We have used our algorithm to
verify C programs taken from the literature. The algorithm is able to infer
safe invariants for a range of challenging benchmarks and compares favorably to
other ML-based invariant inference techniques. In particular, it scales well to
large sample sets.Comment: 15 pages, 2 figure
On Learning to Prove
In this paper, we consider the problem of learning a first-order theorem
prover that uses a representation of beliefs in mathematical claims to
construct proofs. The inspiration for doing so comes from the practices of
human mathematicians where "plausible reasoning" is applied in addition to
deductive reasoning to find proofs.
Towards this end, we introduce a representation of beliefs that assigns
probabilities to the exhaustive and mutually exclusive first-order
possibilities found in Hintikka's theory of distributive normal forms. The
representation supports Bayesian update, induces a distribution on statements
that does not enforce that logically equivalent statements are assigned the
same probability, and suggests an embedding of statements into an associated
Hilbert space.
We then examine conjecturing as model selection and an alternating-turn game
of determining consistency. The game is amenable (in principle) to self-play
training to learn beliefs and derive a prover that is complete when logical
omniscience is attained and sound when beliefs are reasonable. The
representation has super-exponential space requirements as a function of
quantifier depth so the ideas in this paper should be taken as theoretical. We
will comment on how abstractions can be used to control the space requirements
at the cost of completeness.Comment: Preprin
Refining Existential Properties in Separation Logic Analyses
In separation logic program analyses, tractability is generally achieved by
restricting invariants to a finite abstract domain. As this domain cannot vary,
loss of information can cause failure even when verification is possible in the
underlying logic. In this paper, we propose a CEGAR-like method for detecting
spurious failures and avoiding them by refining the abstract domain. Our
approach is geared towards discovering existential properties, e.g. "list
contains value x". To diagnose failures, we use abduction, a technique for
inferring command preconditions. Our method works backwards from an error,
identifying necessary information lost by abstraction, and refining the forward
analysis to avoid the error. We define domains for several classes of
existential properties, and show their effectiveness on case studies adapted
from Redis, Azureus and FreeRTOS
Entity Abstraction in Visual Model-Based Reinforcement Learning
This paper tests the hypothesis that modeling a scene in terms of entities
and their local interactions, as opposed to modeling the scene globally,
provides a significant benefit in generalizing to physical tasks in a
combinatorial space the learner has not encountered before. We present
object-centric perception, prediction, and planning (OP3), which to the best of
our knowledge is the first fully probabilistic entity-centric dynamic latent
variable framework for model-based reinforcement learning that acquires entity
representations from raw visual observations without supervision and uses them
to predict and plan. OP3 enforces entity-abstraction -- symmetric processing of
each entity representation with the same locally-scoped function -- which
enables it to scale to model different numbers and configurations of objects
from those in training. Our approach to solving the key technical challenge of
grounding these entity representations to actual objects in the environment is
to frame this variable binding problem as an inference problem, and we develop
an interactive inference algorithm that uses temporal continuity and
interactive feedback to bind information about object properties to the entity
variables. On block-stacking tasks, OP3 generalizes to novel block
configurations and more objects than observed during training, outperforming an
oracle model that assumes access to object supervision and achieving two to
three times better accuracy than a state-of-the-art video prediction model that
does not exhibit entity abstraction.Comment: Accepted at CoRL 201
Abstracting Probabilistic Models: A Logical Perspective
Abstraction is a powerful idea widely used in science, to model, reason and
explain the behavior of systems in a more tractable search space, by omitting
irrelevant details. While notions of abstraction have matured for deterministic
systems, the case for abstracting probabilistic models is not yet fully
understood.
In this paper, we provide a semantical framework for analyzing such
abstractions from first principles. We develop the framework in a general way,
allowing for expressive languages, including logic-based ones that admit
relational and hierarchical constructs with stochastic primitives. We motivate
a definition of consistency between a high-level model and its low-level
counterpart, but also treat the case when the high-level model is missing
critical information present in the low-level model. We prove properties of
abstractions, both at the level of the parameter as well as the structure of
the models. We conclude with some observations about how abstractions can be
derived automatically.Comment: In AAAI Workshop: Statistical Relational Artificial Intelligence,
2020. (This is the extended version.
'Say EM' for Selecting Probabilistic Models for Logical Sequences
Many real world sequences such as protein secondary structures or shell logs
exhibit a rich internal structures. Traditional probabilistic models of
sequences, however, consider sequences of flat symbols only. Logical hidden
Markov models have been proposed as one solution. They deal with logical
sequences, i.e., sequences over an alphabet of logical atoms. This comes at the
expense of a more complex model selection problem. Indeed, different
abstraction levels have to be explored. In this paper, we propose a novel
method for selecting logical hidden Markov models from data called SAGEM. SAGEM
combines generalized expectation maximization, which optimizes parameters, with
structure search for model selection using inductive logic programming
refinement operators. We provide convergence and experimental results that show
SAGEM's effectiveness.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty
in Artificial Intelligence (UAI2005
Instruction-Level Abstraction (ILA): A Uniform Specification for System-on-Chip (SoC) Verification
Modern Systems-on-Chip (SoC) designs are increasingly heterogeneous and
contain specialized semi-programmable accelerators in addition to programmable
processors. In contrast to the pre-accelerator era, when the ISA played an
important role in verification by enabling a clean separation of concerns
between software and hardware, verification of these "accelerator-rich" SoCs
presents new challenges. From the perspective of hardware designers, there is a
lack of a common framework for the formal functional specification of
accelerator behavior. From the perspective of software developers, there exists
no unified framework for reasoning about software/hardware interactions of
programs that interact with accelerators. This paper addresses these challenges
by providing a formal specification and high-level abstraction for accelerator
functional behavior. It formalizes the concept of an Instruction Level
Abstraction (ILA), developed informally in our previous work, and shows its
application in modeling and verification of accelerators. This formal ILA
extends the familiar notion of instructions to accelerators and provides a
uniform, modular, and hierarchical abstraction for modeling software-visible
behavior of both accelerators and programmable processors. We demonstrate the
applicability of the ILA through several case studies of accelerators (for
image processing, machine learning, and cryptography), and a general-purpose
processor (RISC-V). We show how the ILA model facilitates equivalence checking
between two ILAs, and between an ILA and its hardware finite-state machine
(FSM) implementation. Further, this equivalence checking supports accelerator
upgrades using the notion of ILA compatibility, similar to processor upgrades
using ISA compatibility.Comment: 24 pages, 3 figures, 3 table
Disjunctive Interpolants for Horn-Clause Verification (Extended Technical Report)
One of the main challenges in software verification is efficient and precise
compositional analysis of programs with procedures and loops. Interpolation
methods remain one of the most promising techniques for such verification, and
are closely related to solving Horn clause constraints. We introduce a new
notion of interpolation, disjunctive interpolation, which solve a more general
class of problems in one step compared to previous notions of interpolants,
such as tree interpolants or inductive sequences of interpolants. We present
algorithms and complexity for construction of disjunctive interpolants, as well
as their use within an abstraction-refinement loop. We have implemented Horn
clause verification algorithms that use disjunctive interpolants and evaluate
them on benchmarks expressed as Horn clauses over the theory of integer linear
arithmetic
Inferring Inductive Invariants from Phase Structures
Infinite-state systems such as distributed protocols are challenging to
verify using interactive theorem provers or automatic verification tools. Of
these techniques, deductive verification is highly expressive but requires the
user to annotate the system with inductive invariants. To relieve the user from
this labor-intensive and challenging task, invariant inference aims to find
inductive invariants automatically. Unfortunately, when applied to
infinite-state systems such as distributed protocols, existing inference
techniques often diverge, which limits their applicability.
This paper proposes user-guided invariant inference based on phase
invariants, which capture the different logical phases of the protocol. Users
conveys their intuition by specifying a phase structure, an automaton with
edges labeled by program transitions; the tool automatically infers assertions
that hold in the automaton's states, resulting in a full safety proof.The
additional structure from phases guides the inference procedure towards finding
an invariant.
Our results show that user guidance by phase structures facilitates
successful inference beyond the state of the art. We find that phase structures
are pleasantly well matched to the intuitive reasoning routinely used by domain
experts to understand why distributed protocols are correct, so that providing
a phase structure reuses this existing intuition
Program Synthesis using Abstraction Refinement
We present a new approach to example-guided program synthesis based on
counterexample-guided abstraction refinement. Our method uses the abstract
semantics of the underlying DSL to find a program whose abstract behavior
satisfies the examples. However, since program may be spurious with respect
to the concrete semantics, our approach iteratively refines the abstraction
until we either find a program that satisfies the examples or prove that no
such DSL program exists. Because many programs have the same input-output
behavior in terms of their abstract semantics, this synthesis methodology
significantly reduces the search space compared to existing techniques that use
purely concrete semantics. While synthesis using abstraction refinement
(SYNGAR) could be implemented in different settings, we propose a
refinement-based synthesis algorithm that uses abstract finite tree automata
(AFTA). Our technique uses a coarse initial program abstraction to construct an
initial AFTA, which is iteratively refined by constructing a proof of
incorrectness of any spurious program. In addition to ruling out the spurious
program accepted by the previous AFTA, proofs of incorrectness are also useful
for ruling out many other spurious programs. We implement these ideas in a
framework called \tool. We have used the BLAZE framework to build synthesizers
for string and matrix transformations, and we compare BLAZE with existing
techniques. Our results for the string domain show that BLAZE compares
favorably with FlashFill, a domain-specific synthesizer that is now deployed in
Microsoft PowerShell. In the context of matrix manipulations, we compare BLAZE
against Prose, a state-of-the-art general-purpose VSA-based synthesizer, and
show that BLAZE results in a 90x speed-up over Prose
- …