7,899 research outputs found
Satisfiability and Synthesis Modulo Oracles
In classic program synthesis algorithms, such as counterexample-guided
inductive synthesis (CEGIS), the algorithms alternate between a synthesis phase
and an oracle (verification) phase. Many synthesis algorithms use a white-box
oracle based on satisfiability modulo theory (SMT) solvers to provide
counterexamples. But what if a white-box oracle is either not available or not
easy to work with? We present a framework for solving a general class of
oracle-guided synthesis problems which we term synthesis modulo oracles. In
this setting, oracles may be black boxes with a query-response interface
defined by the synthesis problem. As a necessary component of this framework,
we also formalize the problem of satisfiability modulo theories and oracles,
and present an algorithm for solving this problem. We implement a prototype
solver for satisfiability and synthesis modulo oracles and demonstrate that, by
using oracles that execute functions not easily modeled in SMT-constraints,
such as recursive functions or oracles that incorporate compilation and
execution of code, SMTO and SyMO are able to solve problems beyond the
abilities of standard SMT and synthesis solvers.Comment: 12 pages, 8 Figure
A Theory of Formal Synthesis via Inductive Learning
Formal synthesis is the process of generating a program satisfying a
high-level formal specification. In recent times, effective formal synthesis
methods have been proposed based on the use of inductive learning. We refer to
this class of methods that learn programs from examples as formal inductive
synthesis. In this paper, we present a theoretical framework for formal
inductive synthesis. We discuss how formal inductive synthesis differs from
traditional machine learning. We then describe oracle-guided inductive
synthesis (OGIS), a framework that captures a family of synthesizers that
operate by iteratively querying an oracle. An instance of OGIS that has had
much practical impact is counterexample-guided inductive synthesis (CEGIS). We
present a theoretical characterization of CEGIS for learning any program that
computes a recursive language. In particular, we analyze the relative power of
CEGIS variants where the types of counterexamples generated by the oracle
varies. We also consider the impact of bounded versus unbounded memory
available to the learning algorithm. In the special case where the universe of
candidate programs is finite, we relate the speed of convergence to the notion
of teaching dimension studied in machine learning theory. Altogether, the
results of the paper take a first step towards a theoretical foundation for the
emerging field of formal inductive synthesis
Overfitting in Synthesis: Theory and Practice (Extended Version)
In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically
generate a program belonging to a grammar of possible implementations that
meets a logical specification. We investigate a common limitation across
state-of-the-art SyGuS tools that perform counterexample-guided inductive
synthesis (CEGIS). We empirically observe that as the expressiveness of the
provided grammar increases, the performance of these tools degrades
significantly.
We claim that this degradation is not only due to a larger search space, but
also due to overfitting. We formally define this phenomenon and prove
no-free-lunch theorems for SyGuS, which reveal a fundamental tradeoff between
synthesizer performance and grammar expressiveness.
A standard approach to mitigate overfitting in machine learning is to run
multiple learners with varying expressiveness in parallel. We demonstrate that
this insight can immediately benefit existing SyGuS tools. We also propose a
novel single-threaded technique called hybrid enumeration that interleaves
different grammars and outperforms the winner of the 2018 SyGuS competition
(Inv track), solving more problems and achieving a mean speedup.Comment: 24 pages (5 pages of appendices), 7 figures, includes proofs of
theorem
Sciduction: Combining Induction, Deduction, and Structure for Verification and Synthesis
Even with impressive advances in automated formal methods, certain problems
in system verification and synthesis remain challenging. Examples include the
verification of quantitative properties of software involving constraints on
timing and energy consumption, and the automatic synthesis of systems from
specifications. The major challenges include environment modeling,
incompleteness in specifications, and the complexity of underlying decision
problems.
This position paper proposes sciduction, an approach to tackle these
challenges by integrating inductive inference, deductive reasoning, and
structure hypotheses. Deductive reasoning, which leads from general rules or
concepts to conclusions about specific problem instances, includes techniques
such as logical inference and constraint solving. Inductive inference, which
generalizes from specific instances to yield a concept, includes algorithmic
learning from examples. Structure hypotheses are used to define the class of
artifacts, such as invariants or program fragments, generated during
verification or synthesis. Sciduction constrains inductive and deductive
reasoning using structure hypotheses, and actively combines inductive and
deductive reasoning: for instance, deductive techniques generate examples for
learning, and inductive reasoning is used to guide the deductive engines.
We illustrate this approach with three applications: (i) timing analysis of
software; (ii) synthesis of loop-free programs, and (iii) controller synthesis
for hybrid systems. Some future applications are also discussed
Learning a Static Analyzer from Data
To be practically useful, modern static analyzers must precisely model the
effect of both, statements in the programming language as well as frameworks
used by the program under analysis. While important, manually addressing these
challenges is difficult for at least two reasons: (i) the effects on the
overall analysis can be non-trivial, and (ii) as the size and complexity of
modern libraries increase, so is the number of cases the analysis must handle.
In this paper we present a new, automated approach for creating static
analyzers: instead of manually providing the various inference rules of the
analyzer, the key idea is to learn these rules from a dataset of programs. Our
method consists of two ingredients: (i) a synthesis algorithm capable of
learning a candidate analyzer from a given dataset, and (ii) a counter-example
guided learning procedure which generates new programs beyond those in the
initial dataset, critical for discovering corner cases and ensuring the learned
analysis generalizes to unseen programs.
We implemented and instantiated our approach to the task of learning
JavaScript static analysis rules for a subset of points-to analysis and for
allocation sites analysis. These are challenging yet important problems that
have received significant research attention. We show that our approach is
effective: our system automatically discovered practical and useful inference
rules for many cases that are tricky to manually identify and are missed by
state-of-the-art, manually tuned analyzers
Automatic Repair of Buggy If Conditions and Missing Preconditions with SMT
We present Nopol, an approach for automatically repairing buggy if conditions
and missing preconditions. As input, it takes a program and a test suite which
contains passing test cases modeling the expected behavior of the program and
at least one failing test case embodying the bug to be repaired. It consists of
collecting data from multiple instrumented test suite executions, transforming
this data into a Satisfiability Modulo Theory (SMT) problem, and translating
the SMT result -- if there exists one -- into a source code patch. Nopol
repairs object oriented code and allows the patches to contain nullness checks
as well as specific method calls.Comment: CSTVA'2014, India (2014
Learning to Infer Graphics Programs from Hand-Drawn Images
We introduce a model that learns to convert simple hand drawings into
graphics programs written in a subset of \LaTeX. The model combines techniques
from deep learning and program synthesis. We learn a convolutional neural
network that proposes plausible drawing primitives that explain an image. These
drawing primitives are like a trace of the set of primitive commands issued by
a graphics program. We learn a model that uses program synthesis techniques to
recover a graphics program from that trace. These programs have constructs like
variable bindings, iterative loops, or simple kinds of conditionals. With a
graphics program in hand, we can correct errors made by the deep network,
measure similarity between drawings by use of similar high-level geometric
structures, and extrapolate drawings. Taken together these results are a step
towards agents that induce useful, human-readable programs from perceptual
input
- …