160 research outputs found
Inductive Program Synthesis via Iterative Forward-Backward Abstract Interpretation
A key challenge in example-based program synthesis is the gigantic search
space of programs. To address this challenge, various work proposed to use
abstract interpretation to prune the search space. However, most of existing
approaches have focused only on forward abstract interpretation, and thus
cannot fully exploit the power of abstract interpretation. In this paper, we
propose a novel approach to inductive program synthesis via iterative
forward-backward abstract interpretation. The forward abstract interpretation
computes possible outputs of a program given inputs, while the backward
abstract interpretation computes possible inputs of a program given outputs. By
iteratively performing the two abstract interpretations in an alternating
fashion, we can effectively determine if any completion of each partial program
as a candidate can satisfy the input-output examples. We apply our approach to
a standard formulation, syntax-guided synthesis (SyGuS), thereby supporting a
wide range of inductive synthesis tasks. We have implemented our approach and
evaluated it on a set of benchmarks from the prior work. The experimental
results show that our approach significantly outperforms the state-of-the-art
approaches thanks to the sophisticated abstract interpretation techniques
ImageEye: Batch Image Processing Using Program Synthesis
This paper presents a new synthesis-based approach for batch image
processing. Unlike existing tools that can only apply global edits to the
entire image, our method can apply fine-grained edits to individual objects
within the image. For example, our method can selectively blur or crop specific
objects that have a certain property. To facilitate such fine-grained image
editing tasks, we propose a neuro-symbolic domain-specific language (DSL) that
combines pre-trained neural networks for image classification with other
language constructs that enable symbolic reasoning. Our method can
automatically learn programs in this DSL from user demonstrations by utilizing
a novel synthesis algorithm. We have implemented the proposed technique in a
tool called ImageEye and evaluated it on 50 image editing tasks. Our evaluation
shows that ImageEye is able to automate 96% of these tasks
Better Together: Unifying Datalog and Equality Saturation
We present egglog, a fixpoint reasoning system that unifies Datalog and
equality saturation (EqSat). Like Datalog, it supports efficient incremental
execution, cooperating analyses, and lattice-based reasoning. Like EqSat, it
supports term rewriting, efficient congruence closure, and extraction of
optimized terms.
We identify two recent applications--a unification-based pointer analysis in
Datalog and an EqSat-based floating-point term rewriter--that have been
hampered by features missing from Datalog but found in EqSat or vice-versa. We
evaluate egglog by reimplementing those projects in egglog. The resulting
systems in egglog are faster, simpler, and fix bugs found in the original
systems.Comment: PLDI 202
Programming by Example Made Easy
Programming by example (PBE) is an emerging programming paradigm that
automatically synthesizes programs specified by user-provided input-output
examples. Despite the convenience for end-users, implementing PBE tools often
requires strong expertise in programming language and synthesis algorithms.
Such a level of knowledge is uncommon among software developers. It greatly
limits the broad adoption of PBE by the industry. To facilitate the adoption of
PBE techniques, we propose a PBE framework called Bee, which leverages an
"entity-action" model based on relational tables to ease PBE development for a
wide but restrained range of domains. Implementing PBE tools with Bee only
requires adapting domain-specific data entities and user actions to tables,
with no need to design a domain-specific language or an efficient synthesis
algorithm. The synthesis algorithm of Bee exploits bidirectional searching and
constraint-solving techniques to address the challenge of value computation
nested in table transformation. We evaluated Bee's effectiveness on 64 PBE
tasks from three different domains and usability with a human study of 12
participants. Evaluation results show that Bee is easier to learn and use than
the state-of-the-art PBE framework, and the bidirectional algorithm achieves
comparable performance to domain-specifically optimized synthesizers.Comment: Accepted by ACM Transactions on Software Engineering and Methodolog
Verifying Data Constraint Equivalence in FinTech Systems
Data constraints are widely used in FinTech systems for monitoring data
consistency and diagnosing anomalous data manipulations. However, many
equivalent data constraints are created redundantly during the development
cycle, slowing down the FinTech systems and causing unnecessary alerts. We
present EqDAC, an efficient decision procedure to determine the data constraint
equivalence. We first propose the symbolic representation for semantic encoding
and then introduce two light-weighted analyses to refute and prove the
equivalence, respectively, which are proved to achieve in polynomial time. We
evaluate EqDAC upon 30,801 data constraints in a FinTech system. It is shown
that EqDAC detects 11,538 equivalent data constraints in three hours. It also
supports efficient equivalence searching with an average time cost of 1.22
seconds, enabling the system to check new data constraints upon submission.Comment: 14 pages, 11 figures, accepted by ICSE 202
Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks (Extended Version)
Verifying real-world programs often requires inferring loop invariants with
nonlinear constraints. This is especially true in programs that perform many
numerical operations, such as control systems for avionics or industrial
plants. Recently, data-driven methods for loop invariant inference have shown
promise, especially on linear invariants. However, applying data-driven
inference to nonlinear loop invariants is challenging due to the large numbers
of and magnitudes of high-order terms, the potential for overfitting on a small
number of samples, and the large space of possible inequality bounds.
In this paper, we introduce a new neural architecture for general SMT
learning, the Gated Continuous Logic Network (G-CLN), and apply it to nonlinear
loop invariant learning. G-CLNs extend the Continuous Logic Network (CLN)
architecture with gating units and dropout, which allow the model to robustly
learn general invariants over large numbers of terms. To address overfitting
that arises from finite program sampling, we introduce fractional sampling---a
sound relaxation of loop semantics to continuous functions that facilitates
unbounded sampling on real domain. We additionally design a new CLN activation
function, the Piecewise Biased Quadratic Unit (PBQU), for naturally learning
tight inequality bounds.
We incorporate these methods into a nonlinear loop invariant inference system
that can learn general nonlinear loop invariants. We evaluate our system on a
benchmark of nonlinear loop invariants and show it solves 26 out of 27
problems, 3 more than prior work, with an average runtime of 53.3 seconds. We
further demonstrate the generic learning ability of G-CLNs by solving all 124
problems in the linear Code2Inv benchmark. We also perform a quantitative
stability evaluation and show G-CLNs have a convergence rate of on
quadratic problems, a improvement over CLN models
- …