6,134 research outputs found
Proving Non-Termination via Loop Acceleration
We present the first approach to prove non-termination of integer programs
that is based on loop acceleration. If our technique cannot show
non-termination of a loop, it tries to accelerate it instead in order to find
paths to other non-terminating loops automatically. The prerequisites for our
novel loop acceleration technique generalize a simple yet effective
non-termination criterion. Thus, we can use the same program transformations to
facilitate both non-termination proving and loop acceleration. In particular,
we present a novel invariant inference technique that is tailored to our
approach. An extensive evaluation of our fully automated tool LoAT shows that
it is competitive with the state of the art
Learning Fitness Functions for Machine Programming
The problem of automatic software generation is known as Machine Programming.
In this work, we propose a framework based on genetic algorithms to solve this
problem. Although genetic algorithms have been used successfully for many
problems, one criticism is that hand-crafting its fitness function, the test
that aims to effectively guide its evolution, can be notably challenging. Our
framework presents a novel approach to learn the fitness function using neural
networks to predict values of ideal fitness functions. We also augment the
evolutionary process with a minimally intrusive search heuristic. This
heuristic improves the framework's ability to discover correct programs from
ones that are approximately correct and does so with negligible computational
overhead. We compare our approach with several state-of-the-art program
synthesis methods and demonstrate that it finds more correct programs with
fewer candidate program generations
FlashProfile: A Framework for Synthesizing Data Profiles
We address the problem of learning a syntactic profile for a collection of
strings, i.e. a set of regex-like patterns that succinctly describe the
syntactic variations in the strings. Real-world datasets, typically curated
from multiple sources, often contain data in various syntactic formats. Thus,
any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify the different
formats is infeasible in standard big-data scenarios.
Prior techniques are restricted to a small set of pre-defined patterns (e.g.
digits, letters, words, etc.), and provide no control over granularity of
profiles. We define syntactic profiling as a problem of clustering strings
based on syntactic similarity, followed by identifying patterns that succinctly
describe each cluster. We present a technique for synthesizing such profiles
over a given language of patterns, that also allows for interactive refinement
by requesting a desired number of clusters.
Using a state-of-the-art inductive synthesis framework, PROSE, we have
implemented our technique as FlashProfile. Across tasks over large
real datasets, we observe a median profiling time of only s.
Furthermore, we show that access to syntactic profiles may allow for more
accurate synthesis of programs, i.e. using fewer examples, in
programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201
Termination of Rewriting with and Automated Synthesis of Forbidden Patterns
We introduce a modified version of the well-known dependency pair framework
that is suitable for the termination analysis of rewriting under forbidden
pattern restrictions. By attaching contexts to dependency pairs that represent
the calling contexts of the corresponding recursive function calls, it is
possible to incorporate the forbidden pattern restrictions in the (adapted)
notion of dependency pair chains, thus yielding a sound and complete approach
to termination analysis. Building upon this contextual dependency pair
framework we introduce a dependency pair processor that simplifies problems by
analyzing the contextual information of the dependency pairs. Moreover, we show
how this processor can be used to synthesize forbidden patterns suitable for a
given term rewriting system on-the-fly during the termination analysis.Comment: In Proceedings IWS 2010, arXiv:1012.533
Synthesizing Certified Code
Code certification is a lightweight approach for formally demonstrating software quality. Its basic idea is to require code producers to provide formal proofs that their code satisfies certain quality properties. These proofs serve as certificates that can be checked independently. Since code certification uses the same underlying technology as program verification, it requires detailed annotations (e.g., loop invariants) to make the proofs possible. However, manually adding annotations to the code is time-consuming and error-prone. We address this problem by combining code certification with automatic program synthesis. Given a high-level specification, our approach simultaneously generates code and all annotations required to certify the generated code. We describe a certification extension of AutoBayes, a synthesis tool for automatically generating data analysis programs. Based on built-in domain knowledge, proof annotations are added and used to generate proof obligations that are discharged by the automated theorem prover E-SETHEO. We demonstrate our approach by certifying operator- and memory-safety on a data-classification program. For this program, our approach was faster and more precise than PolySpace, a commercial static analysis tool
Network-wide Configuration Synthesis
Computer networks are hard to manage. Given a set of high-level requirements
(e.g., reachability, security), operators have to manually figure out the
individual configuration of potentially hundreds of devices running complex
distributed protocols so that they, collectively, compute a compatible
forwarding state. Not surprisingly, operators often make mistakes which lead to
downtimes. To address this problem, we present a novel synthesis approach that
automatically computes correct network configurations that comply with the
operator's requirements. We capture the behavior of existing routers along with
the distributed protocols they run in stratified Datalog. Our key insight is to
reduce the problem of finding correct input configurations to the task of
synthesizing inputs for a stratified Datalog program. To solve this synthesis
task, we introduce a new algorithm that synthesizes inputs for stratified
Datalog programs. This algorithm is applicable beyond the domain of networks.
We leverage our synthesis algorithm to construct the first network-wide
configuration synthesis system, called SyNET, that support multiple interacting
routing protocols (OSPF and BGP) and static routes. We show that our system is
practical and can infer correct input configurations, in a reasonable amount
time, for networks of realistic size (> 50 routers) that forward packets for
multiple traffic classes.Comment: 24 Pages, short version published in CAV 201
Spatial Aggregation: Theory and Applications
Visual thinking plays an important role in scientific reasoning. Based on the
research in automating diverse reasoning tasks about dynamical systems,
nonlinear controllers, kinematic mechanisms, and fluid motion, we have
identified a style of visual thinking, imagistic reasoning. Imagistic reasoning
organizes computations around image-like, analogue representations so that
perceptual and symbolic operations can be brought to bear to infer structure
and behavior. Programs incorporating imagistic reasoning have been shown to
perform at an expert level in domains that defy current analytic or numerical
methods. We have developed a computational paradigm, spatial aggregation, to
unify the description of a class of imagistic problem solvers. A program
written in this paradigm has the following properties. It takes a continuous
field and optional objective functions as input, and produces high-level
descriptions of structure, behavior, or control actions. It computes a
multi-layer of intermediate representations, called spatial aggregates, by
forming equivalence classes and adjacency relations. It employs a small set of
generic operators such as aggregation, classification, and localization to
perform bidirectional mapping between the information-rich field and
successively more abstract spatial aggregates. It uses a data structure, the
neighborhood graph, as a common interface to modularize computations. To
illustrate our theory, we describe the computational structure of three
implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the
spatial aggregation generic operators by mixing and matching a library of
commonly used routines.Comment: See http://www.jair.org/ for any accompanying file
- …