42,435 research outputs found
FlashProfile: A Framework for Synthesizing Data Profiles
We address the problem of learning a syntactic profile for a collection of
strings, i.e. a set of regex-like patterns that succinctly describe the
syntactic variations in the strings. Real-world datasets, typically curated
from multiple sources, often contain data in various syntactic formats. Thus,
any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify the different
formats is infeasible in standard big-data scenarios.
Prior techniques are restricted to a small set of pre-defined patterns (e.g.
digits, letters, words, etc.), and provide no control over granularity of
profiles. We define syntactic profiling as a problem of clustering strings
based on syntactic similarity, followed by identifying patterns that succinctly
describe each cluster. We present a technique for synthesizing such profiles
over a given language of patterns, that also allows for interactive refinement
by requesting a desired number of clusters.
Using a state-of-the-art inductive synthesis framework, PROSE, we have
implemented our technique as FlashProfile. Across tasks over large
real datasets, we observe a median profiling time of only s.
Furthermore, we show that access to syntactic profiles may allow for more
accurate synthesis of programs, i.e. using fewer examples, in
programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201
Stepping Stones to Inductive Synthesis of Low-Level Looping Programs
Inductive program synthesis, from input/output examples, can provide an
opportunity to automatically create programs from scratch without presupposing
the algorithmic form of the solution. For induction of general programs with
loops (as opposed to loop-free programs, or synthesis for domain-specific
languages), the state of the art is at the level of introductory programming
assignments. Most problems that require algorithmic subtlety, such as fast
sorting, have remained out of reach without the benefit of significant
problem-specific background knowledge. A key challenge is to identify cues that
are available to guide search towards correct looping programs. We present
MAKESPEARE, a simple delayed-acceptance hillclimbing method that synthesizes
low-level looping programs from input/output examples. During search, delayed
acceptance bypasses small gains to identify significantly-improved stepping
stone programs that tend to generalize and enable further progress. The method
performs well on a set of established benchmarks, and succeeds on the
previously unsolved "Collatz Numbers" program synthesis problem. Additional
benchmarks include the problem of rapidly sorting integer arrays, in which we
observe the emergence of comb sort (a Shell sort variant that is empirically
fast). MAKESPEARE has also synthesized a record-setting program on one of the
puzzles from the TIS-100 assembly language programming game.Comment: AAAI 201
Program Synthesis using Natural Language
Interacting with computers is a ubiquitous activity for millions of people.
Repetitive or specialized tasks often require creation of small, often one-off,
programs. End-users struggle with learning and using the myriad of
domain-specific languages (DSLs) to effectively accomplish these tasks.
We present a general framework for constructing program synthesizers that
take natural language (NL) inputs and produce expressions in a target DSL. The
framework takes as input a DSL definition and training data consisting of
NL/DSL pairs. From these it constructs a synthesizer by learning optimal
weights and classifiers (using NLP features) that rank the outputs of a
keyword-programming based translation. We applied our framework to three
domains: repetitive text editing, an intelligent tutoring system, and flight
information queries. On 1200+ English descriptions, the respective synthesizers
rank the desired program as the top-1 and top-3 for 80% and 90% descriptions
respectively
Invariant Synthesis for Incomplete Verification Engines
We propose a framework for synthesizing inductive invariants for incomplete
verification engines, which soundly reduce logical problems in undecidable
theories to decidable theories. Our framework is based on the counter-example
guided inductive synthesis principle (CEGIS) and allows verification engines to
communicate non-provability information to guide invariant synthesis. We show
precisely how the verification engine can compute such non-provability
information and how to build effective learning algorithms when invariants are
expressed as Boolean combinations of a fixed set of predicates. Moreover, we
evaluate our framework in two verification settings, one in which verification
engines need to handle quantified formulas and one in which verification
engines have to reason about heap properties expressed in an expressive but
undecidable separation logic. Our experiments show that our invariant synthesis
framework based on non-provability information can both effectively synthesize
inductive invariants and adequately strengthen contracts across a large suite
of programs
Verification and Synthesis of Symmetric Uni-Rings for Leads-To Properties
This paper investigates the verification and synthesis of parameterized
protocols that satisfy leadsto properties on symmetric
unidirectional rings (a.k.a. uni-rings) of deterministic and constant-space
processes under no fairness and interleaving semantics, where and are
global state predicates. First, we show that verifying for
parameterized protocols on symmetric uni-rings is undecidable, even for
deterministic and constant-space processes, and conjunctive state predicates.
Then, we show that surprisingly synthesizing symmetric uni-ring protocols that
satisfy is actually decidable. We identify necessary and
sufficient conditions for the decidability of synthesis based on which we
devise a sound and complete polynomial-time algorithm that takes the predicates
and , and automatically generates a parameterized protocol that
satisfies for unbounded (but finite) ring sizes. Moreover, we
present some decidability results for cases where leadsto is required from
multiple distinct predicates to different predicates. To demonstrate
the practicality of our synthesis method, we synthesize some parameterized
protocols, including agreement and parity protocols
- …