4,344 research outputs found
A Symbolic Execution Algorithm for Constraint-Based Testing of Database Programs
In so-called constraint-based testing, symbolic execution is a common
technique used as a part of the process to generate test data for imperative
programs. Databases are ubiquitous in software and testing of programs
manipulating databases is thus essential to enhance the reliability of
software. This work proposes and evaluates experimentally a symbolic ex-
ecution algorithm for constraint-based testing of database programs. First, we
describe SimpleDB, a formal language which offers a minimal and well-defined
syntax and seman- tics, to model common interaction scenarios between pro-
grams and databases. Secondly, we detail the proposed al- gorithm for symbolic
execution of SimpleDB models. This algorithm considers a SimpleDB program as a
sequence of operations over a set of relational variables, modeling both the
database tables and the program variables. By inte- grating this relational
model of the program with classical static symbolic execution, the algorithm
can generate a set of path constraints for any finite path to test in the
control- flow graph of the program. Solutions of these constraints are test
inputs for the program, including an initial content for the database. When the
program is executed with respect to these inputs, it is guaranteed to follow
the path with re- spect to which the constraints were generated. Finally, the
algorithm is evaluated experimentally using representative SimpleDB models.Comment: 12 pages - preliminary wor
Recommended from our members
Automated synthesis of data extraction and transformation programs
Due to the abundance of data in today’s data-rich world, end-users increasingly need to perform various data extraction and transformation tasks. While many of these tedious tasks can be performed in a programmatic way, most end-users lack the required programming expertise to automate them and end up spending their valuable time in manually performing various data- related tasks. The field of program synthesis aims to overcome this problem by automatically generating programs from informal specifications, such as input-output examples or natural language.
This dissertation focuses on the design and implementation of new systems for automating important classes of data transformation and extraction tasks. It introduces solutions for automating data manipulation tasks on fully- structured data formats like relational tables, or on semi-structured formats such as XML and JSON documents.
First, we describe a novel algorithm for synthesizing hierarchical data transformations from input-output examples. A key novelty of our approach is that it reduces the synthesis of tree transformations to the simpler problem of synthesizing transformations over the paths of the tree. We also describe a new and effective algorithm for learning path transformations that combines logical SMT-based reasoning with machine learning techniques based on decision trees.
Next, we present a new methodology for learning programs that migrate tree-structured documents to relational table representations from input-output examples. Our approach achieves its goal by decomposing the synthesis task to two subproblems of (A) learning the column extraction logic, and (B) learning the row extraction logic. We propose a technique for learning column extraction programs using deterministic finite automata, and a new algorithm for predicate learning which combines integer linear programing and logic minimization.
Finally, we address the problem of automating data extraction tasks from natural language. Specifically, we focus on data retrieval from relational databases and describe a novel approach for learning SQL queries from English descriptions. The method we describe is fully automatic and database-agnostic
(i.e., does not require customization for each database). Our method combines semantic parsing techniques from the NLP community with novel programming languages ideas involving probabilistic type inhabitation and automated sketch repair.Computer Science
- …