Learning Program Specifications from Sample Runs

Abstract

With science fiction of yore being reality recently with self-driving cars, wearable computers and autonomous robots, software reliability is growing increasingly important. A critical pre-requisite to ensure the software that controls such systems is correct is the availability of precise specifications that describe a program\u27s intended behaviors. Generating these specifications manually is a challenging, often unsuccessful, exercise; unfortunately, existing static analysis techniques often produce poor quality specifications that are ineffective in aiding program verification tasks. In this dissertation, we present a recent line of work on automated synthesis of specifications that overcome many of the deficiencies that plague existing specification inference methods. Our main contribution is a formulation of the problem as a sample driven one, in which specifications, represented as terms in a decidable refinement type representation, are discovered from observing a program\u27s sample runs in terms of either program execution paths or input-output values, and automatically verified through the use of expressive refinement type systems. Our approach is realized as a series of inductive synthesis frameworks, which use various logic-based or classification-based learning algorithms to provide sound and precise machine-checked specifications. Experimental results indicate that the learning algorithms are both efficient and effective, capable of automatically producing sophisticated specifications in nontrivial hypothesis domains over a range of complex real-world programs, going well beyond the capabilities of existing solutions

    Similar works