16,574 research outputs found
A Semantics for Approximate Program Transformations
An approximate program transformation is a transformation that can change the
semantics of a program within a specified empirical error bound. Such
transformations have wide applications: they can decrease computation time,
power consumption, and memory usage, and can, in some cases, allow
implementations of incomputable operations. Correctness proofs of approximate
program transformations are by definition quantitative. Unfortunately, unlike
with standard program transformations, there is as of yet no modular way to
prove correctness of an approximate transformation itself. Error bounds must be
proved for each transformed program individually, and must be re-proved each
time a program is modified or a different set of approximations are applied. In
this paper, we give a semantics that enables quantitative reasoning about a
large class of approximate program transformations in a local, composable way.
Our semantics is based on a notion of distance between programs that defines
what it means for an approximate transformation to be correct up to an error
bound. The key insight is that distances between programs cannot in general be
formulated in terms of metric spaces and real numbers. Instead, our semantics
admits natural notions of distance for each type construct; for example,
numbers are used as distances for numerical data, functions are used as
distances for functional data, an polymorphic lambda-terms are used as
distances for polymorphic data. We then show how our semantics applies to two
example approximations: replacing reals with floating-point numbers, and loop
perforation
Randomized accuracy-aware program transformations for efficient approximate computations
Despite the fact that approximate computations have come to dominate many areas of computer science, the field of program transformations has focused almost exclusively on traditional semantics-preserving transformations that do not attempt to exploit the opportunity, available in many computations, to acceptably trade off accuracy for benefits such as increased performance and reduced resource consumption.
We present a model of computation for approximate computations and an algorithm for optimizing these computations. The algorithm works with two classes of transformations: substitution transformations (which select one of a number of available implementations for a given function, with each implementation offering a different combination of accuracy and resource consumption) and sampling transformations (which randomly discard some of the inputs to a given reduction). The algorithm produces a (1+ε) randomized approximation to the optimal randomized computation (which minimizes resource consumption subject to a probabilistic accuracy specification in the form of a maximum expected error or maximum error variance).National Science Foundation (U.S.). (Grant number CCF-0811397)National Science Foundation (U.S.). (Grant number CCF-0905244)National Science Foundation (U.S.). (Grant number CCF-0843915)National Science Foundation (U.S.). (Grant number CCF-1036241)National Science Foundation (U.S.). (Grant number IIS-0835652)United States. Dept. of Energy. (Grant Number DE-SC0005288)Alfred P. Sloan Foundation. Fellowshi
Reasoning about Relaxed Programs
A number of approximate program transformations have recently emerged that enable transformed programs to trade accuracy of their results for increased performance by dynamically and nondeterministically modifying variables that control program execution. We call such transformed programs relaxed programs -- they have been extended with additional nondeterminism to relax their semantics and offer greater execution flexibility. We present programming language constructs for developing relaxed programs and proof rules for reasoning about properties of relaxed programs. Our proof rules enable programmers to directly specify and verify acceptability properties that characterize the desired correctness relationships between the values of variables in a program's original semantics (before transformation) and its relaxed semantics. Our proof rules also support the verification of safety properties (which characterize desirable properties involving values in individual executions). The rules are designed to support a reasoning approach in which the majority of the reasoning effort uses the original semantics. This effort is then reused to establish the desired properties of the program under the relaxed semantics. We have formalized the dynamic semantics of our target programming language and the proof rules in Coq, and verified that the proof rules are sound with respect to the dynamic semantics. Our Coq implementation enables developers to obtain fully machine checked verifications of their relaxed programs
Learning a Static Analyzer from Data
To be practically useful, modern static analyzers must precisely model the
effect of both, statements in the programming language as well as frameworks
used by the program under analysis. While important, manually addressing these
challenges is difficult for at least two reasons: (i) the effects on the
overall analysis can be non-trivial, and (ii) as the size and complexity of
modern libraries increase, so is the number of cases the analysis must handle.
In this paper we present a new, automated approach for creating static
analyzers: instead of manually providing the various inference rules of the
analyzer, the key idea is to learn these rules from a dataset of programs. Our
method consists of two ingredients: (i) a synthesis algorithm capable of
learning a candidate analyzer from a given dataset, and (ii) a counter-example
guided learning procedure which generates new programs beyond those in the
initial dataset, critical for discovering corner cases and ensuring the learned
analysis generalizes to unseen programs.
We implemented and instantiated our approach to the task of learning
JavaScript static analysis rules for a subset of points-to analysis and for
allocation sites analysis. These are challenging yet important problems that
have received significant research attention. We show that our approach is
effective: our system automatically discovered practical and useful inference
rules for many cases that are tricky to manually identify and are missed by
state-of-the-art, manually tuned analyzers
Program development using abstract interpretation (and the ciao system preprocessor)
The technique of Abstract Interpretation has allowed the development of very sophisticated global program analyses which are at the same time provably correct and practical. We present in a tutorial fashion a novel program development framework which uses abstract interpretation
as a fundamental tool. The framework uses modular, incremental abstract interpretation to obtain information about the program. This information is used to validate programs, to detect bugs with respect to partial specifications written using assertions (in the program itself and/or in system librarles), to genérate and simplify run-time tests, and to perform high-level program transformations such as múltiple abstract specialization, parallelization, and resource usage control, all in a provably correct way. In the case of validation and debugging, the assertions can refer to a variety of program points such as procedure entry, procedure exit, points within procedures, or global computations. The system can reason with much richer information than, for example, traditional types. This includes data structure shape (including pointer sharing), bounds on data structure sizes, and other operational variable instantiation properties, as well as procedure-level properties such as determinacy, termination, non-failure, and bounds on resource consumption (time or space cost). CiaoPP, the preprocessor of the Ciao multi-paradigm programming system, which implements the described functionality, will be used to illustrate the fundamental ideas
Approximation with Error Bounds in Spark
We introduce a sampling framework to support approximate computing with
estimated error bounds in Spark. Our framework allows sampling to be performed
at the beginning of a sequence of multiple transformations ending in an
aggregation operation. The framework constructs a data provenance tree as the
computation proceeds, then combines the tree with multi-stage sampling and
population estimation theories to compute error bounds for the aggregation.
When information about output keys are available early, the framework can also
use adaptive stratified reservoir sampling to avoid (or reduce) key losses in
the final output and to achieve more consistent error bounds across popular and
rare keys. Finally, the framework includes an algorithm to dynamically choose
sampling rates to meet user specified constraints on the CDF of error bounds in
the outputs. We have implemented a prototype of our framework called
ApproxSpark, and used it to implement five approximate applications from
different domains. Evaluation results show that ApproxSpark can (a)
significantly reduce execution time if users can tolerate small amounts of
uncertainties and, in many cases, loss of rare keys, and (b) automatically find
sampling rates to meet user specified constraints on error bounds. We also
explore and discuss extensively trade-offs between sampling rates, execution
time, accuracy and key loss
- …