2,557 research outputs found
Symbolic Execution for Randomized Programs
We propose a symbolic execution method for programs that can draw random
samples. In contrast to existing work, our method can verify randomized
programs with unknown inputs and can prove probabilistic properties that
universally quantify over all possible inputs. Our technique augments standard
symbolic execution with a new class of \emph{probabilistic symbolic variables},
which represent the results of random draws, and computes symbolic expressions
representing the probability of taking individual paths. We implement our
method on top of the \textsc{KLEE} symbolic execution engine alongside multiple
optimizations and use it to prove properties about probabilities and expected
values for a range of challenging case studies written in C++, including
Freivalds' algorithm, randomized quicksort, and a randomized property-testing
algorithm for monotonicity. We evaluate our method against \textsc{Psi}, an
exact probabilistic symbolic inference engine, and \textsc{Storm}, a
probabilistic model checker, and show that our method significantly outperforms
both tools.Comment: 47 pages, 9 figures, to appear at OOPSLA 202
Incremental Static Analysis of Probabilistic Programs
Probabilistic models are used successfully in a wide range of fields, including machine
learning, data mining, pattern recognition, and robotics. Probabilistic programming
languages are designed to express probabilistic models in high-level programming
languages and to conduct automatic inference to compute posterior distributions.
A key obstacle to the wider adoption of probabilistic programming languages
in practice is that general-purpose efficient inference is computationally difficult.
This thesis aims to improve the efficiency of inference through incremental analysis,
while preserving precision when a probabilistic program undergoes small changes.
For small changes to probabilistic knowledge (i.e., prior probability distributions
and observations), the probabilistic model represented by a probabilistic
program evolves. In this thesis, we first present a new approach, Icpp, which
is a data-flow-based incremental inference approach. By capturing the probabilistic
dependence of each data-flow fact and updating changed probabilities
sparsely, Icpp can incrementally compute new posterior distributions and
thus enable previously computed results to be reused.
For small changes at observed array data, upon which their probabilistic models
are conditioned, the probabilistic models remain unchanged. In this thesis, we
also present ISymb, which is a novel incremental symbolic inference framework.
By conducting an intra-procedurally path-sensitive analysis, except for "meets-over-all-paths" analysis within an iteration of a loop (conditioned on some
observed array data), ISymb captures the probability distribution for each
path and only recomputes the probability distributions for the affected paths.
Further, ISymb enables a precision-preserving incremental symbolic inference
to run significantly faster than its non-incremental counterparts.
In this thesis, we evaluate both Icpp and ISymb against the state-of-the-art
data-flow-based inference and symbolic inference, respectively. The results demonstrate
that both Icpp and ISymb meet their design goals. For example, Icpp
succeeds in making data-flow-based incremental inference possible in probabilistic
programs when some probabilistic knowledge undergoes small yet frequent changes.
Additionally, ISymb enables symbolic inference to perform one or two orders of
magnitude faster than non-incremental inference when some observed array dat
- …