2,557 research outputs found

    Symbolic Execution for Randomized Programs

    Full text link
    We propose a symbolic execution method for programs that can draw random samples. In contrast to existing work, our method can verify randomized programs with unknown inputs and can prove probabilistic properties that universally quantify over all possible inputs. Our technique augments standard symbolic execution with a new class of \emph{probabilistic symbolic variables}, which represent the results of random draws, and computes symbolic expressions representing the probability of taking individual paths. We implement our method on top of the \textsc{KLEE} symbolic execution engine alongside multiple optimizations and use it to prove properties about probabilities and expected values for a range of challenging case studies written in C++, including Freivalds' algorithm, randomized quicksort, and a randomized property-testing algorithm for monotonicity. We evaluate our method against \textsc{Psi}, an exact probabilistic symbolic inference engine, and \textsc{Storm}, a probabilistic model checker, and show that our method significantly outperforms both tools.Comment: 47 pages, 9 figures, to appear at OOPSLA 202

    Incremental Static Analysis of Probabilistic Programs

    Full text link
    Probabilistic models are used successfully in a wide range of fields, including machine learning, data mining, pattern recognition, and robotics. Probabilistic programming languages are designed to express probabilistic models in high-level programming languages and to conduct automatic inference to compute posterior distributions. A key obstacle to the wider adoption of probabilistic programming languages in practice is that general-purpose efficient inference is computationally difficult. This thesis aims to improve the efficiency of inference through incremental analysis, while preserving precision when a probabilistic program undergoes small changes. For small changes to probabilistic knowledge (i.e., prior probability distributions and observations), the probabilistic model represented by a probabilistic program evolves. In this thesis, we first present a new approach, Icpp, which is a data-flow-based incremental inference approach. By capturing the probabilistic dependence of each data-flow fact and updating changed probabilities sparsely, Icpp can incrementally compute new posterior distributions and thus enable previously computed results to be reused. For small changes at observed array data, upon which their probabilistic models are conditioned, the probabilistic models remain unchanged. In this thesis, we also present ISymb, which is a novel incremental symbolic inference framework. By conducting an intra-procedurally path-sensitive analysis, except for "meets-over-all-paths" analysis within an iteration of a loop (conditioned on some observed array data), ISymb captures the probability distribution for each path and only recomputes the probability distributions for the affected paths. Further, ISymb enables a precision-preserving incremental symbolic inference to run significantly faster than its non-incremental counterparts. In this thesis, we evaluate both Icpp and ISymb against the state-of-the-art data-flow-based inference and symbolic inference, respectively. The results demonstrate that both Icpp and ISymb meet their design goals. For example, Icpp succeeds in making data-flow-based incremental inference possible in probabilistic programs when some probabilistic knowledge undergoes small yet frequent changes. Additionally, ISymb enables symbolic inference to perform one or two orders of magnitude faster than non-incremental inference when some observed array dat
    corecore