Search CORE

27,392 research outputs found

Statistical Symbolic Execution with Informed Sampling

Author: Filieri Antonio
Geldenhuys Jaco
Pasareanu Corina S.
Visser Willem
Publication venue
Publication date: 16/11/2014
Field of study

Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits

NASA Technical Reports Server

Spiral - Imperial College Digital Repository

Delayed Sampling and Automatic Rao-Blackwellization of Probabilistic Programs

Author: Broman David
Kudlicka Jan
Lundén Daniel
Murray Lawrence M.
Schön Thomas B.
Publication venue
Publication date: 01/01/2018
Field of study

We introduce a dynamic mechanism for the solution of analytically-tractable substructure in probabilistic programs, using conjugate priors and affine transformations to reduce variance in Monte Carlo estimators. For inference with Sequential Monte Carlo, this automatically yields improvements such as locally-optimal proposals and Rao-Blackwellization. The mechanism maintains a directed graph alongside the running program that evolves dynamically as operations are triggered upon it. Nodes of the graph represent random variables, edges the analytically-tractable relationships between them. Random variables remain in the graph for as long as possible, to be sampled only when they are used by the program in a way that cannot be resolved analytically. In the meantime, they are conditioned on as many observations as possible. We demonstrate the mechanism with a few pedagogical examples, as well as a linear-nonlinear state-space model with simulated data, and an epidemiological model with real data of a dengue outbreak in Micronesia. In all cases one or more variables are automatically marginalized out to significantly reduce variance in estimates of the marginal likelihood, in the final case facilitating a random-weight or pseudo-marginal-type importance sampler for parameter estimation. We have implemented the approach in Anglican and a new probabilistic programming language called Birch.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Compositional Solution Space Quantification for Probabilistic Software Analysis

Author: Adje A.
Antonio Filieri
Corina S. Păsăreanu
Haggarty R.
Kroese D. P.
Marcelo d'Amorim
Mateus Borges
Robert C. P.
Tillmann N.
Willem Visser
Publication venue
Publication date: 09/06/2014
Field of study

Probabilistic software analysis aims at quantifying how likely a target event is to occur during program execution. Current approaches rely on symbolic execution to identify the conditions to reach the target event and try to quantify the fraction of the input domain satisfying these conditions. Precise quantification is usually limited to linear constraints, while only approximate solutions can be provided in general through statistical approaches. However, statistical approaches may fail to converge to an acceptable accuracy within a reasonable time. We present a compositional statistical approach for the efficient quantification of solution spaces for arbitrarily complex constraints over bounded floating-point domains. The approach leverages interval constraint propagation to improve the accuracy of the estimation by focusing the sampling on the regions of the input domain containing the sought solutions. Preliminary experiments show significant improvement on previous approaches both in results accuracy and analysis time

Crossref

NASA Technical Reports Server

Spiral - Imperial College Digital Repository

Fairness Testing: Testing Software for Discrimination

Author: Angwin Julia
Brun Yuriy
Ferral Katelyn
Gonzalez Jesus A.
Guglielmo Luigi Di
Ingold David
Letzter Rafi
Mattioli Dana
Meliou Alexandra
Meliou Alexandra
Nadella Satya
Olson Parmy
Shahani Aarti
Soper Spencer
Soper Spencer
von Rhein Alexander
Zafar Muhammad Bilal
Zemel Richard
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/09/2017
Field of study

This paper defines software fairness and discrimination and develops a testing-based method for measuring if and how much software discriminates, focusing on causality in discriminatory behavior. Evidence of software discrimination has been found in modern software systems that recommend criminal sentences, grant access to financial products, and determine who is allowed to participate in promotions. Our approach, Themis, generates efficient test suites to measure discrimination. Given a schema describing valid system inputs, Themis generates discrimination tests automatically and does not require an oracle. We evaluate Themis on 20 software systems, 12 of which come from prior work with explicit focus on avoiding discrimination. We find that (1) Themis is effective at discovering software discrimination, (2) state-of-the-art techniques for removing discrimination from algorithms fail in many situations, at times discriminating against as much as 98% of an input subdomain, (3) Themis optimizations are effective at producing efficient test suites for measuring discrimination, and (4) Themis is more efficient on systems that exhibit more discrimination. We thus demonstrate that fairness testing is a critical aspect of the software development cycle in domains with possible discrimination and provide initial tools for measuring software discrimination.Comment: Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness Testing: Testing Software for Discrimination. In Proceedings of 2017 11th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), Paderborn, Germany, September 4-8, 2017 (ESEC/FSE'17). https://doi.org/10.1145/3106237.3106277, ESEC/FSE, 201

arXiv.org e-Print Archive

Crossref

A Probabilistic Linear Genetic Programming with Stochastic Context-Free Grammar for solving Symbolic Regression problems

Author: Bosman P. A. N.
Poli R.
Shan Y.
Wong P. K.
Yanai K.
Yanai K.
Publication venue
Publication date: 03/04/2017
Field of study

Traditional Linear Genetic Programming (LGP) algorithms are based only on the selection mechanism to guide the search. Genetic operators combine or mutate random portions of the individuals, without knowing if the result will lead to a fitter individual. Probabilistic Model Building Genetic Programming (PMB-GP) methods were proposed to overcome this issue through a probability model that captures the structure of the fit individuals and use it to sample new individuals. This work proposes the use of LGP with a Stochastic Context-Free Grammar (SCFG), that has a probability distribution that is updated according to selected individuals. We proposed a method for adapting the grammar into the linear representation of LGP. Tests performed with the proposed probabilistic method, and with two hybrid approaches, on several symbolic regression benchmark problems show that the results are statistically better than the obtained by the traditional LGP.Comment: Genetic and Evolutionary Computation Conference (GECCO) 2017, Berlin, German

arXiv.org e-Print Archive

Crossref

Towards concolic testing for hybrid systems

Author: A Fehnker
A Filieri
A Leon-Garcia
A Platzer
B Barbot
BM Gyori
C Jegourel
D Chistikov
EM Hahn
G Orosz
H Lebesgue
J Gordon
K Sen
KE Iverson
M Abramowitz
N Kamide
P Godefroid
S Gao
S Jha
TA Henzinger
TA Henzinger
TA Henzinger
TA Henzinger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/08/2016
Field of study

Hybrid systems exhibit both continuous and discrete behavior. Analyzing hybrid systems is known to be hard. Inspired by the idea of concolic testing (of programs), we investigate whether we can combine random sampling and symbolic execution in order to effectively verify hybrid systems. We identify a sufficient condition under which such a combination is more effective than random sampling. Furthermore, we analyze different strategies of combining random sampling and symbolic execution and propose an algorithm which allows us to dynamically switch between them so as to reduce the overall cost. Our method has been implemented as a web-based checker named HYCHECKER. HYCHECKER has been evaluated with benchmark hybrid systems and a water treatment system in order to test its effectiveness.CPCI-S(ISTP)[email protected]; [email protected]

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University