Search CORE

81 research outputs found

Overfitting in Synthesis: Theory and Practice (Extended Version)

Author: Millstein Todd
Nori Aditya
Padhi Saswat
Sharma Rahul
Publication venue
Publication date: 01/01/2019
Field of study

In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically generate a program belonging to a grammar of possible implementations that meets a logical specification. We investigate a common limitation across state-of-the-art SyGuS tools that perform counterexample-guided inductive synthesis (CEGIS). We empirically observe that as the expressiveness of the provided grammar increases, the performance of these tools degrades significantly. We claim that this degradation is not only due to a larger search space, but also due to overfitting. We formally define this phenomenon and prove no-free-lunch theorems for SyGuS, which reveal a fundamental tradeoff between synthesizer performance and grammar expressiveness. A standard approach to mitigate overfitting in machine learning is to run multiple learners with varying expressiveness in parallel. We demonstrate that this insight can immediately benefit existing SyGuS tools. We also propose a novel single-threaded technique called hybrid enumeration that interleaves different grammars and outperforms the winner of the 2018 SyGuS competition (Inv track), solving more problems and achieving a

5\times

mean speedup.Comment: 24 pages (5 pages of appendices), 7 figures, includes proofs of theorem

arXiv.org e-Print Archive

eScholarship - University of California

A Provably Correct Sampler for Probabilistic Programs

Author: Hur Chung-Kil
Nori Aditya V.
Rajamani Sriram K.
Samuel Selva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)
Publication date: 01/01/2015
Field of study

Dagstuhl Research Online Publication Server

Scenic: A Language for Scenario Specification and Scene Generation

Author: Dosovitskiy Alexey
Fremont Daniel J.
Gupta Ankush
Jiang Chenfanfu
Kulkarni Tejas
Liebelt Joerg
Milch Brian
Naveh Yehuda
Nori Aditya V
Ritchie Daniel
Ros Germán
Russell Stuart
Saheb-Djahromi Nasser
Sutton Michael
Wood Frank
Wu Bichen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/06/2019
Field of study

We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs and sampling these to generate specialized training and test sets. More generally, such languages can be used for cyber-physical systems and robotics to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment is a "scene", a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing "scenarios" that are distributions over scenes. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods.Comment: 41 pages, 36 figures. Full version of a PLDI 2019 paper (extending UC Berkeley EECS Department Tech Report No. UCB/EECS-2018-8

arXiv.org e-Print Archive

Crossref

Targeted Greybox Fuzzing with Static Lookahead Analysis

Author: Anand Saswat
Brent Lexi
Cadar Cristian
Chowdhury Animesh Basak
Clarke Edmund M.
Csallner Christoph
Czech Mike
Feist Josselin
Fähndrich Manuel
Hajdu Ákos
Kalra Sukrit
Li Yuekang
Ma Lei
Marinescu Paul Dan
Mossberg Mark
Nori Aditya V.
Raval Siraj
Sen Koushik
Swan Melanie
Tillmann Nikolai
Vargha András
Wang Yuepeng
Wood Gavin
Wüstholz Valentin
Publication venue
Publication date: 17/05/2019
Field of study

Automatic test generation typically aims to generate inputs that explore new paths in the program under test in order to find bugs. Existing work has, therefore, focused on guiding the exploration toward program parts that are more likely to contain bugs by using an offline static analysis. In this paper, we introduce a novel technique for targeted greybox fuzzing using an online static analysis that guides the fuzzer toward a set of target locations, for instance, located in recently modified parts of the program. This is achieved by first semantically analyzing each program path that is explored by an input in the fuzzer's test suite. The results of this analysis are then used to control the fuzzer's specialized power schedule, which determines how often to fuzz inputs from the test suite. We implemented our technique by extending a state-of-the-art, industrial fuzzer for Ethereum smart contracts and evaluate its effectiveness on 27 real-world benchmarks. Using an online analysis is particularly suitable for the domain of smart contracts since it does not require any code instrumentation---instrumentation to contracts changes their semantics. Our experiments show that targeted fuzzing significantly outperforms standard greybox fuzzing for reaching 83% of the challenging target locations (up to 14x of median speed-up)

arXiv.org e-Print Archive

Crossref

MPG.PuRe