34 research outputs found
Enhancing Reuse of Constraint Solutions to Improve Symbolic Execution
Constraint solution reuse is an effective approach to save the time of
constraint solving in symbolic execution. Most of the existing reuse approaches
are based on syntactic or semantic equivalence of constraints; e.g. the Green
framework is able to reuse constraints which have different representations but
are semantically equivalent, through canonizing constraints into syntactically
equivalent normal forms. However, syntactic/semantic equivalence is not a
necessary condition for reuse--some constraints are not syntactically or
semantically equivalent, but their solutions still have potential for reuse.
Existing approaches are unable to recognize and reuse such constraints.
In this paper, we present GreenTrie, an extension to the Green framework,
which supports constraint reuse based on the logical implication relations
among constraints. GreenTrie provides a component, called L-Trie, which stores
constraints and solutions into tries, indexed by an implication partial order
graph of constraints. L-Trie is able to carry out logical reduction and logical
subset and superset querying for given constraints, to check for reuse of
previously solved constraints. We report the results of an experimental
assessment of GreenTrie against the original Green framework, which shows that
our extension achieves better reuse of constraint solving result and saves
significant symbolic execution time.Comment: this paper has been submitted to conference ISSTA 201
Targeted Greybox Fuzzing with Static Lookahead Analysis
Automatic test generation typically aims to generate inputs that explore new
paths in the program under test in order to find bugs. Existing work has,
therefore, focused on guiding the exploration toward program parts that are
more likely to contain bugs by using an offline static analysis.
In this paper, we introduce a novel technique for targeted greybox fuzzing
using an online static analysis that guides the fuzzer toward a set of target
locations, for instance, located in recently modified parts of the program.
This is achieved by first semantically analyzing each program path that is
explored by an input in the fuzzer's test suite. The results of this analysis
are then used to control the fuzzer's specialized power schedule, which
determines how often to fuzz inputs from the test suite. We implemented our
technique by extending a state-of-the-art, industrial fuzzer for Ethereum smart
contracts and evaluate its effectiveness on 27 real-world benchmarks. Using an
online analysis is particularly suitable for the domain of smart contracts
since it does not require any code instrumentation---instrumentation to
contracts changes their semantics. Our experiments show that targeted fuzzing
significantly outperforms standard greybox fuzzing for reaching 83% of the
challenging target locations (up to 14x of median speed-up)
Recommended from our members
Identifying Program Entropy Characteristics with Symbolic Execution
The security infrastructure underpinning our society relies on encryption, which relies on the correct generation and use of pseudorandom data. Unfortunately, random data is deceptively hard to generate. Implementation problems in PRNGs and the incorrect usage of generated random data in cryptographic algorithms have led to many issues, including the infamous Debian OpenSSL bug, which exposed millions of systems on the internet to potential compromise due to a mistake that limited the source of randomness during key generation to have 2^15 different seeds (i.e. 15 bits of entropy).It is important to automatically identify if a given program applies a certain cryptographic algorithm or uses its random data correctly.This paper tackles the very first step of this problem by extracting an understanding of how a binary program generates or uses randomness. Specifically, we set the following problem: given a program (or a specific function), can we estimate bounds on the amount of randomness present in the program or function's output by determining bounds on the entropy of this output data? Our technique estimates upper bounds on the entropy of program output through a process of expression reinterpretation and stochastic probability estimation, related to abstract interpretation and model counting
Benchmarking Symbolic Execution Using Constraint Problems -- Initial Results
Symbolic execution is a powerful technique for bug finding and program
testing. It is successful in finding bugs in real-world code. The core
reasoning techniques use constraint solving, path exploration, and search,
which are also the same techniques used in solving combinatorial problems,
e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP
instances as more challenging benchmarks to evaluate the effectiveness of the
core techniques in symbolic execution. We transform CSP benchmarks into C
programs suitable for testing the reasoning capabilities of symbolic execution
tools. From a single CSP P, we transform P depending on transformation choice
into different C programs. Preliminary testing with the KLEE, Tracer-X, and
LLBMC tools show substantial runtime differences from transformation and solver
choice. Our C benchmarks are effective in showing the limitations of existing
symbolic execution tools. The motivation for this work is we believe that
benchmarks of this form can spur the development and engineering of improved
core reasoning in symbolic execution engines
Proof-Producing Symbolic Execution for Binary Code Verification
We propose a proof-producing symbolic execution for verification of
machine-level programs. The analysis is based on a set of core inference rules
that are designed to give control over the tradeoff between preservation of
precision and the introduction of overapproximation to make the application to
real world code useful and tractable. We integrate our symbolic execution in a
binary analysis platform that features a low-level intermediate language
enabling the application of analyses to many different processor architectures.
The overall framework is implemented in the theorem prover HOL4 to be able to
obtain highly trustworthy verification results. We demonstrate our approach to
establish sound execution time bounds for a control loop program implemented
for an ARM Cortex-M0 processor
How software engineering research aligns with design science: A review
Background: Assessing and communicating software engineering research can be
challenging. Design science is recognized as an appropriate research paradigm
for applied research but is seldom referred to in software engineering.
Applying the design science lens to software engineering research may improve
the assessment and communication of research contributions. Aim: The aim of
this study is 1) to understand whether the design science lens helps summarize
and assess software engineering research contributions, and 2) to characterize
different types of design science contributions in the software engineering
literature. Method: In previous research, we developed a visual abstract
template, summarizing the core constructs of the design science paradigm. In
this study, we use this template in a review of a set of 38 top software
engineering publications to extract and analyze their design science
contributions. Results: We identified five clusters of papers, classifying them
according to their alignment with the design science paradigm. Conclusions: The
design science lens helps emphasize the theoretical contribution of research
output---in terms of technological rules---and reflect on the practical
relevance, novelty, and rigor of the rules proposed by the research.Comment: 32 pages, 10 figure