21 research outputs found
Symbolic Crosschecking of Data-Parallel Floating Point Code
In this thesis we present a symbolic execution-based technique for cross-checking programs
accelerated using SIMD or OpenCL against an unaccelerated version, as well as a technique for
detecting data races in OpenCL programs. Our techniques are implemented in KLEE-CL, a
symbolic execution engine based on KLEE that supports symbolic reasoning on the equivalence
between expressions involving both integer and floating-point operations.
While the current generation of constraint solvers provide good support for integer arithmetic,
there is little support available for floating-point arithmetic, due to the complexity inherent
in such computations. The key insight behind our approach is that floating-point values are
only reliably equal if they are essentially built by the same operations. This allows us to use
an algorithm based on symbolic expression matching augmented with canonicalisation rules to
determine path equivalence.
Under symbolic execution, we have to verify equivalence along every feasible control-flow
path. We reduce the branching factor of this process by aggressively merging conditionals,
if-converting branches into select operations via an aggressive phi-node folding transformation.
To support the Intel Streaming SIMD Extension (SSE) instruction set, we lower SSE instructions
to equivalent generic vector operations, which in turn are interpreted in terms of primitive
integer and floating-point operations.
To support OpenCL programs, we symbolically model the OpenCL environment using an
OpenCL runtime library targeted to symbolic execution. We detect data races by keeping
track of all memory accesses using a memory log, and reporting a race whenever we detect that
two accesses conflict. By representing the memory log symbolically, we are also able to detect
races associated with symbolically indexed accesses of memory objects.
We used KLEE-CL to find a number of issues in a variety of open source projects that use SSE
and OpenCL, including mismatches between implementations, memory errors, race conditions
and compiler bugs
CTGEN - a Unit Test Generator for C
We present a new unit test generator for C code, CTGEN. It generates test
data for C1 structural coverage and functional coverage based on
pre-/post-condition specifications or internal assertions. The generator
supports automated stub generation, and data to be returned by the stub to the
unit under test (UUT) may be specified by means of constraints. The typical
application field for CTGEN is embedded systems testing; therefore the tool can
cope with the typical aliasing problems present in low-level C, including
pointer arithmetics, structures and unions. CTGEN creates complete test
procedures which are ready to be compiled and run against the UUT. In this
paper we describe the main features of CTGEN, their technical realisation, and
we elaborate on its performance in comparison to a list of competing test
generation tools. Since 2011, CTGEN is used in industrial scale test campaigns
for embedded systems code in the automotive domain.Comment: In Proceedings SSV 2012, arXiv:1211.587
Structured parallelism discovery with hybrid static-dynamic analysis and evaluation technique
Parallel computer architectures have dominated the computing landscape for the
past two decades; a trend that is only expected to continue and intensify, with increasing specialization and heterogeneity. This creates huge pressure across the software
stack to produce programming languages, libraries, frameworks and tools which will
efficiently exploit the capabilities of parallel computers, not only for new software, but
also revitalizing existing sequential code. Automatic parallelization, despite decades of
research, has had limited success in transforming sequential software to take advantage
of efficient parallel execution. This thesis investigates three approaches that use commutativity analysis as the enabler for parallelization. This has the potential to overcome
limitations of traditional techniques.
We introduce the concept of liveness-based commutativity for sequential loops.
We examine the use of a practical analysis utilizing liveness-based commutativity in a
symbolic execution framework. Symbolic execution represents input values as groups
of constraints, consequently deriving the output as a function of the input and enabling
the identification of further program properties. We employ this feature to develop an
analysis and discern commutativity properties between loop iterations. We study the
application of this approach on loops taken from real-world programs in the OLDEN
and NAS Parallel Benchmark (NPB) suites, and identify its limitations and related
overheads.
Informed by these findings, we develop Dynamic Commutativity Analysis (DCA), a
new technique that leverages profiling information from program execution with specific
input sets. Using profiling information, we track liveness information and detect loop
commutativity by examining the code’s live-out values. We evaluate DCA against almost
1400 loops of the NPB suite, discovering 86% of them as parallelizable. Comparing
our results against dependence-based methods, we match the detection efficacy of two
dynamic and outperform three static approaches, respectively. Additionally, DCA is
able to automatically detect parallelism in loops which iterate over Pointer-Linked
Data Structures (PLDSs), taken from wide range of benchmarks used in the literature,
where all other techniques we considered failed. Parallelizing the discovered loops, our
methodology achieves an average speedup of 3.6× across NPB (and up to 55×) and up
to 36.9× for the PLDS-based loops on a 72-core host. We also demonstrate that our
methodology, despite relying on specific input values for profiling each program, is able
to correctly identify parallelism that is valid for all potential input sets.
Lastly, we develop a methodology to utilize liveness-based commutativity, as implemented in DCA, to detect latent loop parallelism in the shape of patterns. Our approach
applies a series of transformations which subsequently enable multiple applications
of DCA over the generated multi-loop code section and match its loop commutativity
outcomes against the expected criteria for each pattern. Applying our methodology on
sets of sequential loops, we are able to identify well-known parallel patterns (i.e., maps,
reduction and scans). This extends the scope of parallelism detection to loops, such
as those performing scan operations, which cannot be determined as parallelizable by
simply evaluating liveness-based commutativity conditions on their original form
Automated Testcase Generation for Numerical Support Functions in Embedded Systems
We present a tool for the automatic generation of test stimuli for small numerical support functions, e.g., code for trigonometric functions, quaternions, filters, or table lookup. Our tool is based on KLEE to produce a set of test stimuli for full path coverage. We use a method of iterative deepening over abstractions to deal with floating-point values. During actual testing the stimuli exercise the code against a reference implementation. We illustrate our approach with results of experiments with low-level trigonometric functions, interpolation routines, and mathematical support functions from an open source UAS autopilot
Doctor of Philosophy
dissertationGraphics processing units (GPUs) are highly parallel processors that are now commonly used in the acceleration of a wide range of computationally intensive tasks. GPU programs often suffer from data races and deadlocks, necessitating systematic testing. Conventional GPU debuggers are ineffective at finding and root-causing races since they detect errors with respect to the specific platform and inputs as well as thread schedules. The recent formal and semiformal analysis based tools have improved the situation much, but they still have some problems. Our research goal is to aply scalable formal analysis to refrain from platform constraints and exploit all relevant inputs and thread schedules for GPU programs. To achieve this objective, we create a novel symbolic analysis, test and test case generator tailored for C++ GPU programs, the entire framework consisting of three stages: GKLEE, GKLEEp, and SESA. Moreover, my thesis not only presents that our framework is capable of uncovering many concurrency errors effectively in real-world CUDA programs such as latest CUDA SDK kernels, Parboil and LoneStarGPU benchmarks, but also demonstrates a high degree of test automation is achievable in the space of GPU programs through SMT-based symbolic execution, picking representative executions through thread abstraction, and combined static and dynamic analysis
Building Better Bit-Blasting for Floating-Point Problems
An effective approach to handling the theory of floating-point is to reduce it to the theory of bit-vectors. Implementing the required encodings is complex, error prone and requires a deep understanding of floating-point hardware. This paper presents SymFPU, a library of encodings that can be included in solvers. It also includes a verification argument for its correctness, and experimental results showing that its use in CVC4 out-performs all previous tools. As well as a significantly improved performance and correctness, it is hoped this will give a simple route to add support for the theory of floating-point
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv