3,361 research outputs found
Goal-oriented test data generation for pointer programs
Goal-oriented test data generation; Constraint Logic Programming; Static Single Assignment formInternational audienceAutomatic test data generation leads to the identification of input values on which a selected path or a selected branch is executed within a program (path-oriented vs goal-oriented methods). In both cases, several approaches based on constraint solving exist, but in the presence of pointer variables only path-oriented methods have been proposed. Pointers are responsible for the existence of conditional aliasing problems that usually provoke the failure of the goal-oriented test data generation process. In this paper, we propose an overall constraint-based method that exploits the results of an intraprocedural points-to analysis and provides two specific constraint combinators for automatically generating goal-oriented test data. This approach correctly handles multi-levels stack-directed pointers that are mainly used in C programs. The method has been fully implemented in the test data generation tool INKA and first experiences in applying it to a variety of existing programs are presented
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Proving memory safety of floating-point computations by combining static and dynamic program analysis
Whitebox fuzzing is a novel form of security testing based on runtime symbolic execution and constraint solving. Over the last couple of years, whitebox fuzzers have found dozens of new security vulnerabilities (buffer overflows) in Windows and Linux applications, including codecs, image viewers and media players. Those types of applications tend to use floating-point instructions available on modern processors, yet existing whitebox fuzzers and SMT constraint solvers do not handle floating-point arithmetic. Are there new security vulnerabilities lurking in floating-point code? A naive solution would be to extend symbolic execu-tion to floating-point (FP) instructions (months of work), ex-tend SMT solvers to reason about FP constraints (months of work), and then face more complex constraints and an even worse path explosion problem. Instead, we propose an alternative approach, based on the rough intuition that FP code should only perform memory-safe data-processing of the “payload ” of an image or video file, while the non-FP part of the application should deal with buffer alloca-tions and memory address computations, with only the lat-ter being prone to buffer overflows and other security critical bugs. Our approach combines (1) a lightweight local path-insensitive “may ” static analysis of FP instructions with (2) a high-precision whole-program path-sensitive “must ” dy-namic analysis of non-FP instructions. The aim of this com-bination is to prove memory safety of the FP part and a form of non-interference between the FP part and the non-FP part with respect to memory address computations. We have implemented our approach using two existing tools for, respectively, static and dynamic x86 binary analysis. We present preliminary results of experiments with standard JPEG, GIF and ANI Windows parsers. For a given test suite of diverse input files, our mixed static/dynamic analysis is able to prove memory safety of FP code in those parsers for a small upfront static analysis cost and a marginal runtime expense compared to regular runtime symbolic execution
Separation logic for high-level synthesis
High-level synthesis (HLS) promises a significant shortening of the digital hardware design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. However, applications using dynamic, pointer-based data structures remain difficult to implement well, yet such constructs are widely used in software. Automated optimisations that leverage the memory bandwidth of dedicated hardware implementations by distributing the application data over separate on-chip memories and parallelise the implementation are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis that disambiguates pointer-based memory accesses. This thesis takes a step towards closing this gap. We explore recent advances in separation logic, a rigorous mathematical framework that enables formal reasoning about the memory access of heap-manipulating programs. We develop a static analysis that automatically splits heap-allocated data structures into provably disjoint regions. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable loop parallelisation and physical memory partitioning by off-the-shelf HLS tools.
We then extend the scope of our technique to pointer-based memory-intensive implementations that require access to an off-chip memory. The extended HLS design aid generates parallel on-chip multi-cache architectures. It uses the disjointness property of memory accesses to support non-overlapping memory regions by private caches. It also identifies regions which are shared after parallelisation and which are supported by parallel caches with a coherency mechanism and synchronisation, resulting in automatically specialised memory systems. We show up to 15x acceleration from heap partitioning, parallelisation and the insertion of the custom cache system in demonstrably practical applications.Open Acces
WCET-Directed Dynamic Scratchpad Memory Allocation of Data
Many embedded systems feature processors coupled with a small and fast scratchpad memory. To the difference with caches, allocation of data to scratchpad memory must be handled by software. The major gain is to enhance the pre-dictability of memory accesses latencies. A compile-time dy-namic allocation approach enables eviction and placement of data to the scratchpad memory at runtime. Previous dynamic scratchpad memory allocation ap-proaches aimed to reduce average-case program execution time or the energy consumption due to memory accesses. For real-time systems, worst-case execution time is the main metric to optimize. In this paper, we propose a WCET-directed algorithm to dynamically allocate static data and stack data of a pro-gram to scratchpad memory. The granularity of placement of memory transfers (e.g. on function, basic block bound-aries) is discussed from the perspective of its computation complexity and the quality of allocation. 1
Formal Derivation of Concurrent Garbage Collectors
Concurrent garbage collectors are notoriously difficult to implement
correctly. Previous approaches to the issue of producing correct collectors
have mainly been based on posit-and-prove verification or on the application of
domain-specific templates and transformations. We show how to derive the upper
reaches of a family of concurrent garbage collectors by refinement from a
formal specification, emphasizing the application of domain-independent design
theories and transformations. A key contribution is an extension to the
classical lattice-theoretic fixpoint theorems to account for the dynamics of
concurrent mutation and collection.Comment: 38 pages, 21 figures. The short version of this paper appeared in the
Proceedings of MPC 201
Recommended from our members
Duplo: A framework for OCaml post-link optimisation
We present a novel framework,
Duplo
, for the low-level post-link optimisation of OCaml programs, achieving a speedup of 7% and a reduction of at least 15% of the code size of widely-used OCaml applications. Unlike existing post-link optimisers, which typically operate on target-specific machine code, our framework operates on a Low-Level Intermediate Representation (LLIR) capable of representing both the OCaml programs and any C dependencies they invoke through the foreign-function interface (FFI). LLIR is analysed, transformed and lowered to machine code by our post-link optimiser, LLIR-OPT. Most importantly, LLIR allows the optimiser to cross the OCaml-C language boundary, mitigating the overhead incurred by the FFI and enabling analyses and transformations in a previously unavailable context. The optimised IR is then lowered to amd64 machine code through the existing target-specific code generator of LLVM, modified to handle garbage collection just as effectively as the native OCaml backend. We equip our optimiser with a suite of SSA-based transformations and points-to analyses capable of capturing the semantics and representing the memory models of both languages, along with a cross-language inliner to embed C methods into OCaml callers. We evaluate the gains of our framework, which can be attributed to both our optimiser and the more sophisticated amd64 backend of LLVM, on a wide-range of widely-used OCaml applications, as well as an existing suite of micro- and macro-benchmarks used to track the performance of the OCaml compiler.
EPSRC EP/P020011/1, Cambridge Trust
- …