182,941 research outputs found
Thread-Modular Static Analysis for Relaxed Memory Models
We propose a memory-model-aware static program analysis method for accurately
analyzing the behavior of concurrent software running on processors with weak
consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of
our method is a unified framework for deciding the feasibility of inter-thread
interferences to avoid propagating spurious data flows during static analysis
and thus boost the performance of the static analyzer. We formulate the
checking of interference feasibility as a set of Datalog rules which are both
efficiently solvable and general enough to capture a range of hardware-level
memory models. Compared to existing techniques, our method can significantly
reduce the number of bogus alarms as well as unsound proofs. We implemented the
method and evaluated it on a large set of multithreaded C programs. Our
experiments showthe method significantly outperforms state-of-the-art
techniques in terms of accuracy with only moderate run-time overhead.Comment: revised version of the ESEC/FSE 2017 pape
Separating computation from communication: a design approach for concurrent program verification
We describe an approach to design static analysis and verification tools for concurrent programs that separates intra-thread computation from inter-thread communication by means of a shared memory abstraction (SMA). We formally characterize the concept of thread-asynchronous transition systems that underpins our approach and that allows us to design tools as two independent components, the intra-thread analysis, which can be optimized separately, and the implementation of the SMA itself, which can be exchanged easily (e.g., from the SC to the TSO memory model). We describe the SMA’s API and show that several concurrent verification techniques from the literature can easily be recast in our setting and thus be extended to weak memory models. We give SMA implementations for the SC, TSO, and PSO memory models that are based on the idea of individual memory unwindings. We instantiate our approach by developing a new, efficient BMC-based bug finding tool for multi-threaded C programs under SC, TSO, or PSO based on these SMAs, and show experimentally that it is competitive to existing tools
Legacy fortran software: applying syntactic metrics to global climate models
It is di cult to maintain legacy Fortran programs that use outdated programming constructs, especially when this maintenance requires a detailed understanding of the code (e.g., for parallelization).\nInitially, we want to gauge the prevalence of such constructs by applying straightforward syntactic metrics to some well-known global climate models. Detailed information regarding les, subroutines, and loops has been collected from each model by applying a lightweight source code static analysis based on ASTs (Abstract Syntax Tree) for a posterior analysis. Modernizing Fortran Legacy programs is still a challenge. Our objective has been to collect relevant information on these programs to help us approach parallelizing legacy scienti c programs in a shared memory environment (e.g. using multi-core processors). The data we collected indicate that old Fortran features are still being used on these models in these days. Furthermore, we propose some metrics to be used as a guide to determine how many changes a program needs in order to be modernized, optimized, and eventually, parallelized.Eje: Workshop IngenierÃa de software (WIS
Expressive and Efficient Memory Representation for Bounded Model Checking of C programs
Ensuring memory safety in programs has been an important yet difficult topic of research. Most static analysis approaches rely on the theory of arrays to model memory
access. The limitation of the theory of arrays in terms of scalability and compatibility
with SAT/SMT solvers is well-known, and there has been many attempts at optimizing
either the theory itself or memory encodings based on theory of arrays.
In this thesis, we demonstrate that existing arrays-based memory encodings miss potential optimization opportunities by omitting language specific properties such as alignment and pointer arithmetic in C. We present SeaM, a new memory representation for C programs built around a more expressive First-order Theory: the Theory of Memory.
We show that by preserving more C language specific rules and properties, the Theory
of Memory allows for more thorough optimization methods during eager rewriting of sequences of stores. We introduce two such optimization methods in this thesis. First, we over-approximate pointer comparison with an abstract interpretation-like approach called AddressRangeMap. Second, we compress sequences of stores with Store-Map for faster
address offset look-ups.
The new memory representation is implemented in SeaBmc, a new BMC tool for
LLVM. We evaluate our approach on real-world bounded model checking tasks from the
aws-c-common library and Sv-Comp benchmarks and compare it against two existing
memory representations in SeaBmc. Our results show that SeaM outperforms the theory
of array based representation and is comparable with the λ based representation
Valuation: Developer support for by-references to by-value type conversion.
Modern object oriented languages like C# and JAVA enable developers to build complex application in less time. These languages are based on selecting heap allocated pass-by-reference objects for user defined data structures. This simplifies programming by automatically managing memory allocation and deallocation in conjunction with automated garbage collection. This simplification of programming comes at the cost of performance. Using pass-by-reference
objects instead of lighter weight pass-by value structs can have memory impact in some cases. These costs can be critical when these application runs on limited resource environments such as mobile devices and cloud computing systems. We explore the problem by using the simple and uniform memory model to improve the performance. In this work we address this problem by providing an automated and sounds static conversion analysis which identifies if a by
reference type can be safely converted to a by value type where the conversion may result in performance improvements. This works focus on C# programs. Our approach is based on a combination of syntactic and semantic checks to identify classes that are safe to convert. We evaluate the effectiveness of our work in identifying convertible types and impact of this transformation. The result shows that the transformation of reference type to value type can have substantial performance impact in practice. In our case studies we optimize the performance in Barnes-Hut program which shows total memory allocation decreased by 93% and execution time also reduced by 15%
Bounded Model Checking of Industrial Code
Abstract:
Bounded Model Checking(BMC) is an effective and precise static analysis technique that reduces program verification to satisfiability (SAT) solving. However, with a few exceptions, BMC is not actively used in software industry, especially, when compared to dynamic analysis techniques such as fuzzing, or light-weight formal static analysis. This thesis describes our experience of applying BMC to industrial code using a novel BMC tool SEABMC.
We present three contributions:
First, a case study of (re)verifying the aws-c-common library from AWS using SEABMC and KLEE. This study explores the methodology from the perspective of three research questions: (a) can proof artifacts be used across verification tools; (b) are there bugs in verified code; and (c) can specifications be improved. To study these questions, we port the verification tasks for aws-c-common library to SEAHORN and KLEE. We show the benefits of using compiler semantics and cross-checking specifications with different verification techniques, and call for standardizing proof library extensions to increase specification reuse.
Second, a description of SEABMC - a novel BMC engine for SEAHORN. We start with a custom IR (called SEA-IR) that explicitly purifies all memory operations by explicating dependencies between them. We then run program transformations and allow for generating many different styles of verification conditions. To support memory safety checking, we extend our base approach with fat pointers and shadow bits of memory to keep track of metadata, such as the size of a pointed-to object. To evaluate SEABMC, we use the aws-c-common library from AWS as a benchmark and compare with CBMC, SMACK, and KLEE. We show that SEABMC is capable of providing an order of magnitude improvement compared with state-of-the-art.
Third, a case study of extending SEABMC to work with Rust - a young systems programming language. We ask three research questions: (a) can SEABMC be used to verify Rust programs easily; (b) can the specification style of aws-c-common be applied successfully to Rust programs; and (c) can verification become more efficient when using higher level language information. We answer these questions by verifying aspects of the Rust standard library using SEAURCHIN, an extension of SEABMC for Rust
Heap Abstractions for Static Analysis
Heap data is potentially unbounded and seemingly arbitrary. As a consequence,
unlike stack and static memory, heap memory cannot be abstracted directly in
terms of a fixed set of source variable names appearing in the program being
analysed. This makes it an interesting topic of study and there is an abundance
of literature employing heap abstractions. Although most studies have addressed
similar concerns, their formulations and formalisms often seem dissimilar and
some times even unrelated. Thus, the insights gained in one description of heap
abstraction may not directly carry over to some other description. This survey
is a result of our quest for a unifying theme in the existing descriptions of
heap abstractions. In particular, our interest lies in the abstractions and not
in the algorithms that construct them.
In our search of a unified theme, we view a heap abstraction as consisting of
two features: a heap model to represent the heap memory and a summarization
technique for bounding the heap representation. We classify the models as
storeless, store based, and hybrid. We describe various summarization
techniques based on k-limiting, allocation sites, patterns, variables, other
generic instrumentation predicates, and higher-order logics. This approach
allows us to compare the insights of a large number of seemingly dissimilar
heap abstractions and also paves way for creating new abstractions by
mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure
Static Analysis of Run-Time Errors in Embedded Real-Time Parallel C Programs
We present a static analysis by Abstract Interpretation to check for run-time
errors in parallel and multi-threaded C programs. Following our work on
Astr\'ee, we focus on embedded critical programs without recursion nor dynamic
memory allocation, but extend the analysis to a static set of threads
communicating implicitly through a shared memory and explicitly using a finite
set of mutual exclusion locks, and scheduled according to a real-time
scheduling policy and fixed priorities. Our method is thread-modular. It is
based on a slightly modified non-parallel analysis that, when analyzing a
thread, applies and enriches an abstract set of thread interferences. An
iterator then re-analyzes each thread in turn until interferences stabilize. We
prove the soundness of our method with respect to the sequential consistency
semantics, but also with respect to a reasonable weakly consistent memory
semantics. We also show how to take into account mutual exclusion and thread
priorities through a partitioning over an abstraction of the scheduler state.
We present preliminary experimental results analyzing an industrial program
with our prototype, Th\'es\'ee, and demonstrate the scalability of our
approach
- …