30,402 research outputs found
Heap Abstractions for Static Analysis
Heap data is potentially unbounded and seemingly arbitrary. As a consequence,
unlike stack and static memory, heap memory cannot be abstracted directly in
terms of a fixed set of source variable names appearing in the program being
analysed. This makes it an interesting topic of study and there is an abundance
of literature employing heap abstractions. Although most studies have addressed
similar concerns, their formulations and formalisms often seem dissimilar and
some times even unrelated. Thus, the insights gained in one description of heap
abstraction may not directly carry over to some other description. This survey
is a result of our quest for a unifying theme in the existing descriptions of
heap abstractions. In particular, our interest lies in the abstractions and not
in the algorithms that construct them.
In our search of a unified theme, we view a heap abstraction as consisting of
two features: a heap model to represent the heap memory and a summarization
technique for bounding the heap representation. We classify the models as
storeless, store based, and hybrid. We describe various summarization
techniques based on k-limiting, allocation sites, patterns, variables, other
generic instrumentation predicates, and higher-order logics. This approach
allows us to compare the insights of a large number of seemingly dissimilar
heap abstractions and also paves way for creating new abstractions by
mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure
Sound and Precise Malware Analysis for Android via Pushdown Reachability and Entry-Point Saturation
We present Anadroid, a static malware analysis framework for Android apps.
Anadroid exploits two techniques to soundly raise precision: (1) it uses a
pushdown system to precisely model dynamically dispatched interprocedural and
exception-driven control-flow; (2) it uses Entry-Point Saturation (EPS) to
soundly approximate all possible interleavings of asynchronous entry points in
Android applications. (It also integrates static taint-flow analysis and least
permissions analysis to expand the class of malicious behaviors which it can
catch.) Anadroid provides rich user interface support for human analysts which
must ultimately rule on the "maliciousness" of a behavior.
To demonstrate the effectiveness of Anadroid's malware analysis, we had teams
of analysts analyze a challenge suite of 52 Android applications released as
part of the Auto- mated Program Analysis for Cybersecurity (APAC) DARPA
program. The first team analyzed the apps using a ver- sion of Anadroid that
uses traditional (finite-state-machine-based) control-flow-analysis found in
existing malware analysis tools; the second team analyzed the apps using a
version of Anadroid that uses our enhanced pushdown-based
control-flow-analysis. We measured machine analysis time, human analyst time,
and their accuracy in flagging malicious applications. With pushdown analysis,
we found statistically significant (p < 0.05) decreases in time: from 85
minutes per app to 35 minutes per app in human plus machine analysis time; and
statistically significant (p < 0.05) increases in accuracy with the
pushdown-driven analyzer: from 71% correct identification to 95% correct
identification.Comment: Appears in 3rd Annual ACM CCS workshop on Security and Privacy in
SmartPhones and Mobile Devices (SPSM'13), Berlin, Germany, 201
Static Safety for an Actor Dedicated Process Calculus by Abstract Interpretation
The actor model eases the definition of concurrent programs with non uniform
behaviors. Static analysis of such a model was previously done in a data-flow
oriented way, with type systems. This approach was based on constraint set
resolution and was not able to deal with precise properties for communications
of behaviors. We present here a new approach, control-flow oriented, based on
the abstract interpretation framework, able to deal with communication of
behaviors. Within our new analyses, we are able to verify most of the previous
properties we observed as well as new ones, principally based on occurrence
counting
An efficient, parametric fixpoint algorithm for analysis of java bytecode
Abstract interpretation has been widely used for the analysis of object-oriented languages and, in particular, Java source and bytecode. However, while most existing work deals with the problem of flnding expressive abstract domains that track accurately the characteristics of a particular concrete property, the underlying flxpoint algorithms have received comparatively less attention. In fact, many existing (abstract interpretation based—) flxpoint algorithms rely on relatively inefHcient techniques for solving inter-procedural caligraphs or are speciflc and tied to particular analyses. We also argüe that the design of an efficient fixpoint algorithm is pivotal to supporting the analysis of large programs. In this paper we introduce a novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations. The algorithm is parametric -in the sense that it is independent of the abstract domain used and it can be applied to different domains as "plug-ins"-, multivariant, and flow-sensitive. Also, is based on a program transformation, prior to the analysis, that results in a highly uniform representation of all the features in the language and therefore simplifies analysis. Detailed descriptions of decompilation solutions are given and discussed with an example. We also provide some performance data from a preliminary implementation of the analysis
What Does Aspect-Oriented Programming Mean for Functional Programmers?
Aspect-Oriented Programming (AOP) aims at modularising crosscutting concerns that show up in software. The success of AOP has been almost viral and nearly all areas in Software Engineering and Programming Languages have become "infected" by the AOP bug in one way or another. Interestingly the functional programming community (and, in particular, the pure functional programming community) seems to be resistant to the pandemic. The goal of this paper is to debate the possible causes of the functional programming community's resistance and to raise awareness and interest by showcasing the benefits that could be gained from having a functional AOP language. At the same time, we identify the main challenges and explore the possible design-space
Call Graphs for Languages with Parametric Polymorphism
The performance of contemporary object oriented languages depends on optimizations such as devirtualization, inlining, and specialization, and these in turn depend on precise call graph analysis. Existing call graph analyses do not take advantage of the information provided by the rich type systems of contemporary languages, in particular generic type arguments. Many existing approaches analyze Java bytecode, in which generic types have been erased. This paper shows that this discarded information is actually very useful as the context in a context-sensitive analysis, where it significantly improves precision and keeps the running time small. Specifically, we propose and evaluate call graph construction algorithms in which the contexts of a method are (i) the type arguments passed to its type parameters, and (ii) the static types of the arguments passed to its term parameters. The use of static types from the caller as context is effective because it allows more precise dispatch of call sites inside the callee. Our evaluation indicates that the average number of contexts required per method is small. We implement the analysis in the Dotty compiler for Scala, and evaluate it on programs that use the type-parametric Scala collections library and on the Dotty compiler itself. The context-sensitive analysis runs 1.4x faster than a context-insensitive one and discovers 20\% more monomorphic call sites at the same time. When applied to method specialization, the imprecision in a context-insensitive call graph would require the average method to be cloned 22 times, whereas the context-sensitive call graph indicates a much more practical 1.00 to 1.50 clones per method
Interprocedural Data Flow Analysis in Soot using Value Contexts
An interprocedural analysis is precise if it is flow sensitive and fully
context-sensitive even in the presence of recursion. Many methods of
interprocedural analysis sacrifice precision for scalability while some are
precise but limited to only a certain class of problems.
Soot currently supports interprocedural analysis of Java programs using graph
reachability. However, this approach is restricted to IFDS/IDE problems, and is
not suitable for general data flow frameworks such as heap reference analysis
and points-to analysis which have non-distributive flow functions.
We describe a general-purpose interprocedural analysis framework for Soot
using data flow values for context-sensitivity. This framework is not
restricted to problems with distributive flow functions, although the lattice
must be finite. It combines the key ideas of the tabulation method of the
functional approach and the technique of value-based termination of call string
construction.
The efficiency and precision of interprocedural analyses is heavily affected
by the precision of the underlying call graph. This is especially important for
object-oriented languages like Java where virtual method invocations cause an
explosion of spurious call edges if the call graph is constructed naively. We
have instantiated our framework with a flow and context-sensitive points-to
analysis in Soot, which enables the construction of call graphs that are far
more precise than those constructed by Soot's SPARK engine.Comment: SOAP 2013 Final Versio
Simple and Effective Type Check Removal through Lazy Basic Block Versioning
Dynamically typed programming languages such as JavaScript and Python defer
type checking to run time. In order to maximize performance, dynamic language
VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between
compilation time and resulting precision. This has lead to the creation of
increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation
technique which effectively removes redundant type checks from critical code
paths. This novel approach lazily generates type-specialized versions of basic
blocks on-the-fly while propagating context-dependent type information. This
does not require the use of costly program analyses, is not restricted by the
precision limitations of traditional type analyses and avoids the
implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a
JavaScript JIT compiler. This approach is compared with a classical flow-based
type analysis. Lazy basic block versioning performs as well or better on all
benchmarks. On average, 71% of type tests are eliminated, yielding speedups of
up to 50%. We also show that our implementation generates more efficient
machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on
several benchmarks. The combination of implementation simplicity, low
algorithmic complexity and good run time performance makes basic block
versioning attractive for baseline JIT compilers
- …