27,300 research outputs found
BCFA: Bespoke Control Flow Analysis for CFA at Scale
Many data-driven software engineering tasks such as discovering programming
patterns, mining API specifications, etc., perform source code analysis over
control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be
expensive and performance of the analysis heavily depends on the underlying CFG
traversal strategy. State-of-the-art analysis frameworks use a fixed traversal
strategy. We argue that a single traversal strategy does not fit all kinds of
analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a
control flow analysis (CFA) and a large number of CFGs, BCFA selects the most
efficient traversal strategy for each CFG. BCFA extracts a set of properties of
the CFA by analyzing the code of the CFA and combines it with properties of the
CFG, such as branching factor and cyclicity, for selecting the optimal
traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a
set of representative static analyses that mainly involve traversing CFGs and
two large datasets containing 287 thousand and 162 million CFGs. Our results
show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA
has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%.Comment: 12 page
Liveness-Based Garbage Collection for Lazy Languages
We consider the problem of reducing the memory required to run lazy
first-order functional programs. Our approach is to analyze programs for
liveness of heap-allocated data. The result of the analysis is used to preserve
only live data---a subset of reachable data---during garbage collection. The
result is an increase in the garbage reclaimed and a reduction in the peak
memory requirement of programs. While this technique has already been shown to
yield benefits for eager first-order languages, the lack of a statically
determinable execution order and the presence of closures pose new challenges
for lazy languages. These require changes both in the liveness analysis itself
and in the design of the garbage collector.
To show the effectiveness of our method, we implemented a copying collector
that uses the results of the liveness analysis to preserve live objects, both
evaluated (i.e., in WHNF) and closures. Our experiments confirm that for
programs running with a liveness-based garbage collector, there is a
significant decrease in peak memory requirements. In addition, a sizable
reduction in the number of collections ensures that in spite of using a more
complex garbage collector, the execution times of programs running with
liveness and reachability-based collectors remain comparable
GraphMaps: Browsing Large Graphs as Interactive Maps
Algorithms for laying out large graphs have seen significant progress in the
past decade. However, browsing large graphs remains a challenge. Rendering
thousands of graphical elements at once often results in a cluttered image, and
navigating these elements naively can cause disorientation. To address this
challenge we propose a method called GraphMaps, mimicking the browsing
experience of online geographic maps.
GraphMaps creates a sequence of layers, where each layer refines the previous
one. During graph browsing, GraphMaps chooses the layer corresponding to the
zoom level, and renders only those entities of the layer that intersect the
current viewport. The result is that, regardless of the graph size, the number
of entities rendered at each view does not exceed a predefined threshold, yet
all graph elements can be explored by the standard zoom and pan operations.
GraphMaps preprocesses a graph in such a way that during browsing, the
geometry of the entities is stable, and the viewer is responsive. Our case
studies indicate that GraphMaps is useful in gaining an overview of a large
graph, and also in exploring a graph on a finer level of detail.Comment: submitted to GD 201
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Invariant Generation through Strategy Iteration in Succinctly Represented Control Flow Graphs
We consider the problem of computing numerical invariants of programs, for
instance bounds on the values of numerical program variables. More
specifically, we study the problem of performing static analysis by abstract
interpretation using template linear constraint domains. Such invariants can be
obtained by Kleene iterations that are, in order to guarantee termination,
accelerated by widening operators. In many cases, however, applying this form
of extrapolation leads to invariants that are weaker than the strongest
inductive invariant that can be expressed within the abstract domain in use.
Another well-known source of imprecision of traditional abstract interpretation
techniques stems from their use of join operators at merge nodes in the control
flow graph. The mentioned weaknesses may prevent these methods from proving
safety properties. The technique we develop in this article addresses both of
these issues: contrary to Kleene iterations accelerated by widening operators,
it is guaranteed to yield the strongest inductive invariant that can be
expressed within the template linear constraint domain in use. It also eschews
join operators by distinguishing all paths of loop-free code segments. Formally
speaking, our technique computes the least fixpoint within a given template
linear constraint domain of a transition relation that is succinctly expressed
as an existentially quantified linear real arithmetic formula. In contrast to
previously published techniques that rely on quantifier elimination, our
algorithm is proved to have optimal complexity: we prove that the decision
problem associated with our fixpoint problem is in the second level of the
polynomial-time hierarchy.Comment: 35 pages, conference version published at ESOP 2011, this version is
a CoRR version of our submission to Logical Methods in Computer Scienc
- …