27,300 research outputs found

    BCFA: Bespoke Control Flow Analysis for CFA at Scale

    Full text link
    Many data-driven software engineering tasks such as discovering programming patterns, mining API specifications, etc., perform source code analysis over control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be expensive and performance of the analysis heavily depends on the underlying CFG traversal strategy. State-of-the-art analysis frameworks use a fixed traversal strategy. We argue that a single traversal strategy does not fit all kinds of analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a control flow analysis (CFA) and a large number of CFGs, BCFA selects the most efficient traversal strategy for each CFG. BCFA extracts a set of properties of the CFA by analyzing the code of the CFA and combines it with properties of the CFG, such as branching factor and cyclicity, for selecting the optimal traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a set of representative static analyses that mainly involve traversing CFGs and two large datasets containing 287 thousand and 162 million CFGs. Our results show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%.Comment: 12 page

    Liveness-Based Garbage Collection for Lazy Languages

    Full text link
    We consider the problem of reducing the memory required to run lazy first-order functional programs. Our approach is to analyze programs for liveness of heap-allocated data. The result of the analysis is used to preserve only live data---a subset of reachable data---during garbage collection. The result is an increase in the garbage reclaimed and a reduction in the peak memory requirement of programs. While this technique has already been shown to yield benefits for eager first-order languages, the lack of a statically determinable execution order and the presence of closures pose new challenges for lazy languages. These require changes both in the liveness analysis itself and in the design of the garbage collector. To show the effectiveness of our method, we implemented a copying collector that uses the results of the liveness analysis to preserve live objects, both evaluated (i.e., in WHNF) and closures. Our experiments confirm that for programs running with a liveness-based garbage collector, there is a significant decrease in peak memory requirements. In addition, a sizable reduction in the number of collections ensures that in spite of using a more complex garbage collector, the execution times of programs running with liveness and reachability-based collectors remain comparable

    GraphMaps: Browsing Large Graphs as Interactive Maps

    Full text link
    Algorithms for laying out large graphs have seen significant progress in the past decade. However, browsing large graphs remains a challenge. Rendering thousands of graphical elements at once often results in a cluttered image, and navigating these elements naively can cause disorientation. To address this challenge we propose a method called GraphMaps, mimicking the browsing experience of online geographic maps. GraphMaps creates a sequence of layers, where each layer refines the previous one. During graph browsing, GraphMaps chooses the layer corresponding to the zoom level, and renders only those entities of the layer that intersect the current viewport. The result is that, regardless of the graph size, the number of entities rendered at each view does not exceed a predefined threshold, yet all graph elements can be explored by the standard zoom and pan operations. GraphMaps preprocesses a graph in such a way that during browsing, the geometry of the entities is stable, and the viewer is responsive. Our case studies indicate that GraphMaps is useful in gaining an overview of a large graph, and also in exploring a graph on a finer level of detail.Comment: submitted to GD 201

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Invariant Generation through Strategy Iteration in Succinctly Represented Control Flow Graphs

    Full text link
    We consider the problem of computing numerical invariants of programs, for instance bounds on the values of numerical program variables. More specifically, we study the problem of performing static analysis by abstract interpretation using template linear constraint domains. Such invariants can be obtained by Kleene iterations that are, in order to guarantee termination, accelerated by widening operators. In many cases, however, applying this form of extrapolation leads to invariants that are weaker than the strongest inductive invariant that can be expressed within the abstract domain in use. Another well-known source of imprecision of traditional abstract interpretation techniques stems from their use of join operators at merge nodes in the control flow graph. The mentioned weaknesses may prevent these methods from proving safety properties. The technique we develop in this article addresses both of these issues: contrary to Kleene iterations accelerated by widening operators, it is guaranteed to yield the strongest inductive invariant that can be expressed within the template linear constraint domain in use. It also eschews join operators by distinguishing all paths of loop-free code segments. Formally speaking, our technique computes the least fixpoint within a given template linear constraint domain of a transition relation that is succinctly expressed as an existentially quantified linear real arithmetic formula. In contrast to previously published techniques that rely on quantifier elimination, our algorithm is proved to have optimal complexity: we prove that the decision problem associated with our fixpoint problem is in the second level of the polynomial-time hierarchy.Comment: 35 pages, conference version published at ESOP 2011, this version is a CoRR version of our submission to Logical Methods in Computer Scienc
    • …
    corecore