5,195 research outputs found

    Transformation of Attributed Structures with Cloning (Long Version)

    Full text link
    Copying, or cloning, is a basic operation used in the specification of many applications in computer science. However, when dealing with complex structures, like graphs, cloning is not a straightforward operation since a copy of a single vertex may involve (implicitly)copying many edges. Therefore, most graph transformation approaches forbid the possibility of cloning. We tackle this problem by providing a framework for graph transformations with cloning. We use attributed graphs and allow rules to change attributes. These two features (cloning/changing attributes) together give rise to a powerful formal specification approach. In order to handle different kinds of graphs and attributes, we first define the notion of attributed structures in an abstract way. Then we generalise the sesqui-pushout approach of graph transformation in the proposed general framework and give appropriate conditions under which attributed structures can be transformed. Finally, we instantiate our general framework with different examples, showing that many structures can be handled and that the proposed framework allows one to specify complex operations in a natural way

    Simple and Effective Type Check Removal through Lazy Basic Block Versioning

    Get PDF
    Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler. This approach is compared with a classical flow-based type analysis. Lazy basic block versioning performs as well or better on all benchmarks. On average, 71% of type tests are eliminated, yielding speedups of up to 50%. We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks. The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers

    Value-Flow-Based Demand-Driven Pointer Analysis for C and C++

    Full text link
    IEEE We present SUPA, a value-flow-based demand-driven flow- and context-sensitive pointer analysis with strong updates for C and C++ programs. SUPA enables computing points-to information via value-flow refinement, in environments with small time and memory budgets. We formulate SUPA by solving a graph-reachability problem on an inter-procedural value-flow graph representing a program's def-use chains, which are pre-computed efficiently but over-approximately. To answer a client query (a request for a variable's points-to set), SUPA reasons about the flow of values along the pre-computed def-use chains sparsely (rather than across all program points), by performing only the work necessary for the query (rather than analyzing the whole program). In particular, strong updates are performed to filter out spurious def-use chains through value-flow refinement as long as the total budget is not exhausted

    Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis

    Get PDF
    Static analysis is used to automatically detect bugs and security breaches, and aids compiler optimization. Whole-program analysis (WPA) can yield high precision, however causes long analysis times and thus does not match common software-development workflows, making it often impractical to use for large, real-world applications. This paper thus presents the design and implementation of ModAlyzer, a novel static-analysis approach that aims at accelerating whole-program analysis by making the analysis modular and compositional. It shows how to compute lossless, persisted summaries for callgraph, points-to and data-flow information, and it reports under which circumstances this function-level compositional analysis outperforms WPA. We implemented ModAlyzer as an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time, ModAlyzer modularly and losslessly summarizes the analysis effect of the library code those applications share, hence avoiding its repeated re-analysis. The experimental results show that the reuse of these summaries can save, on average, 72% of analysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retains precision and recall. Surprisingly, as our results show, it sometimes even yields precision superior to WPA. The initial summary generation, on average, takes about 3.67 times as long as WPA

    Efficient Set Sharing Using ZBDDs

    Get PDF
    Set sharing is an abstract domain in which each concrete object is represented by the set of local variables from which it might be reachable. It is a useful abstraction to detect parallelism opportunities, since it contains definite information about which variables do not share in memory, i.e., about when the memory regions reachable from those variables are disjoint. Set sharing is a more precise alternative to pair sharing, in which each domain element is a set of all pairs of local variables from which a common object may be reachable. However, the exponential complexity of some set sharing operations has limited its wider application. This work introduces an efficient implementation of the set sharing domain using Zero-suppressed Binary Decision Diagrams (ZBDDs). Because ZBDDs were designed to represent sets of combinations (i.e., sets of sets), they naturally represent elements of the set sharing domain. We show how to synthesize the operations needed in the set sharing transfer functions from basic ZBDD operations. For some of the operations, we devise custom ZBDD algorithms that perform better in practice. We also compare our implementation of the abstract domain with an efficient, compact, bit set-based alternative, and show that the ZBDD version scales better in terms of both memory usage and running time

    A Sparse Program Dependence Graph For Object Oriented Programming Languages

    Get PDF
    The Program Dependence Graph (PDG) has achieved widespread acceptance as a useful tool for software engineering, program analysis, and automated compiler optimizations. This thesis presents the Sparse Object Oriented Program Dependence Graph (SOOPDG), a formalism that contains elements of traditional PDG\u27s adapted to compactly represent programs written in object-oriented languages such as Java. This formalism is called sparse because, in contrast to other OO and Java-specific adaptations of PDG\u27s, it introduces few node types and no new edge types beyond those used in traditional dependence-based representations. This results in correct program representations using smaller graph structures and simpler semantics when compared to other OO formalisms. We introduce the Single Flow to Use (SFU) property which requires that exactly one definition of each variable be available for each use. We demonstrate that the SOOPDG, with its support for the SFU property coupled with a higher order rewriting semantics, is sufficient to represent static Java-like programs and dynamic program behavior. We present algorithms for creating SOOPDG representations from program text, and describe graph rewriting semantics. We also present algorithms for common static analysis techniques such as program slicing, inheritance analysis, and call chain analysis. We contrast the SOOPDG with two previously published OO graph structures, the Java System Dependence Graph and the Java Software Dependence Graph. The SOOPDG results in comparatively smaller static representations of programs, cleaner graph semantics, and potentially more accurate program analysis. Finally, we introduce the Simulation Dependence Graph (SDG). The SDG is a related representation that is developed specifically to represent simulation systems, but is extensible to more general component-based software design paradigms. The SDG allows formal reasoning about issues such as component composition, a property critical to the creation and analysis of complex simulation systems and component-based design systems

    Sound and Precise Malware Analysis for Android via Pushdown Reachability and Entry-Point Saturation

    Full text link
    We present Anadroid, a static malware analysis framework for Android apps. Anadroid exploits two techniques to soundly raise precision: (1) it uses a pushdown system to precisely model dynamically dispatched interprocedural and exception-driven control-flow; (2) it uses Entry-Point Saturation (EPS) to soundly approximate all possible interleavings of asynchronous entry points in Android applications. (It also integrates static taint-flow analysis and least permissions analysis to expand the class of malicious behaviors which it can catch.) Anadroid provides rich user interface support for human analysts which must ultimately rule on the "maliciousness" of a behavior. To demonstrate the effectiveness of Anadroid's malware analysis, we had teams of analysts analyze a challenge suite of 52 Android applications released as part of the Auto- mated Program Analysis for Cybersecurity (APAC) DARPA program. The first team analyzed the apps using a ver- sion of Anadroid that uses traditional (finite-state-machine-based) control-flow-analysis found in existing malware analysis tools; the second team analyzed the apps using a version of Anadroid that uses our enhanced pushdown-based control-flow-analysis. We measured machine analysis time, human analyst time, and their accuracy in flagging malicious applications. With pushdown analysis, we found statistically significant (p < 0.05) decreases in time: from 85 minutes per app to 35 minutes per app in human plus machine analysis time; and statistically significant (p < 0.05) increases in accuracy with the pushdown-driven analyzer: from 71% correct identification to 95% correct identification.Comment: Appears in 3rd Annual ACM CCS workshop on Security and Privacy in SmartPhones and Mobile Devices (SPSM'13), Berlin, Germany, 201

    Algebraic optimization of recursive queries

    Get PDF
    Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems
    corecore