117 research outputs found

    Heap Abstractions for Static Analysis

    Full text link
    Heap data is potentially unbounded and seemingly arbitrary. As a consequence, unlike stack and static memory, heap memory cannot be abstracted directly in terms of a fixed set of source variable names appearing in the program being analysed. This makes it an interesting topic of study and there is an abundance of literature employing heap abstractions. Although most studies have addressed similar concerns, their formulations and formalisms often seem dissimilar and some times even unrelated. Thus, the insights gained in one description of heap abstraction may not directly carry over to some other description. This survey is a result of our quest for a unifying theme in the existing descriptions of heap abstractions. In particular, our interest lies in the abstractions and not in the algorithms that construct them. In our search of a unified theme, we view a heap abstraction as consisting of two features: a heap model to represent the heap memory and a summarization technique for bounding the heap representation. We classify the models as storeless, store based, and hybrid. We describe various summarization techniques based on k-limiting, allocation sites, patterns, variables, other generic instrumentation predicates, and higher-order logics. This approach allows us to compare the insights of a large number of seemingly dissimilar heap abstractions and also paves way for creating new abstractions by mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure

    Beyond reachability: Shape abstraction in the presence of pointer arithmetic

    Get PDF
    Abstract. Previous shape analysis algorithms use a memory model where the heap is composed of discrete nodes that can be accessed only via access paths built from variables and field names, an assumption that is violated by pointer arithmetic. In this paper we show how this assumption can be removed, and pointer arithmetic embraced, by using an analysis based on separation logic. We describe an abstract domain whose elements are certain separation logic formulae, and an abstraction mechanism that automatically transits between a low-level RAM view of memory and a higher, fictional, view that abstracts from the representation of nodes and multiword linked-lists as certain configurations of the RAM. A widening operator is used to accelerate the analysis. We report experimental results obtained from running our analysis on a number of classic algorithms for dynamic memory management.

    Precise static analysis of untrusted driver binaries

    Get PDF
    Most closed source drivers installed on desktop systems today have never been exposed to formal analysis. Without vendor support, the only way to make these often hastily written, yet critical programs accessible to static analysis is to directly work at the binary level. In this paper, we describe a full architecture to perform static analysis on binaries that does not rely on unsound external components such as disassemblers. To precisely calculate data and function pointers without any type information, we introduce Bounded Address Tracking, an abstract domain that is tailored towards machine code and is path sensitive up to a tunable bound assuring termination. We implemented Bounded Address Tracking in our binary analysis platform Jakstab and used it to verify API specifications on several Windows device drivers. Even without assumptions about executable layout and procedures as made by state of the art approaches, we achieve more precise results on a set of drivers from the Windows DDK. Since our technique does not require us to compile drivers ourselves, we also present results from analyzing over 300 closed source drivers

    Static Type Analysis by Abstract Interpretation of Python Programs

    Get PDF

    An abstract domain for objects in dynamic programming languages

    Get PDF
    Dynamic languages, such as JavaScript, PHP, Python or Ruby, provide a memory model for objects data structures allowing programmers to dynamically create, manipulate, and delete objects’ properties. Moreover, in dynamic languages it is possible to access and update properties by using strings: this represents a hard challenge for static analysis. In this paper, we exploit the finite state automata abstract domain, approximating strings, in order to define a novel abstract domain for objects. We design an abstract interpreter useful to analyze objects in a toy language, inspired by real-word dynamic programming languages. We then show, by means of minimal yet expressive examples, the precision of the proposed abstract domain

    A Sound Flow-Sensitive Heap Abstraction for the Static Analysis of Android Applications

    Get PDF
    The present paper proposes the first static analysis for Android applications which is both flow-sensitive on the heap abstraction and provably sound with respect to a rich formal model of the Android platform. We formulate the analysis as a set of Horn clauses defining a sound over-approximation of the semantics of the Android application to analyse, borrowing ideas from recency abstraction and extending them to our concurrent setting. Moreover, we implement the analysis in HornDroid, a state-of-the-art information flow analyser for Android applications. Our extension allows HornDroid to perform strong updates on heap-allocated data structures, thus significantly increasing its precision, without sacrificing its soundness guarantees. We test our implementation on DroidBench, a popular benchmark of Android applications developed by the research community, and we show that our changes to HornDroid lead to an improvement in the precision of the tool, while having only a moderate cost in terms of efficiency. Finally, we assess the scalability of our tool to the analysis of real applications

    Revisiting Recency Abstraction for JavaScript Towards an Intuitive, Compositional, and Efficient Heap Abstraction

    Get PDF
    International audienceJavaScript is one of the most widely used programming languages. To understand the behaviors of JavaScript programs and to detect possible errors in them, researchers have developed several static analyzers based on the abstract interpretation framework. However, JavaScript provides various language features that are difficult to analyze statically and precisely such as dynamic addition and removal of object properties, first-class property names, and higher-order functions. To alleviate the problem, JavaScript static analyzers often use recency abstraction, which refines address abstraction by distinguishing recent objects from summaries of old objects. We observed that while recency abstraction enables more precise analysis results by allowing strong updates on recent objects, it is not monotone in the sense that it does not preserve the precision relationship between the underlying address abstraction techniques: for an address abstraction A and a more precise abstraction B, recency abstraction on B may not be more precise than recency abstraction on A. Such an unintuitive semantics of recency abstraction makes its composition with various analysis sensitivity techniques also unintuitive. In this paper, we propose a new sin-gleton abstraction technique, which distinguishes singleton objects to allow strong updates on them without changing a given address abstraction. We formally define recency and singleton abstractions, and explain the unintuitive behaviors of recency abstraction. Our preliminary experiments show promising results for singleton abstraction

    Shape Analysis in the Absence of Pointers and Structure

    Get PDF
    discover properties of dynamic and/or mutable structures. We ask, “Is there an equivalent to shape analysis for purely functional programs, and if so, what ‘shapes ’ does it discover? ” By treating binding environments as dynamically allocated structures, by treating bindings as addresses, and by treating value environments as heaps, we argue that we can analyze the “shape ” of higher-order functions. To demonstrate this, we enrich an abstract-interpretive control-flow analysis with principles from shape analysis. In particular, we promote “anodization ” as a way to generalize both singleton abstraction and the notion of focusing, and we promote “binding invariants ” as the analog of shape predicates. Our analysis enables two optimizations known to be beyond the reach of control-flow analysis (globalization and super-ÎČ inlining) and one previously unknown optimization (higher-order rematerialization).
    • 

    corecore