94 research outputs found

    Field-sensitive unreachability and non-cyclicity analysis

    Get PDF
    Field-sensitive static analyses of object-oriented code use approximations of the computational states where fields are taken into account, for better precision. This article presents a novel and sound definite analysis of Java bytecode that approximates two strictly related properties: field-sensitive unreachability between program variables and field-sensitive non-cyclicity of program variables. The latter exploits the former for better precision. We build a data-flow analysis based on constraint graphs, whose nodes are program points and whose arcs propagate information according to the semantics of each bytecode instruction. We follow abstract interpretation both to approximate the concrete semantics and to prove our results formally correct. Our analysis has been designed with the goal of improving client analyses such as termination analysis, asserting the non-cyclicity of variables with respect to specific fields

    A General Framework for Constraint-Based Static Analyses of Java Bytecode Programs

    Get PDF
    Questa tesi introduce un generico e parametrizzato framework per analisi statica dei programmi Java bytecode, basato sulla generazione e soluzione dei vincoli. All'interno del framework \ue8 possibile gestire sia i flussi di eccezione all'interno di programmi analizzati, sia i side-effect indotti dalle esecuzioni dei metodi che possono modificare la memoria. Questo framework \ue8 generico nel senso che diverse istanziazioni dei suoi parametri risultano in diverse analisi statiche capaci di catturare varie propriet\ue0 relative alla memoria delle variabili del programma ad ogni punto del programma. Le analisi statiche definite dal framework sono basate su interpretazione astratta, e quindi le propriet\ue0 d'interesse sono rappresentate da dei domini astratti. Il framework pu\uf2 essere usato per la definizione sia delle analisi statiche che producono le approssimazioni del tipo "possible" oppure "may", che quelle del tipo "definite" oppure "must". Nel primo caso, il risultato di tali analisi \ue8 una sovra-approssimazione di quello che potrebbe essere vero ad un certo punto del programma, mentre nel secondo caso il risultato rappresenta una sotto-approssimazione della situazione reale. Questa tesi fornisce un insieme di condizioni che diverse istanziazioni dei parametri del framework devono soddisfare affinch\ue9 le analisi statiche definite all'interno del framework siano "sound" (corrette). Quando i parametri istanziati soddisfano tali condizioni, il framework garantisce la correttezza dell'analisi corrispondente all'istanziazione in questione. Il vantaggio di questo approccio \ue8 che il designer di una nuova analisi statica deve soltanto mostrare che i parametri da lui istanziati soddisfano i criteri specificati dal framework.In questo modo la dimostrazione di correttezza dell'analisi completa \ue8 semplificata. Questa \ue8 una caratteristica molto importante del presente lavoro. La tesi introduce due nuove analisi statiche relatve alle propriet\ue0 della memoria: la Possible Reachability Analysis Between Program Variables e la Definite Expression Aliasing Analysis. La prima rappresenta un esempio delle analisi "possible" e determina, per ogni punto p del programma, quali sono le coppie ordinate delle variabili disponibili a tale punto, tali che v potrebbe raggiungere w al punto p, ovvero, che a partire dalla variabile v \ue8 possibile seguire un insieme di locazioni di memoria che portano all'oggetto legato alla variabile w. La seconda analisi \ue8 un esempio delle analisi "definite" e determina, per ogni punto p del programma ed ogni variabile v disponibile a tale punto, un insieme di espressioni il cui valore \ue8 sempre uguale al valore che la variabile v pu\uf2 avere al punto p, per ogni possibile esecuzione. Entrambe le analisi sono state formalizzate e dimostrate corrette grazie ai risultati teorici del framework introdotto in questa tesi. In pi\uf9, entrambe le analisi sono state implementate all'interno dell'analizzatore statico per Java e Android chiamato Julia (www.juliasoft.com). Gli esperimenti eseguiti sui programmi reali mostrano che la precisione dei principali tool di Julia (nullness e termination tool) \ue8 migliorata rispetto alle versioni precedenti di Julia nelle quali le nuove analisi non erano presenti.The present thesis introduces a generic parameterized framework for static analysis of Java bytecode programs, based on constraint generation and solving. This framework is able to deal with the exceptional flows inside the program and the side-effects induced by calls to non-pure methods. It is generic in the sense that different instantiations of its parameters give rise to different static analyses which might capture complex memory-related properties at each program point. Different properties of interest are represented as abstract domains, and therefore the static analyses defined inside the framework are abstract interpretation-based. The framework can be used to generate possible or may approximations of the property of interest, as well as definite or must approximations of that property. In the former case, the result of the static analysis is an over-approximation of what might be true at a given program point; in the latter, it is an under-approximation. This thesis provides a set of conditions that different instantiations of framework's parameters must satisfy in order to have a sound static analysis. When these conditions are satisfied by a parameter's instantiation, the framework guarantees that the corresponding static analysis is sound. It means that the designer of a novel static analysis should only show that the parameters he or she instantiated actually satisfy the conditions provided by the framework. This way the framework simplifies the proofs of soundness of the static analysis: instead of showing that the overall analysis is sound, it is enough to show that the provided instantiation describing the actual static analyses satisfies the conditions mentioned above. This a very important feature of the present approach. Then the thesis introduces two novel static analyses dealing with memory-related properties: the Possible Reachability Analysis Between Program Variables and the Definite Expression Aliasing Analysis. The former analysis is an example of a possible analysis which determines, for each program point p, which are the ordered pairs of variables available at p, such that v might reach w at p, i.e., such that starting from v it is possible to follow a path of memory locations that leads to the object bound to w. The latter analysis is an example of a definite analysis, and it determines, for each program point p and each variable v available at that point, a set of expressions which are always aliased to v at p. Both analyses have been formalized and proved sound by using the theoretical results of the framework. These analyses have been also implemented inside the Julia tool (www.juliasoft.com), which is a static analyzer for Java and Android. Experimental evaluation of these analyses on real-life benchmarks shows how the precision of Julia's principal checkers (nullness and termination checkers) increased compared to the previous version of Julia where these two analyses were not implemented. Moreover, this experimental evaluation showed that the presence of the reachability analysis actually decreased the total run-time of Julia. On the other hand, the aliasing analysis takes more time, but the number of possible warnings produced by the principal checkers drastically decreased

    Inferring Complete Initialization of Arrays

    Get PDF
    We define an automaton-based abstract interpretation of a trace semantics which identifies loops that definitely initialize all elements of an array to values satisfying a given property, a useful piece of information for the static analysis of Java-like languages. This results in a completely automatic and efficient analysis, that does not use manual code annotations. We give a formal proof of correctness that considers aspects such as side-effects of method calls. We show how the identification of those loops can be lifted to global invariants about the contents of elements of fields of array type, that hold everywhere in the code where those elements are accessed. This makes our work more significant and useful for the static analysis of real programs. The implementation of our analysis inside the Julia analyzer is both efficient and precise

    Deriving object typestates in the presence of inter-object references

    Full text link

    Efficient Set Sharing Using ZBDDs

    Get PDF
    Set sharing is an abstract domain in which each concrete object is represented by the set of local variables from which it might be reachable. It is a useful abstraction to detect parallelism opportunities, since it contains definite information about which variables do not share in memory, i.e., about when the memory regions reachable from those variables are disjoint. Set sharing is a more precise alternative to pair sharing, in which each domain element is a set of all pairs of local variables from which a common object may be reachable. However, the exponential complexity of some set sharing operations has limited its wider application. This work introduces an efficient implementation of the set sharing domain using Zero-suppressed Binary Decision Diagrams (ZBDDs). Because ZBDDs were designed to represent sets of combinations (i.e., sets of sets), they naturally represent elements of the set sharing domain. We show how to synthesize the operations needed in the set sharing transfer functions from basic ZBDD operations. For some of the operations, we devise custom ZBDD algorithms that perform better in practice. We also compare our implementation of the abstract domain with an efficient, compact, bit set-based alternative, and show that the ZBDD version scales better in terms of both memory usage and running time

    General Declarative Must-Alias Analysis

    Get PDF
    Οι περισσότερες δημοσιευμένες αναλύσεις για δείκτες είναι ίσως-αναλύσεις: δηλαδή υπερεκτιμούν τη σχέση συνωνυμίας δεικτών ή τη σχέση “δείχνει-σε”. Οι αναλύσεις σίγουρης-συνωνυμίας δεικτών έχουν μελετηθεί λιγότερο αλλά προσφέρουν ελκυστικά πλεονεκτήματα, για τη βελτιστοποίηση και την κατανόηση των προγραμμάτων. Σε αυτήν την εργασία δίνουμε ένα δηλωτικό μοντέλο για μια πλούσια οικογένεια αναλύσεων σίγουρης-συνωνυμίας δεικτών. Αν και υπάρχουν ήδη στη βιβλιογραφία φορμαλισμοί ανλύσεων σίγουρης-συνωνυμίας, δίνουμε έμφαση στη μοντελοποίηση και την ανάδειξη των κύριων σημείων όπου ένας αλγόριθμος μπορεί να προσαρμόσει την ισορροπία μεταξύ της συλλογής πληροφορίας και της απόδοσης της ανάλυσης. Επιπλέον, δείχνουμε ότι το μοντέλο μας μπορεί εύκολα να επεκταθεί για να συμπεριλάβει μια ανάλυση για τη σχέση “σίγουρα-δείχνει-σε”. Το μοντέλο μας είναι εκτελέσιμο, στη γλώσσα Datalog, και αποτελεί τη βάση για μια ολοκληρωμένη ανάλυση σίγουρης-συνωνυμίας δεικτών για κώδικα σε μορφή Java bytecode. Εξετάζουμε σε βάθος πώς μπορεί να παραμετροποιηθεί η ανάλυση και ποσοτικοποιούμε την επίδραση των σχεδιαστικών αποφάσεών σε μεγάλα δοκιμαστικά προγράμματα Java.Most published pointer analysis algorithms are may-analyses: they over-approximate aliasing or points-to relations. Must-alias analyses are more rarely studied but offer at- tractive benefits, for optimization and program understanding. In this thesis we give a declarative model of a rich family of must-alias analyses. Although other specifications of must-alias algorithms exist in the literature, our emphasis is on modeling and exposing the key points where the algorithm can adjust its inference power vs. scalability trade- off. Furthermore, we show that our model can be easily extended to also incorporate a must-point-to analysis. Our model is executable, in the Datalog language, and forms the basis for a full-fledged must-alias analysis of Java bytecode. We discuss insights on con- figuring a must-alias analysis and quantify the impact of design decisions on large Java benchmarks

    Locking Discipline Inference and Checking

    Get PDF
    Concurrency is a requirement for much modern software, but the implementation of multithreaded algorithms comes at the risk of errors such as data races. Programmers can prevent data races by documenting and obeying a locking discipline, which indicates which locks must be held in order to access which data. This paper introduces a formal semantics for locking specifications that gives a guarantee of race freedom. The paper also provides two implementations of the formal semantics for the Java language: one based on abstract interpretation and one based on type theory. To the best of our knowledge, these are the first tools that can soundly infer and check a locking discipline for Java. Our experiments com-pare the implementations with one another and with annotations written by programmers

    Semantics for Locking Specifications

    Get PDF
    Lock-based synchronization disciplines, like Java\u2019s @GuardedBy, are widely used to prevent concurrency errors. However, their semantics is often expressed informally and is consequently ambiguous. This article highlights such ambiguities and overcomes them by formalizing two possible semantics of @GuardedBy, using a reference operational semantics for a core calculus of a concurrent Java-like language. It also identifies when such annotations are actual guarantees against data races. Our work aids in understanding the annotations and supports the development of sound tools that verify or infer them
    corecore