70 research outputs found

    An efficient, parametric fixpoint algorithm for analysis of java bytecode

    Get PDF
    Abstract interpretation has been widely used for the analysis of object-oriented languages and, in particular, Java source and bytecode. However, while most existing work deals with the problem of flnding expressive abstract domains that track accurately the characteristics of a particular concrete property, the underlying flxpoint algorithms have received comparatively less attention. In fact, many existing (abstract interpretation based—) flxpoint algorithms rely on relatively inefHcient techniques for solving inter-procedural caligraphs or are speciflc and tied to particular analyses. We also argüe that the design of an efficient fixpoint algorithm is pivotal to supporting the analysis of large programs. In this paper we introduce a novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations. The algorithm is parametric -in the sense that it is independent of the abstract domain used and it can be applied to different domains as "plug-ins"-, multivariant, and flow-sensitive. Also, is based on a program transformation, prior to the analysis, that results in a highly uniform representation of all the features in the language and therefore simplifies analysis. Detailed descriptions of decompilation solutions are given and discussed with an example. We also provide some performance data from a preliminary implementation of the analysis

    A practical logic framework for verifying safety properties of executables

    Get PDF
    ManuscriptWe present a novel program logic, Lf , which is designed on top of a Hoare logic, but is simpler, more flexible and more scalable. Based on Lf , we develop a framework for automatically verifying safety properties of executables. It utilizes a whole-program interprocedural abstract interpretation to automatically discover the specifications needed by Lf to prove a program judgment. We implemented Lf and the framework in the HOL theorem prover

    PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS

    No full text
    6 pagesInternational audienceParallel and heterogeneous computing are growing in audience thanks to the increased performance brought by ubiquitous manycores and GPUs. However, available programming models, like OPENCL or CUDA, are far from being straightforward to use. As a consequence, several automated or semi-automated approaches have been proposed to automatically generate hardware-level codes from high-level sequential sources. Polyhedral models are becoming more popular because of their combination of expressiveness, compactness, and accurate abstraction of the data-parallel behaviour of programs. These models provide automatic or semi-automatic parallelization and code transformation capabilities that target such modern parallel architectures. PIPS is a quarter-century old source-to-source transformation framework that initially targeted parallel machines but then evolved to include other targets. PIPS uses abstract interpretation on an integer polyhedral lattice to represent program code, allowing linear relation analysis on integer variables in an interprocedural way. The same representation is used for the dependence test and the convex array region analysis. The polyhedral model is also more classically used to schedule code from linear constraints. In this paper, we illustrate the features of this compiler infrastructure on an hypothetical input code, demonstrating the combination of polyhedral and non polyhedral transformations. PIPS interprocedural polyhedral analyses are used to generate data transfers and are combined with non-polyhedral transformations to achieve efficient CUDA code generation

    Software similarity and classification

    Full text link
    This thesis analyses software programs in the context of their similarity to other software programs. Applications proposed and implemented include detecting malicious software and discovering security vulnerabilities

    An efficient, parametric fixpoint algorithm for incremental analysis of java bytecode

    Get PDF
    Abstract interpretation has been widely used for the analysis of object-oriented languages and, more precisely, Java source and bytecode. However, while most of the existing work deals with the problem of finding expressive abstract domains that track accurately the characteristics of a particular concrete property, the underlying fixpoint algorithms have received comparatively less attention. In fact, many existing (abstract interpretation based) fixpoint algorithms rely on relatively inefficient techniques to solve inter-procedural call graphs or are specific and tied to particular analyses. We argue that the design of an efficient fixpoint algorithm is pivotal to support the analysis of large programs. In this paper we introduce a novel algorithm for analysis of Java bytecode which includes a number of optimizations in order to reduce the number of iterations. Also, the algorithm is parametric in the sense that it is independent of the abstract domain used and it can be applied to different domains as "plug-ins". It is also incremental in the sense that, if desired, analysis data can be saved so that only a reduced amount of reanalysis is needed after a small program change, which can be instrumental for large programs. The algorithm is also multivariant and flowsensitive. Finally, another interesting characteristic of the algorithm is that it is based on a program transformation, prior to the analysis, that results in a highly uniform representation of all the features in the language and therefore simplifies analysis. Detailed descriptions of decompilation solutions are provided and discussed with an example

    Data and Process Abstraction in PIPS Internal Representation

    No full text
    7 pagesInternational audiencePIPS, a state-of-the-art, source-to-source compilation and optimization platform, has been under development at MINES Paris-Tech since 1988, and its development is still running strong. Initially designed to perform automatic interprocedural parallelization of Fortran 77 programs, PIPS has been extended over the years to compile HPF (High Performance Fortran), C and Fortran 95 programs. Written in C, the PIPS framework has shown to be surprisingly resilient, and its analysis and transformation phases have been reused, adapted and extended to new targets, such as generating code for special purpose hardware accelerators, without requiring significant re-engineering of its core structure. We suggest that one of the key features that explain this adaptability is the PIPS internal representation (IR) which stores an abstract syntax tree. Although fit for source-to-source processing, PIPS IR emphasized from its origins the use of maximum abstraction over target languages' specificities and generic data structure manipulation services via the Newgen Domain Specific Language, which provides key features such as type building, automatic serialization and powerful iterators. The state of software technology has significantly advanced over the last 20 years and many of the pioneering features introduced by Newgen are nowadays present in modern programming frameworks. However, we believe that the methodology used to design PIPS IR, and presented in this paper, remains relevant today and could be put to good use in future compilation platform development projects

    Get rid of inline assembly through verification-oriented lifting

    Full text link
    Formal methods for software development have made great strides in the last two decades, to the point that their application in safety-critical embedded software is an undeniable success. Their extension to non-critical software is one of the notable forthcoming challenges. For example, C programmers regularly use inline assembly for low-level optimizations and system primitives. This usually results in driving state-of-the-art formal analyzers developed for C ineffective. We thus propose TInA, an automated, generic, trustable and verification-oriented lifting technique turning inline assembly into semantically equivalent C code, in order to take advantage of existing C analyzers. Extensive experiments on real-world C code with inline assembly (including GMP and ffmpeg) show the feasibility and benefits of TInA
    • …
    corecore