39 research outputs found

    Using abstract interpretation to add type checking for interfaces in Java bytecode verification

    Get PDF
    AbstractJava interface types support multiple inheritance. Because of this, the standard bytecode verifier ignores them, since it is not able to model the class hierarchy as a lattice. Thus, type checks on interfaces are performed at run time. We propose a verification methodology that removes the need for run-time checks. The methodology consists of: (1) an augmented verifier that is very similar to the standard one, but is also able to check for interface types in most cases; (2) for all other cases, a set of additional simpler verifiers, each one specialized for a single interface type. We obtain these verifiers in a systematic way by using abstract interpretation techniques. Finally, we describe an implementation of the methodology and evaluate it on a large set of benchmarks

    Converting Java programs to use generic libraries

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (leaves 121-127).Java 1.5 will include a type system (called JSR-14) that supports parametric polymorphism, or generic classes. This will bring many benefits to Java programmers, not least because current Java practise makes heavy use of logically-generic classes, including container classes. Translation of Java source code into semantically equivalent JSR-14 source code requires two steps: parameterisation (adding type parameters to class definitions) and instantiation (adding the type arguments at each use of a parameterised class). Parameterisation need be done only once for a class, whereas instantiation must be performed for each client, of which there are potentially many more. Therefore, this work focuses on the instantiation problem. We present a technique to determine sound and precise JSR-14 types at each use of a class for which a generic type specification is available. Our approach uses a precise and context-sensitive pointer analysis to determine possible types at allocation sites, and a set-constraint-based analysis (that incorporates guarded, or conditional, constraints) to choose consistent types for both allocation and declaration sites. The technique safely handles all features of the JSR-14 type system, notably the raw types (which provide backward compatibility) and 'unchecked' operations on them. We have implemented our analysis in a tool that automatically inserts type arguments into Java code, and we report its performance when applied to a number of real-world Java programs.by Alan A.A. Donovan.S.M

    Computing stack maps with interfaces

    Get PDF
    International audienceLightweight bytecode verification uses stack maps to annotate Java bytecode programs with type information in order to reduce the verification to type checking. This paper describes an improved bytecode analyser together with algorithms for optimizing the stack maps generated. The analyser is simplified in its treatment of base values (keeping only the necessary information to ensure memory safety) and enriched in its representation of interface types, using the Dedekind-MacNeille completion technique. The computed interface information allows to remove the dynamic checks at interface method invocations. We prove the memory safety property guaranteed by the bytecode verifier using an operational semantics whose distinguishing feature is the use of un-tagged 32-bit values. For bytecode typable without sets of types we show how to prune the fix-point to obtain a stack map that can be checked without computing with sets of interfaces i.e., lightweight verification is not made more complex or costly. Experiments on three substantial test suites show that stack maps can be computed and correctly pruned by an optimized (but incomplete) pruning algorithm

    Analysis and optimization of task granularity on the Java virtual machine

    Get PDF
    Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) may introduce considerable parallelization overheads. On the other hand, coarse-grained tasks (i.e., large tasks performing substantial computations) may not fully utilize the available CPU cores, leading to missed parallelization opportunities. We focus on task-parallel applications running in a single Java Virtual Machine on a shared- memory multicore. Despite their performance may considerably depend on the granularity of their tasks, this topic has received little attention in the literature. Our work fills this gap, analyzing and optimizing the task granularity of such applications. In this dissertation, we present a new methodology to accurately and efficiently collect the granularity of each executed task, implemented in a novel profiler. Our profiler collects carefully selected metrics from the whole system stack with low overhead. Our tool helps developers locate performance and scalability problems, and identifies classes and methods where optimizations related to task granularity are needed, guiding developers towards useful optimizations. Moreover, we introduce a novel technique to drastically reduce the overhead of task-granularity profiling, by reifying the class hierarchy of the target application within a separate instrumentation process. Our approach allows the instrumentation process to instrument only the classes representing tasks, inserting more efficient instrumentation code which decreases the overhead of task detection. Our technique significantly speeds up task-granularity profiling and so enables the collection of accurate metrics with low overhead.We use our novel techniques to analyze task granularity in the DaCapo, ScalaBench, and Spark Perf benchmark suites. We reveal inefficiencies related to fine-grained and coarse-grained tasks in several workloads. We demonstrate that the collected task-granularity profiles are actionable by optimizing task granularity in numerous benchmarks, performing optimizations in classes and methods indicated by our tool. Our optimizations result in significant speedups (up to a factor of 5.90x) in numerous workloads suffering from fine- and coarse-grained tasks in different environments. Our results highlight the importance of analyzing and optimizing task granularity on the Java Virtual Machine

    UN'ANALISI DEI CONTROLLI DI TIPO NELLA JAVA VIRTUAL MACHINE

    Get PDF
    Un programma scritto in linguaggio Java viene eseguito da una macchina virtuale denominata Java Virtual Machine. Lo sviluppo di questo linguaggio negli anni, soprattutto nel settore del Web, ha imposto dei rigidi modelli di sicurezza, uno dei quali Ăš la verifica del codice bytecode del programma. Il costrutto Interfaccia del linguaggio Java propone delle problematiche in materia di verifica del bytecode, in quanto offre al programmatore uno strumento alternativo all'ereditĂ  multipla che in Java non Ăš permessa. Alcuni dei controlli di tipo da effettuare sulle interfacce, non possono essere eseguiti dal verificatore e devono essere rimandati a tempo di esecuzione ed effettuati dall'interprete Java. Questa soluzione perĂČ produce un qualche rallentamento nelle prestazioni dell'interprete. Lo scopo di questa Tesi Ăš quello di identificare i controlli a tempo di esecuzione eseguiti sulle interfacce, eliminarli ed eseguire una serie di test per quantificare l'eventuale guadagno in termini di prestazioni che Ăš possibile ottenere

    Increasing the Performance and Predictability of the Code Execution on an Embedded Java Platform

    Get PDF
    This thesis explores the execution of object-oriented code on an embedded Java platform. It presents established and derives new approaches for the implementation of high-level object-oriented functionality and commonly expected system services. The goal of the developed techniques is the provision of the architectural base for an efficient and predictable code execution. The research vehicle of this thesis is the Java-programmed SHAP platform. It consists of its platform tool chain and the highly-customizable SHAP bytecode processor. SHAP offers a fully operational embedded CLDC environment, in which the proposed techniques have been implemented, verified, and evaluated. Two strands are followed to achieve the goal of this thesis. First of all, the sequential execution of bytecode is optimized through a joint effort of an optimizing offline linker and an on-chip application loader. Additionally, SHAP pioneers a reference coloring mechanism, which enables a constant-time interface method dispatch that need not be backed a large sparse dispatch table. Secondly, this thesis explores the implementation of essential system services within designated concurrent hardware modules. This effort is necessary to decouple the computational progress of the user application from the interference induced by time-sharing software implementations of these services. The concrete contributions comprise a spill-free, on-chip stack; a predictable method cache; and a concurrent garbage collection. Each approached means is described and evaluated after the relevant state of the art has been reviewed. This review is not limited to preceding small embedded approaches but also includes techniques that have proven successful on larger-scale platforms. The other way around, the chances that these platforms may benefit from the techniques developed for SHAP are discussed

    Security analyses for detecting deserialisation vulnerabilities : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

    Get PDF
    An important task in software security is to identify potential vulnerabilities. Attackers exploit security vulnerabilities in systems to obtain confidential information, to breach system integrity, and to make systems unavailable to legitimate users. In recent years, particularly 2012, there has been a rise in reported Java vulnerabilities. One type of vulnerability involves (de)serialisation, a commonly used feature to store objects or data structures to an external format and restore them. In 2015, a deserialisation vulnerability was reported involving Apache Commons Collections, a popular Java library, which affected numerous Java applications. Another major deserialisation-related vulnerability that affected 55\% of Android devices was reported in 2015. Both of these vulnerabilities allowed arbitrary code execution on vulnerable systems by malicious users, a serious risk, and this came as a call for the Java community to issue patches to fix serialisation related vulnerabilities in both the Java Development Kit and libraries. Despite attention to coding guidelines and defensive strategies, deserialisation remains a risky feature and a potential weakness in object-oriented applications. In fact, deserialisation related vulnerabilities (both denial-of-service and remote code execution) continue to be reported for Java applications. Further, deserialisation is a case of parsing where external data is parsed from their external representation to a program's internal data structures and hence, potentially similar vulnerabilities can be present in parsers for file formats and serialisation languages. The problem is, given a software package, to detect either injection or denial-of-service vulnerabilities and propose strategies to prevent attacks that exploit them. The research reported in this thesis casts detecting deserialisation related vulnerabilities as a program analysis task. The goal is to automatically discover this class of vulnerabilities using program analysis techniques, and to experimentally evaluate the efficiency and effectiveness of the proposed methods on real-world software. We use multiple techniques to detect reachability to sensitive methods and taint analysis to detect if untrusted user-input can result in security violations. Challenges in using program analysis for detecting deserialisation vulnerabilities include addressing soundness issues in analysing dynamic features in Java (e.g., native code). Another hurdle is that available techniques mostly target the analysis of applications rather than library code. In this thesis, we develop techniques to address soundness issues related to analysing Java code that uses serialisation, and we adapt dynamic techniques such as fuzzing to address precision issues in the results of our analysis. We also use the results from our analysis to study libraries in other languages, and check if they are vulnerable to deserialisation-type attacks. We then provide a discussion on mitigation measures for engineers to protect their software against such vulnerabilities. In our experiments, we show that we can find unreported vulnerabilities in Java code; and how these vulnerabilities are also present in widely-used serialisers for popular languages such as JavaScript, PHP and Rust. In our study, we discovered previously unknown denial-of-service security bugs in applications/libraries that parse external data formats such as YAML, PDF and SVG

    The construction of high-performance virtual machines for dynamic languages

    Get PDF
    Dynamic languages, such as Python and Ruby, have become more widely used over the past decade. Despite this, the standard virtual machines for these languages have disappointing performance. These virtual machines are slow, not because methods for achieving better performance are unknown, but because their implementation is hard. What makes the implementation of high-performance virtual machines difïŹcult is not that they are large pieces of software, but that there are fundamental and complex interdependencies between their components. In order to work together correctly, the interpreter, just-in-time compiler, garbage collector and library must all conform to the same precise low-level protocols. In this dissertation I describe a method for constructing virtual machines for dynamic languages, and explain how to design a virtual machine toolkit by building it around an abstract machine. The design and implementation of such a toolkit, the Glasgow Virtual Machine Toolkit, is described. The Glasgow Virtual Machine Toolkit automatically generates a just-in-time compiler, integrates precise garbage collection into the virtual machine, and automatically manages the complex inter-dependencies between all the virtual machine components. Two different virtual machines have been constructed using the GVMT. One is a minimal implementation of Scheme; which was implemented in under three weeks to demonstrate that toolkits like the GVMT can enable the easy construction of virtual machines. The second, the HotPy VM for Python, is a high-performance virtual machine; it demonstrates that a virtual machine built with a toolkit can be fast and that the use of a toolkit does not overly constrain the high-level design. Evaluation shows that HotPy outperforms the standard Python interpreter, CPython, by a large margin, and has performance on a par with PyPy, the fastest Python VM currently available

    Software engineering perspectives on physiological computing

    Get PDF
    Physiological computing is an interesting and promising concept to widen the communication channel between the (human) users and computers, thus allowing an increase of software systems' contextual awareness and rendering software systems smarter than they are today. Using physiological inputs in pervasive computing systems allows re-balancing the information asymmetry between the human user and the computer system: while pervasive computing systems are well able to flood the user with information and sensory input (such as sounds, lights, and visual animations), users only have a very narrow input channel to computing systems; most of the time, restricted to keyboards, mouse, touchscreens, accelerometers and GPS receivers (through smartphone usage, e.g.). Interestingly, this information asymmetry often forces the user to subdue to the quirks of the computing system to achieve his goals -- for example, users may have to provide information the software system demands through a narrow, time-consuming input mode that the system could sense implicitly from the human body. Physiological computing is a way to circumvent these limitations; however, systematic means for developing and moulding physiological computing applications into software are still unknown. This thesis proposes a methodological approach to the creation of physiological computing applications that makes use of component-based software engineering. Components help imposing a clear structure on software systems in general, and can thus be used for physiological computing systems as well. As an additional bonus, using components allow physiological computing systems to leverage reconfigurations as a means to control and adapt their own behaviours. This adaptation can be used to adjust the behaviour both to the human and to the available computing environment in terms of resources and available devices - an activity that is crucial for complex physiological computing systems. With the help of components and reconfigurations, it is possible to structure the functionality of physiological computing applications in a way that makes them manageable and extensible, thus allowing a stepwise and systematic extension of a system's intelligence. Using reconfigurations entails a larger issue, however. Understanding and fully capturing the behaviour of a system under reconfiguration is challenging, as the system may change its structure in ways that are difficult to fully predict. Therefore, this thesis also introduces a means for formal verification of reconfigurations based on assume-guarantee contracts. With the proposed assume-guarantee contract framework, it is possible to prove that a given system design (including component behaviours and reconfiguration specifications) is satisfying real-time properties expressed as assume-guarantee contracts using a variant of real-time linear temporal logic introduced in this thesis - metric interval temporal logic for reconfigurable systems. Finally, this thesis embeds both the practical approach to the realisation of physiological computing systems and formal verification of reconfigurations into Scrum, a modern and agile software development methodology. The surrounding methodological approach is intended to provide a frame for the systematic development of physiological computing systems from first psychological findings to a working software system with both satisfactory functionality and software quality aspects. By integrating practical and theoretical aspects of software engineering into a self-contained development methodology, this thesis proposes a roadmap and guidelines for the creation of new physiological computing applications.Physiologisches Rechnen ist ein interessantes und vielversprechendes Konzept zur Erweiterung des Kommunikationskanals zwischen (menschlichen) Nutzern und Rechnern, und dadurch die BerĂŒcksichtigung des Nutzerkontexts in Software-Systemen zu verbessern und damit Software-Systeme intelligenter zu gestalten, als sie es heute sind. Physiologische Eingangssignale in ubiquitĂ€ren Rechensystemen zu verwenden, ermöglicht eine Neujustierung der Informationsasymmetrie, die heute zwischen Menschen und Rechensystemen existiert: WĂ€hrend ubiquitĂ€re Rechensysteme sehr wohl in der Lage sind, den Menschen mit Informationen und sensorischen Reizen zu ĂŒberfluten (z.B. durch Töne, Licht und visuelle Animationen), hat der Mensch nur sehr begrenzte Einflussmöglichkeiten zu Rechensystemen. Meistens stehen nur Tastaturen, die Maus, berĂŒhrungsempfindliche Bildschirme, Beschleunigungsmesser und GPS-EmpfĂ€nger (zum Beispiel durch Mobiltelefone oder digitale Assistenten) zur VerfĂŒgung. Diese Informationsasymmetrie zwingt die Benutzer zur Unterwerfung unter die Usancen der Rechensysteme, um ihre Ziele zu erreichen - zum Beispiel mĂŒssen Nutzer Daten manuell eingeben, die auch aus Sensordaten des menschlichen Körpers auf unauffĂ€llige weise erhoben werden können. Physiologisches Rechnen ist eine Möglichkeit, diese BeschrĂ€nkung zu umgehen. Allerdings fehlt eine systematische Methodik fĂŒr die Entwicklung physiologischer Rechensysteme bis zu fertiger Software. Diese Dissertation prĂ€sentiert einen methodischen Ansatz zur Entwicklung physiologischer Rechenanwendungen, der auf der komponentenbasierten Softwareentwicklung aufbaut. Der komponentenbasierte Ansatz hilft im Allgemeinen dabei, eine klare Architektur des Software-Systems zu definieren, und kann deshalb auch fĂŒr physiologische Rechensysteme angewendet werden. Als zusĂ€tzlichen Vorteil erlaubt die Komponentenorientierung in physiologischen Rechensystemen, Rekonfigurationen als Mittel zur Kontrolle und Anpassung des Verhaltens von physiologischen Rechensystemen zu verwenden. Diese Adaptionstechnik kann genutzt werden um das Verhalten von physiologischen Rechensystemen an den Benutzer anzupassen, sowie an die verfĂŒgbare Recheninfrastruktur im Sinne von Systemressourcen und GerĂ€ten - eine Maßnahme, die in komplexen physiologischen Rechensystemen entscheidend ist. Mit Hilfe der Komponentenorientierung und von Rekonfigurationen wird es möglich, die FunktionalitĂ€t von physiologischen Rechensystemen so zu strukturieren, dass das System wartbar und erweiterbar bleibt. Dadurch wird eine schrittweise und systematische Erweiterung der FunktionalitĂ€t des Systems möglich. Die Verwendung von Rekonfigurationen birgt allerdings Probleme. Das Systemverhalten eines Software-Systems, das Rekonfigurationen unterworfen ist zu verstehen und vollstĂ€ndig einzufangen ist herausfordernd, da das System seine Struktur auf schwer vorhersehbare Weise verĂ€ndern kann. Aus diesem Grund fĂŒhrt diese Arbeit eine Methode zur formalen Verifikation von Rekonfigurationen auf Grundlage von Annahme-Zusicherungs-VertrĂ€gen ein. Mit dem vorgeschlagenen Annahme-Zusicherungs-Vertragssystem ist es möglich zu beweisen, dass ein gegebener Systementwurf (mitsamt Komponentenverhalten und Spezifikation des Rekonfigurationsverhaltens) eine als Annahme-Zusicherungs-Vertrag spezifizierte Echtzeiteigenschaft erfĂŒllt. FĂŒr die Spezifikation von Echtzeiteigenschaften kann eine Variante von linearer Temporallogik fĂŒr Echtzeit verwendet werden, die in dieser Arbeit eingefĂŒhrt wird: Die metrische Intervall-Temporallogik fĂŒr rekonfigurierbare Systeme. Schließlich wird in dieser Arbeit sowohl ein praktischer Ansatz zur Realisierung von physiologischen Rechensystemen als auch die formale Verifikation von Rekonfigurationen in Scrum eingebettet, einer modernen und agilen Softwareentwicklungsmethodik. Der methodische Ansatz bietet einen Rahmen fĂŒr die systematische Entwicklung physiologischer Rechensysteme von Erkenntnissen zur menschlichen Physiologie hin zu funktionierenden physiologischen Softwaresystemen mit zufriedenstellenden funktionalen und qualitativen Eigenschaften. Durch die Integration sowohl von praktischen wie auch theoretischen Aspekten der Softwaretechnik in eine vollstĂ€ndige Entwicklungsmethodik bietet diese Arbeit einen Fahrplan und Richtlinien fĂŒr die Erstellung neuer physiologischer Rechenanwendungen

    JOLTS : checkpointing and coordination in grid systems

    Get PDF
    The need for increased computational power is growing faster than our ability to produce faster computers. Already researchers are proposing systems that require peta-flop capable super computers, a far cry from what is currently capable. To meet such high computational requirements, networks of computers will be required. While it is possible to network together computers to achieve a single task, making that network more flexible to handle a multitude of different tasks is the promise of grid computing. Grid systems are slowly appearing that are designed to run many independent tasks, and provide the ability for programs to migrate between machines before completion. However, these systems lack coordination capabilities. Many grid systems/environments allow multiple tasks to communicate/coordinate with each other based on various paradigms, but don't provide migration capabilities. This thesis proposes a system, called JOLTS, that attempts to fill a gap by providing both checkpointing and coordination capabilities. The coordination model offered by JOLTS is based on the Objective Linda coordination language, with some additions. This thesis will show that the object space model is an effective form of coordination and communication, and can effectively be combined with checkpointing capabilities inside the same grid system
    corecore