Search CORE

68 research outputs found

Towards an embedded real-time Java virtual machine

Author: Ive Anders
Publication venue: Department of Computer Science, Lund University
Publication date: 01/01/2003
Field of study

Most computers today are embedded, i.e. they are built into some products or system that is not perceived as a computer. It is highly desirable to use modern safe object-oriented software techniques for a rapid development of reliable systems. However, languages and run-time platforms for embedded systems have not kept up with the front line of language development. Reasons include complex and, in some cases, contradictory requirements on timing, concurrency, predictability, safety, and flexibility. A carefully tailored Java virtual machine (called IVM) is proposed as an approach to overcome these difficulties. In particular, real-time garbage collection has been considered an essential part. The set of bytecodes has been revised to require less memory and to facilitate predictable execution. To further reduce the memory footprint, the class loader can be located outside the embedded processor. Since the accomplished concurrency is crucial for the function of many embedded applications, the scheduling can be defined on the application level in Java. Finally considering future needs for flexibility and on-line configuration of embedded system, the IVM has a unique structure with which, for instance, methods being objects that can be replaced and GCed. The approach has been experimentally verified by a full prototype implementation of such a virtual machine. By making the prototype available for development of real products, this in turn has confronted the solutions with real industrial demands. It was found that the IVM can be easily integrated in typical systems today and the mentioned requirements are fulfilled. Based on experiences from more than 10 projects utilising the novel Java-oriented techniques, there are reasons to believe that the proposed approach is very promising for future flexible embedded systems

Lund University Publications

Dynamic optimization through the use of automatic runtime specialization

Author: Whaley John (John Craig), 1975-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1999
Field of study

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (leaves 99-115).by John Whaley.S.B.and M.Eng

DSpace@MIT

Post-Mortem Dynamic analysis for software debugging

Author: WANG TAO
Publication venue
Publication date: 30/03/2008
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Recommended from our members

JavaFlow : a Java DataFlow Machine

Author: Ascott Robert John
Publication venue
Publication date: 10/02/2015
Field of study

textThe JavaFlow, a Java DataFlow Machine is a machine design concept implementing a Java Virtual Machine aimed at addressing technology roadmap issues along with the ability to effectively utilize and manage very large numbers of processing cores. Specific design challenges addressed include: design complexity through a common set of repeatable structures; low power by featuring unused circuits and ability to power off sections of the chip; clock propagation and wire limits by using locality to bring data to processing elements and a Globally Asynchronous Locally Synchronous (GALS) design; and reliability by allowing portions of the design to be bypassed in case of failures. A Data Flow Architecture is used with multiple heterogeneous networks to connect processing elements capable of executing a single Java ByteCode instruction. Whole methods are cached in this DataFlow fabric, and the networks plus distributed intelligence are used for their management and execution. A mesh network is used for the DataFlow transfers; two ordered networks are used for management and control flow mapping; and multiple high speed rings are used to access the storage subsystem and a controlling General Purpose Processor (GPP). Analysis of benchmarks demonstrates the potential for this design concept. The design process was initiated by analyzing SPEC JVM benchmarks which identified a small number methods contributing to a significant percentage of the overall ByteCode operations. Additional analysis established static instruction mixes to prioritize the types of processing elements used in the DataFlow Fabric. The overall objective of the machine is to provide multi-threading performance for Java Methods deployed to this DataFlow fabric. With advances in technology it is envisioned that from 1,000 to 10,000 cores/instructions could be deployed and managed using this structure. This size of DataFlow fabric would allow all the key methods from the SPEC benchmarks to be resident. A baseline configuration is defined with a compressed dataflow structure and then compared to multiple configurations of instruction assignments and clock relationships. Using a series of methods from the SPEC benchmark running independently, IPC (Instructions per Cycle) performance of the sparsely populated heterogeneous structure is 40% of the baseline. The average ratio of instructions to required nodes is 3.5. Innovative solutions to the loading and management of Java methods along with the translation from control flow to DataFlow structure are demonstrated.Electrical and Computer Engineerin

Texas ScholarWorks

Using program behaviour to exploit heterogeneous multi-core processors

Author: McIlroy Ross
Publication venue
Publication date: 01/01/2010
Field of study

Multi-core CPU architectures have become prevalent in recent years. A number of multi-core CPUs consist of not only multiple processing cores, but multiple different types of processing cores, each with different capabilities and specialisations. These heterogeneous multi-core architectures (HMAs) can deliver exceptional performance; however, they are notoriously difficult to program effectively. This dissertation investigates the feasibility of ameliorating many of the difficulties encountered in application development on HMA processors, by employing a behaviour aware runtime system. This runtime system provides applications with the illusion of executing on a homogeneous architecture, by presenting a homogeneous virtual machine interface. The runtime system uses knowledge of a program's execution behaviour, gained through explicit code annotations, static analysis or runtime monitoring, to inform its resource allocation and scheduling decisions, such that the application makes best use of the HMA's heterogeneous processing cores. The goal of this runtime system is to enable non-specialist application developers to write applications that can exploit an HMA, without the developer requiring in-depth knowledge of the HMA's design. This dissertation describes the development of a Java runtime system, called Hera-JVM, aimed at investigating this premise. Hera-JVM supports the execution of unmodified Java applications on both processing core types of the heterogeneous IBM Cell processor. An application's threads of execution can be transparently migrated between the Cell's different core types by Hera-JVM, without requiring the application's involvement. A number of real-world Java benchmarks are executed across both of the Cell's core types, to evaluate the efficacy of abstracting a heterogeneous architecture behind a homogeneous virtual machine. By characterising the performance of each of the Cell processor's core types under different program behaviours, a set of influential program behaviour characteristics is uncovered. A set of code annotations are presented, which enable program code to be tagged with these behaviour characteristics, enabling a runtime system to track a program's behaviour throughout its execution. This information is fed into a cost function, which Hera-JVM uses to automatically estimate whether the executing program's threads of execution would benefit from being migrated to a different core type, given their current behaviour characteristics. The use of history, hysteresis and trend tracking, by this cost function, is explored as a means of increasing its stability and limiting detrimental thread migrations. The effectiveness of a number of different migration strategies is also investigated under real-world Java benchmarks, with the most effective found to be a strategy that can target code, such that a thread is migrated whenever it executes this code. This dissertation also investigates the use of runtime monitoring to enable a runtime system to automatically infer a program's behaviour characteristics, without the need for explicit code annotations. A lightweight runtime behaviour monitoring system is developed, and its effectiveness at choosing the most appropriate core type on which to execute a set of real-world Java benchmarks is examined. Combining explicit behaviour characteristic annotations with those characteristics which are monitored at runtime is also explored. Finally, an initial investigation is performed into the use of behaviour characteristics to improve application performance under a different type of heterogeneous architecture, specifically, a non-uniform memory access (NUMA) architecture. Thread teams are proposed as a method of automatically clustering communicating threads onto the same NUMA node, thereby reducing data access overheads. Evaluation of this approach shows that it is effective at improving application performance, if the application's threads can be partitioned across the available NUMA nodes of a system. The findings of this work demonstrate that a runtime system with a homogeneous virtual machine interface can reduce the challenge of application development for HMA processors, whilst still being able to exploit such a processor by taking program behaviour into account

Glasgow Theses Service

CiteSeerX

OpenGrey Repository

Towards future method hotness prediction for Virtual Machines

Author: Namjoshi Manjiri Arun
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2009
Field of study

Application profiling is a popular technique that attempts to understand program behavior to improve its performance. Offline profiling, although beneficial for several applications, fails in cases where prior program runs may not be feasible, or if changes in input cause the profile to not match the behavior of the actual program run. Managed languages, like Java and C#, provide a unique opportunity to overcome the drawbacks of offline profiling by generating the profile information online during the current program run. Indeed, online profiling is extensively used in current VMs, especially during selective compilation to improve program startup performance, as well as during other feedback-directed optimizations. In this thesis we illustrate the drawbacks of the current reactive mechanism of online profiling during selective compilation. Current VM profiling mechanisms are slow -- thereby delaying associated transformations, and estimate future behavior based on the program's immediate past -- leading to potential misspeculation that limit the benefits of compilation. We show that these drawbacks produce an average performance loss of over 14.5\% on our set of benchmark programs, over an ideal offline approach that accurately compiles the hot methods early. We then propose and evaluate the potential of a novel strategy to achieve similar performance benefits with an online profiling approach. Our new online profiling strategy uses early determination of loop iteration bounds to predict future method hotness. We explore and present promising results on the potential, feasibility, and other issues involved for the successful implementation of this approach

KU ScholarWorks

Increasing the Performance and Predictability of the Code Execution on an Embedded Java Platform

Author: Preußer Thomas
Publication venue
Publication date: 12/10/2011
Field of study

This thesis explores the execution of object-oriented code on an embedded Java platform. It presents established and derives new approaches for the implementation of high-level object-oriented functionality and commonly expected system services. The goal of the developed techniques is the provision of the architectural base for an efficient and predictable code execution. The research vehicle of this thesis is the Java-programmed SHAP platform. It consists of its platform tool chain and the highly-customizable SHAP bytecode processor. SHAP offers a fully operational embedded CLDC environment, in which the proposed techniques have been implemented, verified, and evaluated. Two strands are followed to achieve the goal of this thesis. First of all, the sequential execution of bytecode is optimized through a joint effort of an optimizing offline linker and an on-chip application loader. Additionally, SHAP pioneers a reference coloring mechanism, which enables a constant-time interface method dispatch that need not be backed a large sparse dispatch table. Secondly, this thesis explores the implementation of essential system services within designated concurrent hardware modules. This effort is necessary to decouple the computational progress of the user application from the interference induced by time-sharing software implementations of these services. The concrete contributions comprise a spill-free, on-chip stack; a predictable method cache; and a concurrent garbage collection. Each approached means is described and evaluated after the relevant state of the art has been reviewed. This review is not limited to preceding small embedded approaches but also includes techniques that have proven successful on larger-scale platforms. The other way around, the chances that these platforms may benefit from the techniques developed for SHAP are discussed

Technische Universität Dresden: Qucosa

Infrastructures and Compilation Strategies for the Performance of Computing Systems

Author: Rohou Erven
Publication venue: HAL CCSD
Publication date: 02/11/2015
Field of study

This document presents our main contributions to the field of compilation, and more generally to the quest of performance ofcomputing systems.It is structured by type of execution environment, from static compilation (execution of native code), to JIT compilation, and purelydynamic optimization. We also consider interpreters. In each chapter, we give a focus on the most relevant contributions.Chapter 2 describes our work about static compilation. It covers a long time frame (from PhD work 1995--1998 to recent work on real-timesystems and worst-case execution times at Inria in 2015) and various positions, both in academia and in the industry.My research on JIT compilers started in the mid-2000s at STMicroelectronics, and is still ongoing. Chapter 3 covers the results we obtained on various aspects of JIT compilers: split-compilation, interaction with real-time systems, and obfuscation.Chapter 4 reports on dynamic binary optimization, a research effort started more recently, in 2012. This considers the optimization of a native binary (without source code), while it runs. It incurs significant challenges but also opportunities.Interpreters represent an alternative way to execute code. Instead of native code generation, an interpreter executes an infinite loop thatcontinuously reads a instruction, decodes it and executes its semantics. Interpreters are much easier to develop than compilers,they are also much more portable, often requiring a simple recompilation. The price to pay is the reduced performance. Chapter 5presents some of our work related to interpreters.All this research often required significant software infrastructures for validation, from early prototypes to robust quasi products, andfrom open-source to proprietary. We detail them in Chapter 6.The last chapter concludes and gives some perspectives

HAL-CentraleSupelec

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL-Rennes 1

Ph.D. Thesis Proprosal: Transportable Agents

Author: Gray Robert S
Publication venue: Dartmouth Digital Commons
Publication date: 19/05/1995
Field of study

One of the paradigms that has been suggested for allowing efficient access to remote resources is transportable agents. A transportable agent is a named program that can migrate from machine to machine in a heterogeneous network. The program chooses when and where to migrate. It can suspend its execution at an arbitrary point, transport to another machine and resume execution on the new machine. Transportable agents have several advantages over the traditional client/server model. Transportable agents consume less network bandwidth and do not require a connection between communicating machines -- this is attractive in all networks and particularly attractive in wireless networks. Transportable agents are a convenient paradigm for distributed computing since they hide the communication channels but not the location of the computation. Transportable agents allow clients and servers to program each other. However transportable agents pose numerous challenges such as security, privacy and efficiency. Existing transportable agent systems do not meet all of these challenges. In addition there has been no formal characterization of the performance of transportable agents. This thesis addresses these weakness. The thesis has two parts -- (1) formally characterize the performance of transportable agents through mathematical analysis and network simulation and (2) implement a complete transportable agent system

Dartmouth Digital Commons (Dartmouth College)

Distributed information extraction from large-scale wireless sensor networks

Author: Brusey James
Daniel T.
Gaura Elena
Halloran John
Publication venue: International Frequency Sensor Association (IFSA)
Publication date: 01/05/2012
Field of study

Coventry University Pure Portal