Search CORE

4 research outputs found

Repositioning Tiered HotSpot Execution Performance Relative to the Interpreter

Author: Casey Kevin
Lambert Jonathan
Monahan Rosemary
Publication venue
Publication date: 13/04/2023
Field of study

Although the advantages of just-in-time compilation over traditional interpretive execution are widely recognised, there needs to be more current research investigating and repositioning the performance differences between these two execution models relative to contemporary workloads. Specifically, there is a need to examine the performance differences between Java Runtime Environment (JRE) Java Virtual Machine (JVM) tiered execution and JRE JVM interpretive execution relative to modern multicore architectures and modern concurrent and parallel benchmark workloads. This article aims to fill this research gap by presenting the results of a study that compares the performance of these two execution models under load from the Renaissance Benchmark Suite. This research is relevant to anyone interested in understanding the performance differences between just-in-time compiled code and interpretive execution. It provides a contemporary assessment of the interpretive JVM core, the entry and starting point for bytecode execution, relative to just-in-time tiered execution. The study considers factors such as the JRE version, the GNU GCC version used in the JRE build toolchain, and the garbage collector algorithm specified at runtime, and their impact on the performance difference envelope between interpretive and tiered execution. Our findings indicate that tiered execution is considerably more efficient than interpretive execution, and the performance gap has increased, ranging from 4 to 37 times more efficient. On average, tiered execution is approximately 15 times more efficient than interpretive execution. Additionally, the performance differences between interpretive and tiered execution are influenced by workload category, with narrower performance differences observed for web-based workloads and more significant differences for Functional and Scala-type workloads.Comment: 17 page

arXiv.org e-Print Archive

High performance annotation-aware JVM for Java cards

Author: Alex Veidenbaum
Alexandru Nicolau
Ana Azevedo
Arun Kejariwal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2005
Field of study

Early applications of smart cards have focused in the area of per-sonal security. Recently, there has been an increasing demand for networked, multi-application cards. In this new scenario, enhanced application-specific on-card Java applets and complex cryptographic services are executed through the smart card Java Virtual Machine (JVM). In order to support such computation-intensive applica-tions, contemporary smart cards are designed with built-in micro-processors and memory. As smart cards are highly area-constrained environments with memory, CPU and peripherals competing for a very small die space, the VM execution engine of choice is often a small, slow interpreter. In addition, support for multiple applica-tions and cryptographic services demands high performance VM execution engine. The above necessitates the optimization of the JVM for Java Cards

CiteSeerX

Crossref

Effective inline-threaded interpretation of Java bytecode using preparation sequences

Author: Etienne Gagnon
Laurie Hendren
Publication venue: Springer
Publication date: 01/01/2003
Field of study

Abstract. Inline-threaded interpretation is a recent technique that improves performance by eliminating dispatch overhead within basic blocks for interpreters written in C [11]. The dynamic class loading, lazy class initialization, and multi-threading features of Java reduce the effectiveness of a straight-forward implementation of this technique within Java interpreters. In this paper, we introduce preparation sequences, a new technique that solves the particular challenge of effectively inline-threading Java. We have implemented our technique in the SableVM Java virtual machine, and our experimental results show that using our technique, inline-threaded interpretation of Java, on a set of benchmarks, achieves a speedup ranging from 1.20 to 2.41 over switch-based interpretation, and a speedup ranging from 1.15 to 2.14 over direct-threaded interpretation.

CiteSeerX

Crossref

Increasing the Performance and Predictability of the Code Execution on an Embedded Java Platform

Author: Preußer Thomas
Publication venue
Publication date: 12/10/2011
Field of study

This thesis explores the execution of object-oriented code on an embedded Java platform. It presents established and derives new approaches for the implementation of high-level object-oriented functionality and commonly expected system services. The goal of the developed techniques is the provision of the architectural base for an efficient and predictable code execution. The research vehicle of this thesis is the Java-programmed SHAP platform. It consists of its platform tool chain and the highly-customizable SHAP bytecode processor. SHAP offers a fully operational embedded CLDC environment, in which the proposed techniques have been implemented, verified, and evaluated. Two strands are followed to achieve the goal of this thesis. First of all, the sequential execution of bytecode is optimized through a joint effort of an optimizing offline linker and an on-chip application loader. Additionally, SHAP pioneers a reference coloring mechanism, which enables a constant-time interface method dispatch that need not be backed a large sparse dispatch table. Secondly, this thesis explores the implementation of essential system services within designated concurrent hardware modules. This effort is necessary to decouple the computational progress of the user application from the interference induced by time-sharing software implementations of these services. The concrete contributions comprise a spill-free, on-chip stack; a predictable method cache; and a concurrent garbage collection. Each approached means is described and evaluated after the relevant state of the art has been reviewed. This review is not limited to preceding small embedded approaches but also includes techniques that have proven successful on larger-scale platforms. The other way around, the chances that these platforms may benefit from the techniques developed for SHAP are discussed

Technische Universität Dresden: Qucosa