Search CORE

134 research outputs found

내장형 시스템에서 Just-in-Time 및 Ahead-of-Time 컴파일을 활용한 하이브리드 자바 컴파일

Author: 오형석
Publication venue: 서울대학교 대학원
Publication date: 01/08/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 문수묵.Many embedded Java software platforms execute two types of Java classes: those installed statically on the client device and those downloaded dynamically from service providers at runtime. For higher performance, it would be desirable to compile static Java classes by ahead-of-time compiler (AOTC) and to handle dynamically downloaded classes by just-in-time compiler (JITC), providing a hybrid compilation environment. We propose a hybrid Java compilation approach and performs an initial case study with a hybrid environment, which is constructed simply by merging an existing AOTC and a JITC for the same JVM. Contrary to our expectations, the hybrid environment does not deliver a performance, in-between of full-JITCs and full-AOTCs. In fact, its performance is even lower than full-JITCs for many benchmarks. We analyzed the result and found that a naive merge of JITC and AOTC may result in inefficiencies, especially due to calls between JITC methods and AOTC methods. We also observed that the distribution of JITC methods and AOTC methods is also important, and experimented with various distributions to understand when a hybrid environment can deliver a desired performance. The Android Java is to be executed by the Dalvik virtual machine (VM), which is quite different from the traditional Java VM such as Oracles HotSpot VM. That is, Dalvik employs register-based bytecode while HotSpot employs stack-based bytecode, requiring a different way of interpretation. Also, Dalvik uses trace-based just-in-time compilation (JITC), while HotSpot uses method-based JITC. Therefore, it is questioned how the Dalvik VM performs compared the HotSpot VM. Unfortunately, there has been little comparative evaluation of both VMs, so the performance of the Dalvik VM is not well understood. More importantly, it is also not well understood how the performance of the Dalvik VM affects the overall performance of the Android applications (apps). We make an attempt to evaluate the Dalvik VM. We install both VMs on the same board and compare the performance using EEMBC benchmark. In the JITC mode, Dakvik is slower than HotSpot by more than 2.9 times and its generated code size is not smaller than HotSpots due to its worse code quality and trace-chaining code. We also investigated how real Android apps are different from Java benchmarks, to understand why the slow Dalvik VM does not affect the performance of the Android apps seriously. We proposes a bytecode-to-C ahead-of-time compilation (AOTC) for the DVM to accelerate pre-installed apps. We translated the bytecode of some of the hot methods used by these apps to C code, which is then compiled together with the DVM source code. AOTC-generated code works with the existing Android zygote mechanism, with correct garbage collection and exception handling. Due to off-line, method-based compilation using existing compiler with full optimizations and Java-specific optimizations, AOTC can generate quality code while obviating runtime compilation overhead. For benchmarks, AOTC can improve the performance by 65%. When we compare with the recently-introduced ART, which also performs ahead-of-time compilation, our AOTC performs better. We cannot AOTC all middleware and framework methods in DTV and android device for hybrid compilation. By case study on DTV, we found that we need to adopt AOTC enough methods and reduce method call overhead. We propose AOTC method selection heuristic using method call chain. We select hot methods and call chain methods using profile data. Our heuristic based on method call chain get better performance than other heuristics.Chapter 1 Introduction 1 1.1 The need of hybrid compilation 1 1.2 Outline of the Dissertation 2 Chapter 2 Hybrid Compilation for Java Virtual Machine 3 2.1 The Approach of Hybrid Compilation 3 2.2 The JITC and AOTC 6 2.2.1 JVM and the Interpreter 7 2.2.2 The JITC 8 2.2.3 The AOTC 9 2.3 Hybrid Compilation Environment 11 2.4 Analysis of the Hybrid Environment 14 2.4.1 Call Behavior of Benchmarks 14 2.4.2 Call Overhead 15 2.4.3 Application Methods and Library Methods 18 2.4.4 Improving hybrid performance 20 2.4.4.1 Reducing the JITC-to-AOTC call overhead 20 2.4.4.2 Performance impact of the distribution of JITC methods and AOTC methods 21 Chapter 3 Evaluation of Dalvik Virtual Machine 23 3.1 Android Platform 23 3.2 Java VM and Dalvik VM 24 3.2.1 Bytecode ISA 25 3.2.3 Just-in-Time Compilation (JITC) 27 3.3 Experimental Results 32 3.3.1 Experimental Environment 33 3.3.2 Interpreter Performance 34 3.3.3 JITC Performance 37 3.3.4 Trace Extension 43 3.4 Behavior of Real Android Apps 44 Chapter 4 Ahead-of-Time Compilation for Dalvik Virtual Machine 51 4.1 Android and Dalvik VM Execution 51 4.1.1 Android Execution Model 51 4.1.2 Dalvik VM 51 4.1.3 Dexopt and JITC in the Dalvik VM 53 4.2 AOTC Architecture 54 4.3 Design and Implementation of AOTC 56 4.3.1 Dexopt and Code Generation 56 4.3.2 C Code Generation 56 4.3.3 AOTC Method Call 58 4.3.4 Garbage Collection 61 4.3.5 Exception Handling 62 4.3.6 AOTC Method Linking 63 4.4 AOTC Code Optimization 64 4.4.1 Method Inlining 64 4.4.2 Spill Optimization 64 4.4.3 Elimination of Redundant Code 65 4.5 Experimental Result 65 4.5.1 Experimental Environment 66 4.5.2 AOTC Target Methods 66 4.5.3 Performance Impact of AOTC 67 4.5.4 DVM AOTC vs. ART 68 Chapter 5 Selecting Ahead-of-Time Compilation Target Methods for Hybrid Compilation 70 5.1 Hybrid Compilation on DTV 70 5.2 Hybrid Compilation on Android Device 72 5.3 AOTC for Hybrid Compilation 74 5.3.1 AOTC Target Methods 74 5.3.2 Case Study: Selecting on DTV 75 5.4 Method Selection Using Call Chain 77 5.5 Experimental Result 77 5.5.1 Experimental Environment 78 5.5.2 Performance Impact 79 Chapter 6 Related Works 81 Chapter 7 Conclusion 84 Bibliography 86Docto

SNU Open Repository and Archive

Clojure on Android: Challenges and Solutions

Author: Kariniemi Nicholas
Publication venue
Publication date: 11/05/2015
Field of study

Mobile operating systems are rapidly expanding into new areas and the importance of mobile apps is rising with them. As the most popular mobile operating system, Android is at the forefront of this development. However, while other mobile operating systems have introduced newer, officially-supported languages for app development, the only supported language for Android app development is an older dialect of Java. Android developers are unable to take advantage of the features and styles available in alternative and more modern languages. The Clojure language compiles to Android-compatible bytecode and is a promising language to fill this gap. However, the development of Android apps with Clojure is hindered by performance concerns. One recognized problem is the slow startup time of Clojure on Android apps. Alternative ``lean'' Clojure compiler projects promise to improve Clojure performance including startup time. However, the performance of Clojure on Android and the lean compiler projects has not been systematically analyzed and evaluated. We benchmarked and analyzed the startup and run time performance of Android apps written in Clojure and compiled using both the standard Clojure compiler and experimental lean Clojure implementations. In our experiments the run time performance of Clojure on Android is similar to that of Clojure on the desktop. However, Clojure on Android apps take a significant amount of time to start, even on relatively new hardware and the latest Android versions. Long startup times scale upwards quickly with larger apps and the problem is closely tied to the Clojure compiler implementation. We also found that while the Skummet lean Clojure compiler project significantly reduces Clojure on Android startup times, more changes are necessary to make Clojure practical for general Android app development

Aaltodoc Publication Archive

ShareJIT: JIT Code Cache Sharing across Processes and Its Practical Implementation

Author: Xu Xiaoran
Cooper Keith
Brock Jacob
Zhang Yan
Ye Handong
Publication venue
Publication date: 01/01/1998
Field of study

Just-in-time (JIT) compilation coupled with code caching are widely used to improve performance in dynamic programming language implementations. These code caches, along with the associated profiling data for the hot code, however, consume significant amounts of memory. Furthermore, they incur extra JIT compilation time for their creation. On Android, the current standard JIT compiler and its code caches are not shared among processes---that is, the runtime system maintains a private code cache, and its associated data, for each runtime process. However, applications running on the same platform tend to share multiple libraries in common. Sharing cached code across multiple applications and multiple processes can lead to a reduction in memory use. It can directly reduce compile time. It can also reduce the cumulative amount of time spent interpreting code. All three of these effects can improve actual runtime performance. In this paper, we describe ShareJIT, a global code cache for JITs that can share code across multiple applications and multiple processes. We implemented ShareJIT in the context of the Android Runtime (ART), a widely used, state-of-the-art system. To increase sharing, our implementation constrains the amount of context that the JIT compiler can use to optimize the code. This exposes a fundamental tradeoff: increased specialization to a single process' context decreases the extent to which the compiled code can be shared. In ShareJIT, we limit some optimization to increase shareability. To evaluate the ShareJIT, we tested 8 popular Android apps in a total of 30 experiments. ShareJIT improved overall performance by 9% on average, while decreasing memory consumption by 16% on average and JIT compilation time by 37% on average.Comment: OOPSLA 201

arXiv.org e-Print Archive

DigitalCommons@ILR

eCommons@Cornell

Applications of information sharing for code generation in process virtual machines

Author: Kyle Stephen Christopher
Publication venue: The University of Edinburgh
Publication date: 27/06/2016
Field of study

As the backbone of many computing environments today, it is important that process virtual machines be both performant and robust in mobile, personal desktop, and enterprise applications. This thesis focusses on code generation within these virtual machines, particularly addressing situations where redundant work is being performed. The goal is to exploit information sharing in order to improve the performance and robustness of virtual machines that are accelerated by native code generation. First, the thesis investigates the potential to share generated code between multiple threads in a dynamic binary translator used to perform instruction set simulation. This is done through a code generation design that allows native code to be executed by any simulated core and adding a mechanism to share native code regions between threads. This is shown to improve the average performance of multi-threaded benchmarks by 1.4x when simulating 128 cores on a quad-core host machine. Secondly, the ahead-of-time code generation system used for executing Android applications is improved through the use of profiling. The thesis investigates the potential for profiles produced by individual users of applications to be shared and merged together to produce a generic profile that still provides a lot of benefit for a new user who is then able to skip the expensive profiling phase. These profiles can not only be used for selective compilation to reduce code-size and installation time, but can also be used for focussed optimisation on vital code regions of an application in order to improve overall performance. With selective compilation applied to a set of popular Android applications, code-size can be reduced by 49.9% on average, while installation time can be reduced by 31.8%, with only an average 8.5% increase in the amount of sequential runtime required to execute the collected profiles. The thesis also shows that, among the tested users, the use of a crowd-sourced and merged profile does not significantly affect their estimated performance loss from selective compilation (0.90x-0.92x) in comparison to when they they perform selective compilation with their own unique profile (0.93x). Furthermore, by proposing a new, more powerful code generator for Android’s virtual machine, these same profiles can be used to perform focussed optimisation, which preliminary results show to increase runtime performance across a set of common Android benchmarks by 1.46x-10.83x. Finally, in such a situation where a new code generator is being added to a virtual machine, it is also important to test the code generator for correctness and robustness. The methods of execution of a virtual machine, such as interpreters and code generators, must share a set of semantics about how programs must be executed, and this can be exploited in order to improve testing. This is done through the application of domain-aware binary fuzzing and differential testing within Android’s virtual machine. The thesis highlights a series of actual code generation and verification bugs that were found in Android’s virtual machine using this testing methodology, as well as comparing the proposed approach to other state-of-the-art fuzzing techniques

Edinburgh Research Archive

Energy efficient adaptation engines for android applications

Author: Ayala Inmaculada
Cañete Ángel
Fuentes Lidia
Horcas Aguilera José Miguel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Context The energy consumption of mobile devices is increasing due to the improvement in their components (e.g., better processors, larger screens). Although the hardware consumes the energy, the software is responsible for managing hardware resources such as the camera software and its functionality, and therefore, affects the energy consumption. Energy consumption not only depends on the installed code, but also on the execution context (environment, devices status) and how the user interacts with the application. Objective In order to reduce the energy consumption based on user behavior, it is necessary to dynamically adapt the application. However, the adaptation mechanism also consumes a certain amount of energy in itself, which may lead to an important increase in the energy expenditure of the application in comparison with the benefits of the adaptation. Therefore, this footprint must be measured and compared with the benefit obtained. Method In this paper, we (1) determine the benefits, in terms of energy consumption, of dynamically adapting mobile applications, based on user behavior; and (2) advocate the most energy-efficient adaptation mechanism. We provide four different implementations of a proposed adaptation model and measure their energy consumption. Results The proposed adaptation engines do not increase the energy consumption when compared to the benefits of the adaptation, which can reduce the energy consumption by up to 20%. Conclusion The adaptation engines proposed in this paper can decrease the energy consumption of the mobile devices based on user behavior. The overhead introduced by the adaptation engines is negligible in comparison with the benefits obtained by the adaptation.Junta de Andalucía MAGIC P12-TIC1814Ministerio de Economía y Competitividad TIN2015-64841-RMinisterio de Ciencia, Innovación y Universidades TIN2017-90644-REDTMinisterio de Ciencia, Innovación y Universidades RTI2018-099213-B-I00Universidad de Málaga LEIA UMA18-FEDERJA-15

idUS. Depósito de Investigación Universidad de Sevilla

가상머신의 메모리 관리 최적화

Author: 최형규
Publication venue: 서울대학교 대학원
Publication date: 01/02/2014
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 문수묵.Memory management is one of key components in virtual machine and also affects overall performance of virtual machine itself. Modern programming languages for virtual machine use dynamic memory allocation and objects are allocated dynamically to heap at a higher rate, such as Java. These allocated objects are reclaimed later when objects are not used anymore to secure free room in the heap for future objects allocation. Many virtual machines adopt garbage collection technique to reclaim dead objects in the heap. The heap can be also expanded itself to allocate more objects instead. Therefore overall performance of memory management is determined by object allocation technique, garbage collection and heap management technique. In this paper, three optimizing techniques are proposed to improve overall performance of memory management in virtual machine. First, a lazy-worst-fit object allocator is suggested to allocate small objects with little overhead in virtual machine which has a garbage collector. Then a biased allocator is proposed to improve the performance of garbage collector itself by reducing extra overhead of garbage collector. Finally an ahead-of-time heap expansion technique is suggested to improve user responsiveness as well as overall performance of memory management by suppressing invocation of garbage collection. Proposed optimizations are evaluated in various devices including desktop, embedded and mobile, with different virtual machines including Java virtual machine for Java runtime and Dalvik virtual machine for Android platform. A lazy-worst-fit allocator outperform other allocators including first-fit and lazy-worst-fit allocator and shows good fragmentation as low as rst-t allocator which is known to have the lowest fragmentation. A biased allocator reduces 4.1% of pause time caused by garbage collections in average. Ahead-of-time heap expansion reduces both number of garbage collections and total pause time of garbage collections. Pause time of GC reduced up to 31% in default applications of Android platform.Abstract i Contents iii List of Figures vi List of Tables viii Chapter 1 Introduction 1 1.1 The need of optimizing memory management . . . . . . . . . . . 2 1.2 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . 3 Chapter 2 Backgrounds 4 2.1 Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Memory management in virtual machine . . . . . . . . . . . . . . 5 Chapter 3 Lazy Worst Fit Allocator 7 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Allocation with fits . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Lazy fits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.1 Lazy worst fit . . . . . . . . . . . . . . . . . . . . . . . . . 13 iii 3.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4.1 LWF implementation in the LaTTe Java virtual machine 14 3.4.2 Experimental environment . . . . . . . . . . . . . . . . . . 16 3.4.3 Performance of LWF . . . . . . . . . . . . . . . . . . . . . 17 3.4.4 Fragmentation of LWF . . . . . . . . . . . . . . . . . . . . 20 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Chapter 4 Biased Allocator 24 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3 Biased allocator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3.1 When to choose an allocator . . . . . . . . . . . . . . . . 28 4.3.2 How to choose an allocator . . . . . . . . . . . . . . . . . 30 4.4 Analyses and implementation . . . . . . . . . . . . . . . . . . . . 32 4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.5.1 Total pause time of garbage collections . . . . . . . . . . . 36 4.5.2 Eect of each analysis . . . . . . . . . . . . . . . . . . . . 38 4.5.3 Pause time of each garbage collection . . . . . . . . . . . 38 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Chapter 5 Ahead-of-time Heap Management 42 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.3 Android . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.3.1 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . 48 5.3.2 Heap expansion heuristic . . . . . . . . . . . . . . . . . . 49 5.4 Ahead-of-time heap expansion . . . . . . . . . . . . . . . . . . . . 51 5.4.1 Spatial heap expansion . . . . . . . . . . . . . . . . . . . . 53 iv 5.4.2 Temporal heap expansion . . . . . . . . . . . . . . . . . . 55 5.4.3 Launch-time heap expansion . . . . . . . . . . . . . . . . 56 5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5.1 Spatial heap expansion . . . . . . . . . . . . . . . . . . . . 58 5.5.2 Comparision of spatial heap expansion . . . . . . . . . . . 61 5.5.3 Temporal heap expansion . . . . . . . . . . . . . . . . . . 70 5.5.4 Launch-time heap expansion . . . . . . . . . . . . . . . . 72 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Chapter 6 Conculsion 74 Bibliography 75 요약 84 Acknowledgements 86Docto

SNU Open Repository and Archive