Search CORE

218 research outputs found

Optimization Coaching for JavaScript

Author: Guo Shu-yu
St-Amour Vincent
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th European Conference on Object-Oriented Programming (ECOOP 2015)
Publication date: 01/01/2015
Field of study

The performance of dynamic object-oriented programming languages such as JavaScript depends heavily on highly optimizing just-in-time compilers. Such compilers, like all compilers, can silently fall back to generating conservative, low-performance code during optimization. As a result, programmers may inadvertently cause performance issues on users\u27 systems by making seemingly inoffensive changes to programs. This paper shows how to solve the problem of silent optimization failures. It specifically explains how to create a so-called optimization coach for an object-oriented just-in-time-compiled programming language. The development and evaluation build on the SpiderMonkey JavaScript engine, but the results should generalize to a variety of similar platforms

Dagstuhl Research Online Publication Server

Approaches to Interpreter Composition

Author: Barrett Edd
Bolz Carl Friedrich
Tratt Laurence
Publication venue: 'Elsevier BV'
Publication date: 19/05/2015
Field of study

In this paper, we compose six different Python and Prolog VMs into 4 pairwise compositions: one using C interpreters; one running on the JVM; one using meta-tracing interpreters; and one using a C interpreter and a meta-tracing interpreter. We show that programs that cross the language barrier frequently execute faster in a meta-tracing composition, and that meta-tracing imposes a significantly lower overhead on composed programs relative to mono-language programs.Comment: 33 pages, 1 figure, 9 table

arXiv.org e-Print Archive

CiteSeerX

King's Research Portal

Inducing heuristics to decide whether to schedule

Author: Chekuri C.
Ellis J. R.
Garey M. R.
J. Eliot B. Moss
John Cavazos
McGovern A.
Moss J. E. B.
Muchnick S. S.
Young C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Quantifying and Predicting the Influence of Execution Platform on Software Component Performance

Author: Kuperberg Michael
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

The performance of software components depends on several factors, including the execution platform on which the software components run. To simplify cross-platform performance prediction in relocation and sizing scenarios, a novel approach is introduced in this thesis which separates the application performance profile from the platform performance profile. The approach is evaluated using transparent instrumentation of Java applications and with automated benchmarks for Java Virtual Machines

KITopen

Collaborative compilation

Author: Wagner Benjamin R
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 105-110).Modern optimizing compilers use many heuristic solutions that must be tuned empirically. This tuning is usually done "at the factory" using standard benchmarks. However, applications that are not in the benchmark suite will not achieve the best possible performance, because they are not considered when tuning the compiler. Collaborative compilation alleviates this problem by using local profiling information for at-the-factory style training, allowing users to tune their compilers based on the applications that they use most. It takes advantage of the repeated compilations performed in Java virtual machines to gather performance information from the programs that the user runs. For a single user, this approach may cause undue overhead; for this reason, collaborative compilation allows the sharing of profile information and publishing of the results of tuning. Thus, users see no performance degradation from profiling, only performance improvement due to tuning. This document describes the challenges of implementing collaborative compilation and the solutions we have developed. We present experiments showing that collaborative compilation can be used to gain performance improvements on several compilation problems. In addition, we relate collaborative compilation to previous research and describe directions for future work.by Benjamin R. Wagner.M.Eng.and S.B

DSpace@MIT

Automating the construction of a complier heuristics using machine learning

Author: Stephenson Mark William, 1975-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 153-162).Compiler writers are expected to create effective and inexpensive solutions to NP-hard problems such as instruction scheduling and register allocation. To make matters worse, separate optimization phases have strong interactions and competing resource constraints. Compiler writers deal with system complexity by dividing the problem into multiple phases and devising approximate heuristics for each phase. However, to achieve satisfactory performance, developers are forced to manually tweak their heuristics with trial-and-error experimentation. In this dissertation I present meta optimization, a methodology for automatically constructing high quality compiler heuristics using machine learning techniques. This thesis describes machine-learned heuristics for three important compiler optimizations: hyperblock formation, register allocation, and loop unrolling. The machine-learned heuristics outperform (by as much as 3x in some cases) their state-of-the-art hand-crafted counterparts. By automatically collecting data and systematically analyzing them, my techniques discover subtle interactions that even experienced engineers would likely overlook. In addition to improving performance, my techniques can significantly reduce the human effort involved in compiler design.(cont.) Machine learning algorithms can design critical portions of compiler heuristics, thereby freeing the human designer to focus on compiler correctness. The progression of experiments I conduct in this thesis leads to collaborative compilation, an approach which enables ordinary users to transparently train compiler heuristics by running their applications as they normally would. The collaborative system automatically adapts itself to the applications in which a community of users is interested.by Mark W. Stephenson.Ph.D

DSpace@MIT

프로그래밍 언어 런타임에서의 응용프로그램 시작 가속을 위한 최적화

Author: 이성원
Publication venue: 서울대학교 대학원
Publication date: 01/08/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 문수묵.자바나 자바스크립트와 같은 프로그래밍 언어를 수행하는 런타임 환경은 응용프로그램의 이식성을 장점으로 하여 임베디드 소프트웨어 플랫폼으로써 널리 사용되고 있다. 자바 응용프로그램은 바이트코드의 형태로 배포되어 디지털 텔레비전이나 안드로이드 플랫폼에서 동작하며 자바스크립트는 소스 코드 형태로 웹 플랫폼에서 수행된다. 그러나 프로그래밍 언어 런타임에 의한 이식성은 본질적으로 성능 문제를 야기할 수 있는데, 하드웨어가 아닌 인터프리터와 같은 소프트웨어에 의해 응용프로그램의 바이트코드나 소스 코드를 수행하기 때문이다. 따라서 더 나은 성능을 얻기 위해 수행 중 바이트코드나 소스 코드를 기계어로 번역하는 적시 컴파일러나 inline caching과 같이 반복 수행되는 동작에 특화된 최적화를 프로그래밍 언어 런타임에 적용하기도 한다. 한편, 임베디드 시스템에서 동작하는 자바 응용프로그램이나 웹페이지의 로딩 중 수행되는 자바스크립트는 안정된 상태에서의 동작보다는 급격한 변화를 수반하는 시작 과정의 행태가 더 두드러진다. 따라서 비교적 짧은 수행시간을 가지고, 동일한 동작을 반복하는 경향이 낮으며, 수행시간에서의 비중이 높은 핫스팟이 드문 특징을 가진다. 그러나 핫스팟에 효과적인 적시 컴파일러나 반복되는 동작에 특화된 최적화는 이와 같은 응용프로그램 시동의 행태에 대하여 성능을 향상시키기 어려울 수 밖에 없다. 이 논문을 통하여 기존의 방식 보다 정교하게 추정한 수행시간을 근거로 작동하는 핫스팟 감지 기법을 제안함으로써 핫스팟이 불분명한 상황에서 자바 적시 컴파일러에 의한 수행 속도의 향상을 꾀하었다. 그 결과 응용프로그램 시작의 행태를 보이는 벤치마크 프로그램의 첫번째 수행시간을 기존의 HotSpot 자바 가상머신의 핫스팟 감지 기법 대비 약 10% 가속화할 수 있었다. 그리고 실제 응용프로그램으로서 디지털 방송에 의해 배포된 Xlet의 시작에 걸리는 수행시간 역시 약 7%가 개선되었다. 또한, 자바스크립트 적시 컴파일러에서 생성되는 기계어의 용량을 줄이기 위하여 축소된 명령어 집합에 최적화된 기계어를 생성하는 기법을 제안하였다. 이를 통하여 약 29%에 해당하는 기계어의 크기를 줄일 수 있었고, 이 결과는 웹페이지 자바스크립트의 시작 과정에서 수행되는 대량의 자바스크립트에서 더욱 효과적일 수 있다. 그리고 적시 컴파일러만을 사용하여 자바스크립트를 수행하는 환경에서 웹페이지 자바스크림트 시작 속도의 성능 저하가 나타남을 발견하였고, 이를 개선하기 위하여 인터프리터 수행을 기반으로 선택적 컴파일을 시도함으로써 적시 컴파일러에 의한 성능 저하를 최소화 하였다. 마지막으로 웹페이지 자바스크립트 시작의 수행 행태에 대하여 분석을 실시한 결과, 빈번하게 발생하는 객체에 대한 접근을 가속화할 수 있는 바이트코드 수준의 최적화를 제안한다. 인터프리터 수행에 적시 컴파일러를 추가로 적용하여도 웹페이지 자바스크립트 시작의 성능 향상은 없었던 반면, 제안한 바이트코드 수준의 최적화는 수행시간을 약 3% 가속화함으로써 웹페이지 자바스크립트 시작에 더 효과적인 것을 확인할 수 있었다.Chapter 1. Introduction 1 1.1 Hot Spot Detection 1 1.2 Memory Consumption of JIT Compiled Code 4 1.3 Web Page JavaScript Performance with JITC 5 Chapter 2. Enhanced Hot Spot Detection 8 2.1 Previous Approaches to Hot Spot Detection 8 2.1.1 Simple Heuristic 8 2.1.2 Hot Heuristic 9 2.1.3 Static Analysis Heuristic 10 2.2 Flow-Sensitive Runtime Estimation 11 2.3 Static-FSRE for First-Invocation Compilation 15 2.4 Merged Heuristic of Dynamic and Static FSRE 18 2.4.1 Threshold of FSRE 18 2.4.2 Merged Heuristic 19 2.5 Experimental Results 19 2.5.1 Benchmark Results 19 2.5.1.1 Experimental Environment 19 2.5.1.2 Evaluation Heuristics 20 2.1.1.3 Performance of the Five Heuristics 21 2.1.1.4 Preciseness of Hot Spot Detection 23 2.1.1.5 Hot Spot Detection Time 28 2.1.1.6 Hot Spot Detection Overhead 29 2.5.2 Digital TV Java Xlet Results 31 2.5.2.1 DTV Environment and Java Xlet application 31 2.5.2.2 Heuristic Adjustments 33 2.5.2.3 Performance Improvement and Comparison 33 Chapter 3. Code Size Optimization for JITC 40 3.1 JavaScript JITC in SFX and Thumb2 40 3.1.1 JavaScript and Execution Semantics 40 3.1.2 SquirrelFish Extreme and the Bytecode 41 3.1.3 SFX JITC Architecture 43 3.1.4 JITC Code Generation for Thumb2 45 3.2 SFX JITC Optimizations for Thumb2 45 3.2.1 Code Generation with Register Re-map 45 3.2.2 Constant Pool Aggregation 46 3.2.3 Patching PC-relative Branches 49 3.3 Experimental Result 52 3.3.1 Experimental Environment 52 3.3.2 Code Size Result 52 3.3.3 Performance Result 55 Chapter 4. Selective JITC for Web Page JavaScript 56 4.1 JavaScript and SFX JITC 56 4.1.1 JavaScript and Interaction with DOM 56 4.1.2 SFX JITC and Its Architecture 59 4.1.3 Benchmark JavaScript and Web Page JavaScript 62 4.2 Selective JITC for the SFX 64 4.2.1 Selective JITC 64 4.2.2 Selective JITC Implementation for the SFX 65 4.3 Experimental Result 66 4.3.1 Experiment Environment 66 4.3.2 Web Page JavaScript and SunSpider Benchmark 66 4.3.3 Web page JavaScript Execution Time 71 4.3.4 Comparison to Benchmark Execution Time 73 4.3.5 Evaluation of the Selective JITC Heuristic 74 4.3.6 Discussions 76 Chapter 5. Bytecode Level Optimizations 78 5.1 Analysis on Web Page JavaScript Execution 78 5.2 Overhead in Property Accesses 82 5.3 Super-Bytecode Construction (SBC) 85 5.4 Bytecode Chaining (BC) 86 5.5 Experimental Evaluation 87 5.5.1 Performance Result 88 5.5.2 Performance Analysis 89 5.5.2.1 Optimized Runtime Services with SBC 89 5.5.2.2 Removed Runtime Services with BC 90 Chapter 6. Related Work 92 Chapter 7. Conclusion 94 Bibliography 97 Abstract 103Docto

SNU Open Repository and Archive

Vapor SIMD: Auto-Vectorize Once, Run Everywhere

Author: Cohen Albert
Dyshel Sergei
Nuzman Dorit
Rohou Erven
Rosen Ira
Williams Kevin
Yuste David
Zaks Ayal
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceJust-in-Time (JIT) compiler technology offers portability while facilitating target- and context-specific specialization. Single-Instruction-Multiple-Data (SIMD) hardware is ubiquitous and markedly diverse, but can be difficult for JIT compilers to efficiently target due to resource and budget constraints. We present our design for a synergistic auto-vectorizing compilation scheme. The scheme is composed of an aggressive, generic offline stage coupled with a lightweight, target-specific online stage. Our method leverages the optimized intermediate results provided by the first stage across disparate SIMD architectures from different vendors, having distinct characteristics ranging from different vector sizes, memory alignment and access constraints, to special computational idioms.We demonstrate the effectiveness of our design using a set of kernels that exercise innermost loop, outer loop, as well as straight-line code vectorization, all automatically extracted by the common offline compilation stage. This results in performance comparable to that provided by specialized monolithic offline compilers. Our framework is implemented using open-source tools and standards, thereby promoting interoperability and extendibility

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1