218 research outputs found

    Optimization Coaching for JavaScript

    Get PDF
    The performance of dynamic object-oriented programming languages such as JavaScript depends heavily on highly optimizing just-in-time compilers. Such compilers, like all compilers, can silently fall back to generating conservative, low-performance code during optimization. As a result, programmers may inadvertently cause performance issues on users\u27 systems by making seemingly inoffensive changes to programs. This paper shows how to solve the problem of silent optimization failures. It specifically explains how to create a so-called optimization coach for an object-oriented just-in-time-compiled programming language. The development and evaluation build on the SpiderMonkey JavaScript engine, but the results should generalize to a variety of similar platforms

    Approaches to Interpreter Composition

    Get PDF
    In this paper, we compose six different Python and Prolog VMs into 4 pairwise compositions: one using C interpreters; one running on the JVM; one using meta-tracing interpreters; and one using a C interpreter and a meta-tracing interpreter. We show that programs that cross the language barrier frequently execute faster in a meta-tracing composition, and that meta-tracing imposes a significantly lower overhead on composed programs relative to mono-language programs.Comment: 33 pages, 1 figure, 9 table

    Quantifying and Predicting the Influence of Execution Platform on Software Component Performance

    Get PDF
    The performance of software components depends on several factors, including the execution platform on which the software components run. To simplify cross-platform performance prediction in relocation and sizing scenarios, a novel approach is introduced in this thesis which separates the application performance profile from the platform performance profile. The approach is evaluated using transparent instrumentation of Java applications and with automated benchmarks for Java Virtual Machines

    Collaborative compilation

    Get PDF
    Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 105-110).Modern optimizing compilers use many heuristic solutions that must be tuned empirically. This tuning is usually done "at the factory" using standard benchmarks. However, applications that are not in the benchmark suite will not achieve the best possible performance, because they are not considered when tuning the compiler. Collaborative compilation alleviates this problem by using local profiling information for at-the-factory style training, allowing users to tune their compilers based on the applications that they use most. It takes advantage of the repeated compilations performed in Java virtual machines to gather performance information from the programs that the user runs. For a single user, this approach may cause undue overhead; for this reason, collaborative compilation allows the sharing of profile information and publishing of the results of tuning. Thus, users see no performance degradation from profiling, only performance improvement due to tuning. This document describes the challenges of implementing collaborative compilation and the solutions we have developed. We present experiments showing that collaborative compilation can be used to gain performance improvements on several compilation problems. In addition, we relate collaborative compilation to previous research and describe directions for future work.by Benjamin R. Wagner.M.Eng.and S.B

    Automating the construction of a complier heuristics using machine learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 153-162).Compiler writers are expected to create effective and inexpensive solutions to NP-hard problems such as instruction scheduling and register allocation. To make matters worse, separate optimization phases have strong interactions and competing resource constraints. Compiler writers deal with system complexity by dividing the problem into multiple phases and devising approximate heuristics for each phase. However, to achieve satisfactory performance, developers are forced to manually tweak their heuristics with trial-and-error experimentation. In this dissertation I present meta optimization, a methodology for automatically constructing high quality compiler heuristics using machine learning techniques. This thesis describes machine-learned heuristics for three important compiler optimizations: hyperblock formation, register allocation, and loop unrolling. The machine-learned heuristics outperform (by as much as 3x in some cases) their state-of-the-art hand-crafted counterparts. By automatically collecting data and systematically analyzing them, my techniques discover subtle interactions that even experienced engineers would likely overlook. In addition to improving performance, my techniques can significantly reduce the human effort involved in compiler design.(cont.) Machine learning algorithms can design critical portions of compiler heuristics, thereby freeing the human designer to focus on compiler correctness. The progression of experiments I conduct in this thesis leads to collaborative compilation, an approach which enables ordinary users to transparently train compiler heuristics by running their applications as they normally would. The collaborative system automatically adapts itself to the applications in which a community of users is interested.by Mark W. Stephenson.Ph.D

    ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ๋Ÿฐํƒ€์ž„์—์„œ์˜ ์‘์šฉํ”„๋กœ๊ทธ๋žจ ์‹œ์ž‘ ๊ฐ€์†์„ ์œ„ํ•œ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2015. 8. ๋ฌธ์ˆ˜๋ฌต.์ž๋ฐ”๋‚˜ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ์™€ ๊ฐ™์€ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Ÿฐํƒ€์ž„ ํ™˜๊ฒฝ์€ ์‘์šฉํ”„๋กœ๊ทธ๋žจ์˜ ์ด์‹์„ฑ์„ ์žฅ์ ์œผ๋กœ ํ•˜์—ฌ ์ž„๋ฒ ๋””๋“œ ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ์œผ๋กœ์จ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ž๋ฐ” ์‘์šฉํ”„๋กœ๊ทธ๋žจ์€ ๋ฐ”์ดํŠธ์ฝ”๋“œ์˜ ํ˜•ํƒœ๋กœ ๋ฐฐํฌ๋˜์–ด ๋””์ง€ํ„ธ ํ…”๋ ˆ๋น„์ „์ด๋‚˜ ์•ˆ๋“œ๋กœ์ด๋“œ ํ”Œ๋žซํผ์—์„œ ๋™์ž‘ํ•˜๋ฉฐ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ๋Š” ์†Œ์Šค ์ฝ”๋“œ ํ˜•ํƒœ๋กœ ์›น ํ”Œ๋žซํผ์—์„œ ์ˆ˜ํ–‰๋œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ๋Ÿฐํƒ€์ž„์— ์˜ํ•œ ์ด์‹์„ฑ์€ ๋ณธ์งˆ์ ์œผ๋กœ ์„ฑ๋Šฅ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ํ•˜๋“œ์›จ์–ด๊ฐ€ ์•„๋‹Œ ์ธํ„ฐํ”„๋ฆฌํ„ฐ์™€ ๊ฐ™์€ ์†Œํ”„ํŠธ์›จ์–ด์— ์˜ํ•ด ์‘์šฉํ”„๋กœ๊ทธ๋žจ์˜ ๋ฐ”์ดํŠธ์ฝ”๋“œ๋‚˜ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ์–ป๊ธฐ ์œ„ํ•ด ์ˆ˜ํ–‰ ์ค‘ ๋ฐ”์ดํŠธ์ฝ”๋“œ๋‚˜ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ๊ธฐ๊ณ„์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋‚˜ inline caching๊ณผ ๊ฐ™์ด ๋ฐ˜๋ณต ์ˆ˜ํ–‰๋˜๋Š” ๋™์ž‘์— ํŠนํ™”๋œ ์ตœ์ ํ™”๋ฅผ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด ๋Ÿฐํƒ€์ž„์— ์ ์šฉํ•˜๊ธฐ๋„ ํ•œ๋‹ค. ํ•œํŽธ, ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์—์„œ ๋™์ž‘ํ•˜๋Š” ์ž๋ฐ” ์‘์šฉํ”„๋กœ๊ทธ๋žจ์ด๋‚˜ ์›นํŽ˜์ด์ง€์˜ ๋กœ๋”ฉ ์ค‘ ์ˆ˜ํ–‰๋˜๋Š” ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ๋Š” ์•ˆ์ •๋œ ์ƒํƒœ์—์„œ์˜ ๋™์ž‘๋ณด๋‹ค๋Š” ๊ธ‰๊ฒฉํ•œ ๋ณ€ํ™”๋ฅผ ์ˆ˜๋ฐ˜ํ•˜๋Š” ์‹œ์ž‘ ๊ณผ์ •์˜ ํ–‰ํƒœ๊ฐ€ ๋” ๋‘๋“œ๋Ÿฌ์ง„๋‹ค. ๋”ฐ๋ผ์„œ ๋น„๊ต์  ์งง์€ ์ˆ˜ํ–‰์‹œ๊ฐ„์„ ๊ฐ€์ง€๊ณ , ๋™์ผํ•œ ๋™์ž‘์„ ๋ฐ˜๋ณตํ•˜๋Š” ๊ฒฝํ–ฅ์ด ๋‚ฎ์œผ๋ฉฐ, ์ˆ˜ํ–‰์‹œ๊ฐ„์—์„œ์˜ ๋น„์ค‘์ด ๋†’์€ ํ•ซ์ŠคํŒŸ์ด ๋“œ๋ฌธ ํŠน์ง•์„ ๊ฐ€์ง„๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ•ซ์ŠคํŒŸ์— ํšจ๊ณผ์ ์ธ ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋‚˜ ๋ฐ˜๋ณต๋˜๋Š” ๋™์ž‘์— ํŠนํ™”๋œ ์ตœ์ ํ™”๋Š” ์ด์™€ ๊ฐ™์€ ์‘์šฉํ”„๋กœ๊ทธ๋žจ ์‹œ๋™์˜ ํ–‰ํƒœ์— ๋Œ€ํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์–ด๋ ค์šธ ์ˆ˜ ๋ฐ–์— ์—†๋‹ค. ์ด ๋…ผ๋ฌธ์„ ํ†ตํ•˜์—ฌ ๊ธฐ์กด์˜ ๋ฐฉ์‹ ๋ณด๋‹ค ์ •๊ตํ•˜๊ฒŒ ์ถ”์ •ํ•œ ์ˆ˜ํ–‰์‹œ๊ฐ„์„ ๊ทผ๊ฑฐ๋กœ ์ž‘๋™ํ•˜๋Š” ํ•ซ์ŠคํŒŸ ๊ฐ์ง€ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•จ์œผ๋กœ์จ ํ•ซ์ŠคํŒŸ์ด ๋ถˆ๋ถ„๋ช…ํ•œ ์ƒํ™ฉ์—์„œ ์ž๋ฐ” ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ์— ์˜ํ•œ ์ˆ˜ํ–‰ ์†๋„์˜ ํ–ฅ์ƒ์„ ๊พ€ํ•˜์—ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์‘์šฉํ”„๋กœ๊ทธ๋žจ ์‹œ์ž‘์˜ ํ–‰ํƒœ๋ฅผ ๋ณด์ด๋Š” ๋ฒค์น˜๋งˆํฌ ํ”„๋กœ๊ทธ๋žจ์˜ ์ฒซ๋ฒˆ์งธ ์ˆ˜ํ–‰์‹œ๊ฐ„์„ ๊ธฐ์กด์˜ HotSpot ์ž๋ฐ” ๊ฐ€์ƒ๋จธ์‹ ์˜ ํ•ซ์ŠคํŒŸ ๊ฐ์ง€ ๊ธฐ๋ฒ• ๋Œ€๋น„ ์•ฝ 10% ๊ฐ€์†ํ™”ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ ์‘์šฉํ”„๋กœ๊ทธ๋žจ์œผ๋กœ์„œ ๋””์ง€ํ„ธ ๋ฐฉ์†ก์— ์˜ํ•ด ๋ฐฐํฌ๋œ Xlet์˜ ์‹œ์ž‘์— ๊ฑธ๋ฆฌ๋Š” ์ˆ˜ํ–‰์‹œ๊ฐ„ ์—ญ์‹œ ์•ฝ 7%๊ฐ€ ๊ฐœ์„ ๋˜์—ˆ๋‹ค. ๋˜ํ•œ, ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ์—์„œ ์ƒ์„ฑ๋˜๋Š” ๊ธฐ๊ณ„์–ด์˜ ์šฉ๋Ÿ‰์„ ์ค„์ด๊ธฐ ์œ„ํ•˜์—ฌ ์ถ•์†Œ๋œ ๋ช…๋ น์–ด ์ง‘ํ•ฉ์— ์ตœ์ ํ™”๋œ ๊ธฐ๊ณ„์–ด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•˜์—ฌ ์•ฝ 29%์— ํ•ด๋‹นํ•˜๋Š” ๊ธฐ๊ณ„์–ด์˜ ํฌ๊ธฐ๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ์—ˆ๊ณ , ์ด ๊ฒฐ๊ณผ๋Š” ์›นํŽ˜์ด์ง€ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ์˜ ์‹œ์ž‘ ๊ณผ์ •์—์„œ ์ˆ˜ํ–‰๋˜๋Š” ๋Œ€๋Ÿ‰์˜ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ์—์„œ ๋”์šฑ ํšจ๊ณผ์ ์ผ ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ํ™˜๊ฒฝ์—์„œ ์›นํŽ˜์ด์ง€ ์ž๋ฐ”์Šคํฌ๋ฆผํŠธ ์‹œ์ž‘ ์†๋„์˜ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋‚˜ํƒ€๋‚จ์„ ๋ฐœ๊ฒฌํ•˜์˜€๊ณ , ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์ˆ˜ํ–‰์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์„ ํƒ์  ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ•จ์œผ๋กœ์จ ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ์— ์˜ํ•œ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ์ตœ์†Œํ™” ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์›นํŽ˜์ด์ง€ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ์‹œ์ž‘์˜ ์ˆ˜ํ–‰ ํ–‰ํƒœ์— ๋Œ€ํ•˜์—ฌ ๋ถ„์„์„ ์‹ค์‹œํ•œ ๊ฒฐ๊ณผ, ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฐ์ฒด์— ๋Œ€ํ•œ ์ ‘๊ทผ์„ ๊ฐ€์†ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐ”์ดํŠธ์ฝ”๋“œ ์ˆ˜์ค€์˜ ์ตœ์ ํ™”๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์ˆ˜ํ–‰์— ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์ถ”๊ฐ€๋กœ ์ ์šฉํ•˜์—ฌ๋„ ์›นํŽ˜์ด์ง€ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ์‹œ์ž‘์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์€ ์—†์—ˆ๋˜ ๋ฐ˜๋ฉด, ์ œ์•ˆํ•œ ๋ฐ”์ดํŠธ์ฝ”๋“œ ์ˆ˜์ค€์˜ ์ตœ์ ํ™”๋Š” ์ˆ˜ํ–‰์‹œ๊ฐ„์„ ์•ฝ 3% ๊ฐ€์†ํ™”ํ•จ์œผ๋กœ์จ ์›นํŽ˜์ด์ง€ ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ์‹œ์ž‘์— ๋” ํšจ๊ณผ์ ์ธ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.Chapter 1. Introduction 1 1.1 Hot Spot Detection 1 1.2 Memory Consumption of JIT Compiled Code 4 1.3 Web Page JavaScript Performance with JITC 5 Chapter 2. Enhanced Hot Spot Detection 8 2.1 Previous Approaches to Hot Spot Detection 8 2.1.1 Simple Heuristic 8 2.1.2 Hot Heuristic 9 2.1.3 Static Analysis Heuristic 10 2.2 Flow-Sensitive Runtime Estimation 11 2.3 Static-FSRE for First-Invocation Compilation 15 2.4 Merged Heuristic of Dynamic and Static FSRE 18 2.4.1 Threshold of FSRE 18 2.4.2 Merged Heuristic 19 2.5 Experimental Results 19 2.5.1 Benchmark Results 19 2.5.1.1 Experimental Environment 19 2.5.1.2 Evaluation Heuristics 20 2.1.1.3 Performance of the Five Heuristics 21 2.1.1.4 Preciseness of Hot Spot Detection 23 2.1.1.5 Hot Spot Detection Time 28 2.1.1.6 Hot Spot Detection Overhead 29 2.5.2 Digital TV Java Xlet Results 31 2.5.2.1 DTV Environment and Java Xlet application 31 2.5.2.2 Heuristic Adjustments 33 2.5.2.3 Performance Improvement and Comparison 33 Chapter 3. Code Size Optimization for JITC 40 3.1 JavaScript JITC in SFX and Thumb2 40 3.1.1 JavaScript and Execution Semantics 40 3.1.2 SquirrelFish Extreme and the Bytecode 41 3.1.3 SFX JITC Architecture 43 3.1.4 JITC Code Generation for Thumb2 45 3.2 SFX JITC Optimizations for Thumb2 45 3.2.1 Code Generation with Register Re-map 45 3.2.2 Constant Pool Aggregation 46 3.2.3 Patching PC-relative Branches 49 3.3 Experimental Result 52 3.3.1 Experimental Environment 52 3.3.2 Code Size Result 52 3.3.3 Performance Result 55 Chapter 4. Selective JITC for Web Page JavaScript 56 4.1 JavaScript and SFX JITC 56 4.1.1 JavaScript and Interaction with DOM 56 4.1.2 SFX JITC and Its Architecture 59 4.1.3 Benchmark JavaScript and Web Page JavaScript 62 4.2 Selective JITC for the SFX 64 4.2.1 Selective JITC 64 4.2.2 Selective JITC Implementation for the SFX 65 4.3 Experimental Result 66 4.3.1 Experiment Environment 66 4.3.2 Web Page JavaScript and SunSpider Benchmark 66 4.3.3 Web page JavaScript Execution Time 71 4.3.4 Comparison to Benchmark Execution Time 73 4.3.5 Evaluation of the Selective JITC Heuristic 74 4.3.6 Discussions 76 Chapter 5. Bytecode Level Optimizations 78 5.1 Analysis on Web Page JavaScript Execution 78 5.2 Overhead in Property Accesses 82 5.3 Super-Bytecode Construction (SBC) 85 5.4 Bytecode Chaining (BC) 86 5.5 Experimental Evaluation 87 5.5.1 Performance Result 88 5.5.2 Performance Analysis 89 5.5.2.1 Optimized Runtime Services with SBC 89 5.5.2.2 Removed Runtime Services with BC 90 Chapter 6. Related Work 92 Chapter 7. Conclusion 94 Bibliography 97 Abstract 103Docto

    Vapor SIMD: Auto-Vectorize Once, Run Everywhere

    Get PDF
    International audienceJust-in-Time (JIT) compiler technology offers portability while facilitating target- and context-specific specialization. Single-Instruction-Multiple-Data (SIMD) hardware is ubiquitous and markedly diverse, but can be difficult for JIT compilers to efficiently target due to resource and budget constraints. We present our design for a synergistic auto-vectorizing compilation scheme. The scheme is composed of an aggressive, generic offline stage coupled with a lightweight, target-specific online stage. Our method leverages the optimized intermediate results provided by the first stage across disparate SIMD architectures from different vendors, having distinct characteristics ranging from different vector sizes, memory alignment and access constraints, to special computational idioms.We demonstrate the effectiveness of our design using a set of kernels that exercise innermost loop, outer loop, as well as straight-line code vectorization, all automatically extracted by the common offline compilation stage. This results in performance comparable to that provided by specialized monolithic offline compilers. Our framework is implemented using open-source tools and standards, thereby promoting interoperability and extendibility
    • โ€ฆ
    corecore