11 research outputs found

    A Multilevel Introspective Dynamic Optimization System For Holistic Power-Aware Computing

    Get PDF
    Power consumption is rapidly becoming the dominant limiting factor for further improvements in computer design. Curiously, this applies both at the "high end" of workstations and servers and the "low end" of handheld devices and embedded computers. At the high-end, the challenge lies in dealing with exponentially growing power densities. At the low-end, there is a demand to make mobile devices more powerful and longer lasting, but battery technology is not improving at the same rate that power consumption is rising. Traditional power-management research is fragmented; techniques are being developed at specific levels, without fully exploring their synergy with other levels. Most software techniques target either operating systems or compilers but do not explore the interaction between the two layers. These techniques also have not fully explored the potential of virtual machines for power management. In contrast, we are developing a system that integrates information from multiple levels of software and hardware, connecting these levels through a communication channel. At the heart of this system are a virtual machine that compiles and dynamically profiles code, and an optimizer that reoptimizes all code, including that of applications and the virtual machine itself. We believe this introspective, holistic approach enables more informed power-management decisions

    Crystal gazer : profile-driven write-rationing garbage collection for hybrid memories

    Get PDF
    Non-volatile memories (NVM) offer greater capacity than DRAM but suffer from high latency and low write endurance. Hybrid memories combine DRAM and NVM to form scalable memory systems with the promise of high capacity, low energy consumption, and high endurance. Automatically managing hybrid NVM-DRAM memories to achieve their promise without changing user applications or their programming models remains an open question. This paper uses garbage collection in managed languages to exploit NVM capacity while preventing NVM wear out in hybrid memories with no changes to the programming model. We introduce profile-driven write-rationing garbage collection. Allocation sites that produce frequently written objects are predicted based on previous program executions. Objects are initially allocated in a DRAM nursery space. The collector copies surviving nursery objects from highly written sites to a mature DRAM space and read-mostly objects to a mature NVM space.Write-intensity prediction for 15 Java benchmarks accurately places objects in the correct space, eliminating expensive object monitoring from prior write-rationing garbage collectors. Furthermore, our technique exposes a Pareto tradeoff between DRAM usage and NVM lifetime, unlike prior work. Experimental results on NUMA hardware that emulates hybrid NVM-DRAM memory demonstrates that profile-driven write-rationing garbage collection reduces the number of writes to NVM compared to prior work to extend its lifetime, maximizes the use of NVM for its capacity, and achieves good performance

    Phase-based adaptive recompilation in a JVM

    Full text link

    Compilaรงรฃo Just-In-Time: Histรณrico, Arquitetura, Princรญpios e Sistemas

    Get PDF
    Diversas implementaรงรตes de linguagens de alto nรญvel focam no desenvolvimento de sistemas baseados em mecanismos de compilaรงรฃo just-in-time. Esse mecanismo possui o atrativo de melhorar o desempenho de tais linguagens, mantendo a portabilidade. Contudo, ao preรงo da inclusรฃo do tempo de compilaรงรฃo ao tempo total de execuรงรฃo. Diante disso, as pesquisas na รกrea tรชm voltado balancear o custo de compilaรงรฃo com eficiรชncia de execuรงรฃo. Os primeiros sistemas de compilaรงรฃo just-in-time empregavam estratรฉgias estรกticas para selecionar e otimizar as regiรตes de cรณdigo propรญcias para gerar bom desempenho. Sistemas mais sofisticados aprimoraram tais estratรฉgias com o objetivo de aplicar otimizaรงรตes de forma mais criteriosa. Nesse sentido, este tutorial apresenta os princรญpios que fundamentam a compilaรงรฃo just-in-time e sua evoluรงรฃo ao longo dos anos, bem como a abordagem utilizada por diversos sistemas para garantir o balanceamento de custo e eficiรชncia. Embora seja difรญcil definir a melhor abordagem, trabalhos recentes mostram que estratรฉgias rรญgidas para detecรงรฃo e otimizaรงรฃo de cรณdigo, juntamente com recursos de paralelismo oferecidos pelas arquiteturas multi-core formarรฃo a base dos futuros sistemas de compilaรงรฃo just-in-time

    ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋‹ค์šด๋กœ๋”ฉ ์‹œ์Šคํ…œ์„ ์œ„ํ•œ ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2014. 8. ๋ฌธ์ˆ˜๋ฌต.์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜(์•ฑ)์„ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์•„์„œ ์ˆ˜ํ–‰ํ•˜๋Š” ์‹œ์Šคํ…œ์€ DTV๋‚˜ ์Šค๋งˆํŠธํฐ์ฒ˜๋Ÿผ ๋Œ€์ค‘์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์•ฑ์„ ๋‹ค์šด๋ฐ›์•„์„œ ์‚ฌ์šฉํ•˜๋Š” ์‹œ์Šคํ…œ๋“ค์€ ๊ฐ€์ƒ ๋จธ์‹ ์„ ์ฃผ๋ฅ˜๋กœ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค. ๊ฐ€์ƒ ๋จธ์‹ ์˜ ๊ฐ€์žฅ ํฐ ๋ฌธ์ œ์ ์€ ์ธํ„ฐํ”„๋ฆฌํ„ฐ๋ฅผ ํ†ตํ•œ ์ˆ˜ํ–‰์— ์˜ํ•œ ๋Š๋ฆฐ ์„ฑ๋Šฅ์ด๋ฉฐ, ์ด ์„ฑ๋Šฅ์˜ ํ–ฅ์ƒ์„ ์œ„ํ•ด ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ธฐ์ˆ ์ด ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ์ด๋‹ค. ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋‹ค์šด๋ฐ›์€ ์•ฑ์˜ ์ˆ˜ํ–‰ ์ค‘์— ๋™์ ์œผ๋กœ ๋จธ์‹  ์ฝ”๋“œ๋กœ ๋ฒˆ์—ญํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ๋ฒ•์œผ๋กœ, ๋™์  ์ปดํŒŒ์ผ๋ ˆ์ด์…˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ๊ฐ€์ง€๊ฒŒ ๋œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ๋™์  ์ปดํŒŒ์ผ๋ ˆ์ด์…˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ œ๊ฑฐํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ๋จธ์‹  ์ฝ”๋“œ๋ฅผ ์•ฑ์˜ ์ข…๋ฃŒ๋  ๋•Œ ์ง€์šฐ์ง€ ์•Š๊ณ  ํŒŒ์ผํ˜•ํƒœ๋กœ ์Šคํ† ๋ฆฌ์ง€์— ์ €์žฅํ•˜์—ฌ ์ดํ›„์— ์•ฑ์ด ๋‹ค์‹œ ์ˆ˜ํ–‰๋  ๋•Œ ์ €์žฅํ•œ ๋จธ์‹  ์ฝ”๋“œ๋ฅผ ์žฌํ™œ์šฉํ•˜์—ฌ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ๋Ÿฐํƒ€์ž„ ์ปดํŒŒ์ผ๋ ˆ์ด์…˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ œ๊ฑฐํ•˜๊ฒŒ ๋œ๋‹ค. ์ €์žฅํ•œ ๋จธ์‹  ์ฝ”๋“œ๋ฅผ ์žฌํ™œ์šฉํ•  ๋•Œ ๋จธ์‹  ์ฝ”๋“œ์— ์ธ์ฝ”๋”ฉ๋œ ์ฃผ์†Œ๊ฐ’๋“ค์€ ์œ ํšจํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ€์ƒ ๋จธ์‹ ์˜ ํ˜„์žฌ ๊ฐ’์— ๋งž์ถ”์–ด ๋ณ€๊ฒฝํ•ด์ฃผ๋Š” ์ž‘์—…์ด ํ•„์š”ํ•˜๋‹ค. ์ด ์ž‘์—…์€ ์ฃผ์†Œ ์žฌ๋ฐฐ์น˜์ด๋‹ค. ์ฃผ์†Œ ์žฌ๋ฐฐ์น˜๋Š” ์ €์žฅ๋œ ๋จธ์‹  ์ฝ”๋“œ๋งŒ์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ถ”๊ฐ€์ ์ธ ์ •๋ณด๋ฅผ ๋จธ์‹  ์ฝ”๋“œ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ณผ์ •์—์„œ ์ƒ์„ฑํ•˜์—ฌ ํŒŒ์ผ์— ํ•จ๊ป˜ ์ €์žฅํ•ด ์ฃผ์–ด์•ผ ํ•œ๋‹ค. ์ž๋ฐ”์˜ ์ƒ์ˆ˜ ํ’€ ํ•ด์„์€ ์ฃผ์†Œ ์žฌ๋ฐฐ์น˜ ์ž‘์—…์„ ์–ด๋ ต๊ฒŒ ํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด์— ๋Œ€ํ•œ ํ•ด๊ฒฐ์ฑ…์„ ๋งŒ๋“ค์—ˆ๋‹ค. ์ฃผ์†Œ ์žฌ๋ฐฐ์น˜๋ฅผ ์œ„ํ•œ ์ •๋ณด๋“ค์„ ์ €์žฅํ•˜๊ธฐ์œ„ํ•ด ์˜๊ตฌ ๋ฉ”๋ชจ๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋„ ๋ฌธ์ œ๊ฐ€ ๋œ๋‹ค. ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” ์ฃผ์†Œ ์žฌ๋ฐฐ์น˜ ์ •๋ณด๋ฅผ ๋จธ์‹  ์ฝ”๋“œ ์ƒ์— ์ธ์ฝ”๋”ฉํ•˜๊ณ  ์••์ถ•ํ•˜์—ฌ ์ €์žฅํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ–ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ ๊ธฐ๋ฒ•์€ ์˜ค๋ผํด์‚ฌ์˜ CDC ๊ฐ€์ƒ๋จธ์‹  ์ฐธ์กฐ๊ตฌํ˜„์ธ CVM์— ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ์šฐ๋ฆฌ์˜ ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋ฒค์น˜๋งˆํฌ์˜ ์„ฑ๋Šฅ์„ ์•ฝ 12% ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค. ๋˜ํ•œ ์šฐ๋ฆฌ๋Š” ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ ๊ธฐ๋ฒ•์„ ์‹ค์ œ๋กœ ํŒ๋งคํ•˜๋Š” DTVํ™˜๊ฒฝ์— ๊ตฌ์ถ•ํ•˜์—ฌ ์‹ค์ œ ๋ฐฉ์†ก๊ตญ์ด ์‚ฌ์šฉํ•˜๋Š” ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์ˆ˜ํ–‰ํ•ด ๋ณด์•˜๋‹ค. ์šฐ๋ฆฌ์˜ ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ ๋ฐฉ์‹์€ ์‚ฌ์šฉ์ž์˜ ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ 33%์˜ ์ข‹์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์—ˆ๋‹ค. ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ๊ฐ€์ƒ๋จธ์‹ ์ธ ๊ตฌ๊ธ€์‚ฌ์˜ V8 ๊ฐ€์ƒ๋จธ์‹ ์€ ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์ˆ˜ํ–‰์—†์ด ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๋งŒ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” V8 ๊ฐ€์ƒ ๋จธ์‹ ์— ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ์ ์šฉํ•˜์˜€์ง€๋งŒ, ์‹ค์ œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์–ด๋‚ด์ง€๋Š” ๋ชปํ–ˆ๋‹ค. ์ด๊ฒƒ์€ V8 ๊ฐ€์ƒ ๋จธ์‹ ์˜ ํŠน์ง•์ธ ๋‚ด๋ถ€ ๊ฐ์ฒด์˜ ์ ๊ทน์ ์ธ ์‚ฌ์šฉ์— ์˜ํ•œ ๊ฒฐ๊ณผ์ด๋‹ค. ๋‚ด๋ถ€ ๊ฐ์ฒด๋Š” ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ์ƒ์„ฑํ•˜์—ฌ ์ปดํŒŒ์ผ๋Ÿฌ ๊ณผ์ •์—์„œ ์‚ฌ์šฉ๋˜๋ฉฐ, ์ž๋ฐ”์Šคํฌ๋ฆฝํŠธ ํ”„๋กœ๊ทธ๋žจ์—์„œ๋„ ์ ‘๊ทผํ•˜์—ฌ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ๋‹ค. V8 ๊ฐ€์ƒ ๋จธ์‹ ์˜ ์ปดํฌ๋„ŒํŠธ๋“ค์€ ๋Œ€๋ถ€๋ถ„ ๋‚ด๋ถ€ ๊ฐ์ฒด๋กœ ์ƒ์„ฑ๋˜์–ด, ๋‹ค๋ฅธ ์ข…๋ฅ˜์˜ ๊ฐ€์ƒ ๋จธ์‹ ์— ๋น„ํ•ด์„œ ์ƒ๋‹นํžˆ ๋งŽ์€ ๋‚ด๋ถ€ ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์žˆ๋‹ค. V8์˜ ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ๋จธ์‹  ์ฝ”๋“œ์—์„œ๋Š” ์ด ๋‚ด๋ถ€ ๊ฐ์ฒด๋ฅผ ์ง์ ‘ ์ ‘๊ทผํ•˜์—ฌ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜์–ด, ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ์— ์˜ํ•ด ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์ˆ˜ํ–‰๋  ๋•Œ๋งˆ๋‹ค ์ด ๋‚ด๋ถ€ ๊ฐ์ฒด๋Š” ํ•ญ์ƒ ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋‚ด๋ถ€ ๊ฐ์ฒด๋ฅผ ์žฌ์ƒ์„ฑํ•ด์•ผ๋งŒ ํ•œ๋‹ค. V8 ์ ์‹œ ์ปดํŒŒ์ผ๋Ÿฌ์˜ ๋Ÿฐํƒ€์ž„ ์ปดํŒŒ์ผ๋ ˆ์ด์…˜ ์˜ค๋ฒ„ํ—ค๋“œ์˜ ๋Œ€๋ถ€๋ถ„์ด ๋‚ด๋ถ€ ๊ฐ์ฒด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ์˜ค๋ฒ„ํ—ค๋“œ์ด๊ธฐ ๋•Œ๋ฌธ์—, ์šฐ๋ฆฌ์˜ ํด๋ผ์ด์–ธํŠธ ์„ ํ–‰ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ์ด ํ™˜๊ฒฝ์—์„œ ์ถฉ๋ถ„ํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์–ป์„ ์ˆ˜ ์—†์—ˆ๋‹ค.App-downloading systems like DTV and smart phone are popularly used. Virtual machine is mainstream for those systems. One critical problem of app-downloading systems is performance because app is executed by interpreter. A popular solution for improving performance is Just-In-Time Compiler (JITC). JITC compiles to machine code at runtime. So, JITC suffers from runtime compilation overhead. We suggested client Ahead-Of-Time Compiler(c-AOTC) which improves the performance by removing runtime compilation overhead. c-AOTC saves machine code of method generated by JITC in persistent storage and reuses it in next runs. The machine code of a method translated by JITC is cached on a persistent memory of the device, and when the method is invoked again in a later run of the program, the machine code is loaded and executed directly without any translation overhead. One major issue in c-AOTC is relocation because some of the address constants embedded in the cached machine code are not correct when the machine code is loaded and used in a different runthose addresses should be corrected before they are used. Constant pool resolution complicates the relocation problem, and we propose our solutions. The persistent memory overhead for saving the relocation information is also an issue, and we propose a technique to encode the relocation information and compress the machine code efficiently. We developed a c-AOTC on Oracles CDC VM, and evaluation results indicate that c-AOTC can improve the performance as much as an average of 12% for benchmarks. And we adopted c-AOTC approach to commercial DTV platform and test the real xlet applications of commercial broadcasting stations. c-AOTC got average 33% performance improvement on the real xlet application test. V8 JavaScript VM does not use interpreter. Apps are executed only by JITC. We adopted c-AOTC to V8 VM. But we cannot get any good performance result because of V8 VMs characteristics. V8 VM components are generated as internal objects. Internal objects are used for compiling and running of JavaScript program. The machine code of V8 VM addresses internal objects which are different for each run. Because internal objects beใ€€accessed in each run, c-AOTCใ€€must recreate those objects. Because most of compilation overhead of V8 VM is internal object creation overhead, c-AOTCใ€€does not get enough improvements.Chapter 1 Introduction 1 Chatper 2 client-AOTC Approach 4 Chatper 3 Java Virtual Machine and Our JITC 9 3.1 Overview of JVM and the Bytecode 9 3.2 Our JITC on the CVM 14 Chatper 4 Design and Implementation of c-AOTC on JVM 16 4.1 Architecutre of the c-AOTC 16 4.2 Relocation 19 4.2.1 Translated Code Which Needs Relocation 19 4.2.2 Relocation Information and Relocation Process 22 4.2.3 Relocation for Inlined Methods 24 4.3 Reducing the Size of .aotc Files 25 4.3.1 Encoding Relocation Information 25 4.3.2 Machine Code Compression 27 4.3.3 Structure of the .aotc File 27 Chatper 5 c-AOTC for DTV JVM platform 29 5.1 DTV software platform 30 5.2 c-AOTC on the DTV 32 5.2.1 Design of c-AOTC on DTV 32 5.2.2 Relocation Problem 35 5.2.3 Example of Relocation 39 5.2.3.1 Relocation Example of JVM c-AOTC 39 5.2.3.3 Relocation Example of DTV c-AOTC 41 Chatper 6 c-AOTC for JavaScript VM 44 6.1 V8 JavaScript VM 44 6.2 Issue and Solution of c-AOTC on V8 JavaScript VM 46 Chatper 7 Experimental Results 51 7.1 Experimental Environment of JVM 51 7.2 Performance Impact of c-AOTC 53 7.3 Space Overhead of c-AOTC 55 7.4 Reducing Number of c-AOTC Methods 60 7.5 c-AOTC with new hot-spot detection heuristics 63 7.5.1 Performance Impact of c-AOTC with new hot-spotdetection heuristics 63 7.5.2 Space Overhead of c-AOTC with new hot-spot detection heuristics 67 7.6 c-AOTC of DTV JVM platform 70 7.6.1 Performance result of DTV platform 70 7.6.2 Analysis of JITCed method of DTV platform 72 7.6.3 Space overhead of DTV platform 74 7.6.4 c-AOTC overhead of DTV platform 75 7.6.5 c-AOTC performance using different xlets c-AOTC file in DTV platform 76 7.7 c-AOTC of V8 JavaScript engine 79 7.7.1 Compilation overhead on V8 JavaScript VM 79 7.7.2 Performance result on V8 JavaScript VM 81 7.7.3 Comparison with c-AOTC of JavaScriptCore VM 83 Chatper 8 Related Work 86 Chatper 9 Conclusion 89 Bibliography 91 ์ดˆ๋ก 99Docto

    Coupling On-Line and Off-Line Profile Information to Improve Program Performance

    No full text
    In this paper, we describe a novel execution environment for Java programs that substantially improves execution performance by incorporating both on-line and off-line profile information to guide dynamic optimization. By using both types of profile collection techniques, we are able to exploit the strengths of each constituent approach: profile accuracy and low overhead. Such coupling also reduces the negative impact of these approaches when each is used in isolation. On-line profiling introduces overhead for dynamic instrumentation, measurement, and decision making. Off-line profile information can be inaccurate when program inputs for execution and optimization differ from those used for profiling. To combat these drawbacks and to achieve the benefits from both online and off-line profiling, we developed a dynamic compilation system (based on JikesRVM) that makes use of both. As a result, we are able improve Java program performance by 9 % on average, for the programs studied. 1

    ENERGY-AWARE OPTIMIZATION FOR EMBEDDED SYSTEMS WITH CHIP MULTIPROCESSOR AND PHASE-CHANGE MEMORY

    Get PDF
    Over the last two decades, functions of the embedded systems have evolved from simple real-time control and monitoring to more complicated services. Embedded systems equipped with powerful chips can provide the performance that computationally demanding information processing applications need. However, due to the power issue, the easy way to gain increasing performance by scaling up chip frequencies is no longer feasible. Recently, low-power architecture designs have been the main trend in embedded system designs. In this dissertation, we present our approaches to attack the energy-related issues in embedded system designs, such as thermal issues in the 3D chip multiprocessor (CMP), the endurance issue in the phase-change memory(PCM), the battery issue in the embedded system designs, the impact of inaccurate information in embedded system, and the cloud computing to move the workload to remote cloud computing facilities. We propose a real-time constrained task scheduling method to reduce peak temperature on a 3D CMP, including an online 3D CMP temperature prediction model and a set of algorithm for scheduling tasks to different cores in order to minimize the peak temperature on chip. To address the challenging issues in applying PCM in embedded systems, we propose a PCM main memory optimization mechanism through the utilization of the scratch pad memory (SPM). Furthermore, we propose an MLC/SLC configuration optimization algorithm to enhance the efficiency of the hybrid DRAM + PCM memory. We also propose an energy-aware task scheduling algorithm for parallel computing in mobile systems powered by batteries. When scheduling tasks in embedded systems, we make the scheduling decisions based on information, such as estimated execution time of tasks. Therefore, we design an evaluation method for impacts of inaccurate information on the resource allocation in embedded systems. Finally, in order to move workload from embedded systems to remote cloud computing facility, we present a resource optimization mechanism in heterogeneous federated multi-cloud systems. And we also propose two online dynamic algorithms for resource allocation and task scheduling. We consider the resource contention in the task scheduling
    corecore