3 research outputs found

    Run-time resource allocation for embedded Multiprocessor System-on-Chip using tree-based design space exploration

    Get PDF
    The dynamic nature of application workloads in modern MPSoC-based embedded systems is growing. To cope with the dynamism of application workloads at run time and to improve the efficiency of the underlying system architecture, this paper presents a novel run-time resource allocation algorithm for multimedia applications with the objective of minimizing energy consumption for predefined deadlines. This algorithm is based on a novel tree-based design space exploration (DSE) method, which is performed in two phases: design-time and run-time. During design time, application clustering is combined with the tree-based DSE, and after that, feature extraction and application classification is performed during run-time based on well-known machine learning techniques. We evaluated our algorithm using a heterogeneous MPSoC system with several applications that have different communication and computation behaviors. Our experimental results revealed that during runtime, more than 91% of the applications were classified correctly by our proposed algorithm to select the best resources for allocation. Therefore the results clearly confirm that our algorithm is effective

    Adaptive Mapping for Multiple Applications on Parallel Architectures

    Get PDF
    International audienceWe propose a novel adaptive approach capable of handling dynamism of a set of applications on network-on-chip. The applications are subject to throughput or energy consumption constraints. For each application, a set of non-dominated (Pareto) schedules are computed at design-time in the (energy, period, processors) space for different cores topologies. Then, upon the starting or ending of an application, a lightweight adaptive run-time scheduler reconfigures the mapping of the live applications according to the available resources (i.e., the available cores of the network-on-chip). This run-time scheduler selects the best topology for each application and maps them to the network-on-chip using the tetris algorithm. This novel scheduling approach is adaptive, it changes the mapping of applications during their execution, and thus delivers just enough power to achieve applications constraints

    ๋งค๋‹ˆ์ฝ”์–ด ๊ฐ€์†๊ธฐ์˜ ๊ฒฐํ•จ์„ ๊ณ ๋ คํ•œ ํƒœ์Šคํฌ ๋งคํ•‘ ๋ฐ ์ž์› ๊ด€๋ฆฌ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2014. 8. ํ•˜์ˆœํšŒ.๊ธฐ์ˆ ์ด ๋ฐœ์ „ํ•จ์— ๋”ฐ๋ผ ํ•˜๋‚˜์˜ ์นฉ ์•ˆ์— ์ง‘์ ๋˜๋Š” ํ”„๋กœ์„ธ์„œ์˜ ๊ฐฏ์ˆ˜๊ฐ€ ์ ์  ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค. ๋˜ํ•œ, ์‘์šฉ๋“ค์˜ ๋ณด๋‹ค ๋†’์€ ์—ฐ์‚ฐ ๋Šฅ๋ ฅ์— ๋Œ€ํ•œ ์š”๊ตฌ๋กœ ์ธํ•ด ๋งค๋‹ˆ์ฝ”์–ด ๊ฐ€์†๊ธฐ๋Š” ์‹œ์Šคํ…œ-์˜จ-์นฉ์—์„œ ์ค‘์š”ํ•œ ์—ฐ์‚ฐ ์žฅ์น˜๊ฐ€ ๋˜์—ˆ๋‹ค. ์‹œ์Šคํ…œ์˜ ์ƒํƒœ๊ฐ€ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ์š”์ธ์— ์˜ํ•ด ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์‹œ์Šคํ…œ ์ˆ˜ํ–‰์ค‘์— ๊ทธ๋Ÿฌํ•œ ๊ฐ€์†๊ธฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๋‹ค๋ฃจ๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ค์šด ๋ฌธ์ œ์ด๋‹ค. ์‹œ์Šคํ…œ ์ˆ˜์ค€์—์„œ๋Š” ์‘์šฉ๋“ค์ด ์‚ฌ์šฉ์ž์˜ ์š”๊ตฌ์— ๋”ฐ๋ผ ์‹œ์ž‘ ๋˜๋Š” ์ข…๋ฃŒ๊ฐ€ ๋˜๊ณ , ์‘์šฉ ๋ ˆ๋ฒจ์—์„œ๋Š” ์‘์šฉ ์ž์ฒด์˜ ๋™์ž‘์ด ์ž…๋ ฅ ๋ฐ์ดํƒ€๋‚˜ ์ˆ˜ํ–‰๋ชจ๋“œ์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๊ฒŒ ๋œ๋‹ค. ์•„ํ‚คํ…์ฒ˜ ์ˆ˜์ค€์—์„œ๋Š” ํ”„๋กœ์„ธ์„œ์˜ ์˜๊ตฌ ๊ณ ์žฅ์œผ๋กœ ์ธํ•ด ํ•˜๋“œ์›จ์–ด ์ปดํฌ๋„ŒํŠธ์˜ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ƒํ™ฉ์ด ๋ณ€ํ•˜๊ฒŒ ๋œ๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๊ฐ€์†๊ธฐ๋ฅผ ๋‹ค๋ฃจ๋Š”๋ฐ ์žˆ์–ด์„œ์˜ ์œ„์™€ ๊ฐ™์€ ์–ด๋ ค์›€๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์„ธ๊ฐ€์ง€ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค. ์ฒซ๋ฒˆ์งธ ๊ธฐ๋ฒ•์€ ํ”„๋กœ์„ธ์„œ์˜ ์˜๊ตฌ ๊ณ ์žฅ์ด ๋ฐœ์ƒํ•˜์˜€์„ ๋•Œ, ์ „์ฒด ์‘์šฉ๋“ค์„ ์‹œ๊ฐ„ ์ œ์•ฝ ํ•˜์— ์ฒ˜๋ฆฌ๋Ÿ‰์˜ ์ €ํ•˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋ฉฐ ์žฌ์Šค์ผ€์ฅด์„ ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ตœ์ ์˜ ์žฌ์Šค์ผ€์ฅด ๊ฒฐ๊ณผ๋“ค์€ ์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•˜์—ฌ ์ปดํŒŒ์ผ ์‹œ์—, ๊ฐ๊ฐ์˜ ํ”„๋กœ์„ธ์„œ ๊ณ ์žฅ ์ƒํ™ฉ์— ๋”ฐ๋ผ ์ค€๋น„๊ฐ€ ๋œ๋‹ค. ์ˆ˜ํ–‰ ์‹œ๊ฐ„์— ํ”„๋กœ์„ธ์„œ ๊ณ ์žฅ์ด ๊ฐ์ง€๋˜๋ฉด, ์ •์ƒ์ ์œผ๋กœ ๋™์ž‘ํ•˜๋Š” ํ”„๋กœ์„ธ์„œ๋“ค์ด ์ €์žฅ๋œ ์Šค์ผ€์ฅด์„ ๊ฐ€์ง€๊ณ  ํƒœ์Šคํฌ ์ด์ฃผ๋ฅผ ์ˆ˜ํ–‰ํ•œ ํ›„ ํƒœ์Šคํฌ๋“ค์˜ ๋‚˜๋จธ์ง€ ์ˆ˜ํ–‰์„ ์ง€์†ํ•œ๋‹ค. ์ด ๊ธฐ๋ฒ•์—์„œ๋Š” ๋˜ํ•œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ์–ป๊ธฐ ์œ„ํ•ด, ์„ ์ , ๋น„์„ ์  ๋ฐ ์œตํ•ฉ ์ด์ฃผ ์ •์ฑ…์ด ์ œ์•ˆ๋˜์—ˆ๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์˜ ๊ฐ€๋Šฅ์„ฑ์€ ์‹ค์ œ ๋””์ง€ํ„ธ ์‹ ํ˜ธ์ฒ˜๋ฆฌ ์‘์šฉ๋“ค๊ณผ ์ž„์˜๋กœ ์ƒ์„ฑ๋œ ์‘์šฉ๋“ค์— ๋Œ€ํ•ด ์‹œ๊ฐ„์ œ์•ฝ๊ณผ ๋‹ค์–‘ํ•œ ํ”„๋กœ์„ธ์„œ ๊ณ ์žฅ ์ƒํ™ฉ์— ๋Œ€ํ•ด ๊ฒ€์ฆ๋˜์—ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์€ ๋ณตํ•ฉ ์ž์› ๊ด€๋ฆฌ ๊ธฐ๋ฒ•์œผ๋กœ, ์ฒซ๋ฒˆ์งธ ๊ธฐ๋ฒ•์—์„œ ๋‹ค๋ฃฌ ํ”„๋กœ์„ธ์„œ ์˜๊ตฌ๊ณ ์žฅ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ๋™๊ธฐํ™” ๋ฐ์ดํƒ€-ํ๋ฆ„ ๊ทธ๋ž˜ํ”„๋กœ ๊ธฐ์ˆ ๋œ ์—ฌ๋Ÿฌ ์‘์šฉ๋“ค๊ณผ ์‘์šฉ๋“ค์˜ ๋™์  ์–‘์ƒ์„ ๋‹ค๋ฃจ๋Š” ๊ฒƒ๊นŒ์ง€๋กœ ํ™•์žฅ์ด ๋œ ๊ฒƒ์ด๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์—์„œ๋Š”, ์šฐ์„  ์„ค๊ณ„ ์ˆ˜์ค€์—์„œ ํ• ๋‹น๋˜๋Š” ํ”„๋กœ์„ธ์„œ์˜ ๊ฐฏ์ˆ˜๋ฅผ ๋ณ€ํ™”์‹œ์ผœ๊ฐ€๋ฉด์„œ ๋™๊ธฐํ™”๋œ ๋ฐ์ดํƒ€-ํ๋ฆ„ ๊ทธ๋ž˜ํ”„๋“ค์˜ ์ฒ˜๋ฆฌ๋Ÿ‰์ด ์ตœ๋Œ€๋กœ ์–ป์–ด์ง€๋Š” ๋งคํ•‘ ๊ฒฐ๊ณผ๋“ค์„ ์–ป๋Š”๋‹ค. ๊ทธ๋ฆฌ๊ณ ๋‚˜์„œ ์ˆ˜ํ–‰ ์‹œ๊ฐ„์—๋Š” ๋ฏธ๋ฆฌ ๊ณ„์‚ฐ๋œ ๋งคํ•‘ ์ •๋ณด๋“ค์„ ๊ฐ€์ง€๊ณ  ์ˆ˜ํ–‰์ค‘์ธ ์‘์šฉ๋“ค์˜ ๋งคํ•‘์„, ๋™์ ์ธ ์‹œ์Šคํ…œ ๋ณ€ํ™”๊ฐ€ ๋ฐœ์ƒํ•  ๋•Œ๋งˆ๋‹ค ์ ์šฉํ•˜๊ฒŒ ๋œ๋‹ค. ์ œ์•ˆ๋œ ์ž์› ๊ด€๋ฆฌ ๊ธฐ๋ฒ•์€ Noxim์ด๋ผ๋Š” ๋„คํŠธ์›Œํฌ-์˜จ-์นฉ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ ์œ„์—์„œ ๊ตฌํ˜„์ด ๋˜์—ˆ์œผ๋ฉฐ, ์‹คํ—˜ ๊ฒฐ๊ณผ๋“ค์€ ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์ด ์ตœ์‹ ์˜ ๋‹ค๋ฅธ ๊ธฐ๋ฒ•๋“ค๊ณผ ๋น„๊ตํ•˜์—ฌ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ๋Š”, ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ์‹œ์Šคํ…œ-์˜จ-์นฉ ์ œ์ž‘ ์ด์ „์— ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ, ๋‘ ๋ฒˆ์งธ ๊ธฐ๋ฒ•์„ ๊ตฌํ˜„ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ์ด ๋งค๋‹ˆ์ฝ”์–ด ์•„ํ‚คํ…์ฒ˜๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์ œ์•ˆ๋˜์—ˆ๋‹ค. ๊ธฐ์กด์˜ ๋งค๋‹ˆ์ฝ”์–ด ์•„ํ‚คํ…์ฒ˜๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•œ ์—ฐ๊ตฌ๋“ค์€ ์ฃผ๋กœ ์ƒ์œ„ ์ˆ˜์ค€์˜ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์—, ์‹ค์ œ ์„ฑ๋Šฅ๊ณผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์„ฑ๋Šฅ์ด ์–ผ๋งˆ๋‚˜ ์ฐจ์ด๊ฐ€ ๋‚ ์ง€๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์•Œ ์ˆ˜๊ฐ€ ์—†์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ๊ณผ, ๊ฐ€์ƒ ํ”„๋กœํ† ํƒ€์ดํ•‘ ์‹œ์Šคํ…œ ๋ฐ ์ œ์˜จ ์—๋ฎฌ๋ ˆ์ด์…˜ ์‹œ์Šคํ…œ์—์„œ์˜ ํ”Œ๋žซํผ ๊ตฌํ˜„ ๋ฐฉ๋ฒ•์ด ์ œ์•ˆ์ด ๋˜์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์‹ค์ œ ์‹œ์Šคํ…œ ๊ตฌํ˜„์„ ํ†ตํ•˜์—ฌ ์ œ์•ˆ๋œ ๋ณตํ•ฉ ์ž์› ๊ด€๋ฆฌ ๊ธฐ๋ฒ•์—์„œ์˜ ๋‹ค์–‘ํ•œ ๋™์  ๋น„์šฉ๋“ค์ด ์ •ํ™•ํ•˜๊ฒŒ ์ถ”์‚ฐ์ด ๋  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์‹คํ—˜์—์„œ๋Š” ์ œ์•ˆ๋œ ์†Œํ”„ํŠธ์›จ์–ด ๊ธฐ๋ฒ•์ด ํƒœ์Šคํฌ๋“ค์˜ ๋™์  ๋งคํ•‘๊ณผ ์ฒดํฌ-ํฌ์ธํŒ…์„ ํ†ตํ•œ ํ”„๋กœ์„ธ์„œ ์˜๊ตฌ ๊ณ ์žฅ์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ฐ๋‚ดํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค.Owing to the incessant technology improvement, the number of processors integrated into a single chip increases consistently, integrating more and more applications. Also, demand for higher computing capability for applications makes a many-core accelerator become an important computing resource in a system-on-chip. Efficient handling of the accelerator at run-time, however, is very challenging because the system status is subject to change dynamically by various factors. At the system level, the set of applications running concurrently may change according to user request. At the application level, the application behavior may change dynamically depending on input data or operation mode. At the architecture level, hardware resource availability may vary since hardware components may experience transient or permanent failures. In this thesis, to resolve the difficulties in handling many-core accelerator, three techniques are proposed. The first technique is the re-scheduling of the entire application to minimize throughput degradation under a latency constraint when a permanent processor failure occurs. Sub-optimal re-scheduling results using a genetic algorithm for each scenario of processor failures are obtained at compile-time. If a failure is detected at run-time, the live processors obtain the saved schedule, perform task transfer, and execute the remaining tasks of the current iteration. In this technique, preemptive and non-preemptive migration policies and a hybrid policy are proposed to obtain better performance. The viability of the proposed technique with real-life DSP applications as well as randomly generated graphs under timing constraints and random fault scenarios are shown through experiments. The second technique is a hybrid resource management scheme, expanded version of the first technique that also handles multi-applications specified as SDF graph and their relevant dynamisms such as application/task arrivals/ends as well as processor permanent failures. In the proposed technique, at design-time, throughput-maximized mappings of each SDF graph by varying the number of allocated processors are determined. Then, at run-time, the pre-computed mapping information is exploited to adjust the mapping of active applications to the processors without user intervention on the system status change. The proposed resource management is evaluated through intensive experiments with an in-house simulator built on top of Noxim, a Network-on-Chip simulator. Experimental results show the enhanced adaptability to dynamic system status change compared to other state-of-the-art approaches. Finally, the software platform for a homogeneous many-core architecture that implements the second technique is proposed to evaluate the system performance more accurately before SoC fabrication. Existing approaches usually use a high-level simulation model to estimate the performance without knowing how much actual performance will be deviated from the estimation. To overcome the limitation, the software platform is proposed and implementation details on a virtual prototyping system and on an emulation system realized with an Intel Xeon-Phi coprocessor are presented. Actual implementation enables us to investigate the overheads involved in the hybrid resource management technique in detail, which was not possible in high-level simulation. Experimental results confirm that the proposed software platform adapts to the dynamic workload variation effectively by dynamic mapping of tasks and tolerate unexpected core failures by check-pointing.Abstract i Contents iv List of Figures viii List of Tables xii Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . 5 1.3 Thesis Organization . . . . . . . . . . . 7 Chapter 2 Preliminaries 8 2.1 Application Model . . . . . . . . . . 8 2.2 Architecture Model . . . . . . . . . . 13 2.3 Fault Model . . . . . . . . . . . . 15 2.4 Thesis Overview . . . . . . . . . . . 15 Chapter 3 Fault-aware Task Mapping 17 3.1 Introduction . . . . . . . . . . . . 17 3.2 Related Work . . . . . . . . . . . . 20 3.2.1 Static Approach . . . . . . . . . . 21 3.2.2 Dynamic Approach . . . . . . . . . . 22 3.3 Proposed Task Remapping/Rescheduling Technique . . 23 3.3.1 Remapping Technique . . . . . . . . 23 3.3.2 Rescheduling Technique . . . . . . . . 31 3.4 Experiments . . . . . . . . . . . . . 38 3.4.1 Remapping Results . . . . . . . . 38 3.4.2 Rescheduling Results . . . . . . . . 46 Chapter 4 Fault-aware Resource Management 53 4.1 Introduction . . . . . . . . . . . . 53 4.2 Related Work . . . . . . . . . . . . 54 4.2.1 Static Approach . . . . . . . . . . 55 4.2.2 Dynamic Approach . . . . . . . . . 55 4.2.3 Hybrid Approach . . . . . . . . . . 57 4.2.4 Summary . . . . . . . . . . . . 57 4.3 Background . . . . . . . . . . . . . 58 4.3.1 Energy Model . . . . . . . . . . . 59 4.3.2 Notation . . . . . . . . . . . . 60 4.4 Proposed Resource Management Technique . . . . 61 4.4.1 Motivational Example . . . . . . . . . 61 4.4.2 Overall Procedure . . . . . . . . . . 65 4.4.3 Design-time Analysis . . . . . . . . . 66 4.4.4 Run-time Mapping . . . . . . . . . . 67 4.5 Experiments . . . . . . . . . . . . . 74 4.5.1 Setup . . . . . . . . . . . . . . 74 4.5.2 Analysis of Run-time Overheads . . . . . . 75 4.5.3 Comparison with Other Approaches . . . . 79 Chapter 5 Software Platform for Resource Management 86 5.1 Introduction . . . . . . . . . . . . 86 5.2 Related Work . . . . . . . . . . . . 87 5.3 Overall Structure . . . . . . . . . . . . 88 5.4 Components of Software Platform . . . . . . 89 5.4.1 Application API Layer . . . . . . . . . 89 5.4.2 Communication Interface Module . . . . . 92 5.4.3 Host Interface Layer . . . . . . . . . 93 5.4.4 Memory Management Module . . . . . . 94 5.4.5 Design-time Analysis . . . . . . . . . 94 5.4.6 Slave Manager . . . . . . . . . . . 98 5.5 Software Platform Implementation . . . . . . 99 5.5.1 Scheduling Information . . . . . . . . 100 5.5.2 Function Migration and Execution . . . . . 101 5.5.3 Function Migration and Execution . . . . . 102 5.6 Virtual Prototyping System . . . . . . . . 105 5.7 Xeon Emulation System . . . . . . . . . 106 5.8 Experiments . . . . . . . . . . . . . 107 5.8.1 Setup . . . . . . . . . . . . . . 107 5.8.2 Experiments on the Virtual Prototyping System . . 108 5.8.3 Experiments on the Xeon Emulation System . . . 111 Chapter 6 Conclusion 116 Bibliography 119 Abstract in Korean 130Docto
    corecore