42 research outputs found

    Towards a field configurable non-homogeneous multiprocessors architecture

    Get PDF
    Standard microprocessors are generally designed to deal efficiently with different types of tasks; their general purpose architecture can lead to misuse of resources, creating a large gap between the computational efficiency of microprocessors and custom silicon. The ever increasing complexity of Field Programmable Logic devices is driving the industry to look for innovative System on a Chip solutions; using programmable logic, the whole design can be tuned to the application requirements. In this paper, under the acronym MPOC (Multiprocessors On a Chip) we propose some applicable ideas on multiprocessing embedded configurable architectures, targeting System on a Programmable Chip (SOPC) cost-effective designs. Using heterogeneous medium or low performance soft-core processors instead of a single high performance processor, and some standardized communication schemes to link these multiple processors, the โ€œbestโ€ core can be chosen for each subtask using a computational efficiency criteria, and therefore improving silicon usage. System-level designย is also considered: models of tasks and links, parameterized soft-core processors, and the use of a standard HDL for system description can lead to automatic generation of the final design

    Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems

    Get PDF
    To exploit the power of modern heterogeneous multiprocessor embedded platforms on partitioned applications, the designer usually needs to efficiently map and schedule all the tasks and the communications of the application, respecting the constraints imposed by the target architecture. Since the problem is heavily constrained, common methods used to explore such design space usually fail, obtaining low-quality solutions. In this paper, we propose an ant colony optimization (ACO) heuristic that, given a model of the target architecture and the application, efficiently executes both scheduling and mapping to optimize the application performance. We compare our approach with several other heuristics, including simulated annealing, tabu search, and genetic algorithms, on the performance to reach the optimum value and on the potential to explore the design space. We show that our approach obtains better results than other heuristics by at least 16% on average, despite an overhead in execution time. Finally, we validate the approach by scheduling and mapping a JPEG encoder on a realistic target architecture

    Hierarchical Transactions for Hardware/Software Cosynthesis

    Get PDF
    Modern heterogeneous devices provide of a variety of computationally diverse components holding tremendous performance and power capability. Hardware-software cosynthesis offers system-level synthesis and optimization opportunities to realize the potential of these evolving architectures. Efficiently coordinating high-throughput data to make use of available computational resources requires a myriad of distributed local memories, caching structures, and data motion resources. In fact, storage, caching, and data transfer components comprise the majority of silicon real estate. Conventional automated approaches, unfortunately, do not effectively represent applications in a way that captures data motion and state management which dictate dominant system costs. Consequently, existing cosynthesis methods suffer from poor utility of computational resources. Automated cosynthesis tailored towards memory-centric optimizations can address the challenge, adapting partitioning, scheduling, mapping, and binding techniques to maximize overall system utility.This research presents a novel hierarchical transaction model that formalizes state and control management through an abstract data/control encapsulation semantic. It is designed from the ground-up to enable efficient synthesis across heterogeneous system components, with an emphasis on memory capacity constraints. It intrinsically encourages a high degree of concurrency and latency tolerance, and provides verification tools to ensure correctness. A unique data/execution hierarchical encapsulation framework guarantees scalable analysis, supporting a novel concept of state and control mobility. A front-end language allows concise expression of designer intent, and is structured with synthesis in mind. Designers express families of valid executions in a minimal format through high-level dependencies, type systems, and computational relationships, allowing synthesis tools to manage lower-level details. This dissertation introduces and exercises the model, discussing language construction, demonstrating control and data-dominated applications, and presenting a synthesis path that exhibits near-linear scalability with problem size

    The Potential for a GPU-Like Overlay Architecture for FPGAs

    Get PDF
    We propose a soft processor programming model and architecture inspired by graphics processing units (GPUs) that are well-matched to the strengths of FPGAs, namely, highly parallel and pipelinable computation. In particular, our soft processor architecture exploits multithreading, vector operations, and predication to supply a floating-point pipeline of 64 stages via hardware support for up to 256 concurrent thread contexts. The key new contributions of our architecture are mechanisms for managing threads and register files that maximize data-level and instruction-level parallelism while overcoming the challenges of port limitations of FPGA block memories as well as memory and pipeline latency. Through simulation of a system that (i) is programmable via NVIDIA's high-level Cg language, (ii) supports AMD's CTM r5xx GPU ISA, and (iii) is realizable on an XtremeData XD1000 FPGA-based accelerator system, we demonstrate the potential for such a system to achieve 100% utilization of a deeply pipelined floating-point datapath

    ์‹ค์‹œ๊ฐ„ ์ž„๋ฒ ๋””๋“œ ์‹œ์Šคํ…œ์„ ์œ„ํ•œ ๋™์  ํ–‰์œ„ ๋ช…์„ธ ๋ฐ ์„ค๊ณ„ ๊ณต๊ฐ„ ํƒ์ƒ‰ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2016. 8. ํ•˜์ˆœํšŒ.ํ•˜๋‚˜์˜ ์นฉ์— ์ง‘์ ๋˜๋Š” ํ”„๋กœ์„ธ์„œ์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์•„์ง€๊ณ , ๋งŽ์€ ๊ธฐ๋Šฅ๋“ค์ด ํ†ตํ•ฉ๋จ์— ๋”ฐ๋ผ, ์—ฐ์‚ฐ์–‘์˜ ๋ณ€ํ™”, ์„œ๋น„์Šค์˜ ํ’ˆ์งˆ, ์˜ˆ์ƒ์น˜ ๋ชปํ•œ ์‹œ์Šคํ…œ ์š”์†Œ์˜ ๊ณ ์žฅ ๋“ฑ๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์š”์†Œ๋“ค์— ์˜ํ•ด ์‹œ์Šคํ…œ์˜ ์ƒํƒœ๊ฐ€ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๊ฒŒ ๋œ๋‹ค. ๋ฐ˜๋ฉด์—, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ฃผ๋œ ๊ด€์‹ฌ์‚ฌ๋ฅผ ๊ฐ€์ง€๋Š” ์Šค๋งˆํŠธ ํฐ ์žฅ์น˜์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋น„๋””์˜ค, ๊ทธ๋ž˜ํ”ฝ ์‘์šฉ๋“ค์˜ ๊ฒฝ์šฐ, ๊ณ„์‚ฐ ๋ณต์žก๋„๊ฐ€ ์ง€์†์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ, ์ด๋ ‡๊ฒŒ ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๋Š” ํ–‰์œ„๋ฅผ ๊ฐ€์ง€๋ฉด์„œ๋„ ๋ณ‘๋ ฌ์„ฑ์„ ๋‚ด์ œํ•œ ๊ณ„์‚ฐ ์ง‘์•ฝ์ ์ธ ์—ฐ์‚ฐ์„ ํฌํ•จํ•˜๋Š” ๋ณต์žกํ•œ ์‹œ์Šคํ…œ์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ฒด๊ณ„์ ์ธ ์„ค๊ณ„ ๋ฐฉ๋ฒ•๋ก ์ด ๊ณ ๋„๋กœ ์š”๊ตฌ๋œ๋‹ค. ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ก ์€ ๋ณ‘๋ ฌ ์ž„๋ฒ ๋””๋“œ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ์„ ์œ„ํ•œ ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ํŠนํžˆ, ์‹œ์Šคํ…œ ๋ช…์„ธ, ์ •์  ์„ฑ๋Šฅ ๋ถ„์„, ์„ค๊ณ„ ๊ณต๊ฐ„ ํƒ์ƒ‰, ๊ทธ๋ฆฌ๊ณ  ์ž๋™ ์ฝ”๋“œ ์ƒ์„ฑ๊นŒ์ง€์˜ ๋ชจ๋“  ์„ค๊ณ„ ๋‹จ๊ณ„๋ฅผ ์ง€์›ํ•˜๋Š” ๋ณ‘๋ ฌ ์ž„๋ฒ ๋””๋“œ ์†Œํ”„ํŠธ์›จ์–ด ์„ค๊ณ„ ํ™˜๊ฒฝ์œผ๋กœ์„œ, HOPES ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ์ œ์‹œ๋˜์—ˆ๋‹ค. ๋‹ค๋ฅธ ์„ค๊ณ„ ํ™˜๊ฒฝ๋“ค๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ, ์ด๊ธฐ์ข… ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์„œ ์•„ํ‚คํ…์ฒ˜์—์„œ์˜ ์ผ๋ฐ˜์ ์ธ ์ˆ˜ํ–‰ ๋ชจ๋ธ๋กœ์„œ, ๊ณตํ†ต ์ค‘๊ฐ„ ์ฝ”๋“œ (CIC) ๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ํ”Œ๋žซํผ์ด๋ผ๋Š” ์ƒˆ๋กœ์šด ๊ฐœ๋…์„ ์†Œ๊ฐœํ•˜์˜€๋‹ค. CIC ํƒœ์Šคํฌ ๋ชจ๋ธ์€ ํ”„๋กœ์„ธ์Šค ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ์— ๊ธฐ๋ฐ˜ํ•˜๊ณ  ์žˆ์ง€๋งŒ, SDF ๋ชจ๋ธ๋กœ ๊ตฌ์ฒดํ™”๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ •์  ๋ถ„์„์ด ์šฉ์ดํ•˜๋‹ค๋Š” ์žฅ์ ์„ ๊ฐ€์ง„๋‹ค. ํ•˜์ง€๋งŒ, SDF ๋ชจ๋ธ์€ ์‘์šฉ์˜ ๋™์ ์ธ ํ–‰์œ„๋ฅผ ๋ช…์„ธํ•  ์ˆ˜ ์—†๋‹ค๋Š” ํ‘œํ˜„์ƒ์˜ ์ œ์•ฝ์„ ๊ฐ€์ง„๋‹ค. ์ด๋Ÿฌํ•œ ์ œ์•ฝ์„ ๊ทน๋ณตํ•˜๊ณ , ์‹œ์Šคํ…œ์˜ ๋™์  ํ–‰์œ„๋ฅผ ์‘์šฉ ์™ธ๋ถ€์™€ ๋‚ด๋ถ€๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ๋ช…์„ธํ•˜๊ธฐ ์œ„ํ•ด, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ์™€ ์œ ํ•œ์ƒํƒœ๊ธฐ (FSM) ๋ชจ๋ธ์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ํ™•์žฅ๋œ CIC ํƒœ์Šคํฌ ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ์ƒ์œ„ ์ˆ˜์ค€์—์„œ๋Š”, ๊ฐ ์‘์šฉ์€ ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ํƒœ์Šคํฌ๋กœ ๋ช…์„ธ ๋˜๋ฉฐ, ๋™์  ํ–‰์œ„๋Š” ์‘์šฉ๋“ค์˜ ์ˆ˜ํ–‰์„ ๊ฐ๋…ํ•˜๋Š” ์ œ์–ด ํƒœ์Šคํฌ๋กœ ๋ชจ๋ธ ๋œ๋‹ค. ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ํƒœ์Šคํฌ ๋‚ด๋ถ€๋Š”, ์œ ํ•œ์ƒํƒœ๊ธฐ ๊ธฐ๋ฐ˜์˜ SADF ๋ชจ๋ธ๊ณผ ์œ ์‚ฌํ•œ ํ˜•ํƒœ๋กœ ๋™์  ํ–‰์œ„๊ฐ€ ๋ช…์„ธ ๋œ๋‹คSDF ํƒœ์Šคํฌ๋Š” ๋ณต์ˆ˜๊ฐœ์˜ ํ–‰์œ„๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ชจ๋“œ ์ „ํ™˜๊ธฐ (MTM)์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ์œ ํ•œ ์ƒํƒœ๊ธฐ์˜ ํ…Œ์ด๋ธ” ํ˜•ํƒœ์˜ ๋ช…์„ธ๋ฅผ ํ†ตํ•ด SDF ๊ทธ๋ž˜ํ”„์˜ ๋ชจ๋“œ ์ „ํ™˜ ๊ทœ์น™์„ ๋ช…์„ธ ํ•œ๋‹ค. ์ด๋ฅผ MTM-SDF ๊ทธ๋ž˜ํ”„๋ผ๊ณ  ๋ถ€๋ฅด๋ฉฐ, ๋ณต์ˆ˜ ๋ชจ๋“œ ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜๋ผ ๊ตฌ๋ถ„๋œ๋‹ค. ์‘์šฉ์€ ์œ ํ•œํ•œ ํ–‰์œ„ (๋˜๋Š” ๋ชจ๋“œ)๋ฅผ ๊ฐ€์ง€๋ฉฐ, ๊ฐ ํ–‰์œ„ (๋ชจ๋“œ)๋Š” SDF ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„๋˜๋Š” ๊ฒƒ์„ ๊ฐ€์ •ํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋‹ค์–‘ํ•œ ํ”„๋กœ์„ธ์„œ ๊ฐœ์ˆ˜์— ๋Œ€ํ•ด ๋‹จ์œ„์‹œ๊ฐ„๋‹น ์ฒ˜๋ฆฌ๋Ÿ‰์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ์ปดํŒŒ์ผ-์‹œ๊ฐ„ ์Šค์ผ€์ค„๋ง์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ์Šค์ผ€์ค„ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ๋˜ํ•œ, ๋ณต์ˆ˜ ๋ชจ๋“œ ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์œ„ํ•œ ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์„œ ์Šค์ผ€์ค„๋ง ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๋ณต์ˆ˜ ๋ชจ๋“œ ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์œ„ํ•œ ๋ช‡๋ช‡ ์Šค์ผ€์ค„๋ง ๊ธฐ๋ฒ•๋“ค์ด ์กด์žฌํ•˜์ง€๋งŒ, ๋ชจ๋“œ ์‚ฌ์ด์— ํƒœ์Šคํฌ ์ด์ฃผ๋ฅผ ํ—ˆ์šฉํ•œ ๊ธฐ๋ฒ•๋“ค์€ ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค. ํ•˜์ง€๋งŒ ํƒœ์Šคํฌ ์ด์ฃผ๋ฅผ ํ—ˆ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ์ž์› ์š”๊ตฌ๋Ÿ‰์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๋ฐœ๊ฒฌ์„ ํ†ตํ•ด, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ชจ๋“œ ์‚ฌ์ด์˜ ํƒœ์Šคํฌ ์ด์ฃผ๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ๋ณต์ˆ˜ ๋ชจ๋“œ ๋ฐ์ดํ„ฐ ํ”Œ๋กœ์šฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์œ„ํ•œ ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์„œ ์Šค์ผ€์ค„๋ง ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์œ ์ „ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๊ธฐ๋ฒ•์€ ์ž์› ์š”๊ตฌ๋Ÿ‰์„ ์ตœ์†Œํ™”ํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ ๋ชจ๋“œ์— ํ•ด๋‹นํ•˜๋Š” ๋ชจ๋“  SDF ๊ทธ๋ž˜ํ”„๋ฅผ ๋™์‹œ์— ์Šค์ผ€์ค„ ํ•œ๋‹ค. ์ฃผ์–ด์ง„ ๋‹จ์œ„ ์‹œ๊ฐ„๋‹น ์ฒ˜๋ฆฌ๋Ÿ‰ ์ œ์•ฝ์„ ๋งŒ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•ด, ์ œ์•ˆํ•˜๋Š” ๊ธฐ๋ฒ•์€ ๊ฐ ๋ชจ๋“œ ๋ณ„๋กœ ์‹ค์ œ ์ฒ˜๋ฆฌ๋Ÿ‰ ์š”๊ตฌ๋Ÿ‰์„ ๊ณ„์‚ฐํ•˜๋ฉฐ, ์ฒ˜๋ฆฌ๋Ÿ‰์˜ ๋ถˆ๊ทœ์น™์„ฑ์„ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์ถœ๋ ฅ ๋ฒ„ํผ์˜ ํฌ๊ธฐ๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค. ๋ช…์„ธ๋œ ํƒœ์Šคํฌ ๊ทธ๋ž˜ํ”„์™€ ์Šค์ผ€์ค„ ๊ฒฐ๊ณผ๋กœ๋ถ€ํ„ฐ, HOPES ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ๋Œ€์ƒ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์œ„ํ•œ ์ž๋™ ์ฝ”๋“œ ์ƒ์„ฑ์„ ์ง€์›ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์ž๋™ ์ฝ”๋“œ ์ƒ์„ฑ๊ธฐ๋Š” CIC ํƒœ์Šคํฌ ๋ชจ๋ธ์˜ ํ™•์žฅ๋œ ํŠน์ง•๋“ค์„ ์ง€์›ํ•˜๋„๋ก ํ™•์žฅ๋˜์—ˆ๋‹ค. ์‘์šฉ ์ˆ˜์ค€์—์„œ๋Š” MTM-SDF ๊ทธ๋ž˜ํ”„๋ฅผ ์ฃผ์–ด์ง„ ์ •์  ์Šค์ผ€์ค„๋ง ๊ฒฐ๊ณผ๋ฅผ ๋”ฐ๋ฅด๋Š” ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์„œ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ํ™•์žฅ๋˜์—ˆ๋‹ค. ๋˜ํ•œ, ๋„ค ๊ฐ€์ง€ ์„œ๋กœ ๋‹ค๋ฅธ ์Šค์ผ€์ค„๋ง ์ •์ฑ… (fully-static, self-timed, static-assignment, fully-dynamic)์— ๋Œ€ํ•œ ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์„œ ์ฝ”๋“œ ์ƒ์„ฑ์„ ์ง€์›ํ•œ๋‹ค. ์‹œ์Šคํ…œ ์ˆ˜์ค€์—์„œ๋Š” ์ง€์›ํ•˜๋Š” ์‹œ์Šคํ…œ ์š”์ฒญ API์— ๋Œ€ํ•œ ์‹ค์ œ ๊ตฌํ˜„ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, ์ •์  ์Šค์ผ€์ค„ ๊ฒฐ๊ณผ์™€ ํƒœ์Šคํฌ๋“ค์˜ ์ œ์–ด ๊ฐ€๋Šฅํ•œ ์†์„ฑ๋“ค์— ๋Œ€ํ•œ ์ž๋ฃŒ ๊ตฌ์กฐ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ๋ณต์ˆ˜ ๋ชจ๋“œ ๋ฉ€ํ‹ฐ๋ฏธ๋””์–ด ํ„ฐ๋ฏธ๋„ ์˜ˆ์ œ๋ฅผ ํ†ตํ•œ ๊ธฐ์ดˆ์ ์ธ ์‹คํ—˜๋“ค์„ ํ†ตํ•ด, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์˜ ํƒ€๋‹น์„ฑ์„ ๋ณด์ธ๋‹ค.As the number of processors in a chip increases, and more functions are integrated, the system status will change dynamically due to various factors such as the workload variation, QoS requirement, and unexpected component failure. On the other hand, computation-complexity of user applications is also steadily increasingvideo and graphics applications are two major driving forces in smart mobile devices, which define the main application domain of interest in this dissertation. So, a systematic design methodology is highly required to implement such complex systems which contain dynamically changed behavior as well as computation-intensive workload that can be parallelized. A model-based approach is one of representative approaches for parallel embedded software development. Especially, HOPES framework is proposed which is a design environment for parallel embedded software supporting the overall design steps: system specification, performance estimation, design space exploration, and automatic code generation. Distinguished from other design environments, it introduces a novel concept of programming platform, called CIC (Common Intermediate Code) that can be understood as a generic execution model of heterogeneous multiprocessor architecture. The CIC task model is based on a process network model, but it can be refined to the SDF (Synchronous Data Flow) model, since it has a very desirable features for static analyzability as well as parallel processing. However, the SDF model has a typical weakness of expression capability, especially for the system-level specification and dynamically changed behavior of an application. To overcome this weakness, in this dissertation, we propose an extended CIC task model based on dataflow and FSM models to specify the dynamic behavior of the system distinguishing inter- and intra-application dynamism. At the top-level, each application is specified by a dataflow task and the dynamic behavior is modeled as a control task that supervises the execution of applications. Inside a dataflow task, it specifies the dynamic behavior using a similar way as FSM-based SADFan SDF task may have multiple behaviors and a tabular specification of an FSM, called MTM (Mode Transition Machine), describes the mode transition rules for the SDF graph. We call it to MTM-SDF model which is classified as multi-mode dataflow models in the dissertation. It assumes that an application has a finite number of behaviors (or modes) and each behavior (mode) is represented by an SDF graph. It enables us to perform compile-time scheduling of each graph to maximize the throughput varying the number of allocated processors, and store the scheduling information. Also, a multiprocessor scheduling technique is proposed for a multi-mode dataflow graph. While there exist several scheduling techniques for multi-mode dataflow models, no one allows task migration between modes. By observing that the resource requirement can be additionally reduced if task migration is allowed, we propose a multiprocessor scheduling technique of a multi-mode dataflow graph considering task migration between modes. Based on a genetic algorithm, the proposed technique schedules all SDF graphs in all modes simultaneously to minimize the resource requirement. To satisfy the throughput constraint, the proposed technique calculates the actual throughput requirement of each mode and the output buffer size for tolerating throughput jitter. For the specified task graph and scheduling results, the CIC translator generates parallelized code for the target architecture. Therefore the CIC translator is extended to support extended features of the CIC task model. In application-level, it is extended to support multiprocessor code generation for an MTM-SDF graph considering the given static scheduling results. Also, multiprocessor code generation of four different scheduling policies are supported for an MTM-SDF graph: fully-static, self-timed, static-assignment, and fully-dynamic. In system-level, the CIC translator is extended to support code generation for implementation of system request APIs and data structures for the static scheduling results and configurable task parameters. Through preliminary experiments with a multi-mode multimedia terminal example, the viability of the proposed methodology is verified.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 7 1.3 Dissertation organization 9 Chapter 2 Background 10 2.1 Related work 10 2.1.1 Compiler-based approach 10 2.1.2 Language-based approach 11 2.1.3 Model-based approach 15 2.2 HOPES framework 19 2.3 Common Intermediate Code (CIC) Model 21 Chapter 3 Dynamic Behavior Specification 26 3.1 Problem definition 26 3.1.1 System-level dynamic behavior 26 3.1.2 Application-level dynamic behavior 27 3.2 Related work 28 3.3 Motivational example 31 3.4 Control task specification for system-level dynamism 33 3.4.1 Internal specification 33 3.4.2 Action scripts 38 3.5 MTM-SDF specification for application-level dynamism 44 3.5.1 MTM specification 44 3.5.2 Task graph specification 45 3.5.3 Execution semantic of an MTM-SDF graph 46 Chapter 4 Multiprocessor Scheduling of an Multi-mode Dataflow Graph 50 4.1 Related work 51 4.2 Motivational example 56 4.2.1 Throughput requirement calculation considering mode transition delay 56 4.2.2 Task migration between mode transition 58 4.3 Problem definition 61 4.4 Throughput requirement analysis 65 4.4.1 Mode transition delay 66 4.4.2 Arrival curves of the output buffer 70 4.4.3 Buffer size determination 71 4.4.4 Throughput requirement analysis 73 4.5 Proposed MMDF scheduling framework 75 4.5.1 Optimization problem 75 4.5.2 GA configuration 76 4.5.3 Fitness function 78 4.5.4 Local optimization technique 79 4.6 Experimental results 81 4.6.1 MMDF scheduling technique 83 4.6.2 Scalability of the Proposed Framework 88 Chapter 5 Multiprocessor Code Generation for the Extended CIC Model 89 5.1 CIC translator 89 5.2 Code generation for application-level dynamism 91 5.2.1 Function call-style code generation (fully-static, self-timed) 94 5.2.2 Thread-style code generation (static-assignment, fully-dynamic) 98 5.3 Code generation for system-level dynamism 101 5.4 Experimental results 105 Chapter 6 Conclusion and Future Work 107 Bibliography 109 ์ดˆ๋ก 125Docto

    MULTI-OBJECTIVE DESIGN AUTOMATION FOR RECONFIGURABLE MULTI-PROCESSOR SYSTEMS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore