1,414 research outputs found
Scheduling Optimisations for SPIN to Minimise Buffer Requirements in Synchronous Data Flow
Synchronous Data flow (SDF) graphs have a simple
and elegant semantics (essentially linear algebra) which makes
SDF graphs eminently suitable as a vehicle for studying scheduling
optimisations. We extend related work on using SPIN to
experiment with scheduling optimisations aimed at minimising
buffer requirements.We show that for a benchmark of commonly
used case studies the performance of our SPIN based scheduler
is comparable to that of state of the art research tools. The
key to success is using the semantics of SDF to prove when using
(even unsound and/or incomplete) optimisations are justified. The
main benefit of our approach lies in gaining deep insight in the
optimisations at relatively low cost
Recommended from our members
A survey of behavioral-level partitioning systems
Many approaches have been developed to partition a system's behavioral description before a structural implementation is synthesized. We highlight the foundations and motivations for behavioral partitioning. We survey behavioral partitioning approaches, discussing abstraction levels, goals, major steps, and key assumptions in each
์ด์ข ๋ฉํฐ ์ฝ์ด ํ๋ก์ธ์์์ SDF/L ๊ทธ๋ํ ์ค์ผ์ค๋ง ๊ธฐ๋ฒ
ํ์๋
ผ๋ฌธ(์์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ, 2021.8. Ha Soonhoi.Although dataflow models are known to thrive at exploiting task-level parallelism of an application, it is difficult to exploit the parallelism of data. Data-level parallelism can be represented well with loop structures, but these structures are not explicitly specified in most existing dataflow models. SDF/L model was introduced to overcome this shortcoming by specifying the loop structures explicitly in a hierarchical fashion. To the best of our knowledge however, scheduling of SDF/L graph onto heterogeneous processors has not been considered in any previous work.
In this dissertation, we introduce a scheduling technique of an application represented by the SDF/L model onto heterogeneous processors. In the proposed method, we explore the mapping of tasks using an evolutionary meta-heuristic and schedule hierarchically in a bottom-up fashion, creating parallel loop schedules at lower levels first and then re-using them when constructing the schedule at a higher level. To verify the efficiency of the proposed scheduling methodology, we apply it to benchmark examples and randomly generated SDF/L graphs.๋ฐ์ดํฐํ๋ก์ฐ ๋ชจ๋ธ์ ์ ํ๋ฆฌ์ผ์ด์
์ ํ์คํฌ๋ฅผ ๋ณ๋ ฌ ์ฒ๋ฆฌํ ๋ ์ข์ ๋ชจ๋ธ๋ก ์๋ ค์ ธ ์์ง๋ง ๋ฐ์ดํฐ๋ฅผ ๋ณ๋ ฌ๋ก ์ฒ๋ฆฌํ๋ ๋ฐ์ ํ์ฉํ๊ธฐ๋ ์ด๋ ต๋ค. ๋ฐ์ดํฐ ์์ค ๋ณ๋ ฌ ์ฒ๋ฆฌ๋ ๋ฃจํ ๊ตฌ์กฐ๋ฅผ ํตํด ํํ๋ ์ ์์ผ๋ ๊ธฐ์กด ๋ฐ์ดํฐํ๋ก์ฐ ๋ชจ๋ธ์์ ๋ช
์์ ์ผ๋ก ๋ฃจํ ๊ตฌ์กฐ๋ ๋ช
์ธํ๋ ๋ฐฉ๋ฒ์ด ์์๋ค. ์ด๋ฌํ ๋จ์ ์ ๊ทน๋ณตํ๊ธฐ ์ํด ๊ณ์ธต์ ๊ตฌ์กฐ๋ฅผ ํ์ฉํ์ฌ ๋ฃจํ ๊ตฌ์กฐ๋ฅผ ๋ช
์์ ์ผ๋ก ๋ช
์ธํ ์ ์๋ SDF/L ๋ชจ๋ธ์ด ์ ์๋์๋ค. ๊ทธ๋ฌ๋ ์ด๊ธฐ์ข
ํ๋ก์ธ์์ ๋ํ SDF/L ๊ทธ๋ํ์ ์ค์ผ์ค๋ง์ ์ด์ ๊น์ง ๊ณ ๋ ค๋์ง ์์ ๊ฒ์ผ๋ก ํ์
๋๋ค.
๋ณธ ๋
ผ๋ฌธ์์๋ SDF/L ๋ชจ๋ธ๋ก ํํ๋๋ ์ ํ๋ฆฌ์ผ์ด์
์ ์ด๊ธฐ์ข
ํ๋ก์ธ์์ ๋ํ์ฌ ์ค์ผ์ค๋งํ๋ ๊ธฐ๋ฒ์ ์๊ฐํ๋ค. ์ ์๋ ๋ฐฉ๋ฒ์์๋ ๋จผ์ ์งํ์ ๋ฉํ ํด๋ฆฌ์คํฑ์ ์ฌ์ฉํ์ฌ ํ์คํฌ ๋งคํ์ ํ์ํ๋ค. ์ดํ ํ์ ์์ค์์ ๋ณ๋ ฌ ๋ฃจํ ์ค์ผ์ค์ ๋ง๋ ๋ค์ ์์ ์์ค์์ ์ค์ผ์ค ๊ตฌ์ฑํ ๋ ์ฌ์ฌ์ฉํ๋ ์ํฅ์์ ๊ณ์ธต์ ํ์คํฌ ์ค์ผ์ค๋ง์ ์ํํ๋ค. ์ ์ํ๋ ์ค์ผ์ค๋ง ๊ธฐ๋ฒ์ ํจ์จ์ฑ์ ๊ฒ์ฆํ๊ธฐ ์ํด ๋ฒค์น๋งํฌ ์์ ์ ๋ฌด์์๋ก ์์ฑ๋ SDF/L ๊ทธ๋ํ์ ๊ธฐ๋ฒ์ ์ ์ฉํ์๋ค.Chapter 1 Introduction 1
Chapter 2 Related Work 6
2.1 SDF Scheduling with Data-level Parallelism 8
2.2 Hierarchical Scheduling 9
Chapter 3 Problem and Challenges 11
3.1 Notations and Problem Description 11
3.2 Challenges 12
Chapter 4 Proposed methodology 15
4.1 Mapping Exploration 15
4.2 Priority Assignment and List Scheduling Heuristic 17
4.3 Hierarchical Scheduling 18
4.4 Complexity 23
Chapter 5 Experiments 24
5.1 Benchmarks 25
5.2 Randomly Generated Graphs 30
Chapter 6 Conclusions 35
Bibliography 37
์ ์ฝ 41์
Heterogeneity-aware scheduling and data partitioning for system performance acceleration
Over the past decade, heterogeneous processors and accelerators have become increasingly prevalent in modern computing systems. Compared with previous homogeneous parallel machines, the hardware heterogeneity in modern systems provides new opportunities and challenges for performance acceleration. Classic operating systems optimisation problems such as task scheduling, and application-specific optimisation techniques such as the adaptive data partitioning of parallel algorithms, are both required to work together to address hardware heterogeneity.
Significant effort has been invested in this problem, but either focuses on a specific type of heterogeneous systems or algorithm, or a high-level framework without insight into the difference in heterogeneity between different types of system. A general software framework is required, which can not only be adapted to multiple types of systems and workloads, but is also equipped with the techniques to address a variety of hardware heterogeneity.
This thesis presents approaches to design general heterogeneity-aware software frameworks for system performance acceleration. It covers a wide variety of systems, including an OS scheduler targeting on-chip asymmetric multi-core processors (AMPs) on mobile devices, a hierarchical many-core supercomputer and multi-FPGA systems for high performance computing (HPC) centers. Considering heterogeneity from on-chip AMPs, such as thread criticality, core sensitivity, and relative fairness, it suggests a collaborative based approach to co-design the task selector and core allocator on OS scheduler. Considering the typical sources of heterogeneity in HPC systems, such as the memory hierarchy, bandwidth limitations and asymmetric physical connection, it proposes an application-specific automatic data partitioning method for a modern supercomputer, and a topological-ranking heuristic based schedule for a multi-FPGA based reconfigurable cluster.
Experiments on both a full system simulator (GEM5) and real systems (Sunway Taihulight Supercomputer and Xilinx Multi-FPGA based clusters) demonstrate the significant advantages of the suggested approaches compared against the state-of-the-art on variety of workloads."This work is supported by St Leonards 7th Century Scholarship and
Computer Science PhD funding from University of St Andrews; by UK
EPSRC grant Discovery: Pattern Discovery and Program Shaping for Manycore
Systems (EP/P020631/1)." -- Acknowledgement
- โฆ