322 research outputs found

    Time-Constrained Scheduling of Weighted Packets on Trees and Meshes

    Full text link

    Hop-Constrained Oblivious Routing

    Full text link
    We prove the existence of an oblivious routing scheme that is poly(logโกn)\mathrm{poly}(\log n)-competitive in terms of (congestion+dilation)(congestion + dilation), thus resolving a well-known question in oblivious routing. Concretely, consider an undirected network and a set of packets each with its own source and destination. The objective is to choose a path for each packet, from its source to its destination, so as to minimize (congestion+dilation)(congestion + dilation), defined as follows: The dilation is the maximum path hop-length, and the congestion is the maximum number of paths that include any single edge. The routing scheme obliviously and randomly selects a path for each packet independent of (the existence of) the other packets. Despite this obliviousness, the selected paths have (congestion+dilation)(congestion + dilation) within a poly(logโกn)\mathrm{poly}(\log n) factor of the best possible value. More precisely, for any integer hop-bound hh, this oblivious routing scheme selects paths of length at most hโ‹…poly(logโกn)h \cdot \mathrm{poly}(\log n) and is poly(logโกn)\mathrm{poly}(\log n)-competitive in terms of congestioncongestion in comparison to the best possible congestioncongestion achievable via paths of length at most hh hops. These paths can be sampled in polynomial time. This result can be viewed as an analogue of the celebrated oblivious routing results of R\"{a}cke [FOCS 2002, STOC 2008], which are O(logโกn)O(\log n)-competitive in terms of congestioncongestion, but are not competitive in terms of dilationdilation

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoCโ€”its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems

    Designing multihop wireless backhaul networks with delay guarantees

    Get PDF
    Abstract โ€” As wireless access technologies improve in data rates, the problem focus is shifting towards providing adequate backhaul from the wireless access points to the Internet. Existing wired backhaul technologies such as copper wires running at DSL, T1, or T3 speeds can be expensive to install or lease, and are becoming a performance bottleneck as wireless access speeds increase. Longhaul, non-line-of-sight wireless technologies such as WiMAX (802.16d) hold the promise of enabling a high speed wireless backhaul as a cost-effective alternative. However, the biggest challenge in building a wireless backhaul is achieving guaranteed performance (throughput and delay) that is typically provided by a wired backhaul. This paper explores the problem of efficiently designing a multihop wireless backhaul to connect multiple wireless access points to a wired gateway. In particular, we provide a generalized link activation framework for scheduling packets over this wireless backhaul, such that any existing wireline scheduling policy can be implemented locally at each node of the wireless backhaul. We also present techniques for determining good interference-free routes within our scheduling framework, given the link rates and cross-link interference information. When a multihop wireline scheduler with worst case delay bounds (such as WFQ or Coordinated EDF) is implemented over the wireless backhaul, we show that our scheduling and routing framework guarantees approximately twice the delay of the corresponding wireline topology. Finally, we present simulation results to demonstrate the low delays achieved using our framework. I

    ์˜จ ์นฉ ๋„คํŠธ์›Œํฌ ์„ค๊ณ„: ๋งคํ•‘, ๊ด€๋ฆฌ, ๋ผ์šฐํŒ…

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2016. 2. ์ตœ๊ธฐ์˜.์ง€๋‚œ ์ˆ˜์‹ญ ๋…„๊ฐ„ ์ด์–ด์ง„ ๋ฐ˜๋„์ฒด ๊ธฐ์ˆ ์˜ ํ–ฅ์ƒ์€ ๋งค๋‹ˆ ์ฝ”์–ด์˜ ์‹œ๋Œ€๋ฅผ ๊ฐ€์ ธ๋‹ค ์ฃผ์—ˆ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์ผ์ƒ ์ƒํ™œ์— ์“ฐ๋Š” ๋ฐ์Šคํฌํ†ฑ ์ปดํ“จํ„ฐ์กฐ์ฐจ๋„ ์ด๋ฏธ ์ˆ˜ ๊ฐœ์˜ ์ฝ”์–ด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ์ˆ˜๋ฐฑ ๊ฐœ์˜ ์ฝ”์–ด๋ฅผ ๊ฐ€์ง„ ์นฉ๋„ ์ƒ์šฉํ™”๋˜์–ด ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋งŽ์€ ์ฝ”์–ด๋“ค ๊ฐ„์˜ ํ†ต์‹  ๊ธฐ๋ฐ˜์œผ๋กœ์„œ, ๋„คํŠธ์›Œํฌ-์˜จ-์นฉ(NoC)์ด ์ƒˆ๋กœ์ด ๋Œ€๋‘๋˜์—ˆ์œผ๋ฉฐ, ์ด๋Š” ํ˜„์žฌ ๋งŽ์€ ์—ฐ๊ตฌ ๋ฐ ์ƒ์šฉ ์ œํ’ˆ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋„คํŠธ์›Œํฌ-์˜จ-์นฉ์„ ๋งค๋‹ˆ ์ฝ”์–ด ์‹œ์Šคํ…œ์— ์‚ฌ์šฉํ•˜๋Š” ๋ฐ์—๋Š” ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ๋”ฐ๋ฅด๋ฉฐ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ทธ ์ค‘ ๋ช‡ ๊ฐ€์ง€๋ฅผ ํ’€์–ด๋‚ด๊ณ ์ž ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ๋‘ ๋ฒˆ์งธ ์ฑ•ํ„ฐ์—์„œ๋Š” NoC ๊ธฐ๋ฐ˜ ๋งค๋‹ˆ์ฝ”์–ด ๊ตฌ์กฐ์— ์ž‘์—…์„ ํ• ๋‹นํ•˜๊ณ  ์Šค์ผ€์ฅดํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃจ์—ˆ๋‹ค. ๋งค๋‹ˆ์ฝ”์–ด์—์˜ ์ž‘์—… ํ• ๋‹น์„ ๋‹ค๋ฃฌ ๋…ผ๋ฌธ์€ ์ด๋ฏธ ๋งŽ์ด ์ถœํŒ๋˜์—ˆ์ง€๋งŒ, ๋ณธ ์—ฐ๊ตฌ๋Š” ๋ฉ”์‹œ์ง€ ํŒจ์‹ฑ๊ณผ ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ, ๋‘ ๊ฐ€์ง€์˜ ํ†ต์‹  ๋ฐฉ์‹์„ ๊ณ ๋ คํ•จ์œผ๋กœ์จ ์„ฑ๋Šฅ๊ณผ ์—๋„ˆ์ง€ ํšจ์œจ์„ ๊ฐœ์„ ํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ณธ ์—ฐ๊ตฌ๋Š” ์—ญ๋ฐฉํ–ฅ ์˜์กด์„ฑ์„ ๊ฐ€์ง„ ์ž‘์—… ๊ทธ๋ž˜ํ”„๋ฅผ ์Šค์ผ€์ฅดํ•˜๋Š” ๋ฐฉ๋ฒ• ๋˜ํ•œ ์ œ์‹œํ•˜์˜€๋‹ค. 3์ฐจ์› ์ ์ธต ๊ธฐ์ˆ ์€ ๋†’์•„์ง„ ์ „๋ ฅ ๋ฐ€๋„ ๋•Œ๋ฌธ์— ์—ด ๋ฌธ์ œ๊ฐ€ ์‹ฌ๊ฐํ•ด์ง€๋Š” ๋“ฑ, ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋„์ „ ๊ณผ์ œ๋ฅผ ๋‚ดํฌํ•˜๊ณ  ์žˆ๋‹ค. ์„ธ ๋ฒˆ์งธ ์ฑ•ํ„ฐ์—์„œ๋Š” DVFS ๊ธฐ์ˆ ์„ ์ด์šฉํ•˜์—ฌ ์—ด ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๊ณ ์ž ํ•˜๋Š” ๊ธฐ์ˆ ์„ ์†Œ๊ฐœํ•œ๋‹ค. ๊ฐ ์ฝ”์–ด์™€ ๋ผ์šฐํ„ฐ๊ฐ€ ์ „์••, ์ž‘๋™ ์†๋„๋ฅผ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐ์—์„œ, ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ์ด๋Œ์–ด ๋‚ด๋ฉด์„œ๋„ ์ตœ๋Œ€ ์˜จ๋„๋ฅผ ๋„˜์–ด์„œ์ง€ ์•Š๋„๋ก ํ•œ๋‹ค. ์„ธ ๋ฒˆ์งธ์™€ ๋„ค ๋ฒˆ์งธ ์ฑ•ํ„ฐ๋Š” ์กฐ๊ธˆ ๋‹ค๋ฅธ ์ธก๋ฉด์„ ๋‹ค๋ฃฌ๋‹ค. 3D ์ ์ธต ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•  ๋•Œ, ์ธต๊ฐ„ ํ†ต์‹ ์€ ์ฃผ๋กœ TSV๋ฅผ ์ด์šฉํ•˜์—ฌ ์ด๋ฃจ์–ด์ง„๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ TSV๋Š” ์ผ๋ฐ˜ wire๋ณด๋‹ค ํ›จ์”ฌ ํฐ ๋ฉด์ ์„ ์ฐจ์ง€ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์ „์ฒด ๋„คํŠธ์›Œํฌ์—์„œ์˜ TSV ๊ฐœ์ˆ˜๋Š” ์ œํ•œ๋˜์–ด์•ผ ํ•  ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ด ๊ฒฝ์šฐ์—๋Š” ๋‘ ๊ฐ€์ง€ ์„ ํƒ์ง€๊ฐ€ ์žˆ๋Š”๋ฐ, ์ฒซ์งธ๋Š” ๊ฐ ์ธต๊ฐ„ ํ†ต์‹  ์ฑ„๋„์˜ ๋Œ€์—ญํญ์„ ์ค„์ด๋Š” ๊ฒƒ์ด๊ณ , ๋‘˜์งธ๋Š” ๊ฐ ์ฑ„๋„์˜ ๋Œ€์—ญํญ์€ ์œ ์ง€ํ•˜๋˜ ์ผ๋ถ€ ๋…ธ๋“œ๋งŒ ์ธต๊ฐ„ ํ†ต์‹ ์ด ๊ฐ€๋Šฅํ•œ ์ฑ„๋„์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ฐ๊ฐ์˜ ๊ฒฝ์šฐ์— ๋Œ€ํ•˜์—ฌ ๋ผ์šฐํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ•˜๋‚˜์”ฉ ์ œ์‹œํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๊ฒฝ์šฐ์— ์žˆ์–ด์„œ๋Š” deflection ๋ผ์šฐํŒ… ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธต๊ฐ„ ํ†ต์‹ ์˜ ๊ธด ์ง€์—ฐ ์‹œ๊ฐ„์„ ๊ทน๋ณตํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ธต๊ฐ„ ํ†ต์‹ ์„ ๊ท ๋“ฑํ•˜๊ฒŒ ๋ถ„๋ฐฐํ•จ์œผ๋กœ์จ, ์ œ์‹œ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๊ฐœ์„ ๋œ ์ง€์—ฐ ์‹œ๊ฐ„์„ ๋ณด์ด๋ฉฐ ๋ผ์šฐํ„ฐ ๋ฒ„ํผ์˜ ์ œ๊ฑฐ๋ฅผ ํ†ตํ•œ ๋ฉด์  ๋ฐ ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ ๋˜ํ•œ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ๋‘ ๋ฒˆ์งธ ๊ฒฝ์šฐ์—์„œ๋Š” ์ธต๊ฐ„ ํ†ต์‹  ์ฑ„๋„์„ ์„ ํƒํ•˜๊ธฐ ์œ„ํ•œ ๋ช‡ ๊ฐ€์ง€ ๊ทœ์น™์„ ์ œ์‹œํ•œ๋‹ค. ์•ฝ๊ฐ„์˜ ๋ผ์šฐํŒ… ์ž์œ ๋„๋ฅผ ํฌ์ƒํ•จ์œผ๋กœ์จ, ์ œ์‹œ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๊ธฐ์กด ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ฐ€์ƒ ์ฑ„๋„ ์š”๊ตฌ ์กฐ๊ฑด์„ ์ œ๊ฑฐํ•˜๊ณ , ๊ฒฐ๊ณผ์ ์œผ๋กœ๋Š” ์„ฑ๋Šฅ ๋˜๋Š” ์—๋„ˆ์ง€ ํšจ์œจ์˜ ์ฆ๊ฐ€๋ฅผ ๊ฐ€์ ธ ์˜จ๋‹ค.For decades, advance in semiconductor technology has led us to the era of many-core systems. Today's desktop computers already have multi-core processors, and chips with more than a hundred cores are commercially available. As a communication medium for such a large number of cores, network-on-chip (NoC) has emerged out, and now is being used by many researchers and companies. Adopting NoC for a many-core system incurs many problems, and this thesis tries to solve some of them. The second chapter of this thesis is on mapping and scheduling of tasks on NoC-based CMP architectures. Although mapping on NoC has a number of papers published, our work reveals that selecting communication types between shared memory and message passing can help improve the performance and energy efficiency. Additionally, our framework supports scheduling applications containing backward dependencies with the help of modified modulo scheduling. Evolving the SoCs through 3D stacking makes us face a number of new problems, and the thermal problem coming from increased power density is one of them. In the third chapter of this thesis, we try to mitigate the hotspot problem using DVFS techniques. Assuming that all the routers as well as cores have capabilities to control voltage and frequency individually, we find voltage-frequency pairs for all cores and routers which yields the best performance within the given thermal constraint. The fourth and the fifth chapters of this thesis are from a different aspect. In 3D stacking, inter-layer interconnections are implemented using through-silicon vias (TSV). TSVs usually take much more area than normal wires. Furthermore, they also consume silicon area as well as metal area. For this reason, designers would want to limit the number of TSVs used in their network. To limit the TSV count, there are two options: the first is to reduce the width of each vertical links, and the other is to use fewer vertical links, which results in a partially connected network. We present two routing methodologies for each case. For the network with reduced bandwidth vertical links, we propose using deflection routing to mitigate the long latency of vertical links. By balancing the vertical traffics properly, the algorithm provides improved latency. Also, a large amount of area and energy reduction can be obtained by the removal of router buffers. For partially connected networks, we introduce a set of routing rules for selecting the vertical links. At the expense of sacrificing some amount of routing freedom, the proposed algorithm removes the virtual channel requirement for avoiding deadlock. As a result, the performance, or energy consumption can be reduced at the designer's choice.Chapter 1 Introduction 1 1.1 Task Mapping and Scheduling 2 1.2 Thermal Management 3 1.3 Routing for 3D Networks 5 Chapter 2 Mapping and Scheduling 9 2.1 Introduction 9 2.2 Motivation 10 2.3 Background 12 2.4 Related Work 16 2.5 Platform Description 17 2.5.1 Architcture Description 17 2.5.2 Energy Model 21 2.5.3 Communication Delay Model 22 2.6 Problem Formulation 23 2.7 Proposed Solution 25 2.7.1 Task and Communication Mapping 27 2.7.2 Communication Type Optimization 31 2.7.3 Design Space Pruning via Pre-evaluation 34 2.7.4 Scheduling 35 2.8 Experimental Results 42 2.8.1 Experiments with Coarse-grained Iterative Modulo Scheduling 42 2.8.2 Comparison with Different Mapping Algorithms 43 2.8.3 Experiments with Overall Algorithms 45 2.8.4 Experiments with Various Local Memory Sizes 47 2.8.5 Experiments with Various Placements of Shared Memory 48 Chapter 3 Thermal Management 50 3.1 Introduction 50 3.2 Background 51 3.2.1 Thermal Modeling 51 3.2.2 Heterogeneity in Thermal Propagation 52 3.3 Motivation and Problem Definition 53 3.4 Related Work 56 3.5 Orchestrated Voltage-Frequency Assignment 56 3.5.1 Individual PI Control Method 56 3.5.2 PI Controlled Weighted-Power Budgeting 57 3.5.3 Performance/Power Estimation 59 3.5.4 Frequency Assignment 62 3.5.5 Algorithm Overview 64 3.5.6 Stability Conditions for PI Controller 65 3.6 Experimental Result 66 3.6.1 Experimental Setup 66 3.6.2 Overall Algorithm Performance 68 3.6.3 Accuracy of the Estimation Model 70 3.6.4 Performance of the Frequency Assignment Algorithm 70 Chapter 4 Routing for Limited Bandwidth 3D NoC 72 4.1 Introduction 72 4.2 Motivation 73 4.3 Background 74 4.4 Related Work 75 4.5 3D Deflection Routing 76 4.5.1 Serialized TSV Model 76 4.5.2 TSV Link Injection/ejection Scheme 78 4.5.3 Deadlock Avoidance 80 4.5.4 Livelock Avoidance 84 4.5.5 Router Architecture: Putting It All Together 86 4.5.6 System Level Consideration 87 4.6 Experimental Results 89 4.6.1 Experimental Setup 89 4.6.2 Results on Synthetic Traffic Patterns 91 4.6.3 Results on Realistic Traffic Patterns 94 4.6.4 Results on Real Application Benchmarks 98 4.6.5 Fairness Issue 103 4.6.6 Area Cost Comparison 104 Chapter 5 Routing for Partially Connected 3D NoC 106 5.1 Introduction 106 5.2 Background 107 5.3 Related Work 109 5.4 Proposed Algorithm 111 5.4.1 Preliminary 112 5.4.2 Routing Algorithm for 3-D Stacked Meshes with Regular Partial Vertical Connections 115 5.4.3 Routing Algorithm for 3-D Stacked Meshes with Irregular Partial Vertical Connections 118 5.4.4 Extension to Heterogeneous Mesh Layers 122 5.5 Experimental Results 126 5.5.1 Experimental Setup 126 5.5.2 Experiments on Synthetic Traffics 128 5.5.3 Experiments on Application Benchmarks 133 5.5.4 Comparison with Reduced Bandwidth Mesh 139 Chapter 6 Conclusion 141 Bibliography 144 ์ดˆ๋ก 163Docto

    Broadcasting on Large Scale Heterogeneous Platforms under the Bounded Multi-Port Model

    Get PDF
    International audienceWe consider the problem of broadcasting a large message in a large scale distributed platform. The message must be sent from a source node, with the help of the receiving peers which may forward the message to other peers. In this context, we are interested in maximizing the throughput (i.e. the maximum streaming rate, once steady state has been reached). The platform model does not assume that the topology of the platform is known in advance: we consider an Internet-like network, with complete potential connectivity. Furthermore, the model associates to each node local properties (incoming and outgoing bandwidth), and the goal is to build an overlay which will be used to perform the broadcast operation. We model contentions using the bounded multi-port model: a processor can be involved simultaneously in several communications, provided that its incoming and outgoing bandwidths are not exceeded. For the sake of realism, it is also necessary to bound the number of simultaneous connections that can be opened at a given node (ie its outdegree). We prove that unfortunately, this additional constraint makes the problem of maximizing the overall throughput NP Complete. On the other hand, we also propose a polynomial time algorithm to solve this problem, based on a slight resource augmentation on the outdegree of the nodes

    Automatic synthesis and optimization of chip multiprocessors

    Get PDF
    The microprocessor technology has experienced an enormous growth during the last decades. Rapid downscale of the CMOS technology has led to higher operating frequencies and performance densities, facing the fundamental issue of power dissipation. Chip Multiprocessors (CMPs) have become the latest paradigm to improve the power-performance efficiency of computing systems by exploiting the parallelism inherent in applications. Industrial and prototype implementations have already demonstrated the benefits achieved by CMPs with hundreds of cores.CMP architects are challenged to take many complex design decisions. Only a few of them are:- What should be the ratio between the core and cache areas on a chip?- Which core architectures to select?- How many cache levels should the memory subsystem have?- Which interconnect topologies provide efficient on-chip communication?These and many other aspects create a complex multidimensional space for architectural exploration. Design Automation tools become essential to make the architectural exploration feasible under the hard time-to-market constraints. The exploration methods have to be efficient and scalable to handle future generation on-chip architectures with hundreds or thousands of cores.Furthermore, once a CMP has been fabricated, the need for efficient deployment of the many-core processor arises. Intelligent techniques for task mapping and scheduling onto CMPs are necessary to guarantee the full usage of the benefits brought by the many-core technology. These techniques have to consider the peculiarities of the modern architectures, such as availability of enhanced power saving techniques and presence of complex memory hierarchies.This thesis has several objectives. The first objective is to elaborate the methods for efficient analytical modeling and architectural design space exploration of CMPs. The efficiency is achieved by using analytical models instead of simulation, and replacing the exhaustive exploration with an intelligent search strategy. Additionally, these methods incorporate high-level models for physical planning. The related contributions are described in Chapters 3, 4 and 5 of the document.The second objective of this work is to propose a scalable task mapping algorithm onto general-purpose CMPs with power management techniques, for efficient deployment of many-core systems. This contribution is explained in Chapter 6 of this document.Finally, the third objective of this thesis is to address the issues of the on-chip interconnect design and exploration, by developing a model for simultaneous topology customization and deadlock-free routing in Networks-on-Chip. The developed methodology can be applied to various classes of the on-chip systems, ranging from general-purpose chip multiprocessors to application-specific solutions. Chapter 7 describes the proposed model.The presented methods have been thoroughly tested experimentally and the results are described in this dissertation. At the end of the document several possible directions for the future research are proposed
    • โ€ฆ
    corecore