2 research outputs found

    온 μΉ© λ„€νŠΈμ›Œν¬ 섀계: 맀핑, 관리, λΌμš°νŒ…

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : 전기·정보곡학뢀, 2016. 2. 졜기영.μ§€λ‚œ μˆ˜μ‹­ λ…„κ°„ 이어진 λ°˜λ„μ²΄ 기술의 ν–₯상은 λ§€λ‹ˆ μ½”μ–΄μ˜ μ‹œλŒ€λ₯Ό κ°€μ Έλ‹€ μ£Όμ—ˆλ‹€. μš°λ¦¬κ°€ 일상 μƒν™œμ— μ“°λŠ” λ°μŠ€ν¬ν†± 컴퓨터쑰차도 이미 수 개의 μ½”μ–΄λ₯Ό 가지고 있으며, 수백 개의 μ½”μ–΄λ₯Ό 가진 칩도 μƒμš©ν™”λ˜μ–΄ μžˆλ‹€. μ΄λŸ¬ν•œ λ§Žμ€ μ½”μ–΄λ“€ κ°„μ˜ 톡신 κΈ°λ°˜μœΌλ‘œμ„œ, λ„€νŠΈμ›Œν¬-온-μΉ©(NoC)이 μƒˆλ‘œμ΄ λŒ€λ‘λ˜μ—ˆμœΌλ©°, μ΄λŠ” ν˜„μž¬ λ§Žμ€ 연ꡬ 및 μƒμš© μ œν’ˆμ—μ„œ 널리 μ‚¬μš©λ˜κ³  μžˆλ‹€. κ·ΈλŸ¬λ‚˜ λ„€νŠΈμ›Œν¬-온-칩을 λ§€λ‹ˆ μ½”μ–΄ μ‹œμŠ€ν…œμ— μ‚¬μš©ν•˜λŠ” λ°μ—λŠ” μ—¬λŸ¬ 가지 λ¬Έμ œκ°€ λ”°λ₯΄λ©°, λ³Έ λ…Όλ¬Έμ—μ„œλŠ” κ·Έ 쀑 λͺ‡ 가지λ₯Ό ν’€μ–΄λ‚΄κ³ μž ν•˜μ˜€λ‹€. λ³Έ λ…Όλ¬Έμ˜ 두 번째 μ±•ν„°μ—μ„œλŠ” NoC 기반 λ§€λ‹ˆμ½”μ–΄ ꡬ쑰에 μž‘μ—…μ„ ν• λ‹Ήν•˜κ³  μŠ€μΌ€μ₯΄ν•˜λŠ” 방법을 λ‹€λ£¨μ—ˆλ‹€. λ§€λ‹ˆμ½”μ–΄μ—μ˜ μž‘μ—… 할당을 닀룬 논문은 이미 많이 μΆœνŒλ˜μ—ˆμ§€λ§Œ, λ³Έ μ—°κ΅¬λŠ” λ©”μ‹œμ§€ νŒ¨μ‹±κ³Ό 곡유 λ©”λͺ¨λ¦¬, 두 κ°€μ§€μ˜ 톡신 방식을 κ³ λ €ν•¨μœΌλ‘œμ¨ μ„±λŠ₯κ³Ό μ—λ„ˆμ§€ νš¨μœ¨μ„ κ°œμ„ ν•˜μ˜€λ‹€. λ˜ν•œ, λ³Έ μ—°κ΅¬λŠ” μ—­λ°©ν–₯ μ˜μ‘΄μ„±μ„ 가진 μž‘μ—… κ·Έλž˜ν”„λ₯Ό μŠ€μΌ€μ₯΄ν•˜λŠ” 방법 λ˜ν•œ μ œμ‹œν•˜μ˜€λ‹€. 3차원 적측 κΈ°μˆ μ€ 높아진 μ „λ ₯ 밀도 λ•Œλ¬Έμ— μ—΄ λ¬Έμ œκ°€ μ‹¬κ°ν•΄μ§€λŠ” λ“±, μ—¬λŸ¬ 가지 도전 과제λ₯Ό λ‚΄ν¬ν•˜κ³  μžˆλ‹€. μ„Έ 번째 μ±•ν„°μ—μ„œλŠ” DVFS κΈ°μˆ μ„ μ΄μš©ν•˜μ—¬ μ—΄ 문제λ₯Ό μ™„ν™”ν•˜κ³ μž ν•˜λŠ” κΈ°μˆ μ„ μ†Œκ°œν•œλ‹€. 각 코어와 λΌμš°ν„°κ°€ μ „μ••, μž‘λ™ 속도λ₯Ό μ‘°μ ˆν•  수 μžˆλŠ” κ΅¬μ‘°μ—μ„œ, κ°€μž₯ 높은 μ„±λŠ₯을 μ΄λŒμ–΄ λ‚΄λ©΄μ„œλ„ μ΅œλŒ€ μ˜¨λ„λ₯Ό λ„˜μ–΄μ„œμ§€ μ•Šλ„λ‘ ν•œλ‹€. μ„Έ λ²ˆμ§Έμ™€ λ„€ 번째 μ±•ν„°λŠ” 쑰금 λ‹€λ₯Έ 츑면을 닀룬닀. 3D 적측 κΈ°μˆ μ„ μ‚¬μš©ν•  λ•Œ, μΈ΅κ°„ 톡신은 주둜 TSVλ₯Ό μ΄μš©ν•˜μ—¬ 이루어진닀. κ·ΈλŸ¬λ‚˜ TSVλŠ” 일반 wire보닀 훨씬 큰 면적을 μ°¨μ§€ν•˜κΈ° λ•Œλ¬Έμ—, 전체 λ„€νŠΈμ›Œν¬μ—μ„œμ˜ TSV κ°œμˆ˜λŠ” μ œν•œλ˜μ–΄μ•Ό ν•  κ²½μš°κ°€ λ§Žλ‹€. 이 κ²½μš°μ—λŠ” 두 가지 선택지가 μžˆλŠ”λ°, μ²«μ§ΈλŠ” 각 μΈ΅κ°„ 톡신 μ±„λ„μ˜ λŒ€μ—­ν­μ„ μ€„μ΄λŠ” 것이고, λ‘˜μ§ΈλŠ” 각 μ±„λ„μ˜ λŒ€μ—­ν­μ€ μœ μ§€ν•˜λ˜ 일뢀 λ…Έλ“œλ§Œ μΈ΅κ°„ 톡신이 κ°€λŠ₯ν•œ 채널을 μ œκ³΅ν•˜λŠ” 것이닀. μš°λ¦¬λŠ” 각각의 κ²½μš°μ— λŒ€ν•˜μ—¬ λΌμš°νŒ… μ•Œκ³ λ¦¬μ¦˜μ„ ν•˜λ‚˜μ”© μ œμ‹œν•œλ‹€. 첫 번째 κ²½μš°μ— μžˆμ–΄μ„œλŠ” deflection λΌμš°νŒ… 기법을 μ‚¬μš©ν•˜μ—¬ μΈ΅κ°„ ν†΅μ‹ μ˜ κΈ΄ 지연 μ‹œκ°„μ„ κ·Ήλ³΅ν•˜κ³ μž ν•˜μ˜€λ‹€. μΈ΅κ°„ 톡신을 κ· λ“±ν•˜κ²Œ λΆ„λ°°ν•¨μœΌλ‘œμ¨, μ œμ‹œλœ μ•Œκ³ λ¦¬μ¦˜μ€ κ°œμ„ λœ 지연 μ‹œκ°„μ„ 보이며 λΌμš°ν„° λ²„νΌμ˜ 제거λ₯Ό ν†΅ν•œ 면적 및 μ—λ„ˆμ§€ νš¨μœ¨μ„± λ˜ν•œ 얻을 수 μžˆλ‹€. 두 번째 κ²½μš°μ—μ„œλŠ” μΈ΅κ°„ 톡신 채널을 μ„ νƒν•˜κΈ° μœ„ν•œ λͺ‡ 가지 κ·œμΉ™μ„ μ œμ‹œν•œλ‹€. μ•½κ°„μ˜ λΌμš°νŒ… μžμœ λ„λ₯Ό ν¬μƒν•¨μœΌλ‘œμ¨, μ œμ‹œλœ μ•Œκ³ λ¦¬μ¦˜μ€ κΈ°μ‘΄ μ•Œκ³ λ¦¬μ¦˜μ˜ 가상 채널 μš”κ΅¬ 쑰건을 μ œκ±°ν•˜κ³ , κ²°κ³Όμ μœΌλ‘œλŠ” μ„±λŠ₯ λ˜λŠ” μ—λ„ˆμ§€ 효율의 증가λ₯Ό κ°€μ Έ μ˜¨λ‹€.For decades, advance in semiconductor technology has led us to the era of many-core systems. Today's desktop computers already have multi-core processors, and chips with more than a hundred cores are commercially available. As a communication medium for such a large number of cores, network-on-chip (NoC) has emerged out, and now is being used by many researchers and companies. Adopting NoC for a many-core system incurs many problems, and this thesis tries to solve some of them. The second chapter of this thesis is on mapping and scheduling of tasks on NoC-based CMP architectures. Although mapping on NoC has a number of papers published, our work reveals that selecting communication types between shared memory and message passing can help improve the performance and energy efficiency. Additionally, our framework supports scheduling applications containing backward dependencies with the help of modified modulo scheduling. Evolving the SoCs through 3D stacking makes us face a number of new problems, and the thermal problem coming from increased power density is one of them. In the third chapter of this thesis, we try to mitigate the hotspot problem using DVFS techniques. Assuming that all the routers as well as cores have capabilities to control voltage and frequency individually, we find voltage-frequency pairs for all cores and routers which yields the best performance within the given thermal constraint. The fourth and the fifth chapters of this thesis are from a different aspect. In 3D stacking, inter-layer interconnections are implemented using through-silicon vias (TSV). TSVs usually take much more area than normal wires. Furthermore, they also consume silicon area as well as metal area. For this reason, designers would want to limit the number of TSVs used in their network. To limit the TSV count, there are two options: the first is to reduce the width of each vertical links, and the other is to use fewer vertical links, which results in a partially connected network. We present two routing methodologies for each case. For the network with reduced bandwidth vertical links, we propose using deflection routing to mitigate the long latency of vertical links. By balancing the vertical traffics properly, the algorithm provides improved latency. Also, a large amount of area and energy reduction can be obtained by the removal of router buffers. For partially connected networks, we introduce a set of routing rules for selecting the vertical links. At the expense of sacrificing some amount of routing freedom, the proposed algorithm removes the virtual channel requirement for avoiding deadlock. As a result, the performance, or energy consumption can be reduced at the designer's choice.Chapter 1 Introduction 1 1.1 Task Mapping and Scheduling 2 1.2 Thermal Management 3 1.3 Routing for 3D Networks 5 Chapter 2 Mapping and Scheduling 9 2.1 Introduction 9 2.2 Motivation 10 2.3 Background 12 2.4 Related Work 16 2.5 Platform Description 17 2.5.1 Architcture Description 17 2.5.2 Energy Model 21 2.5.3 Communication Delay Model 22 2.6 Problem Formulation 23 2.7 Proposed Solution 25 2.7.1 Task and Communication Mapping 27 2.7.2 Communication Type Optimization 31 2.7.3 Design Space Pruning via Pre-evaluation 34 2.7.4 Scheduling 35 2.8 Experimental Results 42 2.8.1 Experiments with Coarse-grained Iterative Modulo Scheduling 42 2.8.2 Comparison with Different Mapping Algorithms 43 2.8.3 Experiments with Overall Algorithms 45 2.8.4 Experiments with Various Local Memory Sizes 47 2.8.5 Experiments with Various Placements of Shared Memory 48 Chapter 3 Thermal Management 50 3.1 Introduction 50 3.2 Background 51 3.2.1 Thermal Modeling 51 3.2.2 Heterogeneity in Thermal Propagation 52 3.3 Motivation and Problem Definition 53 3.4 Related Work 56 3.5 Orchestrated Voltage-Frequency Assignment 56 3.5.1 Individual PI Control Method 56 3.5.2 PI Controlled Weighted-Power Budgeting 57 3.5.3 Performance/Power Estimation 59 3.5.4 Frequency Assignment 62 3.5.5 Algorithm Overview 64 3.5.6 Stability Conditions for PI Controller 65 3.6 Experimental Result 66 3.6.1 Experimental Setup 66 3.6.2 Overall Algorithm Performance 68 3.6.3 Accuracy of the Estimation Model 70 3.6.4 Performance of the Frequency Assignment Algorithm 70 Chapter 4 Routing for Limited Bandwidth 3D NoC 72 4.1 Introduction 72 4.2 Motivation 73 4.3 Background 74 4.4 Related Work 75 4.5 3D Deflection Routing 76 4.5.1 Serialized TSV Model 76 4.5.2 TSV Link Injection/ejection Scheme 78 4.5.3 Deadlock Avoidance 80 4.5.4 Livelock Avoidance 84 4.5.5 Router Architecture: Putting It All Together 86 4.5.6 System Level Consideration 87 4.6 Experimental Results 89 4.6.1 Experimental Setup 89 4.6.2 Results on Synthetic Traffic Patterns 91 4.6.3 Results on Realistic Traffic Patterns 94 4.6.4 Results on Real Application Benchmarks 98 4.6.5 Fairness Issue 103 4.6.6 Area Cost Comparison 104 Chapter 5 Routing for Partially Connected 3D NoC 106 5.1 Introduction 106 5.2 Background 107 5.3 Related Work 109 5.4 Proposed Algorithm 111 5.4.1 Preliminary 112 5.4.2 Routing Algorithm for 3-D Stacked Meshes with Regular Partial Vertical Connections 115 5.4.3 Routing Algorithm for 3-D Stacked Meshes with Irregular Partial Vertical Connections 118 5.4.4 Extension to Heterogeneous Mesh Layers 122 5.5 Experimental Results 126 5.5.1 Experimental Setup 126 5.5.2 Experiments on Synthetic Traffics 128 5.5.3 Experiments on Application Benchmarks 133 5.5.4 Comparison with Reduced Bandwidth Mesh 139 Chapter 6 Conclusion 141 Bibliography 144 초둝 163Docto

    Routing and Wavelength Assignment for Multicast Communication in Optical Network-on-Chip

    Get PDF
    An Optical Network-on-Chip (ONoC) is an emerging chip-level optical interconnection technology to realise high-performance and power-efficient inter-core communication for many-core processors. Within the field, multicast communication is one of the most important inter-core communication forms. It is not only widely used in parallel computing applications in Chip Multi-Processors (CMPs), but also common in emerging areas such as neuromorphic computing. While many studies have been conducted on designing ONoC architectures and routing schemes to support multicast communication, most existing solutions adopt the methods that were initially proposed for electrical interconnects. These solutions can neither fully take advantage of optical communication nor address the special requirements of an ONoC. Moreover, most of them focus only on the optimisation of one multicast, which limits the practical applications because real systems often have to handle multiple multicasts requested from various applications. Hence, this thesis will address the design of a high-performance communication scheme for multiple multicasts by taking into account the unique characteristics and constraints of an ONoC. This thesis studies the problem from a network-level perspective. The design methodology is to optimally route all multicasts requested simultaneously from the applications in an ONoC, with the objective of efficiently utilising available wavelengths. The novelty is to adopt multicast-splitting strategies, where a multicast can be split into several sub-multicasts according to the distribution of multicast nodes, in order to reduce the conflicts of different multicasts. As routing and wavelength assignment problem is an NP-hard problem, heuristic approaches that use the multicast-splitting strategy are proposed in this thesis. Specifically, three routing and wavelength assignment schemes for multiple multicasts in an ONoC are proposed for different problem domains. Firstly, PRWAMM, a Path-based Routing and Wavelength Assignment for Multiple Multicasts in an ONoC, is proposed. Due to the low manufacture complexity requirement of an ONoC, e.g., no splitters, path-based routing is studied in PRWAMM. Two wavelength-assignment strategies for multiple multicasts under path-based routing are proposed. One is an intramulticast wavelength assignment, which assigns wavelength(s) for one multicast. The other is an inter-multicast wavelength assignment, which assigns wavelength(s) for different multicasts, according to the distributions of multicasts. Simulation results show that PRWAMM can reduce the average number of wavelengths by 15% compared to other path-based schemes. Secondly, RWADMM, a Routing and Wavelength Assignment scheme for Distribution-based Multiple Multicasts in a 2D ONoC, is proposed. Because path-based routing lacks flexibility, it cannot reduce the link conflicts effectively. Hence, RWADMM is designed, based on the distribution of different multicasts, which includes two algorithms. One is an optimal routing and wavelength assignment algorithm for special distributions of multicast nodes. The other is a heuristic routing and wavelength assignment algorithm for random distributions of multicast nodes. Simulation results show that RWADMM can reduce the number of wavelengths by 21.85% on average, compared to the state-of-the-art solutions in a 2D ONoC. Thirdly, CRRWAMM, a Cluster-based Routing and Reusable Wavelength Assignment scheme for Multiple Multicasts in a 3D ONoC, is proposed. Because of the different architectures with a 2D ONoC (e.g., the layout of nodes, optical routers), the methods designed for a 2D ONoC cannot be simply extended to a 3D ONoC. In CRRWAMM, the distribution of multicast nodes in a mesh-based 3D ONoC is analysed first. Then, routing theorems for special instances are derived. Based on the theorems, a general routing scheme, which includes a cluster-based routing method and a reusable wavelength assignment method, is proposed. Simulation results show that CRRWAMM can reduce the number of wavelengths by 33.2% on average, compared to other schemes in a 3D ONoC. Overall, the three routing and wavelength assignment schemes can achieve high-performance multicast communication for multiple multicasts of their problem domains in an ONoC. They all have the advantages of a low routing complexity, a low wavelength requirement, and good scalability, compared to their counterparts, respectively. These methods make an ONoC a flexible high-performance computing platform to execute various parallel applications with different multicast requirements. As future work, I will investigate the power consumption of various routing schemes for multicasts. Using a multicast-splitting strategy may increase power consumption since it needs different wavelengths to send packets to different destinations for one multicast, though the reduction of wavelengths used in the schemes can also potentially decrease overall power consumption. Therefore, how to achieve the best trade-off between the total number of wavelengths used and the number of sub-multicasts in order to reduce power consumption will be interesting future research
    corecore