197 research outputs found
์ ์ ๋ ฅ ๊ณ ์ฑ๋ฅ ๋์งํธ ์์คํ ์ ์ํ ๊ณ ์ ๋ขฐ๋์ ํด๋ญ ๋คํธ์ํฌ ์ค๊ณ ๋ฐฉ๋ฒ๋ก
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ์ ๊ธฐยท์ปดํจํฐ๊ณตํ๋ถ, 2015. 8. ๊นํํ.์ค๋๋ ์ ํ๋ก ์ค๊ณ์์ ๊ณต์ ๋ณ์ด๊ฐ ํ๋ก ํด๋ญ์ ํ์ด๋ฐ์ ๋ณ์ด์ ๋ฏธ์น๋ ์ํฅ์ ๋งค์ฐ ์ปค์ง์ ๋ฐ๋ผ, ์ ํต์ ์ผ๋ก ์ฌ์ฉ๋๋ ํด๋ญ ํธ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ ํด๋ญ ๋คํธ์ํฌ๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ ํ๊ณ์ ๋ถ๋ชํ๊ฒ ๋์๊ณ , ์ด๋ฅผ ๊ทน๋ณตํ๊ธฐ ์ํ ์ฌ๋ฌ๊ฐ์ง ๊ธฐ์ ๋ค์ด ์ ์๋์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ๋ณ์ด์ ๊ฐํ ํด๋ญ ๋คํธ์ํฌ๋ฅผ ์ค๊ณํ๊ธฐ ์ํด, ์ฐ๊ตฌ ๋ฐ ์ฌ์ฉ๋๊ณ ์๋ ์ธ ๊ฐ์ง ๊ธฐ์ ์ ๋ํด ์๊ฐํ๊ณ , ์ด๋ค์ ๊ฐ์ ํ ์ฐ๊ตฌ๋ค์ ์ ์ํ๋ค.
์ฒซ์งธ๋ก, ์ด ๋
ผ๋ฌธ์์๋ ํด๋ญ์ ํ์ด๋ฐ ๋ฌธ์ ๋ฅผ ํ๋ก ์ ์ ์ดํ ๋จ๊ณ์์ ์กฐ์ ํ ์ ์๋ ํฌ์คํธ ์ค๋ฆฌ์ฝ ์กฐ์ ํด๋ญ ๋ฒํผ๋ฅผ ๋ฐฐ์นํ๋ ๋ฌธ์ ์ ๋ํด ์์ ํ๋ค. ํฌ์คํธ ์ค๋ฆฌ์ฝ ์กฐ์ ๋ฒํผ๋ ํด๋ญ์ ์ง์ฐ์๊ฐ์ ํ๋ก๊ฐ ์ ์๋ ์ดํ์ ๋จ๊ณ์์ ์กฐ์ ํ ์ฌ ํด๋ญ์ ํ์ด๋ฐ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ ์ ์์ง๋ง, ๋ฒํผ ์์ฒด์ ํฌ๊ธฐ ๋๋ฌธ์ ์ต์ํ์ ๊ฐ์๋ง ๊ฐ์ฅ ํจ์จ์ ์ธ ์์น์ ๋ฐฐ์นํด์ผ ํ๋ ๋ฌธ์ ๊ฐ ์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ด์ ์ ์ฐ๊ตฌ๊ฐ ํ๋ก์ ์์จ์ ๊ณ์ฐํ ๋ ์๊ฐ์ด ๋ง์ด ๊ฑธ๋ฆฌ๋ ๋ชฌํ
-์นด๋ฅผ๋ก ์๋ฎฌ๋ ์ด์
์ ์ฌ์ฉํ๊ธฐ ๋๋ฌธ์ ํ์ ๊ฐ๋ฅํ ํฌ์คํธ ์ค๋ฆฌ์ฝ ์กฐ์ ๋ฒํผ์ ๋ฐฐ์น๊ฐ ์ ํ๋๋ ๋ฌธ์ ๊ฐ ์์์ ์ง์ ํ ํ, ๊ธฐ์กด์ ์ ์๋์๋ ๊ทธ๋ํ ๊ธฐ๋ฐ ํ๋ก ์์จ ๊ณ์ฐ ๊ธฐ๋ฒ์ ์ฌ์ฉํ์ฌ ํจ์จ์ ์ธ ํฌ์คํธ ์ค๋ฆฌ์ฝ ์กฐ์ ๋ฒํผ ๋ฐฐ์น๋ฅผ ์ฐพ์ ์ ์๋ ์ ์ง์ ์ด๊ณ ์ฒด๊ณ์ ์ธ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค.
๋ค์์ ํด๋ญ ์์ฐจ ์ค์ผ์ฅด๋ง ๋ฐฉ๋ฒ์ ๋ํ ์ฐ๊ตฌ๋ฅผ ์์ ํ๋ค. ์ต๊ทผ์ ์ฐ๊ตฌ์์ ์ ์๋์๋, ํ๋ฆฝ-ํ๋กญ์ ํด๋ญ์์ ์ถ๋ ฅ๊น์ง์ ๋๋ ์ด๊ฐ ํด๋ญ์ ์ค๋น์๊ฐ๊ณผ ์ ์ง์๊ฐ์ ์์กดํ๋ค๋ ์ ์ฐํ ํ๋ฆฝ-ํ๋กญ ํ์ด๋ฐ ๋ชจ๋ธ ์ฐ๊ตฌ๋ ๊ธฐ์กด์ ํ๋ฆฝ-ํ๋กญ์ ํ์ด๋ฐ ํน์ฑ๋ค์ด ๊ณ ์ ๋ ๊ฐ์ด๋ผ๋ ๊ฐ์ ์ ๊ธฐ๋ฐํ ์ ์ ํ์ด๋ฐ ๋ถ์์ ์ ํ์ฑ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ ์ ์๋ ์ค์ํ ์ฐ๊ตฌ์ด๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์๋ก์ด ๋ชจ๋ธ์ ๊ณ ๋ คํ์ฌ, ์ด์ ์ ๊ณ ์ ์ ์ธ ํ๋ฆฝ-ํ๋กญ ํ์ด๋ฐ ํน์ฑ ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก ์งํ๋์๋ ํด๋ญ ์์ฐจ ์ค์ผ์ฅด๋ง์ ์ต์ ํ ๋ฌธ์ ๋ฅผ ์ ์ฐํ ํ๋ฆฝ-ํ๋กญ ํ์ด๋ฐ ๋ชจ๋ธ์ ๊ณ ๋ คํ์ฌ ํด๊ฒฐํ์๋ค. ๋ณธ ์ฐ๊ตฌ์์๋ ์ฃผ์ด์ง ํ๋ก์ ์ค๋น์๊ฐ๊ณผ ์ ์ง์๊ฐ์ ์ฌ์ ์๊ฐ์ ๋ฐ๋ณต์ ์ด๊ณ ์ฒด๊ณ์ ์ผ๋ก ์ต๋ํํ์ฌ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ์๋ค.
๋ง์ง๋ง์ผ๋ก ํด๋ญ ์คํ์ธ ๋คํธ์ํฌ์ ํฉ์ฑ์ ์๋ํํ๋ ๋ฌธ์ ์ ๋ํด ์์ ํ๋ค. ์ ํต์ ์ธ ํด๋ญ ํธ๋ฆฌ ๊ตฌ์กฐ๊ฐ ๊ณต์ ๋ณ์ด ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ์ง ๋ชปํ๊ธฐ ๋๋ฌธ์ ํด๋ญ ๋ฉ์ฌ๋ฅผ ํฌํจํ๋ ๋ค์ํ ๋์์ ๊ตฌ์กฐ๊ฐ ์ ์๋์๋ค. ํด๋ญ ๋ฉ์ฌ์ ๊ฒฝ์ฐ ๊ณต์ ๋ณ์ด์ ์ํ ํด๋ญ ์์ฐจ๋ฅผ ์ค์ผ ์ ์์์ง๋ง ์ด๋ฅผ ์ํด ์์ด์ด๋ ๋ฒํผ ๋ฑ์ ์์์ ๋ง์ด ์๋ชจํ๋ ๋ฌธ์ ๋ฅผ ๊ฐ์ง๊ณ ์๋ค. ๋ ๊ตฌ์กฐ์ ์ค๊ฐ์ ๊ตฌ์กฐ์๋ ํด๋ญ ํธ๋ฆฌ์ ๋
ธ๋๋ฅผ ์ฐ๊ฒฐํ๋ ํฌ๋ก์ค ๋งํฌ๋ฅผ ์ฝ์
ํ๋ ๊ตฌ์กฐ์ ํด๋ญ ์คํ์ธ ๊ตฌ์กฐ๊ฐ ์๋ค. ํด๋ญ ํธ๋ฆฌ์ ์ ์ง์ ์ธ ์์ ์ ๊ฐํ์ฌ ๋ง๋๋ ํฌ๋ก์ค ๋งํฌ์ ๋ฌ๋ฆฌ, ํด๋ญ ์คํ์ธ ๊ตฌ์กฐ๋ ํธ๋ฆฌ๋ ์ดํ์ ์ ์๋ ๋ฉ์ฌ์๋ ์์ ํ ๋ณ๊ฐ์ ๊ตฌ์กฐ๋ก, ์ด๋ฅผ ํฉ์ฑํ๋ ๋ฐฉ๋ฒ๋ ๋งค์ฐ ๋ค๋ฅด๋ค. ๊ทธ๋ ๊ธฐ ๋๋ฌธ์ ํด๋ญ ์คํ์ธ์ ํฉ์ฑํ๋ ์๊ณ ๋ฆฌ์ฆ์ ํ์์ ์ด๋ผ๊ณ ํ ์ ์์ผ๋, ํฉ์ฑ ๋ฐฉ๋ฒ๋ก ์ด๋ ์ด๋ฅผ ์๋ํํ๋ ๋ฐฉ๋ฒ์ ๊ดํ ์ฐ๊ตฌ๋ ์์ง ์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ฐ์ , ํด๋ญ-๊ฒ์ดํ
์ ์ง์ํ๋ ํด๋ญ ์คํ์ธ์ ์ฃผ์ด์ง ํด๋ญ ์์ฐจ ๋ฐ ํด๋ญ ์ฌ๋ฃจ ์กฐ๊ฑด์ ๋ง์กฑํ๋ฉด์ ์์ ๋ฐ ์ ๋ ฅ ์๋ชจ๋์ ์ต์ํํ๋ ๋ฌธ์ ์ ๋ํด ์์ ํ๋ค. ๊ทธ๋ฆฌ๊ณ , ํ๋ก์์ ์ฃผ์ด์ง ํ๋ฆฝ-ํ๋กญ๋ค์ ํด๋ญ-๊ฒ์ดํ
์กฐ๊ฑด์์์ ์ฐ๊ด์ฑ์ ๊ณ ๋ คํ๊ณ ์กฐ์งํํ์ฌ ํด๋ญ ์คํ์ธ์ ์ฝ์
ํ ํ, ํด๋ญ ์์ฐจ ๋ฐ ์ฌ๋ฃจ ์กฐ๊ฑด์ ๊ณ ๋ คํ์ฌ ๋ฒํผ๋ฅผ ์ฝ์
ํ๋ ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค.
์์ฝํ๋ฉด, ๋ณธ ๋
ผ๋ฌธ์์๋ ํด๋ญ์ ํ์ด๋ฐ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ํฌ์คํธ-์ค๋ฆฌ์ฝ ์กฐ์ ํด๋ญ ๋ฒํผ๋ฅผ ์ฌ์ฉํ๋ ํ
ํฌ๋๊ณผ ํด๋ญ ์์ฐจ ์ค์ผ์ฅด๋ง์ ์ ์ฐํ ํ๋ฆฝ-ํ๋กญ ํ์ด๋ฐ ๋ชจ๋ธ์์ ์ ์ฉํ๋ ํ
ํฌ๋์ ์ ์ํ๊ณ , ํด๋ญ์ ํ์ด๋ฐ ๋ฌธ์ ์ ์ ๋ ฅ ์๋ชจ ๋ฌธ์ ๋ฅผ ํ๋ฒ์ ํด๊ฒฐํ๊ธฐ ์ํ ์๋ก์ด ํด๋ญ ์คํ์ธ ๋คํธ์ํฌ๋ฅผ ํฉ์ฑํ๋ ์๋ํ ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค.As the process variation is dominating to cause the clock timing variation among chips to be much large, conventional clock tree based clock network is not able to guarantee the timing constraint of a digital system. To overcome the limitations of traditional clock design techniques, various techniques have been studied. This dissertation addresses three techniques that have been widely used for designing robust clock network and proposes developed methods.
First, it is widely accepted that post-silicon tunable (PST) clock buffers can effectively resolve the clock timing violation. Since PST buffers, which can reset the clock delay to flip-flops after the chip is manufactured, impose a non-trivial implementation area and control circuitry, it is very important to minimally allocate PST buffers while satisfying the chip yield constraint. In this dissertation, we (1) develop a graph-based chip yield computation technique which can update yields very efficiently and accurately for incremental PST buffer allocation, based on which we (2) propose a systematic (bottom-up and top-down with refinement) PST buffer allocation algorithm that is able to fully explore the design space of PST buffer allocation.
Second, clock skew scheduling is one of the essential steps that must be carefully performed during the design process. This dissertation addresses the clock skew optimization problem integrated with the consideration of the interdependent relation between the setup and hold skews, and clk-to-Q delay of flip-flops, so that the time margin is more accurately and reliably set aside over that of the previous methods, which have never taken the integrated problem into account. Precisely, based on an accurate flexible model of setup skew, hold skew, and clk-to-Q delay, we propose a stepwise clock skew scheduling technique in which at each iteration, the worst slack of setup and hold skews is systematically and incrementally relaxed to maximally extend the time margin.
Lastly, clock tree with cross links and clock spine have an intermediate characteristics for skew tolerance and power consumption, compared to clock tree and clock mesh which are two extreme structures of clock network. Unlike the clock tree with links between clock nodes, which is a sort of an incremental modification of the structure of clock tree, clock spine network is a completely separated structure from the structures of tree and mesh. Consequently, it is necessary and essential to develop a synthesis algorithm for clock spines, which will be compatible to the existing synthesis algorithms of clock trees and clock meshes. To this end, this dissertation first addresses the problem of automating the synthesis of clock-gated clock spines with the objective of minimizing total clock power while meeting the clock skew and slew constraints. The key idea of our proposed synthesis algorithm is to identify and group the flip-flops with tight correlation of clock-gating operations together to form a spine while accurately predicting and maintaining clock skew and slew variations through the buffer insertion and stub allocation.
In summary, this dissertation presents clock tuning techniques with consideration of post-silicon tuning, flexible flip-flop timing model, and clock-gated clock spine synthesis algorithm.Abstract i
Chapter 1 INTRODUCTION 1
1.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . 1
1.2 Process Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Flexible Flip-flop Timing Model . . . . . . . . . . . . . . . . . . . 3
1.4 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Contributions of This Dissertation . . . . . . . . . . . . . . . . . 6
Chapter 2 POST-SILICON TUNABLE CLOCK BUFFER ALLOCATION BASED ON FAST CHIP YIELD COMPUTATION
8
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Systematic Exploration of PST Buffer Allocation . . . . . . . . . 10
2.2.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . 16
2.3 Fast Timing Yield Computation . . . . . . . . . . . . . . . . . . 17
2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Incremental Yield Computation . . . . . . . . . . . . . . . 22
2.4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 PST Buffer Configuration Techniques . . . . . . . . . . . . . . . 31
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 3 POST-SILICON TUNING BASED ON FLEXIBLE FLIP-FLOP TIMING 34
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Preliminary and Definitions . . . . . . . . . . . . . . . . . . . . . 40
3.2.1 Flexible Flip-Flop Timing Model . . . . . . . . . . . . . . 40
3.2.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Clock Skew Scheduling for Slack Relaxation Based on Flexible Flip-Flop Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.1 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4.2 Finding Local Clock Skew Schedule . . . . . . . . . . . . 48
3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Chapter 4 SYNTHESIS FOR POWER-AWARE CLOCK SPINES 61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Preliminaries and Motivation . . . . . . . . . . . . . . . . . . . . 64
4.2.1 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.2 Activity Patterns . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.3 Power Computation . . . . . . . . . . . . . . . . . . . . . 67
4.3 Algorithm for Clock Spine Synthesis . . . . . . . . . . . . . . . . 68
4.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 68
4.3.2 Power-Aware Sink Clustering . . . . . . . . . . . . . . . . 70
4.3.3 Spine Relaxation . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.4 Spine Buffer Allocation . . . . . . . . . . . . . . . . . . . 80
4.3.5 Top-Level Tree Construction . . . . . . . . . . . . . . . . 86
4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Chapter 5 CONCLUSION 95
5.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Bibliography 97
์ด๋ก 106Docto
Variability-Aware VLSI Design Automation For Nanoscale Technologies
As technology scaling enters the nanometer regime, design of large scale ICs gets more challenging due to shrinking feature sizes and increasing design complexity. Aggressive scaling causes significant degradation in reliability, increased susceptibility to fabrication and environmental randomness and increased dynamic and leakage power dissipation. In this work, we investigate these scaling issues in large scale integrated systems.
This dissertation proposes to develop variability-aware design methodologies by proposing design analysis, design-time optimization, post-silicon tunability and runtime-adaptivity based optimization techniques for handling variability. We discuss our research in the area of variability-aware analysis, specifically
focusing on the problem of statistical timing analysis. The first technique presents the concept of error budgeting that achieves significant runtime speedups during statistical timing analysis. The second work presents a general framework for non-linear non-Gaussian statistical timing analysis considering correlations.
Further, we present our work on design-time optimization schemes that are applicable during physical synthesis. Firstly, we present a buffer insertion technique that considers wire-length uncertainty and proposes algorithms to perform probabilistic buffer insertion. Secondly, we present a stochastic optimization framework
based on Monte-Carlo technique considering fabrication variability. This optimization framework can be applied to problems that can be modeled as linear programs without without imposing any assumptions on the nature of the variability.
Subsequently, we present our work on post-silicon tunability based design optimization. This work presents a design management framework that can be used to balance the effort spent on pre-silicon (through gate sizing) and post-silicon optimization (through tunable clock-tree buffers) while maximizing the yield gains. Lastly, we present our work on variability-aware runtime optimization techniques. We look at the problem of runtime supply voltage scaling for dynamic power optimization, and propose a framework to consider the impact of variability on the reliability of such designs. We propose a probabilistic design synthesis technique
where reliability of the design is a primary optimization metric
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
The Smart Stone Network: Design and Protocols
The Smart Stone Protocol (SSP) has been developed to achieve rapid synchronization in a wireless sensor network, establish Time Division Multiple Access (TDMA)communication slots, and perform distributed sensing with global shared awareness. The SSP achieves a synchronization precision of 50μs among receivers. The sender is synchronized to the receivers using a novel scheme to identify the closest comparable times on the sender and receiver. The protocol is tightly related to events that occur in the mote hardware, and is designed to operate on resource constrained wireless sensor motes. Robust TDMA communication slots are set up based on the achieved synchronization, and an innovative algorithm is employed to maintain synchronization without sending any additional synchronization bytes. To test and validate the protocol, Smart Stones have been custom designed using commercial off-the-shelf (COTS) components, and the SSP has been successfully demonstrated on the Smart Stone Network performing an acoustic sensing application
Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip
The sustained demand for faster, more powerful chips has been met by the
availability of chip manufacturing processes allowing for the integration of increasing
numbers of computation units onto a single die. The resulting outcome,
especially in the embedded domain, has often been called SYSTEM-ON-CHIP
(SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC).
MPSoC design brings to the foreground a large number of challenges, one of
the most prominent of which is the design of the chip interconnection. With a
number of on-chip blocks presently ranging in the tens, and quickly approaching
the hundreds, the novel issue of how to best provide on-chip communication
resources is clearly felt.
NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable
answer to this design concern. By bringing large-scale networking concepts to
the on-chip domain, they guarantee a structured answer to present and future
communication requirements. The point-to-point connection and packet switching
paradigms they involve are also of great help in minimizing wiring overhead
and physical routing issues. However, as with any technology of recent inception,
NoC design is still an evolving discipline. Several main areas of interest
require deep investigation for NoCs to become viable solutions:
โข The design of the NoC architecture needs to strike the best tradeoff among
performance, features and the tight area and power constraints of the onchip
domain.
โข Simulation and verification infrastructure must be put in place to explore,
validate and optimize the NoC performance.
โข NoCs offer a huge design space, thanks to their extreme customizability in
terms of topology and architectural parameters. Design tools are needed
to prune this space and pick the best solutions.
โข Even more so given their global, distributed nature, it is essential to evaluate
the physical implementation of NoCs to evaluate their suitability for
next-generation designs and their area and power costs.
This dissertation performs a design space exploration of network-on-chip architectures,
in order to point-out the trade-offs associated with the design of
each individual network building blocks and with the design of network topology
overall. The design space exploration is preceded by a comparative analysis
of state-of-the-art interconnect fabrics with themselves and with early networkon-
chip prototypes. The ultimate objective is to point out the key advantages
that NoC realizations provide with respect to state-of-the-art communication
infrastructures and to point out the challenges that lie ahead in order to make
this new interconnect technology come true. Among these latter, technologyrelated
challenges are emerging that call for dedicated design techniques at all
levels of the design hierarchy. In particular, leakage power dissipation, containment
of process variations and of their effects. The achievement of the above
objectives was enabled by means of a NoC simulation environment for cycleaccurate
modelling and simulation and by means of a back-end facility for the
study of NoC physical implementation effects. Overall, all the results provided
by this work have been validated on actual silicon layout
Knowledge representation into Ada parallel processing
The Knowledge Representation into Ada Parallel Processing project is a joint NASA and Air Force funded project to demonstrate the execution of intelligent systems in Ada on the Charles Stark Draper Laboratory fault-tolerant parallel processor (FTPP). Two applications were demonstrated - a portion of the adaptive tactical navigator and a real time controller. Both systems are implemented as Activation Framework Objects on the Activation Framework intelligent scheduling mechanism developed by Worcester Polytechnic Institute. The implementations, results of performance analyses showing speedup due to parallelism and initial efficiency improvements are detailed and further areas for performance improvements are suggested
Working Notes from the 1992 AAAI Spring Symposium on Practical Approaches to Scheduling and Planning
The symposium presented issues involved in the development of scheduling systems that can deal with resource and time limitations. To qualify, a system must be implemented and tested to some degree on non-trivial problems (ideally, on real-world problems). However, a system need not be fully deployed to qualify. Systems that schedule actions in terms of metric time constraints typically represent and reason about an external numeric clock or calendar and can be contrasted with those systems that represent time purely symbolically. The following topics are discussed: integrating planning and scheduling; integrating symbolic goals and numerical utilities; managing uncertainty; incremental rescheduling; managing limited computation time; anytime scheduling and planning algorithms, systems; dependency analysis and schedule reuse; management of schedule and plan execution; and incorporation of discrete event techniques
Mapreduce and Heterogeneity: Power-Aware Bag-of-Tasks, Framework Parameter Sensitivity, and Dynamic Cluster Aware Framework Configuration
This dissertation presents the techniques for adaptation of MapReduce frameworks to incorporate heterogeneity-aware scheduling algorithms, an inspection of cluster configurations and how they impact these scheduling algorithms, an analysis regarding how the cluster configuration and the heterogeneity-aware scheduling can work together to minimize turnaround time and/or power consumption of the cluster when executing MapReduce applications, and how these lessons can be applied more broadly to Big Data infrastructure outside of MapReduce that supports multiple Big Data frameworks simultaneously.
Heterogeneity exists in various capacities in any given cluster, from static (Physical and Platform) heterogeneity to dynamic heterogeneity (Transient Data, Transient Applications, and Irregular Hardware Behavior). Within the cluster there are historically several types of mitigation strategies for each of these types of heterogeneity, and each has their pros and cons. We discuss these mitigation strategies and the types of heterogeneity each of these strategies is able to address, and the history of the related work in the field.
After this, we consider taking host-level metrics and using them to schedule tasks in real time, with a desire to address cluster-wide energy usage. To do this, we consider estimators for power consumption that are available on-chip, namely temperature. We establish a correlation between CPU temperature and power consumption, then derive a scheduling algorithm that eliminates nodes that are consuming too much power from the pool of schedule-able resources. In order to do this we focus on the ability of MapReduce frameworks, constructed as we have constructed the frameworks described in this thesis, to delay binding of tasks to specific workers. We analyze the impacts this has on turnaround time of a MapReduce application, with analysis around setting this threshold properly to reduce impact on turnaround time while shifting power consumption around in the cluster, away from nodes that are over-consuming.
We also address concerns with respect to upgrading a cluster in stages, introducing more Physical Heterogeneity at various levels and the types of adjustments that need to be made to MapReduce configurations in order to combat the increased Heterogeneity. In particular, we look at the concerns for MapReduce platform mis-configuration and its impacts on turnaround time, analyzing the ways in which these types of errors can be mitigated between incremental platform upgrades. In an effort to address this, we introduce a Dynamic Heterogeneity Awareness (DHA) module to our MapReduce framework in order to address these upgrades, and allow better spreading of tasks by the framework, in order to further improve turnaround time and resource utilization.
Finally we consider the implications for framework and application co-tenancy, and we describe the state of art in these areas. We focus on describing what co-tenancy is, why it\u27s important, and how the state of the art can be expanded to in order to leverage findings from this thesis to make these co-tenant clusters increase application and framework performance as well as improving these clusters with considerations for energy efficiency
- โฆ