957 research outputs found
Desynchronization: Synthesis of asynchronous circuits from synchronous specifications
Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur
๋ก์ง ๋ฐ ํผ์ง์ปฌ ํฉ์ฑ์์์ ํ์ด๋ฐ ๋ถ์๊ณผ ์ต์ ํ
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2020. 8. ๊นํํ.Timing analysis is one of the necessary steps in the development of a semiconductor circuit. In addition, it is increasingly important in the advanced process technologies due to various factors, including the increase of processโvoltageโtemperature variation. This dissertation addresses three problems related to timing analysis and optimization in logic and physical synthesis. Firstly, most static timing analysis today are based on conventional fixed flip-flop timing models, in which every flip-flop is assumed to have a fixed clock-to-Q delay. However, setup and hold skews affect the clock-to-Q delay in reality. In this dissertation, I propose a mathematical formulation to solve the problem and apply it to the clock skew scheduling problems as well as to the analysis of a given circuit, with a scalable speedup technique. Secondly, near-threshold computing is one of the promising concepts for energy-efficient operation of VLSI systems, but wide performance variation and nonlinearity to process variations block the proliferation. To cope with this, I propose a holistic hardware performance monitoring methodology for accurate timing prediction in a near-threshold voltage regime and advanced process technology. Lastly, an asynchronous circuit is one of the alternatives to the conventional synchronous style, and asynchronous pipeline circuit especially attractive because of its small design effort. This dissertation addresses the synthesis problem of lightening two-phase bundled-data asynchronous pipeline controllers, in which delay buffers are essential for guaranteeing the correct handshaking operation but incurs considerable area increase.ํ์ด๋ฐ ๋ถ์์ ๋ฐ๋์ฒด ํ๋ก ๊ฐ๋ฐ ํ์ ๊ณผ์ ์ค ํ๋๋ก, ์ต์ ๊ณต์ ์ผ์๋ก ๊ณต์ -์ ์-์จ๋ ๋ณ์ด ์ฆ๊ฐ๋ฅผ ํฌํจํ ๋ค์ํ ์์ธ์ผ๋ก ํ์ฌ๊ธ ๊ทธ ์ค์์ฑ์ด ์ปค์ง๊ณ ์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ๋ก์ง ๋ฐ ํผ์ง์ปฌ ํฉ์ฑ๊ณผ ๊ด๋ จํ์ฌ ์ธ ๊ฐ์ง ํ์ด๋ฐ ๋ถ์ ๋ฐ ์ต์ ํ ๋ฌธ์ ์ ๋ํด ๋ค๋ฃฌ๋ค. ์ฒซ์งธ๋ก, ์ค๋๋ ๋๋ถ๋ถ์ ์ ์ ํ์ด๋ฐ ๋ถ์์ ๋ชจ๋ ํ๋ฆฝ-ํ๋กญ์ ํด๋ญ-์ถ๋ ฅ ๋๋ ์ด๊ฐ ๊ณ ์ ๋ ๊ฐ์ด๋ผ๋ ๊ฐ์ ์ ๋ฐํ์ผ๋ก ์ด๋ฃจ์ด์ก๋ค. ํ์ง๋ง ์ค์ ํด๋ญ-์ถ๋ ฅ ๋๋ ์ด๋ ํด๋น ํ๋ฆฝ-ํ๋กญ์ ์
์
๋ฐ ํ๋ ์คํ์ ์ํฅ์ ๋ฐ๋๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ด๋ฌํ ํน์ฑ์ ์ํ์ ์ผ๋ก ์ ๋ฆฌํ์์ผ๋ฉฐ, ์ด๋ฅผ ํ์ฅ ๊ฐ๋ฅํ ์๋ ํฅ์ ๊ธฐ๋ฒ๊ณผ ๋๋ถ์ด ์ฃผ์ด์ง ํ๋ก์ ํ์ด๋ฐ ๋ถ์ ๋ฐ ํด๋ญ ์คํ ์ค์ผ์ฅด๋ง ๋ฌธ์ ์ ์ ์ฉํ์๋ค. ๋์งธ๋ก, ์ ์ฌ ๋ฌธํฑ ์ฐ์ฐ์ ์ด๊ณ ์ง์ ํ๋ก ๋์์ ์๋์ง ํจ์จ์ ๋์ด ์ฌ๋ฆด ์ ์๋ค๋ ์ ์์ ๊ฐ๊ด๋ฐ์ง๋ง, ํฐ ํญ์ ์ฑ๋ฅ ๋ณ์ด ๋ฐ ๋น์ ํ์ฑ ๋๋ฌธ์ ๋๋ฆฌ ํ์ฉ๋๊ณ ์์ง ์๋ค. ์ด๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์ ์ฌ ๋ฌธํฑ ์ ์ ์์ญ ๋ฐ ์ต์ ๊ณต์ ๋
ธ๋์์ ๋ณด๋ค ์ ํํ ํ์ด๋ฐ ์์ธก์ ์ํ ํ๋์จ์ด ์ฑ๋ฅ ๋ชจ๋ํฐ๋ง ๋ฐฉ๋ฒ๋ก ์ ๋ฐ์ ์ ์ํ์๋ค. ๋ง์ง๋ง์ผ๋ก, ๋น๋๊ธฐ ํ๋ก๋ ๊ธฐ์กด ๋๊ธฐ ํ๋ก์ ๋์ ์ค ํ๋๋ก, ๊ทธ ์ค์์๋ ๋น๋๊ธฐ ํ์ดํ๋ผ์ธ ํ๋ก๋ ๋น๊ต์ ์ ์ ์ค๊ณ ๋
ธ๋ ฅ๋ง์ผ๋ก๋ ๊ตฌํ ๊ฐ๋ฅํ๋ค๋ ์ฅ์ ์ด ์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ 2์์ ๋ฌถ์ ๋ฐ์ดํฐ ํ๋กํ ์ฝ ๊ธฐ๋ฐ ๋น๋๊ธฐ ํ์ดํ๋ผ์ธ ์ปจํธ๋กค๋ฌ ์์์, ์ ํํ ํธ๋์
ฐ์ดํน ํต์ ์ ์ํด ์ฝ์
๋ ๋๋ ์ด ๋ฒํผ์ ์ํ ๋ฉด์ ์ฆ๊ฐ๋ฅผ ์ํํ ์ ์๋ ํฉ์ฑ ๊ธฐ๋ฒ์ ์ ์ํ์๋ค.1 INTRODUCTION 1
1.1 Flexible Flip-Flop Timing Model 1
1.2 Hardware Performance Monitoring Methodology 4
1.3 Asynchronous Pipeline Controller 10
1.4 Contributions of this Dissertation 15
2 ANALYSIS AND OPTIMIZATION CONSIDERING FLEXIBLE FLIP-FLOP TIMING MODEL 17
2.1 Preliminaries 17
2.1.1 Terminologies 17
2.1.2 Timing Analysis 20
2.1.3 Clock-to-Q Delay Surface Modeling 21
2.2 Clock-to-Q Delay Interval Analysis 22
2.2.1 Derivation 23
2.2.2 Additional Constraints 26
2.2.3 Analysis: Finding Minimum Clock Period 28
2.2.4 Optimization: Clock Skew Scheduling 30
2.2.5 Scalable Speedup Technique 33
2.3 Experimental Results 37
2.3.1 Application to Minimum Clock Period Finding 37
2.3.2 Application to Clock Skew Scheduling 39
2.3.3 Efficacy of Scalable Speedup Technique 43
2.4 Summary 44
3 HARDWARE PERFORMANCE MONITORING METHODOLOGY AT NTC AND ADVANCED TECHNOLOGY NODE 45
3.1 Overall Flow of Proposed HPM Methodology 45
3.2 Prerequisites to HPM Methodology 47
3.2.1 BEOL Process Variation Modeling 47
3.2.2 Surrogate Model Preparation 49
3.3 HPM Methodology: Design Phase 52
3.3.1 HPM2PV Model Construction 52
3.3.2 Optimization of Monitoring Circuits Configuration 54
3.3.3 PV2CPT Model Construction 58
3.4 HPM Methodology: Post-Silicon Phase 60
3.4.1 Transfer Learning in Silicon Characterization Step 60
3.4.2 Procedures in Volume Production Phase 61
3.5 Experimental Results 62
3.5.1 Experimental Setup 62
3.5.2 Exploration of Monitoring Circuits Configuration 64
3.5.3 Effectiveness of Monitoring Circuits Optimization 66
3.5.4 Considering BEOL PVs and Uncertainty Learning 68
3.5.5 Comparison among Different Prediction Flows 69
3.5.6 Effectiveness of Prediction Model Calibration 71
3.6 Summary 73
4 LIGHTENING ASYNCHRONOUS PIPELINE CONTROLLER 75
4.1 Preliminaries and State-of-the-Art Work 75
4.1.1 Bundled-data vs. Dual-rail Asynchronous Circuits 75
4.1.2 Two-phase vs. Four-phase Bundled-data Protocol 76
4.1.3 Conventional State-of-the-Art Pipeline Controller Template 77
4.2 Delay Path Sharing for Lightening Pipeline Controller Template 78
4.2.1 Synthesizing Sharable Delay Paths 78
4.2.2 Validating Logical Correctness for Sharable Delay Paths 80
4.2.3 Reformulating Timing Constraints of Controller Template 81
4.2.4 Minimally Allocating Delay Buffers 87
4.3 In-depth Pipeline Controller Template Synthesis with Delay Path Reusing 88
4.3.1 Synthesizing Delay Path Units 88
4.3.2 Validating Logical Correctness of Delay Path Units 89
4.3.3 Updating Timing Constraints for Delay Path Units 91
4.3.4 In-depth Synthesis Flow Utilizing Delay Path Units 95
4.4 Experimental Results 99
4.4.1 Environment Setup 99
4.4.2 Piecewise Linear Modeling of Delay Path Unit Area 99
4.4.3 Comparison of Power, Performance, and Area 102
4.5 Summary 107
5 CONCLUSION 109
5.1 Chapter 2 109
5.2 Chapter 3 110
5.3 Chapter 4 110
Abstract (In Korean) 127Docto
Elastic circuits
Elasticity in circuits and systems provides tolerance to variations in computation and communication delays. This paper presents a comprehensive overview of elastic circuits for those designers who are mainly familiar with synchronous design. Elasticity can be implemented both synchronously and asynchronously, although it was traditionally more often associated with asynchronous circuits. This paper shows that synchronous and asynchronous elastic circuits can be designed, analyzed, and optimized using similar techniques. Thus, choices between synchronous and asynchronous implementations are localized and deferred until late in the design process.Peer ReviewedPostprint (published version
Low Power Processor Architectures and Contemporary Techniques for Power Optimization โ A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. ยฉ 2009 ACADEMY PUBLISHER
Dynamic Power Management for Neuromorphic Many-Core Systems
This work presents a dynamic power management architecture for neuromorphic
many core systems such as SpiNNaker. A fast dynamic voltage and frequency
scaling (DVFS) technique is presented which allows the processing elements (PE)
to change their supply voltage and clock frequency individually and
autonomously within less than 100 ns. This is employed by the neuromorphic
simulation software flow, which defines the performance level (PL) of the PE
based on the actual workload within each simulation cycle. A test chip in 28 nm
SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled
from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct
PLs. By measurement of three neuromorphic benchmarks it is shown that the total
PE power consumption can be reduced by 75%, with 80% baseline power reduction
and a 50% reduction of energy per neuron and synapse computation, all while
maintaining temporary peak system performance to achieve biological real-time
operation of the system. A numerical model of this power management model is
derived which allows DVFS architecture exploration for neuromorphics. The
proposed technique is to be used for the second generation SpiNNaker
neuromorphic many core system
Delay Measurements and Self Characterisation on FPGAs
This thesis examines new timing measurement methods for self delay characterisation of Field-Programmable Gate Arrays (FPGAs) components and delay measurement of complex circuits
on FPGAs. Two novel measurement techniques based on analysis of a circuit's output failure
rate and transition probability is proposed for accurate, precise and efficient measurement of
propagation delays. The transition probability based method is especially attractive, since
it requires no modifications in the circuit-under-test and requires little hardware resources,
making it an ideal method for physical delay analysis of FPGA circuits.
The relentless advancements in process technology has led to smaller and denser transistors
in integrated circuits. While FPGA users benefit from this in terms of increased hardware
resources for more complex designs, the actual productivity with FPGA in terms of timing
performance (operating frequency, latency and throughput) has lagged behind the potential
improvements from the improved technology due to delay variability in FPGA components
and the inaccuracy of timing models used in FPGA timing analysis. The ability to measure
delay of any arbitrary circuit on FPGA offers many opportunities for on-chip characterisation
and physical timing analysis, allowing delay variability to be accurately tracked and variation-aware optimisations to be developed, reducing the productivity gap observed in today's FPGA
designs.
The measurement techniques are developed into complete self measurement and characterisation platforms in this thesis, demonstrating their practical uses in actual FPGA hardware for
cross-chip delay characterisation and accurate delay measurement of both complex combinatorial and sequential circuits, further reinforcing their positions in solving the delay variability
problem in FPGAs
A Review of Bayesian Methods in Electronic Design Automation
The utilization of Bayesian methods has been widely acknowledged as a viable
solution for tackling various challenges in electronic integrated circuit (IC)
design under stochastic process variation, including circuit performance
modeling, yield/failure rate estimation, and circuit optimization. As the
post-Moore era brings about new technologies (such as silicon photonics and
quantum circuits), many of the associated issues there are similar to those
encountered in electronic IC design and can be addressed using Bayesian
methods. Motivated by this observation, we present a comprehensive review of
Bayesian methods in electronic design automation (EDA). By doing so, we hope to
equip researchers and designers with the ability to apply Bayesian methods in
solving stochastic problems in electronic circuits and beyond.Comment: 24 pages, a draft version. We welcome comments and feedback, which
can be sent to [email protected]
- โฆ