65 research outputs found

    Clock routing for high performance microprocessor designs.

    Get PDF
    Tian, Haitong.Chinese abstract is on unnumbered page.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 65-74).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Our Contributions --- p.2Chapter 1.3 --- Organization of the Thesis --- p.3Chapter 2 --- Background Study --- p.4Chapter 2.1 --- Traditional Clock Routing Problem --- p.4Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20Chapter 2.3.2 --- Spine Structure --- p.20Chapter 2.3.3 --- Hybrid Structure --- p.21Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22Chapter 2.5 --- Limitations of the Previous Work --- p.24Chapter 3 --- Post-Grid Clock Routing Problem --- p.26Chapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Problem Definition --- p.27Chapter 3.3 --- Our Approach --- p.30Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36Chapter 3.4 --- Experimental Results --- p.39Chapter 3.4.1 --- Experiment Setup --- p.39Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41Chapter 3.4.4 --- Lowest Achievable Delays --- p.42Chapter 3.4.5 --- Simulation Results --- p.42Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46Chapter 4.2.1 --- Problem Ports Identification --- p.47Chapter 4.2.2 --- Non-Tree Construction --- p.47Chapter 4.2.3 --- Wire Link Selection --- p.48Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51Chapter 4.5 --- Experimental Results --- p.51Chapter 4.5.1 --- Experiment Setup --- p.51Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52Chapter 4.5.3 --- Lowest Achievable Delays --- p.53Chapter 4.5.4 --- Results on New Benchmarks --- p.53Chapter 4.5.5 --- Simulation Results --- p.55Chapter 5 --- Efficient Partitioning-based Extension --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Partition-based Extension --- p.58Chapter 5.3 --- Experimental Results --- p.61Chapter 5.3.1 --- Experiment Setup --- p.61Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61Chapter 6 --- Conclusion --- p.63Bibliography --- p.6

    Clock Distribution Network Optimization by Sequential Quadratic Programing

    Get PDF
    Clock mesh is widely used in microprocessor designs for achieving low clock skew and high process variation tolerance. Clock mesh optimization is a very diffcult problem to solve because it has a highly connected structure and requires accurate delay models which are computationally expensive. Existing methods on clock network optimization are either restricted to clock trees, which are easy to be separated into smaller problems, or naive heuristics based on crude delay models. A clock mesh sizing algorithm, which is aimed to minimize total mesh wire area with consideration of clock skew constraints, has been proposed in this research work. This algorithm is a systematic solution search through rigorous Sequential Quadratic Programming (SQP). The SQP is guided by an efficient adjoint sensitivity analysis which has near-SPICE(Simulation Program for Integrated Circuits Emphasis)-level accuracy and faster-than-SPICE speed. Experimental results on various benchmark circuits indicate that this algorithm leads to substantial wire area reduction while maintaining low clock skew in the clock mesh. The reduction in mesh area achieved is about 33%

    Temperature-aware clock tree synthesis considering spatiotemporal hot spot correlations

    Full text link
    Temperature variation in microprocessors is a workload dependent problem. In such a design, the clock skew should be minimized with respect to temperature variation. Existing work has studied clock tree embedding perturbation consid-ering time variant temperature variation. There is no existing method that can reduce skew variation. This paper develops an efficient yet effective simultaneous hotspot avoid embedding and thermal aware routing (TMST) method, where hotspot embedding avoid tree topology located in area with high temperature possibility and thermal aware routing reduce skew in tree path with more smooth temperature area. With a thermally tolerable tree structure, our method can reduce not only delay skew but also skew variation (skew violation range). Compared with existing temperature-aware clock tree method, our TMST solution reduces skew variation by 2X compared with the Greedy-DME (GDME) method of Edahiro and existing thermal aware clock synthesis TACO and PECO. With the scale from 100 down to 1 temperature maps, our TMST also guarantees the smallest wire length overflow. TMST reduces the worst case skew up to 4X than PECO and 5X than TACO. I

    메쉬 기반의 클락 네트워크 설계 방법론

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 김태환.The clock distribution network in a synchronous digital circuit delivers a clock signal to every storage element i.e., clock sink in the circuit. However, since the continued technology scaling increases PVT (process-voltage-temperature) variation, the increase of clock skew variation is highly likely to cause performance degradation or system failure at run time. Recently, to mitigate the clock skew variation, many researchers have taken a profound interest in the clock mesh network. However, though the structure of clock mesh network is excellent in tolerating timing variation, it demands significantly high power consumption due to the use of excessive mesh wire and buffer resources. Thus, optimizing the resources required in the mesh clock synthesis while maintaining the variation tolerance is crucially important. The three major tasks that greatly affect the cost of resulting clock mesh are (1) mesh segment allocation, (2) mesh buffer allocation and sizing, and (3) clock sink binding to mesh segments. Previous clock mesh optimization approaches solve the three tasks sequentially, one by one at a time, to manage the run time complexity of the tasks at the expense of losing the quality of results. However, since the three tasks are tightly inter-related, simultaneously optimizing all three tasks is essential, if the run time is ever permitted, to synthesize an economical clock mesh network. In this dissertation, we propose an approach which is able to tackle the problem in an integrated fashion by combining the three tasks into an iterative framework of incremental updates and solving them simultaneously to find a globally optimal allocation of mesh resources while taking into account the clock skew tolerance constraints. The core parts of this dissertation are a precise analysis on the relation among the resource optimization tasks and an establishment of mechanism for effective and efficient integration of the tasks. In particular, to handle the run time problem, we propose a set of speed-up techniques i.e., modeling RC circuit for eliminating redundant matrix multiplications, exploiting sliding window scheme, and fast buffer sizing effect estimation, which are fitted into our context of fast clock skew estimation in mesh resource optimization as well as an invention of early decision policies. In summary, this dissertation presents the efficient design methodology for clock mesh synthesis with consideration on integration of three tasks and reduction of runtime complexity.Abstract i Contents iii List of Figures vi List of Tables x 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contributions of This Dissertation . . . . . . . . . . . . . . . . . . . 3 2 Background 5 2.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Clock Network Topologies . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Design Metrics of Clock Network . . . . . . . . . . . . . . . . . . . 7 2.4 The Effects of Variations on Clock Skew . . . . . . . . . . . . . . . . 9 3 Clock Mesh Synthesis Flow 12 3.1 Elements of Clock Mesh . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Conventional Clock Mesh Synthesis Overview . . . . . . . . . . . . . 13 3.3 Initial Grid Generation . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Mesh Buffer Placement and Sizing . . . . . . . . . . . . . . . . . . . 14 3.5 Clock Mesh Optimization . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Integrated Resource Allocation and Binding in Clock Mesh Synthesis 19 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Framework of Clock Mesh Optimization . . . . . . . . . . . . . . . . 26 4.3.1 Incremental Resource Updates . . . . . . . . . . . . . . . . . 29 4.3.2 Constraints for Variation Tolerance . . . . . . . . . . . . . . 34 4.3.3 Early Decision Policies . . . . . . . . . . . . . . . . . . . . . 38 4.3.4 Time Complexity Analysis . . . . . . . . . . . . . . . . . . . 39 4.4 Fast Clock Skew Estimation Techniques . . . . . . . . . . . . . . . . 40 4.4.1 Partially Reusing Matrix Multiplication for Incremental Updates 41 4.4.2 Adopting Sliding Window Scheme . . . . . . . . . . . . . . . 43 4.4.3 Adjusting Delay Caused by Buffer Resizing . . . . . . . . . . 44 4.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.5.1 Experimental Environments . . . . . . . . . . . . . . . . . . 46 4.5.2 Resource Requirement and Variation Tolerance Comparison . 48 4.5.3 Comparison with Clock Mesh Optimization using Worst Case Timing Analysis of Commercial Tool . . . . . . . . . . . . . 56 4.5.4 Analysis of the Effect of Proposed Techniques . . . . . . . . 58 4.5.5 Run Time Analysis . . . . . . . . . . . . . . . . . . . . . . . 61 4.5.6 Accuracy and Run Time of Fast Clock Skew Estimation . . . 63 4.5.7 Electromigration Analysis . . . . . . . . . . . . . . . . . . . 68 4.5.8 Run-time Analysis in Multi-thread Computing Environment . 70 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5 Conclusion 74 Abstract in Korean 84Docto

    Synthesis Methodologies for Robust and Reconfigurable Clock Networks

    Get PDF
    In today\u27s aggressively scaled technology nodes, billions of transistors are packaged into a single integrated circuit. Electronic Design Automation (EDA) tools are needed to automatically assemble the transistors into a functioning system. One of the most important design steps in the physical synthesis is the design of the clock network. The clock network delivers a synchronizing clock signal to each sequential element. The clock signal is required to be delivered meeting timing constraints under variations and in multiple operating modes. Synthesizing such clock networks is becoming increasingly difficult with the complex power management methodologies and severe manufacturing variations. Clock network synthesis is an important problem because it has a direct impact on the functional correctness, the maximum operating frequency, and the overall power consumption of each synchronous integrated circuit. In this dissertation, we proposed synthesis methodologies for robust and reconfigurable clock networks. We have made three contributions to this topic. First, we have proposed a clock network optimization framework that can achieve better timing quality than previous frameworks. Our proposed framework improves timing quality by reducing the propagation delay on critical paths in a clock network using buffer sizing and layer assignment. Second, we have proposed a clock tree synthesis methodology that integrates the clock tree synthesis with the clock tree optimization. The methodology improves timing quality by avoiding to synthesize clock trees with topologies that are sensitive to variations. Third, we have proposed a clock network that can reconfigure the topology based on the active mode of operation. Lastly, we conclude the dissertation with future research directions

    Real-time human ambulation, activity, and physiological monitoring:taxonomy of issues, techniques, applications, challenges and limitations

    Get PDF
    Automated methods of real-time, unobtrusive, human ambulation, activity, and wellness monitoring and data analysis using various algorithmic techniques have been subjects of intense research. The general aim is to devise effective means of addressing the demands of assisted living, rehabilitation, and clinical observation and assessment through sensor-based monitoring. The research studies have resulted in a large amount of literature. This paper presents a holistic articulation of the research studies and offers comprehensive insights along four main axes: distribution of existing studies; monitoring device framework and sensor types; data collection, processing and analysis; and applications, limitations and challenges. The aim is to present a systematic and most complete study of literature in the area in order to identify research gaps and prioritize future research directions

    A JavaScript Framework Comparison Based on Benchmarking Software Metrics and Environment Configuration

    Get PDF
    JavaScript is a client-side programming language that can be used in multi-platform applications. It controls HTML and CSS to manipulate page behaviours and is widely used in most websites over the internet. JavaScript frameworks are structures made to help web developers build web applications faster by offering features that enhance the user interaction with the web page. An increasing number of JavaScript frameworks have been released in recent years in the market to help front-end developers build applications in a shorter space of time. Decision makers in software companies have been struggling to determine which frameworks are best suited for a specific project. This work investigates the actual state-of-the-art of JavaScript framework comparison, and it proposes metrics and methods that could help developers when choosing a JavaScript framework. In this work, a benchmark framework executes tasks to test the efficiency of three JavaScript frameworks (AngularJS, Aurelia, and Ember). The research shows the impact of the environment (CPU usage and network connectivity) on JavaScript frameworks

    The Improvement of Multi-Satellite Orbit Determination Through the Incorporation of Intersatellite Ranging Observations

    Get PDF
    For many satellite remote sensing and communications missions, particularly those involving a formation or constellation of satellites, having precise knowledge of the satellites’ positions in both an absolute and relative sense is essential. However, the capabilities of Global Navigation Satellite Systems (GNSS)-based precise orbit determination (POD) alone may not be enough to fulfill the mission’s requirements. This thesis examines potential gains to POD when additional Intersatellite Range (ISR) observations (range magnitude only, not range direction or rate) are combined with standard GNSS observables. These ISR observations can be obtained from simple radio frequency (RF) or optical sensors. The methodology behind the combination approach is described and illustrated through a series of simulated case studies involving multiple satellites in low Earth orbit (LEO) using realistic hardware-derived (where possible) measurement noise. The results demonstrate that substantial improvements (factor of two or better) in the POD of the constellation satellites can be obtained with even intermittent ranging measurements, and with only millimeter-level ranging precision. This improved positioning capability enables new mission concepts for small-satellite constellations and formations, and makes these multi-satellite systems resilient to disruptions in GNSS signal availability. This GNSS-denial could be due to a variety of factors, such as intermittent or total hardware failure, power-related duty cycling, or ground-based jamming. Results show that under appropriate phasing of periodic GNSS-denial, combined with the new information from the ISR observations, POD levels approaching the non-GNSS-denied case can be achieved. For the cases of region-specific or single-satellite total GNSS-denial, constellations with ISR capability can be designed to completely compensate for the loss of GNSS observations and perform at levels better than with GNSS alone. Furthermore, the GNSS-denied case has an extended application for providing ISR-only POD for constellations around planetary bodies through the inversion of the invariant non-spherical gravity fields. Case studies are presented using high resolution invariant Earth and lunar gravity fields. In these example cases, ISR-only POD is demonstrated at the sub-meter level with the same millimeter precision of ISR. This research provides opportunities for new mission concepts that require precise positioning, improvements to mission operations, and enables new paradigms for orbit determination without access to GNSS.Ph.D
    corecore