17,842 research outputs found

    Parallel VLSI Circuit Analysis and Optimization

    Get PDF
    The prevalence of multi-core processors in recent years has introduced new opportunities and challenges to Electronic Design Automation (EDA) research and development. In this dissertation, a few parallel Very Large Scale Integration (VLSI) circuit analysis and optimization methods which utilize the multi-core computing platform to tackle some of the most difficult contemporary Computer-Aided Design (CAD) problems are presented. The first CAD application that is addressed in this dissertation is analyzing and optimizing mesh-based clock distribution network. Mesh-based clock distribution network (also known as clock mesh) is used in high-performance microprocessor designs as a reliable way of distributing clock signals to the entire chip. The second CAD application addressed in this dissertation is the Simulation Program with Integrated Circuit Emphasis (SPICE) like circuit simulation. SPICE simulation is often regarded as the bottleneck of the design flow. Recently, parallel circuit simulation has attracted a lot of attention. The first part of the dissertation discusses circuit analysis techniques. First, a combination of clock network specific model order reduction algorithm and a port sliding scheme is presented to tackle the challenges in analyzing large clock meshes with a large number of clock drivers. Our techniques run much faster than the standard SPICE simulation and existing model order reduction techniques. They also provide a basis for the clock mesh optimization. Then, a hierarchical multi-algorithm parallel circuit simulation (HMAPS) framework is presented as an novel technique of parallel circuit simulation. The inter-algorithm parallelism approach in HMAPS is completely different from the existing intra-algorithm parallel circuit simulation techniques and achieves superlinear speedup in practice. The second part of the dissertation talks about parallel circuit optimization. A modified asynchronous parallel pattern search (APPS) based method which utilizes the efficient clock mesh simulation techniques for the clock driver size optimization problem is presented. Our modified APPS method runs much faster than a continuous optimization method and effectively reduces the clock skew for all test circuits. The third part of the dissertation describes parallel performance modeling and optimization of the HMAPS framework. The performance models and runtime optimization scheme improve the speed of HMAPS further more. The dynamically adapted HMAPS becomes a complete solution for parallel circuit simulation

    메쉬 기반의 클락 네트워크 설계 방법론

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 김태환.The clock distribution network in a synchronous digital circuit delivers a clock signal to every storage element i.e., clock sink in the circuit. However, since the continued technology scaling increases PVT (process-voltage-temperature) variation, the increase of clock skew variation is highly likely to cause performance degradation or system failure at run time. Recently, to mitigate the clock skew variation, many researchers have taken a profound interest in the clock mesh network. However, though the structure of clock mesh network is excellent in tolerating timing variation, it demands significantly high power consumption due to the use of excessive mesh wire and buffer resources. Thus, optimizing the resources required in the mesh clock synthesis while maintaining the variation tolerance is crucially important. The three major tasks that greatly affect the cost of resulting clock mesh are (1) mesh segment allocation, (2) mesh buffer allocation and sizing, and (3) clock sink binding to mesh segments. Previous clock mesh optimization approaches solve the three tasks sequentially, one by one at a time, to manage the run time complexity of the tasks at the expense of losing the quality of results. However, since the three tasks are tightly inter-related, simultaneously optimizing all three tasks is essential, if the run time is ever permitted, to synthesize an economical clock mesh network. In this dissertation, we propose an approach which is able to tackle the problem in an integrated fashion by combining the three tasks into an iterative framework of incremental updates and solving them simultaneously to find a globally optimal allocation of mesh resources while taking into account the clock skew tolerance constraints. The core parts of this dissertation are a precise analysis on the relation among the resource optimization tasks and an establishment of mechanism for effective and efficient integration of the tasks. In particular, to handle the run time problem, we propose a set of speed-up techniques i.e., modeling RC circuit for eliminating redundant matrix multiplications, exploiting sliding window scheme, and fast buffer sizing effect estimation, which are fitted into our context of fast clock skew estimation in mesh resource optimization as well as an invention of early decision policies. In summary, this dissertation presents the efficient design methodology for clock mesh synthesis with consideration on integration of three tasks and reduction of runtime complexity.Abstract i Contents iii List of Figures vi List of Tables x 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contributions of This Dissertation . . . . . . . . . . . . . . . . . . . 3 2 Background 5 2.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Clock Network Topologies . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Design Metrics of Clock Network . . . . . . . . . . . . . . . . . . . 7 2.4 The Effects of Variations on Clock Skew . . . . . . . . . . . . . . . . 9 3 Clock Mesh Synthesis Flow 12 3.1 Elements of Clock Mesh . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Conventional Clock Mesh Synthesis Overview . . . . . . . . . . . . . 13 3.3 Initial Grid Generation . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Mesh Buffer Placement and Sizing . . . . . . . . . . . . . . . . . . . 14 3.5 Clock Mesh Optimization . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Integrated Resource Allocation and Binding in Clock Mesh Synthesis 19 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Framework of Clock Mesh Optimization . . . . . . . . . . . . . . . . 26 4.3.1 Incremental Resource Updates . . . . . . . . . . . . . . . . . 29 4.3.2 Constraints for Variation Tolerance . . . . . . . . . . . . . . 34 4.3.3 Early Decision Policies . . . . . . . . . . . . . . . . . . . . . 38 4.3.4 Time Complexity Analysis . . . . . . . . . . . . . . . . . . . 39 4.4 Fast Clock Skew Estimation Techniques . . . . . . . . . . . . . . . . 40 4.4.1 Partially Reusing Matrix Multiplication for Incremental Updates 41 4.4.2 Adopting Sliding Window Scheme . . . . . . . . . . . . . . . 43 4.4.3 Adjusting Delay Caused by Buffer Resizing . . . . . . . . . . 44 4.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.5.1 Experimental Environments . . . . . . . . . . . . . . . . . . 46 4.5.2 Resource Requirement and Variation Tolerance Comparison . 48 4.5.3 Comparison with Clock Mesh Optimization using Worst Case Timing Analysis of Commercial Tool . . . . . . . . . . . . . 56 4.5.4 Analysis of the Effect of Proposed Techniques . . . . . . . . 58 4.5.5 Run Time Analysis . . . . . . . . . . . . . . . . . . . . . . . 61 4.5.6 Accuracy and Run Time of Fast Clock Skew Estimation . . . 63 4.5.7 Electromigration Analysis . . . . . . . . . . . . . . . . . . . 68 4.5.8 Run-time Analysis in Multi-thread Computing Environment . 70 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5 Conclusion 74 Abstract in Korean 84Docto

    High performance IC clock networks with grid and tree topologies

    Get PDF
    In this dissertation, an essential step in the integrated circuit (IC) physical design flow—the clock network design—is investigated. Clock network design entailsa series of computationally intensive, large-scale design and optimization tasks for the generation and distribution of the clock signal through different topologies. The lack or inefficacy of the automation for implementing high performance clock networks, especially for low-power, high speed and variation-aware implementations, is the main driver for this research. The synthesis and optimization methods for the two most commonly used clock topologies in IC design—the grid topology and the tree topology—are primarily investigated.The clock mesh network, which uses the grid topology, has very low skew variation at the cost of high power dissipation. Two novel clock mesh network designmethodologies are proposed in this dissertation in order to reduce the power dissipation. These are the first methods known in literature that combine clock meshsynthesis with incremental register placement and clock gating for power saving purposes. The application of the proposed automation methods on the emerging resonant rotary clocking technology, which also has the grid topology, is investigated in this dissertation as well.The clock tree topology has the advantage of lower power dissipation compared to other traditional clock topologies (e.g. clock mesh, clock spine, clock tree with cross links) at the cost of increased performance degradation due to on-chip variations. A novel clock tree buffer polarity assignment flow is proposed in this dissertation in order to reduce these effects of on-chip variations on the clock tree topology. The proposed polarity assignment flow is the first work that introduces post-silicon, dynamic reconfigurability for polarity assignment, enabling clock gating for low power operation of the variation-tolerant clock tree networks.Ph.D., Electrical Engineering -- Drexel University, 201

    Circuit Optimization Using Efficient Parallel Pattern Search

    Get PDF
    Circuit optimization is extremely important in order to design today's high performance integrated circuits. As systems become more and more complex, traditional optimization techniques are no longer viable due to the complex and simulation intensive nature of the optimization problem. Two examples of such problems include clock mesh skew reduction and optimization of large analog systems, for example Phase locked loops. Mesh-based clock distribution has been employed in many high-performance microprocessor designs due to its favorable properties such as low clock skew and robustness. However, such clock distributions can become quite complex and may consist of hundreds of nonlinear drivers strongly coupled via a large passive network. While the simulation of clock meshes is already very time consuming, tuning such networks under tight performance constraints is an even daunting task. Same is the case with the phase locked loop. Being composed of multiple individual analog blocks, it is an extremely challenging task to optimize the entire system considering all block level trade-offs. In this work, we address these two challenging optimization problems i.e.; clock mesh skew optimization and PLL locking time reduction. The expensive objective function evaluations and difficulty in getting explicit sensitivity information make these problems intractable to standard optimization methods. We propose to explore the recently developed asynchronous parallel pattern search (APPS) method for efficient driver size tuning. While being a search-based method, APPS not only provides the desirable derivative-free optimization capability, but also is amenable to parallelization and possesses appealing theoretically rigorous convergence properties. In this work it is shown how such a method can lead to powerful parallel optimization of these complex problems with significant runtime and quality advantages over the traditional sequential quadratic programming (SQP) method. It is also shown how design-specific properties and speeding-up techniques can be exploited to make the optimization even more efficient while maintaining the convergence of APPS in a practical sense. In addition, the optimization technique is further enhanced by introducing the feature to handle non-linear constraints through the use of penalty functions. The enhanced method is used for optimizing phase locked loops at the system level

    Variation and power issues in VLSI clock networks

    Get PDF
    Clock Distribution Network (CDN) is an important component of any synchronous logic circuit. The function of CDN is to deliver the clock signal to the clock sinks. Clock skew is defined as the difference in the arrival time of the clock signal at the clock sinks. Higher uncertainty in skew (due to PVT variations) degrades circuit performance by decreasing the maximum possible delay between any two sequential elements. Aggressive frequency scaling has also led to high power consumption especially in CDN. This dissertation addresses variation and power issues in the design of current and potential future CDN. The research detailed in this work presents algorithmic techniques for the following problems: (1) Variation tolerance in useful skew design, (2) Link insertion for buffered clock nets, (3) Methodology and algorithms for rotary clocking and (4) Clock mesh optimization for skew-power trade off. For clock trees this dissertation presents techniques to integrate the different aspects of clock tree synthesis (skew scheduling, abstract topology and layout embedding) into one framework- tolerance to variations. This research addresses the issues involved in inserting cross-links in a buffered clock tree and proposes design criteria to avoid the risk of short-circuit current. Rotary clocking is a promising new clocking scheme that consists of unterminated rings formed by differential transmission lines. Rotary clocking achieves reduction in power dissipation clock skew. This dissertation addresses the issues in adopting current CAD methodology to rotary clocks. Alternative methodology and corresponding algorithmic techniques are detailed. Clock mesh is a popular form of CDN used in high performance systems. The problem of simultaneous sizing and placement of mesh buffers in a clock mesh is addressed. The algorithms presented remove the edges from the clock mesh to trade off skew tolerance for low power. For clock trees as well as link insertion, our experiments indicate significant reduction in clock skew due to variations. For clock mesh, experimental results indicate 18.5% reduction in power with 1.3% delay penalty on a average. In summary, this dissertation details methodologies/algorithms that address two critical issues- variation and power dissipation in current and potential future CDN

    Radiation safety based on the sky shine effect in reactor

    Get PDF
    In the reactor operation, neutrons and gamma rays are the most dominant radiation. As protection, lead and concrete shields are built around the reactor. However, the radiation can penetrate the water shielding inside the reactor pool. This incident leads to the occurrence of sky shine where a physical phenomenon of nuclear radiation sources was transmitted panoramic that extends to the environment. The effect of this phenomenon is caused by the fallout radiation into the surrounding area which causes the radiation dose to increase. High doses of exposure cause a person to have stochastic effects or deterministic effects. Therefore, this study was conducted to measure the radiation dose from sky shine effect that scattered around the reactor at different distances and different height above the reactor platform. In this paper, the analysis of the radiation dose of sky shine effect was measured using the experimental metho

    Fog computing, applications , security and challenges, review

    Get PDF
    The internet of things originates a world where on daily basis objects can join the internet and interchange information and in addition process, store, gather them from the nearby environment, and effectively mediate on it. A remarkable number of services might be imagined by abusing the internet of things. Fog computing which is otherwise called edge computing was introduced in 2012 as a considered is a prioritized choice for the internet of things applications. As fog computing extend services of cloud near to the edge of the network and make possible computations, communications, and storage services in proximity to the end user. Fog computing cannot only provide low latency, location awareness but also enhance real-time applications, quality of services, mobility, security and privacy in the internet of things applications scenarios. In this paper, we will summarize and overview fog computing model architecture, characteristic, similar paradigm and various applications in real-time scenarios such as smart grid, traffic control system and augmented reality. Finally, security challenges are presented
    corecore