66 research outputs found

    Integrated Circuits Parasitic Capacitance Extraction Using Machine Learning and its Application to Layout Optimization

    Get PDF
    The impact of parasitic elements on the overall circuit performance keeps increasing from one technology generation to the next. In advanced process nodes, the parasitic effects dominate the overall circuit performance. As a result, the accuracy requirements of parasitic extraction processes significantly increased, especially for parasitic capacitance extraction. Existing parasitic capacitance extraction tools face many challenges to cope with such new accuracy requirements that are set by semiconductor foundries (\u3c 5% error). Although field-solver methods can meet such requirements, they are very slow and have a limited capacity. The other alternative is the rule-based parasitic capacitance extraction methods, which are faster and have a high capacity; however, they cannot consistently provide good accuracy as they use a pre-characterized library of capacitance formulas that cover a limited number of layout patterns. On the other hand, the new parasitic extraction accuracy requirements also added more challenges on existing parasitic-aware routing optimization methods, where simplified parasitic models are used to optimize layouts. This dissertation provides new solutions for interconnect parasitic capacitance extraction and parasitic-aware routing optimization methodologies in order to cope with the new accuracy requirements of advanced process nodes as follows. First, machine learning compact models are developed in rule-based extractors to predict parasitic capacitances of cross-section layout patterns efficiently. The developed models mitigate the problems of the pre-characterized library approach, where each compact model is designed to extract parasitic capacitances of cross-sections of arbitrary distributed metal polygons that belong to a specific set of metal layers (i.e., layer combination) efficiently. Therefore, the number of covered layout patterns significantly increased. Second, machine learning compact models are developed to predict parasitic capacitances of middle-end-of-line (MEOL) layers around FINFETs and MOSFETs. Each compact model extracts parasitic capacitances of 3D MEOL patterns of a specific device type regardless of its metal polygons distribution. Therefore, the developed MEOL models can replace field-solvers in extracting MEOL patterns. Third, a novel accuracy-based hybrid parasitic capacitance extraction method is developed. The proposed hybrid flow divides a layout into windows and extracts the parasitic capacitances of each window using one of three parasitic capacitance extraction methods that include: 1) rule-based; 2) novel deep-neural-networks-based; and 3) field-solver methods. This hybrid methodology uses neural-networks classifiers to determine an appropriate extraction method for each window. Moreover, as an intermediate parasitic capacitance extraction method between rule-based and field-solver methods, a novel deep-neural-networks-based extraction method is developed. This intermediate level of accuracy and speed is needed since using only rule-based and field-solver methods (for hybrid extraction) results in using field-solver most of the time for any required high accuracy extraction. Eventually, a parasitic-aware layout routing optimization and analysis methodology is implemented based on an incremental parasitic extraction and a fast optimization methodology. Unlike existing flows that do not provide a mechanism to analyze the impact of modifying layout geometries on a circuit performance, the proposed methodology provides novel sensitivity circuit models to analyze the integrity of signals in layout routes. Such circuit models are based on an accurate matrix circuit representation, a cost function, and an accurate parasitic sensitivity extraction. The circuit models identify critical parasitic elements along with the corresponding layout geometries in a certain route, where they measure the sensitivity of a route’s performance to corresponding layout geometries very fast. Moreover, the proposed methodology uses a nonlinear programming technique to optimize problematic routes with pre-determined degrees of freedom using the proposed circuit models. Furthermore, it uses a novel incremental parasitic extraction method to extract parasitic elements of modified geometries efficiently, where the incremental extraction is used as a part of the routing optimization process to improve the optimization runtime and increase the optimization accuracy

    Fast and Robust Design of CMOS VCO for Optimal Performance

    Get PDF
    The exponentially growing design complexity with technological advancement calls for a large scope in the analog and mixed signal integrated circuit design automation. In the automation process, performance optimization under different environmental constraints is of prime importance. The analog integrated circuits design strongly requires addressing multiple competing performance objectives for optimization with ability to find global solutions in a constrained environment. The integrated circuit (IC) performances are significantly affected by the device, interconnect and package parasitics. Inclusion of circuit parasitics in the design phase along with performance optimization has become a bare necessity for faster prototyping. Besides this, the fabrication process variations have a predominant effect on the circuit performance, which is directly linked to the acceptability of manufactured integrated circuit chips. This necessitates a manufacturing process tolerant design. The development of analog IC design methods exploiting the computational intelligence of evolutionary techniques for optimization, integrating the circuit parasitic in the design optimization process in a more meaningful way and developing process fluctuation tolerant optimal design is the central theme of this thesis. Evolutionary computing multi-objective optimization techniques such as Non-dominated Sorting Genetic Algorithm-II and Infeasibility Driven Evolutionary Algorithm are used in this thesis for the development of parasitic aware design techniques for analog ICs. The realistic physical and process constraints are integrated in the proposed design technique. A fast design methodology based on one of the efficient optimization technique is developed and an extensive worst case process variation analysis is performed. This work also presents a novel process corner variation aware analog IC design methodology, which would effectively increase the yield of chips in the acceptable performance window. The performance of all the presented techniques is demonstrated through the application to CMOS ring oscillators, current starved and xi differential voltage controlled oscillators, designed in Cadence Virtuoso Analog Design Environment

    Modeling and Analysis of Large-Scale On-Chip Interconnects

    Get PDF
    As IC technologies scale to the nanometer regime, efficient and accurate modeling and analysis of VLSI systems with billions of transistors and interconnects becomes increasingly critical and difficult. VLSI systems impacted by the increasingly high dimensional process-voltage-temperature (PVT) variations demand much more modeling and analysis efforts than ever before, while the analysis of large scale on-chip interconnects that requires solving tens of millions of unknowns imposes great challenges in computer aided design areas. This dissertation presents new methodologies for addressing the above two important challenging issues for large scale on-chip interconnect modeling and analysis: In the past, the standard statistical circuit modeling techniques usually employ principal component analysis (PCA) and its variants to reduce the parameter dimensionality. Although widely adopted, these techniques can be very limited since parameter dimension reduction is achieved by merely considering the statistical distributions of the controlling parameters but neglecting the important correspondence between these parameters and the circuit performances (responses) under modeling. This dissertation presents a variety of performance-oriented parameter dimension reduction methods that can lead to more than one order of magnitude parameter reduction for a variety of VLSI circuit modeling and analysis problems. The sheer size of present day power/ground distribution networks makes their analysis and verification tasks extremely runtime and memory inefficient, and at the same time, limits the extent to which these networks can be optimized. Given today?s commodity graphics processing units (GPUs) that can deliver more than 500 GFlops (Flops: floating point operations per second). computing power and 100GB/s memory bandwidth, which are more than 10X greater than offered by modern day general-purpose quad-core microprocessors, it is very desirable to convert the impressive GPU computing power to usable design automation tools for VLSI verification. In this dissertation, for the first time, we show how to exploit recent massively parallel single-instruction multiple-thread (SIMT) based graphics processing unit (GPU) platforms to tackle power grid analysis with very promising performance. Our GPU based network analyzer is capable of solving tens of millions of power grid nodes in just a few seconds. Additionally, with the above GPU based simulation framework, more challenging three-dimensional full-chip thermal analysis can be solved in a much more efficient way than ever before

    SPICE²: A Spatial, Parallel Architecture for Accelerating the Spice Circuit Simulator

    Get PDF
    Spatial processing of sparse, irregular floating-point computation using a single FPGA enables up to an order of magnitude speedup (mean 2.8X speedup) over a conventional microprocessor for the SPICE circuit simulator. We deliver this speedup using a hybrid parallel architecture that spatially implements the heterogeneous forms of parallelism available in SPICE. We decompose SPICE into its three constituent phases: Model-Evaluation, Sparse Matrix-Solve, and Iteration Control and parallelize each phase independently. We exploit data-parallel device evaluations in the Model-Evaluation phase, sparse dataflow parallelism in the Sparse Matrix-Solve phase and compose the complete design in streaming fashion. We name our parallel architecture SPICE²: Spatial Processors Interconnected for Concurrent Execution for accelerating the SPICE circuit simulator. We program the parallel architecture with a high-level, domain-specific framework that identifies, exposes and exploits parallelism available in the SPICE circuit simulator. This design is optimized with an auto-tuner that can scale the design to use larger FPGA capacities without expert intervention and can even target other parallel architectures with the assistance of automated code-generation. This FPGA architecture is able to outperform conventional processors due to a combination of factors including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and overlapped processing of the control algorithms. We demonstrate that we can independently accelerate Model-Evaluation by a mean factor of 6.5X(1.4--23X) across a range of non-linear device models and Matrix-Solve by 2.4X(0.6--13X) across various benchmark matrices while delivering a mean combined speedup of 2.8X(0.2--11X) for the two together when comparing a Xilinx Virtex-6 LX760 (40nm) with an Intel Core i7 965 (45nm). With our high-level framework, we can also accelerate Single-Precision Model-Evaluation on NVIDIA GPUs, ATI GPUs, IBM Cell, and Sun Niagara 2 architectures. We expect approaches based on exploiting spatial parallelism to become important as frequency scaling slows down and modern processing architectures turn to parallelism (\eg multi-core, GPUs) due to constraints of power consumption. This thesis shows how to express, exploit and optimize spatial parallelism for an important class of problems that are challenging to parallelize.</p

    2022 roadmap on neuromorphic computing and engineering

    Full text link
    Modern computation based on von Neumann architecture is now a mature cutting-edge science. In the von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018^{18} calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community

    Scalable Analysis, Verification and Design of IC Power Delivery

    Get PDF
    Due to recent aggressive process scaling into the nanometer regime, power delivery network design faces many challenges that set more stringent and specific requirements to the EDA tools. For example, from the perspective of analysis, simulation efficiency for large grids must be improved and the entire network with off-chip models and nonlinear devices should be able to be analyzed. Gated power delivery networks have multiple on/off operating conditions that need to be fully verified against the design requirements. Good power delivery network designs not only have to save the wiring resources for signal routing, but also need to have the optimal parameters assigned to various system components such as decaps, voltage regulators and converters. This dissertation presents new methodologies to address these challenging problems. At first, a novel parallel partitioning-based approach which provides a flexible network partitioning scheme using locality is proposed for power grid static analysis. In addition, a fast CPU-GPU combined analysis engine that adopts a boundary-relaxation method to encompass several simulation strategies is developed to simulate power delivery networks with off-chip models and active circuits. These two proposed analysis approaches can achieve scalable simulation runtime. Then, for gated power delivery networks, the challenge brought by the large verification space is addressed by developing a strategy that efficiently identifies a number of candidates for the worst-case operating condition. The computation complexity is reduced from O(2^N) to O(N). At last, motivated by a proposed two-level hierarchical optimization, this dissertation presents a novel locality-driven partitioning scheme to facilitate divide-and-conquer-based scalable wire sizing for large power delivery networks. Simultaneous sizing of multiple partitions is allowed which leads to substantial runtime improvement. Moreover, the electric interactions between active regulators/converters and passive networks and their influences on key system design specifications are analyzed comprehensively. With the derived design insights, the system-level co-design of a complete power delivery network is facilitated by an automatic optimization flow. Results show significant performance enhancement brought by the co-design

    Rapid SoC Design: On Architectures, Methodologies and Frameworks

    Full text link
    Modern applications like machine learning, autonomous vehicles, and 5G networking require an order of magnitude boost in processing capability. For several decades, chip designers have relied on Moore’s Law - the doubling of transistor count every two years to deliver improved performance, higher energy efficiency, and an increase in transistor density. With the end of Dennard’s scaling and a slowdown in Moore’s Law, system architects have developed several techniques to deliver on the traditional performance and power improvements we have come to expect. More recently, chip designers have turned towards heterogeneous systems comprised of more specialized processing units to buttress the traditional processing units. These specialized units improve the overall performance, power, and area (PPA) metrics across a wide variety of workloads and applications. While the GPU serves as a classical example, accelerators for machine learning, approximate computing, graph processing, and database applications have become commonplace. This has led to an exponential growth in the variety (and count) of these compute units found in modern embedded and high-performance computing platforms. The various techniques adopted to combat the slowing of Moore’s Law directly translates to an increase in complexity for modern system-on-chips (SoCs). This increase in complexity in turn leads to an increase in design effort and validation time for hardware and the accompanying software stacks. This is further aggravated by fabrication challenges (photo-lithography, tooling, and yield) faced at advanced technology nodes (below 28nm). The inherent complexity in modern SoCs translates into increased costs and time-to-market delays. This holds true across the spectrum, from mobile/handheld processors to high-performance data-center appliances. This dissertation presents several techniques to address the challenges of rapidly birthing complex SoCs. The first part of this dissertation focuses on foundations and architectures that aid in rapid SoC design. It presents a variety of architectural techniques that were developed and leveraged to rapidly construct complex SoCs at advanced process nodes. The next part of the dissertation focuses on the gap between a completed design model (in RTL form) and its physical manifestation (a GDS file that will be sent to the foundry for fabrication). It presents methodologies and a workflow for rapidly walking a design through to completion at arbitrary technology nodes. It also presents progress on creating tools and a flow that is entirely dependent on open-source tools. The last part presents a framework that not only speeds up the integration of a hardware accelerator into an SoC ecosystem, but emphasizes software adoption and usability.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168119/1/ajayi_1.pd
    corecore