2,020 research outputs found

    Statistical library characterization using belief propagation across multiple technology nodes

    Get PDF
    In this paper, we propose a novel flow to enable computationally efficient statistical characterization of delay and slew in standard cell libraries. The distinguishing feature of the proposed method is the usage of a limited combination of output capacitance, input slew rate and supply voltage for the extraction of statistical timing metrics of an individual logic gate. The efficiency of the proposed flow stems from the introduction of a novel, ultra-compact, nonlinear, analytical timing model, having only four universal regression parameters. This novel model facilitates the use of maximum-a-posteriori belief propagation to learn the prior parameter distribution for the parameters of the target technology from past characterizations of library cells belonging to various other technologies, including older ones. The framework then utilises Bayesian inference to extract the new timing model parameters using an ultra-small set of additional timing measurements from the target technology. The proposed method is validated and benchmarked on several production-level cell libraries including a state-of-the-art 14-nm technology node and a variation-aware, compact transistor model. For the same accuracy as the conventional lookup-table approach, this new method achieves at least 15x reduction in simulation runs.Masdar Institute of Science and Technology (Massachusetts Institute of Technology Cooperative Agreement

    Computationally efficient characterization of standard cells for statistical static timing analysis

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 44-45).We propose a computationally efficient statistical static timing analysis (SSTA) technique that addresses intra-die variations at near-threshold to sub-threshold supply voltage, simulated on a scaled 32nm CMOS standard cell library. This technique would characterize the propagation delay and output slew of an individual cell for subsequent timing path analyses. Its efficiency stems from the fact that it only needs to find the delay or output slew in the vicinity of the ?- sigma operating point (where ? = 0 to 3) rather than the entire probability density function of the delay or output slew, as in conventional Monte-Carlo simulations. The algorithm is simulated on combinational logic gates that include inverters, NANDs, and NORs of different sizes. The delay and output slew estimates in most cases differ from the Monte-Carlo results by less than 5%. Higher supply voltage, larger transistor widths, and slower input slews tend to improve delay and output slew estimates. Transistor stacking is found to be the only major source of under-prediction by the SSTA technique. Overall, the cell characterization approach has a substantial computational advantage compared to SPICE-based Monte-Carlo analysis.by Sharon H. Chou.M.Eng

    Multivariate Adaptive Regression Splines in Standard Cell Characterization for Nanometer Technology in Semiconductor

    Get PDF
    Multivariate adaptive regression splines (MARSP) is a nonparametric regression method. It is an adaptive procedure which does not have any predetermined regression model. With that said, the model structure of MARSP is constructed dynamically and adaptively according to the information derived from the data. Because of its ability to capture essential nonlinearities and interactions, MARSP is considered as a great fit for high-dimension problems. This chapter gives an application of MARSP in semiconductor field, more specifically, in standard cell characterization. The objective of standard cell characterization is to create a set of high-quality models of a standard cell library that accurately and efficiently capture cell behaviors. In this chapter, the MARSP method is employed to characterize the gate delay as a function of many parameters including process-voltage-temperature parameters. Due to its ability of capturing essential nonlinearities and interactions, MARSP method helps to achieve significant accuracy improvement

    CAD methodologies for low power and reliable 3D ICs

    Get PDF
    The main objective of this dissertation is to explore and develop computer-aided-design (CAD) methodologies and optimization techniques for reliability, timing performance, and power consumption of through-silicon-via(TSV)-based and monolithic 3D IC designs. The 3D IC technology is a promising answer to the device scaling and interconnect problems that industry faces today. Yet, since multiple dies are stacked vertically in 3D ICs, new problems arise such as thermal, power delivery, and so on. New physical design methodologies and optimization techniques should be developed to address the problems and exploit the design freedom in 3D ICs. Towards the objective, this dissertation includes four research projects. The first project is on the co-optimization of traditional design metrics and reliability metrics for 3D ICs. It is well known that heat removal and power delivery are two major reliability concerns in 3D ICs. To alleviate thermal problem, two possible solutions have been proposed: thermal-through-silicon-vias (T-TSVs) and micro-fluidic-channel (MFC) based cooling. For power delivery, a complex power distribution network is required to deliver currents reliably to all parts of the 3D IC while suppressing the power supply noise to an acceptable level. However, these thermal and power networks pose major challenges in signal routability and congestion. In this project, a co-optimization methodology for signal, power, and thermal interconnects in 3D ICs is presented. The goal of the proposed approach is to improve signal, thermal, and power noise metrics and to provide fast and accurate design space explorations for early design stages. The second project is a study on 3D IC partition. For a 3D IC, the target circuit needs to be partitioned into multiple parts then mapped onto the dies. The partition style impacts design quality such as footprint, wirelength, timing, and so on. In this project, the design methodologies of 3D ICs with different partition styles are demonstrated. For the LEON3 multi-core microprocessor, three partitioning styles are compared: core-level, block-level, and gate-level. The design methodologies for such partitioning styles and their implications on the physical layout are discussed. Then, to perform timing optimizations for 3D ICs, two timing constraint generation methods are demonstrated that lead to different design quality. The third project is on the buffer insertion for timing optimization of 3D ICs. For high performance 3D ICs, it is crucial to perform thorough timing optimizations. Among timing optimization techniques, buffer insertion is known to be the most effective way. The TSVs have a large parasitic capacitance that increases the signal slew and the delay on the downstream. In this project, a slew-aware buffer insertion algorithm is developed that handles full 3D nets and considers TSV parasitics and slew effects on delay. Compared with the well-known van Ginneken algorithm and a commercial tool, the proposed algorithm finds buffering solutions with lower delay values and acceptable runtime overhead. The last project is on the ultra-high-density logic designs for monolithic 3D ICs. The nano-scale 3D interconnects available in monolithic 3D IC technology enable ultra-high-density device integration at the individual transistor-level. The benefits and challenges of monolithic 3D integration technology for logic designs are investigated. First, a 3D standard cell library for transistor-level monolithic 3D ICs is built and their timing and power behavior are characterized. Then, various interconnect options for monolithic 3D ICs that improve design quality are explored. Next, timing-closed, full-chip GDSII layouts are built and iso-performance power comparisons with 2D IC designs are performed. Important design metrics such as area, wirelength, timing, and power consumption are compared among transistor-level monolithic 3D, gate-level monolithic 3D, TSV-based 3D, and traditional 2D designs.PhDCommittee Chair: Lim, Sung Kyu; Committee Member: Bakir, Muhannad; Committee Member: Kim, Hyesoon; Committee Member: Lee, Hsien-Hsin; Committee Member: Mukhopadhyay, Saiba

    Timing Closure in Chip Design

    Get PDF
    Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

    The low area probing detector as a countermeasure against invasive attacks

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksMicroprobing allows intercepting data from on-chip wires as well as injecting faults into data or control lines. This makes it a commonly used attack technique against security-related semiconductors, such as smart card controllers. We present the low area probing detector (LAPD) as an efficient approach to detect microprobing. It compares delay differences between symmetric lines such as bus lines to detect timing asymmetries introduced by the capacitive load of a probe. Compared with state-of-the-art microprobing countermeasures from industry, such as shields or bus encryption, the area overhead is minimal and no delays are introduced; in contrast to probing detection schemes from academia, such as the probe attempt detector, no analog circuitry is needed. We show the Monte Carlo simulation results of mismatch variations as well as process, voltage, and temperature corners on a 65-nm technology and present a simple reliability optimization. Eventually, we show that the detection of state-of-the-art commercial microprobes is possible even under extreme conditions and the margin with respect to false positives is sufficient.Peer ReviewedPostprint (author's final draft

    Design methodologies, models and tools for very-large-scale integration of NEM relay-based circuits

    Get PDF

    A fast and retargetable framework for logic-IP-internal electromigration assessment comprehending advanced waveform effects

    Get PDF
    A new methodology for system-on-chip-level logic-IP-internal electromigration verification is presented in this paper, which significantly improves accuracy by comprehending the impact of the parasitic RC loading and voltage-dependent pin capacitance in the library model. It additionally provides an on-the-fly retargeting capability for reliability constraints by allowing arbitrary specifications of lifetimes, temperatures, voltages, and failure rates, as well as interoperability of the IPs across foundries. The characterization part of the methodology is expedited through the intelligent IP-response modeling. The ultimate benefit of the proposed approach is demonstrated on a 28-nm design by providing an on-the-fly specification of retargeted reliability constraints. The results show a high correlation with SPICE and were obtained with an order of magnitude reduction in the verification runtime.Peer ReviewedPostprint (author's final draft
    corecore