7 research outputs found
Statistical static timing analysis of nonzero clock skew circuit
As microprocessor and ASIC manufacturers continue to push the limits of transistor sizing into the sub-100nm regime, variations in the manufacturing process lead to increased uncertainty about the exact geometry and performance of the resulting devices. Traditional corner-based Static Timing Analysis (STA) assumes worst-case values for process parameters such as transistor channel length and threshold voltage when verifying integrated circuit timing performance. This has become unrea-sonably pessimistic and causes over-design that degrades full-chip performance, wastes engineering effort, and erodes profits while providing negligible yield improvement. Recently, Statistical Static Timing Analysis (SSTA) methods, which model process variations statistically as probability distribution functions (PDFs) rather than deterministically, have emerged to more accurately portray integrated circuit performance. This analysis has been thoroughly performed on traditional zero clock skew circuits where the synchronizing clock signal is assumed to arrive in phase with respect to each register. However, designers will often schedule the clock skew to different registers in order to decrease the minimum clock period of the entire circuit. Clock skew scheduling (CSS) imparts very different timing constraints that are based, in part, on the topology of the circuit. In this thesis, SSTA is applied to nonzero clock skew circuits in order to determine the accuracy improvement relative to their zero skew counterparts, and also to assess how the results of skew scheduling might be impacted with more accurate statistical modeling. For 99.7% timing yield (3 variation), SSTA is observed to improve the accuracy, and therefore increase the timing margin, of nonzero clock skew circuits by up to 2.5x, and on average by 1.3x, the amount seen by zero skew circuits.M.S., Computer Engineering -- Drexel University, 200
Efficient Monte Carlo Based Methods for Variability Aware Analysis and Optimization of Digital Circuits.
Process variability is of increasing concern in modern nanometer-scale CMOS. The
suitability of Monte Carlo based algorithms for efficient analysis and optimization of
digital circuits under variability is explored in this work. Random sampling based Monte
Carlo techniques incur high cost of computation, due to the large sample size required to
achieve target accuracy. This motivates the need for intelligent sample selection
techniques to reduce the number of samples. As these techniques depend on information
about the system under analysis, there is a need to tailor the techniques to fit the specific
application context. We propose efficient smart sampling based techniques for timing and
leakage power consumption analysis of digital circuits. For the case of timing analysis, we
show that the proposed method requires 23.8X fewer samples on average to achieve
comparable accuracy as a random sampling approach, for benchmark circuits studied. It is
further illustrated that the parallelism available in such techniques can be exploited using
parallel machines, especially Graphics Processing Units. Here, we show that SH-QMC
implemented on a Multi GPU is twice as fast as a single STA on a CPU for benchmark
circuits considered. Next we study the possibility of using such information from
statistical analysis to optimize digital circuits under variability, for example to achieve
minimum area on silicon though gate sizing while meeting a timing constraint. Though
several techniques to optimize circuits have been proposed in literature, it is not clear how
much gains are obtained in these approaches specifically through utilization of statistical
information. Therefore, an effective lower bound computation technique is proposed to
enable efficient comparison of statistical design optimization techniques. It is shown that
even techniques which use only limited statistical information can achieve results to
within 10% of the proposed lower bound. We conclude that future optimization research
should shift focus from use of more statistical information to achieving more efficiency
and parallelism to obtain speed ups.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78936/1/tvvin_1.pd
Analysis and Design of Resilient VLSI Circuits
The reliable operation of Integrated Circuits (ICs) has become increasingly difficult to
achieve in the deep sub-micron (DSM) era. With continuously decreasing device feature
sizes, combined with lower supply voltages and higher operating frequencies, the noise
immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming
more vulnerable to noise effects such as crosstalk, power supply variations and radiation-induced
soft errors. Among these noise sources, soft errors (or error caused by radiation
particle strikes) have become an increasingly troublesome issue for memory arrays as well
as combinational logic circuits. Also, in the DSM era, process variations are increasing
at an alarming rate, making it more difficult to design reliable VLSI circuits. Hence, it
is important to efficiently design robust VLSI circuits that are resilient to radiation particle
strikes and process variations. The work presented in this dissertation presents several
analysis and design techniques with the goal of realizing VLSI circuits which are tolerant
to radiation particle strikes and process variations.
This dissertation consists of two parts. The first part proposes four analysis and two
design approaches to address radiation particle strikes. The analysis techniques for the
radiation particle strikes include: an approach to analytically determine the pulse width
and the pulse shape of a radiation induced voltage glitch in combinational circuits, a technique
to model the dynamic stability of SRAMs, and a 3D device-level analysis of the
radiation tolerance of voltage scaled circuits. Experimental results demonstrate that the proposed techniques for analyzing radiation particle strikes in combinational circuits and
SRAMs are fast and accurate compared to SPICE. Therefore, these analysis approaches
can be easily integrated in a VLSI design flow to analyze the radiation tolerance of such
circuits, and harden them early in the design flow. From 3D device-level analysis of the radiation
tolerance of voltage scaled circuits, several non-intuitive observations are made and
correspondingly, a set of guidelines are proposed, which are important to consider to realize
radiation hardened circuits. Two circuit level hardening approaches are also presented
to harden combinational circuits against a radiation particle strike. These hardening approaches
significantly improve the tolerance of combinational circuits against low and very
high energy radiation particle strikes respectively, with modest area and delay overheads.
The second part of this dissertation addresses process variations. A technique is developed
to perform sensitizable statistical timing analysis of a circuit, and thereby improve the
accuracy of timing analysis under process variations. Experimental results demonstrate that
this technique is able to significantly reduce the pessimism due to two sources of inaccuracy
which plague current statistical static timing analysis (SSTA) tools. Two design approaches
are also proposed to improve the process variation tolerance of combinational circuits and
voltage level shifters (which are used in circuits with multiple interacting power supply
domains), respectively. The variation tolerant design approach for combinational circuits
significantly improves the resilience of these circuits to random process variations, with a
reduction in the worst case delay and low area penalty. The proposed voltage level shifter
is faster, requires lower dynamic power and area, has lower leakage currents, and is more
tolerant to process variations, compared to the best known previous approach.
In summary, this dissertation presents several analysis and design techniques which
significantly augment the existing work in the area of resilient VLSI circuit design
Algorithmic techniques for nanometer VLSI design and manufacturing closure
As Very Large Scale Integration (VLSI) technology moves to the nanoscale
regime, design and manufacturing closure becomes very difficult to achieve due to
increasing chip and power density. Imperfections due to process, voltage and temperature variations aggravate the problem. Uncertainty in electrical characteristic of
individual device and wire may cause significant performance deviations or even functional failures. These impose tremendous challenges to the continuation of Moore's
law as well as the growth of semiconductor industry.
Efforts are needed in both deterministic design stage and variation-aware design
stage. This research proposes various innovative algorithms to address both stages for
obtaining a design with high frequency, low power and high robustness. For deterministic optimizations, new buffer insertion and gate sizing techniques are proposed. For
variation-aware optimizations, new lithography-driven and post-silicon tuning-driven
design techniques are proposed.
For buffer insertion, a new slew buffering formulation is presented and is proved
to be NP-hard. Despite this, a highly efficient algorithm which runs > 90x faster
than the best alternatives is proposed. The algorithm is also extended to handle
continuous buffer locations and blockages.
For gate sizing, a new algorithm is proposed to handle discrete gate library in
contrast to unrealistic continuous gate library assumed by most existing algorithms. Our approach is a continuous solution guided dynamic programming approach, which
integrates the high solution quality of dynamic programming with the short runtime
of rounding continuous solution.
For lithography-driven optimization, the problem of cell placement considering
manufacturability is studied. Three algorithms are proposed to handle cell flipping
and relocation. They are based on dynamic programming and graph theoretic approaches, and can provide different tradeoff between variation reduction and wire-
length increase.
For post-silicon tuning-driven optimization, the problem of unified adaptivity
optimization on logical and clock signal tuning is studied, which enables us to significantly save resources. The new algorithm is based on a novel linear programming
formulation which is solved by an advanced robust linear programming technique.
The continuous solution is then discretized using binary search accelerated dynamic
programming, batch based optimization, and Latin Hypercube sampling based fast
simulation
Robust Design of Variation-Sensitive Digital Circuits
The nano-age has already begun, where typical feature dimensions are smaller than 100nm. The operating frequency is expected to increase up to
12 GHz, and a single chip will contain over 12 billion transistors in 2020, as given by the International Technology Roadmap for Semiconductors
(ITRS) initiative. ITRS also predicts that the scaling of CMOS devices and process technology, as it is known today, will become much more
difficult as the industry advances towards the 16nm technology node and further. This aggressive scaling of CMOS technology has pushed the
devices to their physical limits. Design goals are governed by several factors other than power, performance and area such as process
variations, radiation induced soft errors, and aging degradation mechanisms. These new design challenges have a strong impact on the parametric
yield of nanometer digital circuits and also result in functional yield losses in variation-sensitive digital circuits such as Static Random
Access Memory (SRAM) and flip-flops. Moreover, sub-threshold SRAM and flip-flops circuits, which are aggravated by the strong demand for lower
power consumption, show larger sensitivity to these challenges which reduces their robustness and yield. Accordingly, it is not surprising that
the ITRS considers variability and reliability as the most challenging obstacles for nanometer digital circuits robust design.
Soft errors are considered one of the main reliability and robustness concerns in SRAM arrays in sub-100nm technologies due to low operating
voltage, small node capacitance, and high packing density. The SRAM arrays soft errors immunity is also affected by process variations. We
develop statistical design-oriented soft errors immunity variations models for super-threshold and sub-threshold SRAM cells accounting for
die-to-die variations and within-die variations. This work provides new design insights and highlights the important design knobs that can be
used to reduce the SRAM cells soft errors immunity variations. The developed models are scalable, bias dependent, and only require the
knowledge of easily measurable parameters. This makes them useful in early design exploration, circuit optimization as well as technology
prediction. The derived models are verified using Monte Carlo SPICE simulations, referring to an industrial hardware-calibrated 65nm CMOS
technology.
The demand for higher performance leads to very deep pipelining which means that hundreds of thousands of flip-flops are required to control
the data flow under strict timing constraints. A violation of the timing constraints at a flip-flop can result in latching incorrect data
causing the overall system to malfunction. In addition, the flip-flops power dissipation represents a considerable fraction of the total power
dissipation. Sub-threshold flip-flops are considered the most energy efficient solution for low power applications in which, performance is of
secondary importance. Accordingly, statistical gate sizing is conducted to different flip-flops topologies for timing yield improvement of
super-threshold flip-flops and power yield improvement of sub-threshold flip-flops. Following that, a comparative analysis between these
flip-flops topologies considering the required overhead for yield improvement is performed. This comparative analysis provides useful
recommendations that help flip-flops designers on selecting the best flip-flops topology that satisfies their system specifications while
taking the process variations impact and robustness requirements into account.
Adaptive Body Bias (ABB) allows the tuning of the transistor threshold voltage, Vt, by controlling the transistor body voltage. A forward
body bias reduces Vt, increasing the device speed at the expense of increased leakage power. Alternatively, a reverse body bias increases
Vt, reducing the leakage power but slowing the device. Therefore, the impact of process variations is mitigated by speeding up slow and
less leaky devices or slowing down devices that are fast and highly leaky. Practically, the implementation of the ABB is desirable to bias each
device in a design independently, to mitigate within-die variations. However, supplying so many separate voltages inside a die results in a
large area overhead. On the other hand, using the same body bias for all devices on the same die limits its capability to compensate for
within-die variations. Thus, the granularity level of the ABB scheme is a trade-off between the within-die variations compensation capability
and the associated area overhead. This work introduces new ABB circuits that exhibit lower area overhead by a factor of 143X than that of
previous ABB circuits. In addition, these ABB circuits are resolution free since no digital-to-analog converters or analog-to-digital
converters are required on their implementations. These ABB circuits are adopted to high performance critical paths, emulating a real
microprocessor architecture, for process variations compensation and also adopted to SRAM arrays, for Negative Bias Temperature Instability
(NBTI) aging and process variations compensation. The effectiveness of the new ABB circuits is verified by post layout simulation results and
test chip measurements using triple-well 65nm CMOS technology.
The highly capacitive nodes of wide fan-in dynamic circuits and SRAM bitlines limit the performance of these circuits. In addition, process
variations mitigation by statistical gate sizing increases this capacitance further and fails in achieving the target yield improvement. We
propose new negative capacitance circuits that reduce the overall parasitic capacitance of these highly capacitive nodes. These negative
capacitance circuits are adopted to wide fan-in dynamic circuits for timing yield improvement up to 99.87% and to SRAM arrays for read access
yield improvement up to 100%. The area and power overheads of these new negative capacitance circuits are amortized over the large die area of
the microprocessor and the SRAM array. The effectiveness of the new negative capacitance circuits is verified by post layout simulation results
and test chip measurements using 65nm CMOS technology
Statistical timing based optimization using gate sizing
Abstract The increased dominance of intra-di