38,277 research outputs found
Parametric Yield of VLSI Systems under Variability: Analysis and Design Solutions
Variability has become one of the vital challenges that the
designers of integrated circuits encounter. variability becomes
increasingly important. Imperfect manufacturing process manifest
itself as variations in the design parameters. These variations
and those in the operating environment of VLSI circuits result in
unexpected changes in the timing, power, and reliability of the
circuits. With scaling transistor dimensions, process and
environmental variations become significantly important in the
modern VLSI design. A smaller feature size means that the physical
characteristics of a device are more prone to these
unaccounted-for changes. To achieve a robust design, the random
and systematic fluctuations in the manufacturing process and the
variations in the environmental parameters should be analyzed and
the impact on the parametric yield should be addressed.
This thesis studies the challenges and comprises solutions for
designing robust VLSI systems in the presence of variations.
Initially, to get some insight into the system design under
variability, the parametric yield is examined for a small circuit.
Understanding the impact of variations on the yield at the circuit
level is vital to accurately estimate and optimize the yield at
the system granularity. Motivated by the observations and results,
found at the circuit level, statistical analyses are performed,
and solutions are proposed, at the system level of abstraction, to
reduce the impact of the variations and increase the parametric
yield.
At the circuit level, the impact of the supply and threshold
voltage variations on the parametric yield is discussed. Here, a
design centering methodology is proposed to maximize the
parametric yield and optimize the power-performance trade-off
under variations. In addition, the scaling trend in the yield loss
is studied. Also, some considerations for design centering in the
current and future CMOS technologies are explored.
The investigation, at the circuit level, suggests that the
operating temperature significantly affects the parametric yield.
In addition, the yield is very sensitive to the magnitude of the
variations in supply and threshold voltage. Therefore, the spatial
variations in process and environmental variations make it
necessary to analyze the yield at a higher granularity. Here,
temperature and voltage variations are mapped across the chip to
accurately estimate the yield loss at the system level.
At the system level, initially the impact of process-induced
temperature variations on the power grid design is analyzed. Also,
an efficient verification method is provided that ensures the
robustness of the power grid in the presence of variations. Then,
a statistical analysis of the timing yield is conducted, by taking
into account both the process and environmental variations. By
considering the statistical profile of the temperature and supply
voltage, the process variations are mapped to the delay variations
across a die. This ensures an accurate estimation of the timing
yield. In addition, a method is proposed to accurately estimate
the power yield considering process-induced temperature and supply
voltage variations. This helps check the robustness of the
circuits early in the design process.
Lastly, design solutions are presented to reduce the power
consumption and increase the timing yield under the variations. In
the first solution, a guideline for floorplaning optimization in
the presence of temperature variations is offered. Non-uniformity
in the thermal profiles of integrated circuits is an issue that
impacts the parametric yield and threatens chip reliability.
Therefore, the correlation between the total power consumption and
the temperature variations across a chip is examined. As a result,
floorplanning guidelines are proposed that uses the correlation to
efficiently optimize the chip's total power and takes into account
the thermal uniformity.
The second design solution provides an optimization methodology
for assigning the power supply pads across the chip for maximizing
the timing yield. A mixed-integer nonlinear programming (MINLP)
optimization problem, subject to voltage drop and current
constraint, is efficiently solved to find the optimum number and
location of the pads
Ring oscillator clocks and margins
How much margin do we have to add to the delay lines of a bundled-data circuit? This paper is an attempt to give a methodical answer to this question, taking into account all sources of variability and the existing EDA machinery for timing analysis and sign-off. The paper is based on the study of the margins of a ring oscillator that substitutes a PLL as clock generator. A timing model is proposed that shows that a 12% margin for delay lines can be sufficient to cover variability in a 65nm technology. In a typical scenario, performance and energy improvements between 15% and 35% can be obtained by using a ring oscillator instead of a PLL. The paper concludes that a synchronous circuit with a ring oscillator clock shows similar benefits in performance and energy as those of bundled-data asynchronous circuits.Peer ReviewedPostprint (author's final draft
X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories
Silicon-based Static Random Access Memories (SRAM) and digital Boolean logic
have been the workhorse of the state-of-art computing platforms. Despite
tremendous strides in scaling the ubiquitous metal-oxide-semiconductor
transistor, the underlying \textit{von-Neumann} computing architecture has
remained unchanged. The limited throughput and energy-efficiency of the
state-of-art computing systems, to a large extent, results from the well-known
\textit{von-Neumann bottleneck}. The energy and throughput inefficiency of the
von-Neumann machines have been accentuated in recent times due to the present
emphasis on data-intensive applications like artificial intelligence, machine
learning \textit{etc}. A possible approach towards mitigating the overhead
associated with the von-Neumann bottleneck is to enable \textit{in-memory}
Boolean computations. In this manuscript, we present an augmented version of
the conventional SRAM bit-cells, called \textit{the X-SRAM}, with the ability
to perform in-memory, vector Boolean computations, in addition to the usual
memory storage operations. We propose at least six different schemes for
enabling in-memory vector computations including NAND, NOR, IMP (implication),
XOR logic gates with respect to different bit-cell topologies the 8T cell
and the 8T Differential cell. In addition, we also present a novel
\textit{`read-compute-store'} scheme, wherein the computed Boolean function can
be directly stored in the memory without the need of latching the data and
carrying out a subsequent write operation. The feasibility of the proposed
schemes has been verified using predictive transistor models and Monte-Carlo
variation analysis.Comment: This article has been accepted in a future issue of IEEE Transactions
on Circuits and Systems-I: Regular Paper
Smart Card Fault Injections with High Temperatures
Power and clock glitch attacks on smart cards can help an attacker to discover some internal
secrets or bypass certain security checks. Also, an attacker can manipulate the temperature and supply voltage
of the device, thus making the device glitch more easily. If these manipulations are within the device operating
conditions, it becomes harder to distinguish between an extreme condition from an attacker. To demonstrate
temperature and power supply effect on fault attacks, we perform several tests on an Atmega 163 microcontroller
in different conditions. Our results show that this kind of attacks are still a serious threat to small devices,
whilst maintaining the manufacturer recommendations
Voltage noise analysis with ring oscillator clocks
Voltage noise is the main source of dynamic variability in integrated circuits and a major concern for the design of Power Delivery Networks (PDNs). Ring Oscillators Clocks (ROCs) have been proposed as an alternative to mitigate the negative effects of voltage noise as technology scales down and power density increases. However, their effectiveness highly depends on the design parameters of the PDN, power consumption patterns of the system and spatial locality of the ROCs within the clock domains. This paper analyzes the impact of the PDN parameters and ROC location on the robustness to voltage noise. The capability of reacting instantaneously to unpredictable voltage droops makes ROCs an attractive solution, which allows to reduce the amount of decoupling capacitance without downgrading performance. Tolerance to voltage noise and related benefits can be increased by using multiple ROCs and reducing the size of the clock domains. The analysis shows that up to 83% of the margins for voltage noise and up to 27% of the leakage power can be reduced by using local ROCs.Peer ReviewedPostprint (author's final draft
Desynchronization: Synthesis of asynchronous circuits from synchronous specifications
Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur
- …