164 research outputs found
Digital IP Protection Using Threshold Voltage Control
This paper proposes a method to completely hide the functionality of a
digital standard cell. This is accomplished by a differential threshold logic
gate (TLG). A TLG with inputs implements a subset of Boolean functions of
variables that are linear threshold functions. The output of such a gate is
one if and only if an integer weighted linear arithmetic sum of the inputs
equals or exceeds a given integer threshold. We present a novel architecture of
a TLG that not only allows a single TLG to implement a large number of complex
logic functions, which would require multiple levels of logic when implemented
using conventional logic primitives, but also allows the selection of that
subset of functions by assignment of the transistor threshold voltages to the
input transistors. To obfuscate the functionality of the TLG, weights of some
inputs are set to zero by setting their device threshold to be a high .
The threshold voltage of the remaining transistors is set to low to
increase their transconductance. The function of a TLG is not determined by the
cell itself but rather the signals that are connected to its inputs. This makes
it possible to hide the support set of the function by essentially removing
some variable from the support set of the function by selective assignment of
high and low to the input transistors. We describe how a standard cell
library of TLGs can be mixed with conventional standard cells to realize
complex logic circuits, whose function can never be discovered by reverse
engineering. A 32-bit Wallace tree multiplier and a 28-bit 4-tap filter were
synthesized on an ST 65nm process, placed and routed, then simulated including
extracted parastics with and without obfuscation. Both obfuscated designs had
much lower area (25%) and much lower dynamic power (30%) than their
nonobfuscated CMOS counterparts, operating at the same frequency
A novel deep submicron bulk planar sizing strategy for low energy subthreshold standard cell libraries
Engineering andPhysical Science ResearchCouncil
(EPSRC) and Arm Ltd for providing funding in the form of grants and studentshipsThis work investigates bulk planar deep submicron semiconductor physics in an attempt
to improve standard cell libraries aimed at operation in the subthreshold regime and in
Ultra Wide Dynamic Voltage Scaling schemes. The current state of research in the field is
examined, with particular emphasis on how subthreshold physical effects degrade
robustness, variability and performance. How prevalent these physical effects are in a
commercial 65nm library is then investigated by extensive modeling of a BSIM4.5
compact model. Three distinct sizing strategies emerge, cells of each strategy are laid out
and post-layout parasitically extracted models simulated to determine the
advantages/disadvantages of each. Full custom ring oscillators are designed and
manufactured. Measured results reveal a close correlation with the simulated results, with
frequency improvements of up to 2.75X/2.43X obs erved for RVT/LVT devices
respectively. The experiment provides the first silicon evidence of the improvement
capability of the Inverse Narrow Width Effect over a wide supply voltage range, as well
as a mechanism of additional temperature stability in the subthreshold regime.
A novel sizing strategy is proposed and pursued to determine whether it is able to produce
a superior complex circuit design using a commercial digital synthesis flow. Two 128 bit
AES cores are synthesized from the novel sizing strategy and compared against a third
AES core synthesized from a state-of-the-art subthreshold standard cell library used by
ARM. Results show improvements in energy-per-cycle of up to 27.3% and frequency
improvements of up to 10.25X. The novel subthreshold sizing strategy proves superior
over a temperature range of 0 °C to 85 °C with a nominal (20 °C) improvement in
energy-per-cycle of 24% and frequency improvement of 8.65X.
A comparison to prior art is then performed. Valid cases are presented where the
proposed sizing strategy would be a candidate to produce superior subthreshold circuits
Power-efficient high-speed interface circuit techniques
Inter- and intra-chip connections have become the new challenge to enable the scaling of computing systems, ranging from mobile devices to high-end servers. Demand for aggregate I/O bandwidth has been driven by applications including high-speed ethernet, backplane micro-servers, memory, graphics, chip-to-chip and network onchip. I/O circuitry is becoming the major power consumer in SoC processors and memories as the increasing bandwidth demands larger per-pin data rate or larger I/O pin count per component. The aggregate I/O bandwidth has approximately doubled every three to four years across a diverse range of standards in different applications. However, in order to keep pace with these standards enabled in part by process-technology scaling, we will require more than just device scaling in the near future. New energy-efficient circuit techniques must be proposed to enable the next generations of handheld and high-performance computers, given the thermal and system-power limits they start facing. ^ In this work, we are proposing circuit architectures that improve energy efficiency without decreasing speed performance for the most power hungry circuits in high speed interfaces. By the introduction of a new kind of logic operators in CMOS, called implication operators, we implemented a new family of high-speed frequency dividers/prescalers with reduced footprint and power consumption. New techniques and circuits for clock distribution, for pre-emphasis and for driver at the transmitter side of the I/O circuitry have been proposed and implemented. At the receiver side, new DFE architecture and CDR have been proposed and have been proven experimentally
Recommended from our members
Noise shaping Asynchronous SAR ADC based time to digital converter
Time-to-digital converters (TDCs) are key elements for the digitization of timing information in modern mixed-signal circuits such as digital PLLs, DLLs, ADCs, and on-chip jitter-monitoring circuits. Especially, high-resolution TDCs are increasingly employed in on-chip timing tests, such as jitter and clock skew measurements, as advanced fabrication technologies allow fine on-chip time resolutions. Its main purpose is to quantize the time interval of a pulse signal or the time interval between the rising edges of two clock signals. Similarly to ADCs, the performance of TDCs are also primarily characterized by Resolution, Sampling Rate, FOM, SNDR, Dynamic Range and DNL/INL. This work proposes and demonstrates 2nd order noise shaping Asynchronous SAR ADC based TDC architecture with highest resolution of 0.25 ps among current state of art designs with respect to post-layout simulation results. This circuit is a combination of low power/High Resolution 2nd Order Noise Shaped Asynchronous SAR ADC backend with simple Time to Amplitude converter (TAC) front-end and is implemented in 40nm CMOS technology. Additionally, special emphasis is given on the discussion on various current state of art TDC architectures.Electrical and Computer Engineerin
Recommended from our members
Design techniques for low-power SAR ADCs in nano-scale CMOS technologies
This thesis presents low power design techniques for successive approximation register (SAR) analog-to-digital converters (ADCs) in nano-scale CMOS technologies. Low power SAR ADCs face two major challenges especially at high resolutions: (1) increased comparator power to suppress the noise, and (2) increased DAC switching energy due to the large DAC size. To improve the comparator’s power efficiency, a statistical estimation based comparator noise reduction technique is presented. It allows a low power and noisy comparator to achieve high signal-to-noise ratio (SNR) by estimating the conversion residue. A first prototype ADC in 65nm CMOS has been developed to validate the proposed noise reduction technique. It achieves 4.5 fJ/conv-step Walden figure of merit and 64.5 dB signal-to-noise and distortion ratio (SNDR). In addition, a bidirectional single-side switching technique is developed to reduce the DAC switching power. It can reduce the DAC switching power and the total number of unit capacitors by 86% and 75%, respectively. A second prototype ADC with the proposed switching technique is designed and fabricated in 180nm CMOS technology. It achieves an SNDR of 63.4 dB and consumes only 24 Wat 1MS/s, leading to aWalden figure of merit of 19.9 fJ/conv-step. This thesis also presents an improved loop-unrolled SAR ADC, which works at high frequency with reduced SAR logic power and delay. It employs the bidirectional single-side switching technique to reduce the comparator common-mode voltage variation. In addition, it uses a Vcm-adaptive offset calibration technique which can accurately calibrate comparator’s offset at its operating Vcm. A prototype ADC designed in 40nm CMOS achieves 35 dB at 700 MS/s sampling rate and consumes only 0.95 mW, leading to a Walden figure of merit of 30 fJ/conv-step.Electrical and Computer Engineerin
DIGITALLY ASSISTED TECHNIQUES FOR NYQUIST RATE ANALOG-to-DIGITAL CONVERTERS
With the advance of technology and rapid growth of digital systems, low power high speed analog-to-digital converters with great accuracy are in demand. To achieve high effective number of bits Analog-to-Digital Converter(ADC) calibration as a time consuming process is a potential bottleneck for designs. This dissertation presentsa fully digital background calibration algorithm for a 7-bit redundant flash ADC using split structure and look-up table based correction. Redundant comparators are used in the flash ADC design of this work in order to tolerate large offset voltages while minimizing signal input capacitance. The split ADC structure helps by eliminating the unknown input signal from the calibration path. The flash ADC has been designed in 180nm IBM CMOS technology and fabricated through MOSIS. This work was supported by Analog Devices, Wilmington,MA. While much research on ADC design has concentrated on increasing resolution and sample rate, there are many applications (e.g. biomedical devices and sensor networks) that do not require high performance but do require low power energy efficient ADCs. This dissertation also explores on design of a low quiescent current 100kSps Successive Approximation (SAR) ADC that has been used as an error detection ADC for an automotive application in 350nm CD (CMOS-DMOS) technology. This work was supported by ON Semiconductor Corp, East Greenwich,RI
Accelerated Aging in Devices and Circuits
abstract: The aging mechanism in devices is prone to uncertainties due to dynamic stress conditions. In AMS circuits these can lead to momentary fluctuations in circuit voltage that may be missed by a compact model and hence cause unpredictable failure. Firstly, multiple aging effects in the devices may have underlying correlations. The generation of new traps during TDDB may significantly accelerate BTI, since these traps are close to the dielectric-Si interface in scaled technology. Secondly, the prevalent reliability analysis lacks a direct validation of the lifetime of devices and circuits. The aging mechanism of BTI causes gradual degradation of the device leading to threshold voltage shift and increasing the failure rate. In the 28nm HKMG technology, contribution of BTI to NMOS degradation has become significant at high temperature as compared to Channel Hot Carrier (CHC). This requires revising the End of Lifetime (EOL) calculation based on contribution from induvial aging effects especially in feedback loops. Conventionally, aging in devices is extrapolated from a short-term measurement, but this practice results in unreliable prediction of EOL caused by variability in initial parameters and stress conditions. To mitigate the extrapolation issues and improve predictability, this work aims at providing a new approach to test the device to EOL in a fast and controllable manner. The contributions of this thesis include: (1) based on stochastic trapping/de-trapping mechanism, new compact BTI models are developed and verified with 14nm FinFET and 28nm HKMG data. Moreover, these models are implemented into circuit simulation, illustrating a significant increase in failure rate due to accelerated BTI, (2) developing a model to predict accelerated aging under special conditions like feedback loops and stacked inverters, (3) introducing a feedback loop based test methodology called Adaptive Accelerated Aging (AAA) that can generate accurate aging data till EOL, (4) presenting simulation and experimental data for the models and providing test setup for multiple stress conditions, including those for achieving EOL in 1 hour device as well as ring oscillator (RO) circuit for validation of the proposed methodology, and (5) scaling these models for finding a guard band for VLSI design circuits that can provide realistic aging impact.Dissertation/ThesisMasters Thesis Electrical Engineering 201
The Integration of nearthreshold and subthreshold CMOS logic for energy minimization
With the rapid growth in the use of portable electronic devices, more emphasis has recently been placed on low-energy circuit design. Digital subthreshold complementary metal-oxide-semiconductor (CMOS) circuit design is one area of study that offers significant energy reduction by operating at a supply voltage substantially lower than the threshold voltage of the transistor. However, these energy savings come at a critical cost to performance, restricting its use to severely energy-constrained applications such as microsensor nodes. In an effort to mitigate this performance degradation in low-energy designs, nearthreshold circuit design has been proposed and implemented in digital circuits such as Intel\u27s energy-efficient hardware accelerator. The application spectrum of nearthreshold and subthreshold design could be broadened by integrating these cells into high-performance designs. This research focuses on the integration of characterized nearthreshold and subthreshold standard cells into high-performance functional modules. Within these functional modules, energy minimization is achieved while satisfying performance constraints by replacing non-critical path logic with nearthreshold and subthreshold logic cells. Specifically, the critical path method is used to bind the timing and energy constraints of the design. The design methodology was verified and tested with several benchmark circuits, including a cryptographic hash function, Skein. An average energy savings of 41.15% was observed at a circuit performance degradation factor of 10. The energy overhead of the level shifters accounted for at least 8.5% of the energy consumption of the optimized circuit, with an average energy overhead of 26.76%. A heuristic approach is developed for estimating the energy savings of the optimized design
Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency
abstract: Static CMOS logic has remained the dominant design style of digital systems for
more than four decades due to its robustness and near zero standby current. Static
CMOS logic circuits consist of a network of combinational logic cells and clocked sequential
elements, such as latches and flip-flops that are used for sequencing computations
over time. The majority of the digital design techniques to reduce power, area, and
leakage over the past four decades have focused almost entirely on optimizing the
combinational logic. This work explores alternate architectures for the flip-flops for
improving the overall circuit performance, power and area. It consists of three main
sections.
First, is the design of a multi-input configurable flip-flop structure with embedded
logic. A conventional D-type flip-flop may be viewed as realizing an identity function,
in which the output is simply the value of the input sampled at the clock edge. In
contrast, the proposed multi-input flip-flop, named PNAND, can be configured to
realize one of a family of Boolean functions called threshold functions. In essence,
the PNAND is a circuit implementation of the well-known binary perceptron. Unlike
other reconfigurable circuits, a PNAND can be configured by simply changing the
assignment of signals to its inputs. Using a standard cell library of such gates, a technology
mapping algorithm can be applied to transform a given netlist into one with
an optimal mixture of conventional logic gates and threshold gates. This approach
was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier
in 65nm LP technology. Simulation and chip measurements show more than 30%
improvement in dynamic power and more than 20% reduction in core area.
The functional yield of the PNAND reduces with geometry and voltage scaling.
The second part of this research investigates the use of two mechanisms to improve
the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM
devices for low voltage operation.
The third part of this research focused on the design of flip-flops with non-volatile
storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated
with both conventional D-flipflop and the PNAND circuits to implement non-volatile
logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of
system locally when a power interruption occurs. However, manufacturing variations
in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading
to an overly pessimistic design and consequently, higher energy consumption. A
detailed analysis of the design trade-offs in the driver circuitry for performing backup
and restore, and a novel method to design the energy optimal driver for a given yield is
presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented,
in which the backup time is determined on a per-chip basis, resulting in minimizing
the energy wastage and satisfying the yield constraint. To achieve a yield of 98%,
the conventional approach would have to expend nearly 5X more energy than the
minimum required, whereas the proposed tunable approach expends only 26% more
energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are
designed with the same backup and restore circuitry in 65nm technology. The embedded
logic in NV-TLFF compensates performance overhead of NVL. This leads to the
possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and-
accumulate (MAC) unit is designed to demonstrate the performance benefits of the
proposed architecture. Based on the results of HSPICE simulations, the MAC circuit
with the proposed NV-TLFF cells is shown to consume at least 20% less power and
area as compared to the circuit designed with conventional DFFs, without sacrificing
any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201
myCACTI: A new cache design tool for pipelined nanometer caches
TThe presence of caches in microprocessors has always been one of the most
important techniques in bridging the memory wall, or the speed gap between the
microprocessor and main memory. This importance is continuously increasing
especially as we enter the regime of nanometer process technologies (i.e. 90nm
and below), as industry has favored investing a larger and larger fraction of a
chip.s transistor budget to improving the on-chip cache. This is the case in
practice, as it has proven to be an efficient way to utilize the increasing
number of transistors available with each succeeding technology. Consequently,
it becomes even more important to have cache design tools that give accurate
representations of designs that exist in actual microprocessors.
The prevalent cache design tools that are the most widely used in academe are
CACTI [Wilton1996] and eCACTI [Mamidipaka2004], and these have proven to be very
useful tools not just for cache designers, but also for computer architects.
This dissertation will show that both CACTI and eCACTI still contain major
limitations and even flaws in their design, making them unsuitable for use in
very-deep submicron and nanometer caches, especially pipelined designs. These
limitations and flaws will be discussed in detail.
This dissertation then introduces a new tool, called myCACTI, that addresses all
these limitations and, in addition, introduces major enhancements to the
simulation framework.
This dissertation then demonstrates the use of myCACTI in the cache design
process. Detailed design space explorations are done on multiple cache
configurations to produce pareto optimal curves of the caches to show optimal
implementations. Detailed studies are also performed to characterize the delay
and power dissipation of different cache configurations and implementations.
Finally, future directions to the development of myCACTI are identified to show
possible ways that the tool can be improved in such a way as to allow even more
different kinds of studies to be performed
- …