80 research outputs found

    Design of variation-tolerant synchronizers for multiple clock and voltage domains

    Get PDF
    PhD ThesisParametric variability increasingly affects the performance of electronic circuits as the fabrication technology has reached the level of 32nm and beyond. These parameters may include transistor Process parameters (such as threshold voltage), supply Voltage and Temperature (PVT), all of which could have a significant impact on the speed and power consumption of the circuit, particularly if the variations exceed the design margins. As systems are designed with more asynchronous protocols, there is a need for highly robust synchronizers and arbiters. These components are often used as interfaces between communication links of different timing domains as well as sampling devices for asynchronous inputs coming from external components. These applications have created a need for new robust designs of synchronizers and arbiters that can tolerate process, voltage and temperature variations. The aim of this study was to investigate how synchronizers and arbiters should be designed to tolerate parametric variations. All investigations focused mainly on circuit-level and transistor level designs and were modeled and simulated in the UMC90nm CMOS technology process. Analog simulations were used to measure timing parameters and power consumption along with a “Monte Carlo” statistical analysis to account for process variations. Two main components of synchronizers and arbiters were primarily investigated: flip-flop and mutual-exclusion element (MUTEX). Both components can violate the input timing conditions, setup and hold window times, which could cause metastability inside their bistable elements and possibly end in failures. The mean-time between failures is an important reliability feature of any synchronizer delay through the synchronizer. The MUTEX study focused on the classical circuit, in addition to a number of tolerance, based on increasing internal gain by adding current sources, reducing the capacitive loading, boosting the transconductance of the latch, compensating the existing Miller capacitance, and adding asymmetry to maneuver the metastable point. The results showed that some circuits had little or almost no improvements, while five techniques showed significant improvements by reducing τ and maintaining high tolerance. Three design approaches are proposed to provide variation-tolerant synchronizers. wagging synchronizer proposed to First, the is significantly increase reliability over that of the conventional two flip-flop synchronizer. The robustness of the wagging technique can be enhanced by using robust τ latches or adding one more cycle of synchronization. The second approach is the Metastability Auto-Detection and Correction (MADAC) latch which relies on swiftly detecting a metastable event and correcting it by enforcing the previously stored logic value. This technique significantly reduces the resolution time down from uncertain synchronization technique is proposed to transfer signals between Multiple- Voltage Multiple-Clock Domains (MVD/MCD) that do not require conventional level-shifters between the domains or multiple power supplies within each domain. This interface circuit uses a synchronous set and feedback reset protocol which provides level-shifting and synchronization of all signals between the domains, from a wide range of voltage-supplies and clock frequencies. Overall, synchronizer circuits can tolerate variations to a greater extent by employing the wagging technique or using a MADAC latch, while MUTEX tolerance can suffice with small circuit modifications. Communication between MVD/MCD can be achieved by an asynchronous handshake without a need for adding level-shifters.The Saudi Arabian Embassy in London, Umm Al-Qura University, Saudi Arabi

    Digital Background Self-Calibration Technique for Compensating Transition Offsets in Reference-less Flash ADCs

    Get PDF
    This Dissertation focusses on proving that background calibration using adaptive algorithms are low-cost, stable and effective methods for obtaining high accuracy in flash A/D converters. An integrated reference-less 3-bit flash ADC circuit has been successfully designed and taped out in UMC 180 nm CMOS technology in order to prove the efficiency of our proposed background calibration. References for ADC transitions have been virtually implemented built-in in the comparators dynamic-latch topology by a controlled mismatch added to each comparator input front-end. An external very simple DAC block (calibration bank) allows control the quantity of mismatch added in each comparator front-end and, therefore, compensate the offset of its effective transition with respect to the nominal value. In order to assist to the estimation of the offset of the prototype comparators, an auxiliary A/D converter with higher resolution and lower conversion speed than the flash ADC is used: a 6-bit capacitive-DAC SAR type. Special care in synchronization of analogue sampling instant in both ADCs has been taken into account. In this thesis, a criterion to identify the optimum parameters of the flash ADC design with adaptive background calibration has been set. With this criterion, the best choice for dynamic latch architecture, calibration bank resolution and flash ADC resolution are selected. The performance of the calibration algorithm have been tested, providing great programmability to the digital processor that implements the algorithm, allowing to choose the algorithm limits, accuracy and quantization errors in the arithmetic. Further, systematic controlled offset can be forced in the comparators of the flash ADC in order to have a more exhaustive test of calibration

    Architecture Independent Timing Speculation Techniques in VLSI Circuits.

    Full text link
    Conventional digital circuits must ensure correct operation throughout a wide range of operating conditions including process, voltage, and temperature variation. These conditions have an effect on circuit delays, and safety margins must be put in place which come at a power and performance cost. The Razor system proposed eliminating these timing margins by running a circuit with occasional timing errors and correcting the errors when they occur. Several existing Razor style designs have been proposed, however prior to this work, Razor could not be applied blindly or automatically to designs, as the various error correction schemes modified the architecture of the target design. Because of the architectural invasiveness and design complexities of these techniques, no published Razor style system had been applied to a complete existing commercial processor. Additionally, in all prior Razor-style systems, there is a fundamental tradeoff between speculation window and short path, or minimum delay, constraints, limiting the technique’s effectiveness. This thesis introduces the concept of Razor using two-phase latch based timing. By identifying and utilizing time borrowing as an error correction mechanism, it allows for Razor to be applied without the need to reload data or replay instructions. This allows for Razor to be blindly and automatically applied to existing designs without detailed knowledge of internal architecture. Additionally, latch based Razor allows for large speculation windows, up to 100% of nominal circuit delay, because it breaks the connection between minimum delay constraints and speculation window. By demonstrating how to transform conventional flip-flop based designs, including those which make use of clock gating, to two-phase latch based timing, Razor can be automatically added to a large set of existing digital designs. Two forms of latch based Razor are proposed. First, Bubble Razor involves rippling stall cycles throughout a circuit in response to timing errors and is applied to the ARM Cortex-M3 processor, the first ever application of a Razor technique to a complete, existing processor design. Additional work applies Bubble Razor to the ARM Cortex-R4 processor. The second latch based Razor technique, Voltage Razor, uses voltage boosting to correct for timing errors.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/102461/1/mfojtik_1.pd

    Circuit Techniques for Adaptive and Reliable High Performance Computing.

    Full text link
    Increasing power density with process scaling has caused stagnation in the clock speed of modern microprocessors. Accordingly, designers have adopted message passing and shared memory based multicore architectures in order to keep up with the rapidly rising demand for computing throughput. At the same time, applications are not entirely parallel and improving single-thread performance continues to remain critical. Additionally, reliability is also worsening with process scaling, and margining for failures due to process and environmental variations in modern technologies consumes an increasingly large portion of the power/performance envelope. In the wake of multicore computing, reliability of signal synchronization between the cores is also becoming increasingly critical. This forces designers to search for alternate efficient methods to improve compute performance while addressing reliability. Accordingly, this dissertation presents innovative circuit and architectural techniques for variation-tolerance, performance and reliability targeted at datapath logic, signal synchronization and memories. Firstly, a domino logic based design style for datapath logic is presented that uses Adaptive Robustness Tuning (ART) in addition to timing speculation to provide up to 71% performance gains over conventional domino logic in 32bx32b multiplier in 65nm CMOS. Margins are reduced until functionality errors are detected, that are used to guide the tuning. Secondly, for signal synchronization across clock domains, a new class of dynamic logic based synchronizers with single-cycle synchronization latency is presented, where pulses, rather than stable intermediate voltages cause metastability. Such pulses are amplified using skewed inverters to improve mean time between failures by ~1e6x over jamb latches and double flip-flops at 2GHz in 65nm CMOS. Thirdly, a reconfigurable sensing scheme for 6T SRAMs is presented that employs auto-zero calibration and pre-amplification to improve sensing reliability (by up to 1.2 standard deviations of NMOS threshold voltage in 28nm CMOS); this increased reliability is in turn traded for ~42% sensing speedup. Finally, a main memory architecture design methodology to address reliability and power in the context of Exascale computing systems is presented. Based on 3D-stacked DRAMs, the methodology co-optimizes DRAM access energy, refresh power and the increased cost of error resilience, to meet stringent power and reliability constraints.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107238/1/bharan_1.pd

    Design of hardware-based security solutions for interconnected systems

    Get PDF
    Among all the different research lines related to hardware security, there is a particular topic that strikingly attracts attention. That topic is the research regarding the so-called Physical Unclonable Functions (PUF). The PUFs, as can be seen throughout the Thesis, present the novel idea of connecting digital values uniquely to a physical entity, just as human biometrics does, but with electronic devices. This beautiful idea is not free of obstacles, and is the core of this Thesis. It is studied from different angles in order to better understand, in particular, SRAM PUFs, and to be able to integrate them into complex systems that expand their potential. During Chapter 1, the PUFs, their properties and their main characteristics are defined. In addition, the different types of PUFs, and their main applications in the field of security are also summarized. Once we know what a PUF is, and the types of them we can find, throughout Chapter 2 an exhaustive analysis of the SRAM PUFs is carried out, given the wide availability of SRAMs today in most electronic circuits (which dramatically reduces the cost of deploying any solution). An algorithm is proposed to improve the characteristics of SRAM PUFs, both to generate identifiers and to generate random numbers, simultaneously. The results of this Chapter demonstrates the feasibility of implementing the algorithm, so in the following Chapters it is explored its integration in both hardware and software systems. In Chapter 3 the hardware design and integration of the algorithm introduced in Chapter 2 is described. The design is presented together with some examples of use that demonstrate the possible practical realizations in VLSI designs. In an analogous way, in Chapter 4 the software design and integration of the algorithm introduced in Chapter 2 is described. The design is presented together with some examples of use that demonstrate the possible practical realizations in low-power IoT devices. The algorithm is also described as part of a secure firmware update protocol that has been designed to be resistant to most current attacks, ensuring the integrity and trustworthiness of the updated firmware.In Chapter 5, following the integration of PUF-based solutions into protocols, PUFs are used as part of an authentication protocol that uses zero-knowledge proofs. The cryptographic protocol is a Lattice-based post-quantum protocol that guarantees the integrity and anonymity of the identity generated by the PUF. This type of architecture prevents any type of impersonation or virtual copy of the PUF, since this is unknown and never leaves the device. Specifically, this type of design has been carried out with the aim of having traceability of identities without ever knowing the identity behind, which is very interesting for blockchain technologies. Finally, in Chapter 6 a new type of PUF, named as BPUF (Behavioral and Physical Unclonable Function), is proposed and analyzed according to the definitions given in Chapter 1. This new type of PUF significantly changes the metrics and concepts to which we were used to in previous Chapters. A new multi-modal authentication protocol is presented in this Chapter, taking advantage of the challenge-response tuples of BPUFs. An example of BPUFs is illustrated with SRAMs. A proposal to integrate the BPUFs described in Chapter 6 into the protocol of Chapter 5, as well as the final remarks of the Thesis, can be found in Chapter 7

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    inSense: A Variation and Fault Tolerant Architecture for Nanoscale Devices

    Get PDF
    Transistor technology scaling has been the driving force in improving the size, speed, and power consumption of digital systems. As devices approach atomic size, however, their reliability and performance are increasingly compromised due to reduced noise margins, difficulties in fabrication, and emergent nano-scale phenomena. Scaled CMOS devices, in particular, suffer from process variations such as random dopant fluctuation (RDF) and line edge roughness (LER), transistor degradation mechanisms such as negative-bias temperature instability (NBTI) and hot-carrier injection (HCI), and increased sensitivity to single event upsets (SEUs). Consequently, future devices may exhibit reduced performance, diminished lifetimes, and poor reliability. This research proposes a variation and fault tolerant architecture, the inSense architecture, as a circuit-level solution to the problems induced by the aforementioned phenomena. The inSense architecture entails augmenting circuits with introspective and sensory capabilities which are able to dynamically detect and compensate for process variations, transistor degradation, and soft errors. This approach creates ``smart\u27\u27 circuits able to function despite the use of unreliable devices and is applicable to current CMOS technology as well as next-generation devices using new materials and structures. Furthermore, this work presents an automated prototype implementation of the inSense architecture targeted to CMOS devices and is evaluated via implementation in ISCAS \u2785 benchmark circuits. The automated prototype implementation is functionally verified and characterized: it is found that error detection capability (with error windows from \approx30-400ps) can be added for less than 2\% area overhead for circuits of non-trivial complexity. Single event transient (SET) detection capability (configurable with target set-points) is found to be functional, although it generally tracks the standard DMR implementation with respect to overheads

    D2.1 - Report on Selected TRNG and PUF Principles

    Get PDF
    This report represents the final version of Deliverable 2.1 of the HECTOR work package WP2. It is a result of discussions and work on Task 2.1 of all HECTOR partners involved in WP2. The aim of the Deliverable 2.1 is to select principles of random number generators (RNGs) and physical unclonable functions (PUFs) that fulfill strict technology, design and security criteria. For example, the selected RNGs must be suitable for implementation in logic devices according to the German AIS20/31 standard. Correspondingly, the selected PUFs must be suitable for applying similar security approach. A standard PUF evaluation approach does not exist, yet, but it should be proposed in the framework of the project. Selected RNGs and PUFs should be then thoroughly evaluated from the point of view of security and the most suitable principles should be implemented in logic devices, such as Field Programmable Logic Arrays (FPGAs) and Application Specific Integrated Circuits (ASICs) during the next phases of the project
    corecore