70 research outputs found

    Energy and performance models for clocked and asynchronous communication

    Get PDF
    Journal ArticleParameterized first-order models for throughput, energy, and bandwidth are presented in this paper. Models are developed for many common pipeline methodologies, including clocked flopped, clocked time-borrowing latch protocols, asynchronous two-cycle, four-cycle, delay-insensitive, and source synchronous. The paper focuses on communication costs which have the potential to throttle design performance as scaling continues. The models can also be applied to logic. The equations share common parameters to allow apples-to-apples comparisons against different design targets and pipeline methodologies. By applying the parameters to various design targets, one can determine when unclocked communication is superior at the physical level to clocked communication in terms of energy for a given bandwidth. Comparisons between protocols at fixed targets also allow designers to understand tradeoffs between implementations that have a varying degree of timing assumptions and design requirements

    Exploration and Design of High Performance Variation Tolerant On-Chip Interconnects

    Get PDF
    Siirretty Doriast

    Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency

    Get PDF
    abstract: Static CMOS logic has remained the dominant design style of digital systems for more than four decades due to its robustness and near zero standby current. Static CMOS logic circuits consist of a network of combinational logic cells and clocked sequential elements, such as latches and flip-flops that are used for sequencing computations over time. The majority of the digital design techniques to reduce power, area, and leakage over the past four decades have focused almost entirely on optimizing the combinational logic. This work explores alternate architectures for the flip-flops for improving the overall circuit performance, power and area. It consists of three main sections. First, is the design of a multi-input configurable flip-flop structure with embedded logic. A conventional D-type flip-flop may be viewed as realizing an identity function, in which the output is simply the value of the input sampled at the clock edge. In contrast, the proposed multi-input flip-flop, named PNAND, can be configured to realize one of a family of Boolean functions called threshold functions. In essence, the PNAND is a circuit implementation of the well-known binary perceptron. Unlike other reconfigurable circuits, a PNAND can be configured by simply changing the assignment of signals to its inputs. Using a standard cell library of such gates, a technology mapping algorithm can be applied to transform a given netlist into one with an optimal mixture of conventional logic gates and threshold gates. This approach was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier in 65nm LP technology. Simulation and chip measurements show more than 30% improvement in dynamic power and more than 20% reduction in core area. The functional yield of the PNAND reduces with geometry and voltage scaling. The second part of this research investigates the use of two mechanisms to improve the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM devices for low voltage operation. The third part of this research focused on the design of flip-flops with non-volatile storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated with both conventional D-flipflop and the PNAND circuits to implement non-volatile logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of system locally when a power interruption occurs. However, manufacturing variations in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading to an overly pessimistic design and consequently, higher energy consumption. A detailed analysis of the design trade-offs in the driver circuitry for performing backup and restore, and a novel method to design the energy optimal driver for a given yield is presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented, in which the backup time is determined on a per-chip basis, resulting in minimizing the energy wastage and satisfying the yield constraint. To achieve a yield of 98%, the conventional approach would have to expend nearly 5X more energy than the minimum required, whereas the proposed tunable approach expends only 26% more energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are designed with the same backup and restore circuitry in 65nm technology. The embedded logic in NV-TLFF compensates performance overhead of NVL. This leads to the possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and- accumulate (MAC) unit is designed to demonstrate the performance benefits of the proposed architecture. Based on the results of HSPICE simulations, the MAC circuit with the proposed NV-TLFF cells is shown to consume at least 20% less power and area as compared to the circuit designed with conventional DFFs, without sacrificing any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Microarchitectural Low-Power Design Techniques for Embedded Microprocessors

    Get PDF
    With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided

    Soft-Error Resilience Framework For Reliable and Energy-Efficient CMOS Logic and Spintronic Memory Architectures

    Get PDF
    The revolution in chip manufacturing processes spanning five decades has proliferated high performance and energy-efficient nano-electronic devices across all aspects of daily life. In recent years, CMOS technology scaling has realized billions of transistors within large-scale VLSI chips to elevate performance. However, these advancements have also continually augmented the impact of Single-Event Transient (SET) and Single-Event Upset (SEU) occurrences which precipitate a range of Soft-Error (SE) dependability issues. Consequently, soft-error mitigation techniques have become essential to improve systems\u27 reliability. Herein, first, we proposed optimized soft-error resilience designs to improve robustness of sub-micron computing systems. The proposed approaches were developed to deliver energy-efficiency and tolerate double/multiple errors simultaneously while incurring acceptable speed performance degradation compared to the prior work. Secondly, the impact of Process Variation (PV) at the Near-Threshold Voltage (NTV) region on redundancy-based SE-mitigation approaches for High-Performance Computing (HPC) systems was investigated to highlight the approach that can realize favorable attributes, such as reduced critical datapath delay variation and low speed degradation. Finally, recently, spin-based devices have been widely used to design Non-Volatile (NV) elements such as NV latches and flip-flops, which can be leveraged in normally-off computing architectures for Internet-of-Things (IoT) and energy-harvesting-powered applications. Thus, in the last portion of this dissertation, we design and evaluate for soft-error resilience NV-latching circuits that can achieve intriguing features, such as low energy consumption, high computing performance, and superior soft errors tolerance, i.e., concurrently able to tolerate Multiple Node Upset (MNU), to potentially become a mainstream solution for the aerospace and avionic nanoelectronics. Together, these objectives cooperate to increase energy-efficiency and soft errors mitigation resiliency of larger-scale emerging NV latching circuits within iso-energy constraints. In summary, addressing these reliability concerns is paramount to successful deployment of future reliable and energy-efficient CMOS logic and spintronic memory architectures with deeply-scaled devices operating at low-voltages

    The effects of process variations on performance and robustness of bulk CMOS and SOI implementations of C-elements

    Get PDF
    Advances in semiconductor technology have been driven by the continuous demands of market forces for IC products with higher performance and greater functionality per unit area. To date industry has addressed these demands, principally, by scaling down device dimensions. However, several unintended consequences have undermined the benefits obtained from the advances in technology, firstly, the growing impact of process variations on interconnectivity delay, aggravated by the increase in the amount of interconnectivity as circuit complexity increases. Overall, the difficulty of establishing delay parameters in a circuit is adversely impacting on the attainment of the timing closure for a design. Secondly, the increase in the susceptibility of the circuits , even at ground level, to the effects of soft errors due to the reduction in supply voltages and nodal capacitances, together with the increase in the number of nodes in a circuit as the functionality per unit area increases. The aim of this research has been to model and analyse the reliability of logic circuits with regard to the impact of process variations and soft errors, and to finds ways to minimise these effects using different process technologies such as fully depleted silicon on insulator (FDSOI) and partially depleted silicon on insulator (PDSOI) technologies, together with the implementation of different circuit architectures. In view of the increased susceptibility of logic elements to the effects of process variations and soft errors as device geometries are reduced, a logic element which is not only widely used but also typical to asynchronous design is the Muller C-element, which can be realised in a number of different circuit configurations. The robustness of various C-element configurations implemented in different technologies with regard to the effects of process variations and soft errors was examined using the design of the experiment (DoE) and response surface (RSM) techniques. It was found that the circuits based on SOI technology were more robust compared with bulk silicon technology. On the other hand, from the circuit architecture perspective, the differential logic implementations of C-element were found to be more resilient to the effects of process variation and soft errors in comparison with the other C-element implementations investigated.EThOS - Electronic Theses Online ServiceMutah UniversityGBUnited Kingdo

    The effects of process variations on performance and robustness of bulk CMOS and SOI implementations of C-elements

    Get PDF
    Advances in semiconductor technology have been driven by the continuous demands of market forces for IC products with higher performance and greater functionality per unit area. To date industry has addressed these demands, principally, by scaling down device dimensions. However, several unintended consequences have undermined the benefits obtained from the advances in technology, firstly, the growing impact of process variations on interconnectivity delay, aggravated by the increase in the amount of interconnectivity as circuit complexity increases. Overall, the difficulty of establishing delay parameters in a circuit is adversely impacting on the attainment of the timing closure for a design. Secondly, the increase in the susceptibility of the circuits , even at ground level, to the effects of soft errors due to the reduction in supply voltages and nodal capacitances, together with the increase in the number of nodes in a circuit as the functionality per unit area increases. The aim of this research has been to model and analyse the reliability of logic circuits with regard to the impact of process variations and soft errors, and to finds ways to minimise these effects using different process technologies such as fully depleted silicon on insulator (FDSOI) and partially depleted silicon on insulator (PDSOI) technologies, together with the implementation of different circuit architectures. In view of the increased susceptibility of logic elements to the effects of process variations and soft errors as device geometries are reduced, a logic element which is not only widely used but also typical to asynchronous design is the Muller C-element, which can be realised in a number of different circuit configurations. The robustness of various C-element configurations implemented in different technologies with regard to the effects of process variations and soft errors was examined using the design of the experiment (DoE) and response surface (RSM) techniques. It was found that the circuits based on SOI technology were more robust compared with bulk silicon technology. On the other hand, from the circuit architecture perspective, the differential logic implementations of C-element were found to be more resilient to the effects of process variation and soft errors in comparison with the other C-element implementations investigated.EThOS - Electronic Theses Online ServiceMutah UniversityGBUnited Kingdo

    ESSE 2017. Proceedings of the International Conference on Environmental Science and Sustainable Energy

    Get PDF
    Environmental science is an interdisciplinary academic field that integrates physical-, biological-, and information sciences to study and solve environmental problems. ESSE - The International Conference on Environmental Science and Sustainable Energy provides a platform for experts, professionals, and researchers to share updated information and stimulate the communication with each other. In 2017 it was held in Suzhou, China June 23-25, 2017
    • …
    corecore