1,078 research outputs found
Efficient realization of RTD-CMOS logic gates
The incorporation of Resonant Tunnel Diodes (RTDs) into III/V transistor technologies has shown an improved circuit performance: higher circuit speed, reduced component count, and/or lowered power consumption. Currently, the incorporation of these devices into CMOS technologies (RTD-CMOS) is an area of active research. Although some works have focused the evaluation of the advantages of this incorporation, additional work in this direction is required. This paper compares RTD-CMOS and pure CMOS realizations of a set of logic gates which can be operated in a gate-level nanopipelined. Lower average power and energy per cycle are obtained for RTD/CMOS implementations.Spanish Ministry of Education and Science with support from ERDF under Project TEC2007- 67245Consejería de Innovación, Ciencia y Empresa, Junta de Andalucía under Project TIC-296
RTD-CMOS pipelined networks for reduced power consumption
The incorporation of resonant tunneling diodes (RTDs) into III/V transistor technologies has shown an improved circuit performance, producing higher circuit speed, reduced component count, and/or lower power consumption. Currently, the incorporation of these devices into CMOS technologies (RTD-CMOS) is an area of active research. Although some studies have concentrated on evaluating the advantages of this incorporation, more work in this direction is required. In this letter, we compare RTD-CMOS and pure CMOS realizations of a logic gate network which can be operated in a gate-level pipeline. Significantly lower average power is obtained for RTD-CMOS implementations.Gobierno de España TEC2007-67245, TEC2010-18937Junta de Andalucía TIC-296
Utilizing Magnetic Tunnel Junction Devices in Digital Systems
The research described in this dissertation is motivated by the desire to effectively utilize magnetic tunnel junctions (MTJs) in digital systems. We explore two aspects of this: (1) a read circuit useful for global clocking and magnetologic, and (2) hardware virtualization that utilizes the deeply-pipelined nature of magnetologic.
In the first aspect, a read circuit is used to sense the state of an MTJ (low or high resistance) and produce a logic output that represents this state. With global clocking, an external magnetic field combined with on-chip MTJs is used as an alternative mechanism for distributing the clock signal across the chip. With magnetologic, logic is evaluated with MTJs that must be sensed by a read circuit and used to drive downstream logic. For these two uses, we develop a resistance-to-voltage (R2V) read circuit to sense MTJ resistance and produce a logic voltage output. We design and fabricate a prototype test chip in the 3 metal 2 poly 0.5 um process for testing the R2V read circuit and experimentally validating its correctness. Using a clocked low/high resistor pair, we show that the read circuit can correctly detect the input resistance and produce the desired square wave output. The read circuit speed is measured to operate correctly up to 48 MHz. The input node is relatively insensitive to node capacitance and can handle up to 10s of pF of capacitance without changing the bandwidth of the circuit.
In the second aspect, hardware virtualization is a technique by which deeply-pipelined circuits that have feedback can be utilized. MTJs have the potential to act as state in a magnetologic circuit which may result in a deep pipeline. Streams of computation are then context switched into the hardware logic, allowing them to share hardware resources and more fully utilize the pipeline stages of the logic. While applicable to magnetologic using MTJs, virtualization is also applicable to traditional logic technologies like CMOS. Our investigation targets MTJs, FPGAs, and ASICs. We develop M/D/1 and M/G/1 queueing models of the performance of virtualized hardware with secondary memory using a fixed, hierarchical, round-robin schedule that predict average throughput, latency, and queue occupancy in the system. We develop three C-slow applications and calibrate them to a clock and resource model for FPGA and ASIC technologies. Last, using the M/G/1 model, we predict throughput, latency, and resource usage for MTJ, FPGA, and ASIC technologies. We show three design scenarios illustrating ways in which to use the model
Analog/RF Circuit Design Techniques for Nanometerscale IC Technologies
CMOS evolution introduces several problems in analog design. Gate-leakage mismatch exceeds conventional matching tolerances requiring active cancellation techniques or alternative architectures. One strategy to deal with the use of lower supply voltages is to operate critical parts at higher supply voltages, by exploiting combinations of thin- and thick-oxide transistors. Alternatively, low voltage circuit techniques are successfully developed. In order to benefit from nanometer scale CMOS technology, more functionality is shifted to the digital domain, including parts of the RF circuits. At the same time, analog control for digital and digital control for analog emerges to deal with current and upcoming imperfections
Exploring Spin-transfer-torque devices and memristors for logic and memory applications
As scaling CMOS devices is approaching its physical limits, researchers have begun exploring newer devices and architectures to replace CMOS.
Due to their non-volatility and high density, Spin Transfer Torque (STT) devices are among the most prominent candidates for logic and memory applications. In this research, we first considered a new logic style called All Spin Logic (ASL). Despite its advantages, ASL consumes a large amount of static power; thus, several optimizations can be performed to address this issue. We developed a systematic methodology to perform the optimizations to ensure stable operation of ASL.
Second, we investigated reliable design of STT-MRAM bit-cells and addressed the conflicting read and write requirements, which results in overdesign of the bit-cells. Further, a Device/Circuit/Architecture co-design framework was developed to optimize the STT-MRAM devices by exploring the design space through jointly considering yield enhancement techniques at different levels of abstraction.
Recent advancements in the development of memristive devices have opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this research, we utilized swarm intelligence of memristive networks to perform image edge detection. First, we proposed a hardware-friendly algorithm for image edge detection based on ant colony. Next, we designed the image edge detection algorithm using memristive networks
Energy challenges for ICT
The energy consumption from the expanding use of information and communications technology (ICT) is unsustainable with present drivers, and it will impact heavily on the future climate change. However, ICT devices have the potential to contribute signi - cantly to the reduction of CO2 emission and enhance resource e ciency in other sectors, e.g., transportation (through intelligent transportation and advanced driver assistance systems and self-driving vehicles), heating (through smart building control), and manu- facturing (through digital automation based on smart autonomous sensors). To address the energy sustainability of ICT and capture the full potential of ICT in resource e - ciency, a multidisciplinary ICT-energy community needs to be brought together cover- ing devices, microarchitectures, ultra large-scale integration (ULSI), high-performance computing (HPC), energy harvesting, energy storage, system design, embedded sys- tems, e cient electronics, static analysis, and computation. In this chapter, we introduce challenges and opportunities in this emerging eld and a common framework to strive towards energy-sustainable ICT
Design and Robustness Analysis on Non-volatile Storage and Logic Circuit
By combining the flexibility of MOS logic and the non-volatility of spintronic devices, spin-MOS logic and storage circuitry offer a promising approach to implement highly integrated, power-efficient, and nonvolatile computing and storage systems. Besides the persistent errors due to process variations, however, the functional correctness of Spin-MOS circuitry suffers from additional non-persistent errors that are incurred by the randomness of spintronic device operations, i.e., thermal fluctuations. This work quantitatively investigates the impact of thermal fluctuations on the operations of two typical Spin-MOS circuitry: one transistor and one magnetic tunnel junction (1T1J) spin-transfer torque random access memory (STT-RAM) cell and a nonvolatile latch design. A new nonvolatile latch design is proposed based on magnetic tunneling junction (MTJ) devices. In the standby mode, the latched data can be retained in the MTJs without consuming any power. Two types of operation errors can occur, namely, persistent and non-persistent errors. These are quantitatively analyzed by including models for process variations and thermal fluctuations during the read and write operations. A mixture importance sampling methodology is applied to enable yield-driven design and extend its application beyond memories to peripheral circuits and logic blocks. Several possible design techniques to reduce thermal induced non-persistent error rate are also discussed
Naturalized Communication and Testing
We ”naturalize” the handshake communication links of a self-timed system by assigning the capabilities of filling and draining a link and of storing its full or empty status to the link itself. This contrasts with assigning these capabilities to the joints, the modules connected by the links, as was previously done. Under naturalized communication, the differences between Micropipeline, GasP, Mousetrap, and Click circuits are seen only in the links — the joints become identical; past, present, and future link and joint designs become interchangeable. We also “naturalize” the actions of a self-timed system, giving actions status equal to states — for the purpose of silicon test and debug. We partner traditional scan test techniques dedicated to state with new test capabilities dedicated to action. To each and every joint, we add a novel proper-start-stop circuit, called MrGO, that permits or forbids the action of that joint. MrGO, pronounced “Mister GO,” makes it possible to (1) exit an initial state cleanly to start circuit operation in a delay-insensitive manner, (2) stop a running circuit in a clean and delay-insensitive manner, (3) single- or multi-step circuit operations for test and debug, and (4) test sub-systems at speed.We present a static control flow analysis used in the Simple Unified Policy Programming Language(Suppl) compiler to detect internally inconsistent policies. For example, an access control policy can decide to both “allow” and “deny” access for a user; such an inconsistency is called a conflict. Policies in Suppl. follow the Event-Condition-Action paradigm; predicates are used to model conditions and event handlers are written in an imperative way. The analysis is twofold; it first computes a superset of all conflicts by looking for a combination of actions in the event handlers that might violate a user-supplied definition of conflicts. SMT solvers are then used to try to rule out the combinations that cannot possibly be executed. The analysis is formally proven sound in Coq in the sense that no actual conflict will be ruled out by the SMT solvers. Finally, we explain how we try to show the user what causes the conflicts, to make them easier to solve
Power Modeling and Optimization for GPGPUs
Modern graphics processing units (GPUs) supports tens of thousands of parallel threads and delivers remarkably high computing throughput. General-Purpose computing on GPUs (GPGPUs) is becoming the attractive platform for general-purpose applications that request high computational performance such as scientific computing, financial applications, medical data processing, and so on. However, GPGPUs is facing severe power challenge due to the increasing number of cores placed on a single chip with decreasing feature size. In order to explore the power optimization techniques in GPGPUs, I first build a power model for GPGPUs, which is able to estimate both dynamic and leakage power of major microarchitecture structures in GPGPUs. I then target on the power-hungry structures (e.g. register file) to explore the energy-efficient GPGPUs. In order to hide the long latency operations, GPGPUs employs the fine-grained multi-threading among numerous active threads, leading to the sizeable register files with massive power consumption. The conventional method to reduce dynamic power consumption is the supply voltage scaling. And the inter-bank tunneling FETs (TFETs) is the promising candidate compared to CMOS for low voltage operations regarding to both leakage and performance. However, always executing at the low voltage will result in significant performance degradation. In this study, I propose the hybrid CMOS-TFET based register file and allocate TFET-based registers to threads whose execution progress can be delayed to some degree to avoid the memory contentions with other threads to reduce both dynamic and leakage power, and the CMOS-based registers are still used for threads requiring normal execution speed. My experimental results show that the proposed technique achieves 30% energy (including both dynamic and leakage) reduction in register files with negligible performance degradation compared to the baseline case equipped with naive power optimization technique
- …