13 research outputs found

    Towards Energy-Efficient and Reliable Computing: From Highly-Scaled CMOS Devices to Resistive Memories

    Get PDF
    The continuous increase in transistor density based on Moore\u27s Law has led us to highly scaled Complementary Metal-Oxide Semiconductor (CMOS) technologies. These transistor-based process technologies offer improved density as well as a reduction in nominal supply voltage. An analysis regarding different aspects of 45nm and 15nm technologies, such as power consumption and cell area to compare these two technologies is proposed on an IEEE 754 Single Precision Floating-Point Unit implementation. Based on the results, using the 15nm technology offers 4-times less energy and 3-fold smaller footprint. New challenges also arise, such as relative proportion of leakage power in standby mode that can be addressed by post-CMOS technologies. Spin-Transfer Torque Random Access Memory (STT-MRAM) has been explored as a post-CMOS technology for embedded and data storage applications seeking non-volatility, near-zero standby energy, and high density. Towards attaining these objectives for practical implementations, various techniques to mitigate the specific reliability challenges associated with STT-MRAM elements are surveyed, classified, and assessed herein. Cost and suitability metrics assessed include the area of nanomagmetic and CMOS components per bit, access time and complexity, Sense Margin (SM), and energy or power consumption costs versus resiliency benefits. In an attempt to further improve the Process Variation (PV) immunity of the Sense Amplifiers (SAs), a new SA has been introduced called Adaptive Sense Amplifier (ASA). ASA can benefit from low Bit Error Rate (BER) and low Energy Delay Product (EDP) by combining the properties of two of the commonly used SAs, Pre-Charge Sense Amplifier (PCSA) and Separated Pre-Charge Sense Amplifier (SPCSA). ASA can operate in either PCSA or SPCSA mode based on the requirements of the circuit such as energy efficiency or reliability. Then, ASA is utilized to propose a novel approach to actually leverage the PV in Non-Volatile Memory (NVM) arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time

    Energy and Area Efficient Machine Learning Architectures using Spin-Based Neurons

    Get PDF
    Recently, spintronic devices with low energy barrier nanomagnets such as spin orbit torque-Magnetic Tunnel Junctions (SOT-MTJs) and embedded magnetoresistive random access memory (MRAM) devices are being leveraged as a natural building block to provide probabilistic sigmoidal activation functions for RBMs. In this dissertation research, we use the Probabilistic Inference Network Simulator (PIN-Sim) to realize a circuit-level implementation of deep belief networks (DBNs) using memristive crossbars as weighted connections and embedded MRAM-based neurons as activation functions. Herein, a probabilistic interpolation recoder (PIR) circuit is developed for DBNs with probabilistic spin logic (p-bit)-based neurons to interpolate the probabilistic output of the neurons in the last hidden layer which are representing different output classes. Moreover, the impact of reducing the Magnetic Tunnel Junction\u27s (MTJ\u27s) energy barrier is assessed and optimized for the resulting stochasticity present in the learning system. In p-bit based DBNs, different defects such as variation of the nanomagnet thickness can undermine functionality by decreasing the fluctuation speed of the p-bit realized using a nanomagnet. A method is developed and refined to control the fluctuation frequency of the output of a p-bit device by employing a feedback mechanism. The feedback can alleviate this process variation sensitivity of p-bit based DBNs. This compact and low complexity method which is presented by introducing the self-compensating circuit can alleviate the influences of process variation in fabrication and practical implementation. Furthermore, this research presents an innovative image recognition technique for MNIST dataset on the basis of p-bit-based DBNs and TSK rule-based fuzzy systems. The proposed DBN-fuzzy system is introduced to benefit from low energy and area consumption of p-bit-based DBNs and high accuracy of TSK rule-based fuzzy systems. This system initially recognizes the top results through the p-bit-based DBN and then, the fuzzy system is employed to attain the top-1 recognition results from the obtained top outputs. Simulation results exhibit that a DBN-Fuzzy neural network not only has lower energy and area consumption than bigger DBN topologies while also achieving higher accuracy

    High-Performance and Low-Power Magnetic Material Memory Based Cache Design

    Get PDF
    Magnetic memory technologies are very promising candidates to be universal memory due to its good scalability, zero standby power and radiation hardness. Having a cell area much smaller than SRAM, magnetic memory can be used to construct much larger cache with the same die footprint, leading to siginficant improvement of overall system performance and power consumption especially in this multi-core era. However, magnetic memories have their own drawbacks such as slow write, read disturbance and scaling limitation, making its usage as caches challenging. This dissertation comprehensively studied these two most popular magnetic memory technologies. Design exploration and optimization for the cache design from different design layers including the memory devices, peripheral circuit, memory array structure and micro-architecture are presented. By leveraging device features, two major micro-architectures -multi-retention cache hierarchy and process-variation-aware cache are presented to improve the write performance of STT-RAM. The enhancement in write performance results in the degradation of read operations, in terms of both speed and data reliability. This dissertation also presents an architecture to resolve STT-RAM read disturbance issue. Furthermore, the scaling of STT-RAM is hindered due to the required size of switching transistor. To break the cell area limitation of STT-RAM, racetrack memory is studied to achieve an even higher memory density and better performance and lower energy consumption. With dedicated elaboration, racetrack memory based cache design can achieve a siginificant area reduction and energy saving when compared to optimized STT-RAM

    Bio-inspired learning and hardware acceleration with emerging memories

    Get PDF
    Machine Learning has permeated many aspects of engineering, ranging from the Internet of Things (IoT) applications to big data analytics. While computing resources available to implement these algorithms have become more powerful, both in terms of the complexity of problems that can be solved and the overall computing speed, the huge energy costs involved remains a significant challenge. The human brain, which has evolved over millions of years, is widely accepted as the most efficient control and cognitive processing platform. Neuro-biological studies have established that information processing in the human brain relies on impulse like signals emitted by neurons called action potentials. Motivated by these facts, the Spiking Neural Networks (SNNs), which are a bio-plausible version of neural networks have been proposed as an alternative computing paradigm where the timing of spikes generated by artificial neurons is central to its learning and inference capabilities. This dissertation demonstrates the computational power of the SNNs using conventional CMOS and emerging nanoscale hardware platforms. The first half of this dissertation presents an SNN architecture which is trained using a supervised spike-based learning algorithm for the handwritten digit classification problem. This network achieves an accuracy of 98.17% on the MNIST test data-set, with about 4X fewer parameters compared to the state-of-the-art neural networks achieving over 99% accuracy. In addition, a scheme for parallelizing and speeding up the SNN simulation on a GPU platform is presented. The second half of this dissertation presents an optimal hardware design for accelerating SNN inference and training with SRAM (Static Random Access Memory) and nanoscale non-volatile memory (NVM) crossbar arrays. Three prominent NVM devices are studied for realizing hardware accelerators for SNNs: Phase Change Memory (PCM), Spin Transfer Torque RAM (STT-RAM) and Resistive RAM (RRAM). The analysis shows that a spike-based inference engine with crossbar arrays of STT-RAM bit-cells is 2X and 5X more efficient compared to PCM and RRAM memories, respectively. Furthermore, the STT-RAM design has nearly 6X higher throughput per unit Watt per unit area than that of an equivalent SRAM-based (Static Random Access Memory) design. A hardware accelerator with on-chip learning on an STT-RAM memory array is also designed, requiring 1616 bits of floating-point synaptic weight precision to reach the baseline SNN algorithmic performance on the MNIST dataset. The complete design with STT-RAM crossbar array achieves nearly 20X higher throughput per unit Watt per unit mm^2 than an equivalent design with SRAM memory. In summary, this work demonstrates the potential of spike-based neuromorphic computing algorithms and its efficient realization in hardware based on conventional CMOS as well as emerging technologies. The schemes presented here can be further extended to design spike-based systems that can be ubiquitously deployed for energy and memory constrained edge computing applications

    Low Power Memory/Memristor Devices and Systems

    Get PDF
    This reprint focusses on achieving low-power computation using memristive devices. The topic was designed as a convenient reference point: it contains a mix of techniques starting from the fundamental manufacturing of memristive devices all the way to applications such as physically unclonable functions, and also covers perspectives on, e.g., in-memory computing, which is inextricably linked with emerging memory devices such as memristors. Finally, the reprint contains a few articles representing how other communities (from typical CMOS design to photonics) are fighting on their own fronts in the quest towards low-power computation, as a comparison with the memristor literature. We hope that readers will enjoy discovering the articles within

    MRAM commercialization potential evaluation Research Based on the Chinese Market

    Get PDF
    Regarding the Chinese data storage industries, there is an urgent need for a national strategy and new investments as there are new technologies emerging in the global markets. The storage technology commercialization activities are becoming a widespread concern for the Chinese government and their strategic enterprises. The promotion of storage technology commercialization has become a common goal for enterprise and national government strategies. How the potential for commercialization of a storage technology can be assessed, what evaluation index should be used, and what the factors affect the storage technology are the important issues that must be addressed

    Addressing the RRAM Reliability and Radiation Soft-Errors in the Memory Systems

    Get PDF
    With the continuous and aggressive technology scaling, the design of memory systems becomes very challenging. The desire to have high-capacity, reliable, and energy efficient memory arrays is rising rapidly. However, from the technology side, the increasing leakage power and the restrictions resulting from the manufacturing limitations complicate the design of memory systems. In addition to this, with the new machine learning applications, which require tremendous amount of mathematical operations to be completed in a timely manner, the interest in neuromorphic systems has increased in recent years. Emerging Non- Volatile Memory (NVM) devices have been suggested to be incorporated in the design of memory arrays due to their small size and their ability to reduce leakage power since they can retain their data even in the absence of power supply. Compared to other novel NVM devices, the Resistive Random Access Memory (RRAM) device has many advantages including its low-programming requirements, the large ratio between its high and low resistive states, and its compatibility with the Complementary Metal Oxide Semiconductor (CMOS) fabrication process. RRAM device suffers from other disadvantages including the instability in its switching dynamics and its sensitivity to process variations. Yet, one of the popular issues hindering the deployment of RRAM arrays in products are the RRAM reliability and radiation soft-errors. The RRAM reliability soft-errors result from the diffusion of oxygen vacations out of the conductive channels within the oxide material of the device. On the other hand, the radiation soft-errors are caused by the highly energetic cosmic rays incident on the junction of the MOS device used as a selector for the RRAM cell. Both of those soft-errors cause the unintentional change of the resistive state of the RRAM device. While there is research work in literature to address some of the RRAM disadvantages such as the switching dynamic instability, there is no dedicated work discussing the impact of RRAM soft-errors on the various designs to which the RRAM device is integrated and how the soft-errors can be automatically detected and fixed. In this thesis, we bring the attention to the need of considering the RRAM soft-errors to avoid the degradation in design performance. In addition to this, using previously reported SPICE models, which were experimentally verified, and widely adapted system level simulators and test benches, various solutions are provided to automatically detect and fix the degradation in design performance due to the RRAM soft-errors. The main focus in this work is to propose methodologies which solve or improve the robustness of memory systems to the RRAM soft-errors. These memories are expected to be incorporated in the current and futuristic platforms running the advanced machine learning applications. In more details, the main contributions of this thesis can be summarized as: - Provide in depth analysis of the impact of RRAM soft-errors on the performance of RRAM-based designs. - Provide a new SRAM cell which uses the RRAM device to reduce the SRAM leakage power with minimal impact on its read and write operations. This new SRAM cell can be incorporated in the Graphical Processing Unit (GPU) design used currently in the implementation of the machine learning platforms. - Provide a circuit and system solutions to resolve the reliability and radiation soft-errors in the RRAM arrays. These solution can automatically detect and fix the soft-errors with minimum impact on the delay and energy consumption of the memory array. - A framework is developed to estimate the effect of RRAM soft-errors on the performance of RRAM-based neuromorphic systems. This actually provides, for the first time, a very generic methodology through which the device level RRAM soft-errors are mapped to the overall performance of the neuromorphic systems. Our analysis show that the accuracy of the RRAM-based neuromorphic system can degrade by more than 48% due to RRAM soft-errors. - Two algorithms are provided to automatically detect and restore the degradation in RRAM-based neuromorphic systems due to RRAM soft-errors. The system and circuit level techniques to implement these algorithms are also explained in this work. In conclusion, this work offers initial steps for enabling the usage of RRAM devices in products by tackling one of its most known challenges: RRAM reliability and radiation soft-errors. Despite using experimentally verified SPICE models and widely popular system simulators and test benches, the provided solutions in this thesis need to be verified in the future work through fabrication to study the impact of other RRAM technology shortcomings including: a) the instability in its switching dynamics due to the stochastic nature of oxygen vacancies movement, and b) its sensitivity to process variations

    Nouvelles Architectures Hybrides (Logique / Mémoires Non-Volatiles et technologies associées.)

    Get PDF
    Les nouvelles approches de technologies mémoires permettront une intégration dite back-end, où les cellules élémentaires de stockage seront fabriquées lors des dernières étapes de réalisation à grande échelle du circuit. Ces approches innovantes sont souvent basées sur l'utilisation de matériaux actifs présentant deux états de résistance distincts. Le passage d'un état à l'autre est contrôlé en courant ou en tension donnant lieu à une caractéristique I-V hystérétique. Nos mémoires résistives sont composées d'argent en métal électrochimiquement actif et de sulfure amorphe agissant comme électrolyte. Leur fonctionnement repose sur la formation réversible et la dissolution d'un filament conducteur. Le potentiel d'application de ces nouveaux dispositifs n'est pas limité aux mémoires ultra-haute densité mais aussi aux circuits embarqués. En empilant ces mémoires dans la troisième dimension au niveau des interconnections des circuits logiques CMOS, de nouvelles architectures hybrides et innovantes deviennent possibles. Il serait alors envisageable d'exploiter un fonctionnement à basse énergie, à haute vitesse d'écriture/lecture et de haute performance telles que l'endurance et la rétention. Dans cette thèse, en se concentrant sur les aspects de la technologie de mémoire en vue de développer de nouvelles architectures, l'introduction d'une fonctionnalité non-volatile au niveau logique est démontrée par trois circuits hybrides: commutateurs de routage non volatiles dans un Field Programmable Gate Arrays, un 6T-SRAM non volatile, et les neurones stochastiques pour un réseau neuronal. Pour améliorer les solutions existantes, les limitations de la performances des dispositifs mémoires sont identifiés et résolus avec des nouveaux empilements ou en fournissant des défauts de circuits tolérants.Novel approaches in the field of memory technology should enable backend integration, where individual storage nodes will be fabricated during the last fabrication steps of the VLSI circuit. In this case, memory operation is often based upon the use of active materials with resistive switching properties. A topology of resistive memory consists of silver as electrochemically active metal and amorphous sulfide acting as electrolyte and relies on the reversible formation and dissolution of a conductive filament. The application potential of these new memories is not limited to stand-alone (ultra-high density), but is also suitable for embedded applications. By stacking these memories in the third dimension at the interconnection level of CMOS logic, new ultra-scalable hybrid architectures becomes possible which exploit low energy operation, fast write/read access and high performance with respect to endurance and retention. In this thesis, focusing on memory technology aspects in view of developing new architectures, the introduction of non-volatile functionality at the logic level is demonstrated through three hybrid (CMOS logic ReRAM devices) circuits: nonvolatile routing switches in a Field Programmable Gate Array, nonvolatile 6T-SRAMs, and stochastic neurons of an hardware neural network. To be competitive or even improve existing solutions, limitations on the memory devices performances are identified and solved by stack engineering of CBRAM devices or providing faults tolerant circuits.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Neuromorphic systems based on memristive devices - From the material science perspective to bio-inspired learning hardware

    Get PDF
    Hardware computation is facing in the present age a deep transformation of its own paradigms. Silicon based computation is reaching its limit due to the physical constraints of transistor technology. As predicted by the Moore’s law, downscaling of transistor dimensions doubled each year since the 60s, leading nowadays to the extreme of 16-nm channel width of the present state-of-the-art technology. No further improvement is possible, since laws of physics impose a different electrical behavior when lower dimensions are attempted. Multiple solutions are then envisaged, spanning the range from quantum computing to neuromorphic computing. The present dissertation wants to be a preliminary study for understanding the opportunities enabled by neuromorphic computing based on resistive switching memories. In particular, brain inspires technology and architecture of new generation processors because of its unique properties: parallel and distributed computation, superposition of processing and memory unit, low power consumption, to cite only some of them. Such features make brain particularly efficient and robust against degraded data, further than particularly suitable to process and store in memory new nformation. Despite many research projects and some commercial products are already proposing brain-like computing processors, like spiNNaker or IBM’s Bluenorth, they only mimic the brain functioning with standard Silicon technology, that is inherently serial and distinguish between processing and memory unit. Resistive switching technology on the other hand, would allow to overcome many of these issues, enabling a far better match between biological and artificial neuromorphic computation. Resistive switching are, generally speaking, Metal-Insulator-Metal structures able to change their electrical conductance as a consequence of the history of applied electric signal. In such sense, they behave exactly as synapses do in a biological neural networks. For this reason, resistive switching when modeled as memristor, i.e. memory-resistor, can act as artificial synapses and, moreover, are particularly suitable to be interfaced with artificial Silicon neurons that are designed to replicate the biological behavior when excited with electric pulses. Anyhow, from the technological standpoint, there is still no standard on the design and fabrication of resistive switching, so that multiple structure and materials are investigated. In this dissertation, it is reported an analysis of multiple resistive switching devices, based on various materials, i.e. TiO2, ZnO and HfO, and device architectures, i.e. thin film and nanostructured devices, with the scope of both characterizing and comprehending the physics behind resistive switching phenomena. Furthermore, numerical simulations of artificial spiking neural networks, embedding Silicon neurons and HfO-based resistive switching are designed and performed, in order to give a systematic analysis of the performances reached by this new kind of computing paradigm

    Caractérisation et conception d' architectures basées sur des mémoires à changement de phase

    Get PDF
    Semiconductor memory has always been an indispensable component of modern electronic systems. The increasing demand for highly scaled memory devices has led to the development of reliable non-volatile memories that are used in computing systems for permanent data storage and are capable of achieving high data rates, with the same or lower power dissipation levels as those of current advanced memory solutions.Among the emerging non-volatile memory technologies, Phase Change Memory (PCM) is the most promising candidate to replace conventional Flash memory technology. PCM offers a wide variety of features, such as fast read and write access, excellent scalability potential, baseline CMOS compatibility and exceptional high-temperature data retention and endurance performances, and can therefore pave the way for applications not only in memory devices, but also in energy demanding, high-performance computer systems. However, some reliability issues still need to be addressed in order for PCM to establish itself as a competitive Flash memory replacement.This work focuses on the study of embedded Phase Change Memory in order to optimize device performance and propose solutions to overcome the key bottlenecks of the technology, targeting high-temperature applications. In order to enhance the reliability of the technology, the stoichiometry of the phase change material was appropriately engineered and dopants were added, resulting in an optimized thermal stability of the device. A decrease in the programming speed of the memory technology was also reported, along with a residual resistivity drift of the low resistance state towards higher resistance values over time.A novel programming technique was introduced, thanks to which the programming speed of the devices was improved and, at the same time, the resistance drift phenomenon could be successfully addressed. Moreover, an algorithm for programming PCM devices to multiple bits per cell using a single-pulse procedure was also presented. A pulse generator dedicated to provide the desired voltage pulses at its output was designed and experimentally tested, fitting the programming demands of a wide variety of materials under study and enabling accurate programming targeting the performance optimization of the technology.Les mémoires à base de semi-conducteur sont indispensables pour les dispositifs électroniques actuels. La demande croissante pour des dispositifs mémoires fortement miniaturisées a entraîné le développement de mémoires non volatiles fiables qui sont utilisées dans des systèmes informatiques pour le stockage de données et qui sont capables d'atteindre des débits de données élevés, avec des niveaux de dissipation d'énergie équivalents voire moindres que ceux des technologies mémoires actuelles.Parmi les technologies de mémoires non-volatiles émergentes, les mémoires à changement de phase (PCM) sont le candidat le plus prometteur pour remplacer la technologie de mémoire Flash conventionnelle. Les PCM offrent une grande variété de fonctions, comme une lecture et une écriture rapide, un excellent potentiel de miniaturisation, une compatibilité CMOS et des performances élevées de rétention de données à haute température et d'endurance, et peuvent donc ouvrir la voie à des applications non seulement pour les dispositifs mémoires, mais également pour les systèmes informatiques à hautes performances. Cependant, certains problèmes de fiabilité doivent encore être résolus pour que les PCM se positionnent comme un remplacement concurrentiel de la mémoire Flash.Ce travail se concentre sur l'étude de mémoires à changement de phase intégrées afin d'optimiser leurs performances et de proposer des solutions pour surmonter les principaux points critiques de la technologie, ciblant des applications à hautes températures. Afin d'améliorer la fiabilité de la technologie, la stœchiométrie du matériau à changement de phase a été conçue de façon appropriée et des dopants ont été ajoutés, optimisant ainsi la stabilité thermique. Une diminution de la vitesse de programmation est également rapportée, ainsi qu'un drift résiduel de la résistance de l'état de faiblement résistif vers des valeurs de résistance plus élevées au cours du temps.Une nouvelle technique de programmation est introduite, permettant d'améliorer la vitesse de programmation des dispositifs et, dans le même temps, de réduire avec succès le phénomène de drift en résistance. Par ailleurs, un algorithme de programmation des PCM multi-bits est présenté. Un générateur d'impulsions fournissant des impulsions avec la tension souhaitée en sortie a été conçu et testé expérimentalement, répondant aux demandes de programmation d'une grande variété de matériaux innovants et en permettant la programmation précise et l’optimisation des performances des PCM
    corecore