71 research outputs found

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    Assessment and Improvement of the Pattern Recognition Performance of Memdiode-Based Cross-Point Arrays with Randomly Distributed Stuck-at-Faults

    Get PDF
    In this work, the effect of randomly distributed stuck-at faults (SAFs) in memristive crosspoint array (CPA)-based single and multi-layer perceptrons (SLPs and MLPs, respectively) intended for pattern recognition tasks is investigated by means of realistic SPICE simulations. The quasi-static memdiode model (QMM) is considered here for the modelling of the synaptic weights implemented with memristors. Following the standard memristive approach, the QMM comprises two coupled equations, one for the electron transport based on the double-diode equation with a single series resistance and a second equation for the internal memory state of the device based on the so-called logistic hysteron. By modifying the state parameter in the current-voltage characteristic, SAFs of different severeness are simulated and the final outcome is analysed. Supervised ex-situ training and two well-known image datasets involving hand-written digits and human faces are employed to assess the inference accuracy of the SLP as a function of the faulty device ratio. The roles played by the memristor’s electrical parameters, line resistance, mapping strategy, image pixelation, and fault type (stuck-at-ON or stuck-at-OFF) on the CPA performance are statistically analysed following a Monte-Carlo approach. Three different re-mapping schemes to help mitigate the effect of the SAFs in the SLP inference phase are thoroughly investigated.In this work, the effect of randomly distributed stuck-at faults (SAFs) in memristive cross-point array (CPA)-based single and multi-layer perceptrons (SLPs and MLPs, respectively) intended for pattern recognition tasks is investigated by means of realistic SPICE simulations. The quasi-static memdiode model (QMM) is considered here for the modelling of the synaptic weights implemented with memristors. Following the standard memristive approach, the QMM comprises two coupled equations, one for the electron transport based on the double-diode equation with a single series resistance and a second equation for the internal memory state of the device based on the so-called logistic hysteron. By modifying the state parameter in the current-voltage characteristic, SAFs of different severeness are simulated and the final outcome is analysed. Supervised ex-situ training and two well-known image datasets involving hand-written digits and human faces are employed to assess the inference accuracy of the SLP as a function of the faulty device ratio. The roles played by the memristor?s electrical parameters, line resistance, mapping strategy, image pixelation, and fault type (stuck-at-ON or stuck-at-OFF) on the CPA performance are statistically analysed following a Monte-Carlo approach. Three different re-mapping schemes to help mitigate the effect of the SAFs in the SLP inference phase are thoroughly investigated.Fil: Aguirre, Fernando Leonel. Universidad Tecnológica Nacional. Facultad Regional Buenos Aires. Unidad de Investigación y Desarrollo de las Ingenierías; Argentina. Universitat Autònoma de Barcelona; España. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Pazos, Sebastián Matías. Universidad Tecnológica Nacional. Facultad Regional Buenos Aires. Unidad de Investigación y Desarrollo de las Ingenierías; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Palumbo, Félix Roberto Mario. Universidad Tecnológica Nacional. Facultad Regional Buenos Aires. Unidad de Investigación y Desarrollo de las Ingenierías; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Morell, Antoni. Universitat Autònoma de Barcelona; EspañaFil: Suñé, Jordi. Universitat Autònoma de Barcelona; EspañaFil: Miranda, Enrique. Universitat Autònoma de Barcelona; Españ

    Enabling Neuromorphic Computing for Artificial Intelligence with Hardware-Software Co-Design

    Get PDF
    In the last decade, neuromorphic computing was rebirthed with the emergence of novel nano-devices and hardware-software co-design approaches. With the fast advancement in algorithms for today’s artificial intelligence (AI) applications, deep neural networks (DNNs) have become the mainstream technology. It has been a new research trend to enable neuromorphic designs for DNNs computing with high computing efficiency in speed and energy. In this chapter, we will summarize the recent advances in neuromorphic computing hardware and system designs with non-volatile resistive access memory (ReRAM) devices. More specifically, we will discuss the ReRAM-based neuromorphic computing hardware and system implementations, hardware-software co-design approaches for quantized and sparse DNNs, and architecture designs

    Committee Machines—A Universal Method to Deal with Non-Idealities in Memristor-Based Neural Networks

    Get PDF
    Arti ficial neural networks are notoriously power- and time-consuming when implemented on conventional von Neumann computing systems. Consequently, recent years have seen an emergence of research in machine learning hardware that strives to bring memory and computing closer together. A popular approach is to realise artifi cial neural networks in hardware by implementing their synaptic weights using memristive devices. However, various device- and system-level non-idealities usually prevent these physical implementations from achieving high inference accuracy. We suggest applying a well-known concept in computer science|committee machines|in the context of memristor-based neural networks. Using simulations and experimental data from three different types of memristive devices, we show that committee machines employing ensemble averaging can successfully increase inference accuracy in physically implemented neural networks that suffer from faulty devices, device-to-device variability, random telegraph noise and line resistance. Importantly, we demonstrate that the accuracy can be improved even without increasing the total number of memristors

    Committee Machines—A Universal Method to Deal with Non-Idealities in RRAM-Based Neural Networks

    Get PDF
    Artificial neural networks (ANNs) are notoriously power- and time-consuming when implemented on conventional von Neumann computing systems. Recent years have seen an emergence of research in hardware that strives to break the bottleneck of von Neumann architecture and optimise the data flow; namely to bring memory and computing closer together. One of the most often suggested solutions is the physical implementation of ANNs in which their synaptic weights are realised with analogue resistive devices, such as resistive random-access memory (RRAM). However, various device- and system-level non-idealities usually prevent these physical implementations from achieving high inference accuracy. We suggest applying a well-known concept in computer science -- committee machine (CM) -- in the context of RRAM-based neural networks. Using simulations and experimental data from three different types of RRAM devices, we show that CMs employing ensemble averaging can successfully increase inference accuracy in physically implemented neural networks that suffer from faulty devices, programming non-linearities, random telegraph noise, cycle-to-cycle variability and line resistance. Importantly, we show that the accuracy can be improved even without increasing the number of devices

    MFPA: Mixed-Signal Field Programmable Array for Energy-Aware Compressive Signal Processing

    Get PDF
    Compressive Sensing (CS) is a signal processing technique which reduces the number of samples taken per frame to decrease energy, storage, and data transmission overheads, as well as reducing time taken for data acquisition in time-critical applications. The tradeoff in such an approach is increased complexity of signal reconstruction. While several algorithms have been developed for CS signal reconstruction, hardware implementation of these algorithms is still an area of active research. Prior work has sought to utilize parallelism available in reconstruction algorithms to minimize hardware overheads; however, such approaches are limited by the underlying limitations in CMOS technology. Herein, the MFPA (Mixed-signal Field Programmable Array) approach is presented as a hybrid spin-CMOS reconfigurable fabric specifically designed for implementation of CS data sampling and signal reconstruction. The resulting fabric consists of 1) slice-organized analog blocks providing amplifiers, transistors, capacitors, and Magnetic Tunnel Junctions (MTJs) which are configurable to achieving square/square root operations required for calculating vector norms, 2) digital functional blocks which feature 6-input clockless lookup tables for computation of matrix inverse, and 3) an MRAM-based nonvolatile crossbar array for carrying out low-energy matrix-vector multiplication operations. The various functional blocks are connected via a global interconnect and spin-based analog-to-digital converters. Simulation results demonstrate significant energy and area benefits compared to equivalent CMOS digital implementations for each of the functional blocks used: this includes an 80% reduction in energy and 97% reduction in transistor count for the nonvolatile crossbar array, 80% standby power reduction and 25% reduced area footprint for the clockless lookup tables, and roughly 97% reduction in transistor count for a multiplier built using components from the analog blocks. Moreover, the proposed fabric yields 77% energy reduction compared to CMOS when used to implement CS reconstruction, in addition to latency improvements

    Réseaux neuronaux robustes face à la variabilité de l’apprentissage machine sur crossbar passif de mémoires résistives à base de TiO2

    Get PDF
    La présence des réseaux neuronaux dans notre quotidien connaît une augmentation exponentielle. Les réseaux sociaux, les moteurs de recherche et le commerce électronique ne sont que quelques exemples de domaines les sollicitant en permanence. La taille de ces réseaux, à l'image de leur présence, a également connu un constant essor. Or, plus un réseau neuronal comporte de paramètres, plus il consomme d'énergie puisqu'il nécessite toujours plus d'accès successifs en mémoire afin de chercher et/ou d'altérer ceux-ci. C'est ce goulot d'étranglement entre le processeur et les données qui limite dramatiquement l'efficience énergétique des réseaux neuronaux implémentés sur l'architecture de von Neumann. Une approche prometteuse pour détourner ce problème est l'utilisation d'une composante électronique permettant une version analogue des opérations de multiplication matrice-vecteur essentielles à l'apprentissage machine directement sur mémoire: les mémoires résistives. Cependant, les défauts inhérents dont celles-ci sont affligées les rendent difficiles à utiliser en tant que poids synaptique discret et précis. Il devient donc très intéressant de considérer certaines sources de variabilité de ces mémoires émergentes lors de l'entraînement des réseaux neuronaux afin d'exploiter leurs facultés d'apprentissage pour les rendre ainsi aptes à classifier efficacement en les utilisant malgré leur nombreuses non-idéalités. Ces réseaux stochastiques sont donc robustes à la variabilité matérielle des dispositifs. Une enquête du potentiel de réseaux neuronaux considérant ces défauts stochastiques fut menée durant ce projet de maîtrise. De nouvelles techniques de caractérisation de la variabilité des crossbars et des mémoires résistives furent élaborées. L'utilisation de ces données pour injecter du bruit sur les poids synaptiques du réseau lors de son entraînement permet de créer un réseau plus robuste. Il est prouvé en simulation qu'un réseau conscient des non-idéalités est considérablement plus précis qu'un réseau entraîné naïvement pour une même tâche de classification de demi-lunes lorsque les variabilités sont prises en compte

    Neuromorphic Computing with Resistive Switching Devices.

    Full text link
    Resistive switches, commonly referred to as resistive memory (RRAM) devices and modeled as memristors, are an emerging nanoscale technology that can revolutionize data storage and computing approaches. Enabled by the advancement of nanoscale semiconductor fabrication and detailed understanding of the physical and chemical processes occurring at the atomic scale, resistive switches offer high speed, low-power, and extremely dense nonvolatile data storage. Further, the analog capabilities of resistive switching devices enables neuromorphic computing approaches which can achieve massively parallel computation with a power and area budget that is orders of magnitude lower than today’s conventional, digital approaches. This dissertation presents the investigation of tungsten oxide based resistive switching devices for use in neuromorphic computing applications. Device structure, fabrication, and integration are described and physical models are developed to describe the behavior of the devices. These models are used to develop array-scale simulations in support of neuromorphic computing approaches. Several signal processing algorithms are adapted for acceleration using arrays of resistive switches. Both simulation and experimental results are reported. Finally, guiding principles and proposals for future work are discussed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116743/1/sheridp_1.pd
    • …
    corecore