7 research outputs found

    Energy Efficient Neocortex-Inspired Systems with On-Device Learning

    Get PDF
    Shifting the compute workloads from cloud toward edge devices can significantly improve the overall latency for inference and learning. On the contrary this paradigm shift exacerbates the resource constraints on the edge devices. Neuromorphic computing architectures, inspired by the neural processes, are natural substrates for edge devices. They offer co-located memory, in-situ training, energy efficiency, high memory density, and compute capacity in a small form factor. Owing to these features, in the recent past, there has been a rapid proliferation of hybrid CMOS/Memristor neuromorphic computing systems. However, most of these systems offer limited plasticity, target either spatial or temporal input streams, and are not demonstrated on large scale heterogeneous tasks. There is a critical knowledge gap in designing scalable neuromorphic systems that can support hybrid plasticity for spatio-temporal input streams on edge devices. This research proposes Pyragrid, a low latency and energy efficient neuromorphic computing system for processing spatio-temporal information natively on the edge. Pyragrid is a full-scale custom hybrid CMOS/Memristor architecture with analog computational modules and an underlying digital communication scheme. Pyragrid is designed for hierarchical temporal memory, a biomimetic sequence memory algorithm inspired by the neocortex. It features a novel synthetic synapses representation that enables dynamic synaptic pathways with reduced memory usage and interconnects. The dynamic growth in the synaptic pathways is emulated in the memristor device physical behavior, while the synaptic modulation is enabled through a custom training scheme optimized for area and power. Pyragrid features data reuse, in-memory computing, and event-driven sparse local computing to reduce data movement by ~44x and maximize system throughput and power efficiency by ~3x and ~161x over custom CMOS digital design. The innate sparsity in Pyragrid results in overall robustness to noise and device failure, particularly when processing visual input and predicting time series sequences. Porting the proposed system on edge devices can enhance their computational capability, response time, and battery life

    Design and Analysis of a Reconfigurable Hierarchical Temporal Memory Architecture

    Get PDF
    Self-learning hardware systems, with high-degree of plasticity, are critical in performing spatio-temporal tasks in next-generation computing systems. To this end, hierarchical temporal memory (HTM) offers time-based online-learning algorithms that store and recall temporal and spatial patterns. In this work, a reconfigurable and scalable HTM architecture is designed with unique pooling realizations. Virtual synapse design is proposed to address the dynamic interconnections occurring in the learning process. The architecture is interweaved with parallel cells and columns that enable high processing speed for the cortical learning algorithm. HTM has two core operations, spatial and temporal pooling. These operations are verified for two different datasets: MNIST and European number plate font. The spatial pooling operation is independently verified for classification with and without the presence of noise. The temporal pooling is verified for simple prediction. The spatial pooler architecture is ported onto an Altera cyclone II fabric and the entire architecture is synthesized for Xilinx Virtex IV. The results show that 91% classification accuracy is achieved with MNIST database and 90% accuracy for the European number plate font numbers with the presence of Gaussian and Salt & Pepper noise. For the prediction, first and second order predictions are observed for a 5-number long sequence generated from European number plate font and ~95% accuracy is obtained. Moreover, the proposed hardware architecture offers 3902X speedup over the software realization. These results indicate that the proposed architecture can serve as a core to build the HTM in hardware and eventually as a standalone self-learning hardware system
    corecore