Search CORE

1,004 research outputs found

Exploring Spin-transfer-torque devices and memristors for logic and memory applications

Author: Pajouhi Zoha
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

As scaling CMOS devices is approaching its physical limits, researchers have begun exploring newer devices and architectures to replace CMOS. Due to their non-volatility and high density, Spin Transfer Torque (STT) devices are among the most prominent candidates for logic and memory applications. In this research, we first considered a new logic style called All Spin Logic (ASL). Despite its advantages, ASL consumes a large amount of static power; thus, several optimizations can be performed to address this issue. We developed a systematic methodology to perform the optimizations to ensure stable operation of ASL. Second, we investigated reliable design of STT-MRAM bit-cells and addressed the conflicting read and write requirements, which results in overdesign of the bit-cells. Further, a Device/Circuit/Architecture co-design framework was developed to optimize the STT-MRAM devices by exploring the design space through jointly considering yield enhancement techniques at different levels of abstraction. Recent advancements in the development of memristive devices have opened new opportunities for hardware implementation of non-Boolean computing. To this end, the suitability of memristive devices for swarm intelligence algorithms has enabled researchers to solve a maze in hardware. In this research, we utilized swarm intelligence of memristive networks to perform image edge detection. First, we proposed a hardware-friendly algorithm for image edge detection based on ant colony. Next, we designed the image edge detection algorithm using memristive networks

Purdue E-Pubs

Non-Volatile Memory Adaptation in Asynchronous Microcontroller for Low Leakage Power and Fast Turn-on Time

Author: Habimana Jean Pierre Thierry
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2021
Field of study

This dissertation presents an MSP430 microcontroller implementation using Multi-Threshold NULL Convention Logic (MTNCL) methodology combined with an asynchronous non-volatile magnetic random-access-memory (RAM) to achieve low leakage power and fast turn-on. This asynchronous non-volatile RAM is designed with a Spin-Transfer Torque (STT) memory device model and CMOS transistors in a 65 nm technology. A self-timed Quasi-Delay-Insensitive 1 KB STT RAM is designed with an MTNCL interface and handshaking protocol. A replica methodology is implemented to handle write operation completion detection for long state-switching delays of the STT memory device. The MTNCL MSP430 core is integrated with the STT RAM to create a fully asynchronous non-volatile microcontroller. The MSP430 architecture, the MTNCL design methodology, and the STT RAM’s low power property, along with STT RAM’s non-volatility yield multiple advantages in the MTNCL-STT RAM system for a variety of applications. For comparison, a baseline system with the same MTNCL core combined with an asynchronous CMOS RAM is designed and tested. Schematic simulation results demonstrate that the MTNCL-CMOS RAM system presents advantages in execution time and active energy over the MTNCL-STT RAM system; however, the MTNCL-STT RAM system presents unmatched advantages such as negligible leakage power, zero overhead memory power failure handling, and fast system turn-on

ScholarWorks@UARK

UARK (University of Arkansas )

Algorithm-Directed Crash Consistence in Non-Volatile Memory for HPC

Author: Li Dong
Qiao Yifan
Wu Kai
Yang Shuo
Zhai Jidong
Publication venue
Publication date: 16/05/2017
Field of study

Fault tolerance is one of the major design goals for HPC. The emergence of non-volatile memories (NVM) provides a solution to build fault tolerant HPC. Data in NVM-based main memory are not lost when the system crashes because of the non-volatility nature of NVM. However, because of volatile caches, data must be logged and explicitly flushed from caches into NVM to ensure consistence and correctness before crashes, which can cause large runtime overhead. In this paper, we introduce an algorithm-based method to establish crash consistence in NVM for HPC applications. We slightly extend application data structures or sparsely flush cache blocks, which introduce ignorable runtime overhead. Such extension or cache flushing allows us to use algorithm knowledge to \textit{reason} data consistence or correct inconsistent data when the application crashes. We demonstrate the effectiveness of our method for three algorithms, including an iterative solver, dense matrix multiplication, and Monte-Carlo simulation. Based on comprehensive performance evaluation on a variety of test environments, we demonstrate that our approach has very small runtime overhead (at most 8.2\% and less than 3\% in most cases), much smaller than that of traditional checkpoint, while having the same or less recomputation cost.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Magnetic domain walls : Types, processes and applications

Author: Allwood D. A.
Hayward T. J.
Venkat G.
Publication venue
Publication date: 28/05/2023
Field of study

Domain walls (DWs) in magnetic nanowires are promising candidates for a variety of applications including Boolean/unconventional logic, memories, in-memory computing as well as magnetic sensors and biomagnetic implementations. They show rich physical behaviour and are controllable using a number of methods including magnetic fields, charge and spin currents and spin-orbit torques. In this review, we detail types of domain walls in ferromagnetic nanowires and describe processes of manipulating their state. We look at the state of the art of DW applications and give our take on the their current status, technological feasibility and challenges.Comment: 32 pages, 25 figures, review pape

arXiv.org e-Print Archive

Accelerating Time Series Analysis via Processing using Non-Volatile Memories

Author: Fernandez Ivan
Ghiasi Nika Mansouri
Giannoula Christina
Gutierrez Eladio
Gómez-Luna Juan
Manglik Aditya
Mutlu Onur
Plata Oscar
Quislant Ricardo
Publication venue
Publication date: 08/11/2022
Field of study

Time Series Analysis (TSA) is a critical workload for consumer-facing devices. Accelerating TSA is vital for many domains as it enables the extraction of valuable information and predict future events. The state-of-the-art algorithm in TSA is the subsequence Dynamic Time Warping (sDTW) algorithm. However, sDTW's computation complexity increases quadratically with the time series' length, resulting in two performance implications. First, the amount of data parallelism available is significantly higher than the small number of processing units enabled by commodity systems (e.g., CPUs). Second, sDTW is bottlenecked by memory because it 1) has low arithmetic intensity and 2) incurs a large memory footprint. To tackle these two challenges, we leverage Processing-using-Memory (PuM) by performing in-situ computation where data resides, using the memory cells. PuM provides a promising solution to alleviate data movement bottlenecks and exposes immense parallelism. In this work, we present MATSA, the first MRAM-based Accelerator for Time Series Analysis. The key idea is to exploit magneto-resistive memory crossbars to enable energy-efficient and fast time series computation in memory. MATSA provides the following key benefits: 1) it leverages high levels of parallelism in the memory substrate by exploiting column-wise arithmetic operations, and 2) it significantly reduces the data movement costs performing computation using the memory cells. We evaluate three versions of MATSA to match the requirements of different environments (e.g., embedded, desktop, or HPC computing) based on MRAM technology trends. We perform a design space exploration and demonstrate that our HPC version of MATSA can improve performance by 7.35x/6.15x/6.31x and energy efficiency by 11.29x/4.21x/2.65x over server CPU, GPU and PNM architectures, respectively

arXiv.org e-Print Archive

Repository for Publications and Research Data

Recommended from our members

Shape-engineered ferromagnets and micromagnetic simulation techniques for spin-transfer-torque random access memory

Author: Pramanik Tanmoy
Publication venue
Publication date: 23/08/2018
Field of study

Spin-transfer-torque random access memory (STTRAM) has received great attention as a prospective universal memory due to high speed read and write capabilities, scalability to smaller technology nodes and non-volatile data retention. Two major factors that could limit the performance of large scale STTRAM arrays are the high switching current and the stochastic switching behavior. In this work, possible routes to mitigate these issues have been explored and new techniques have been proposed to estimate the reliability of the write process. Large area of the selection transistor required to support high switching current impacts the bit storage density of an STTRAM memory array. To increase the bit storage density, a multi-state STTRAM cell employing a cross-shaped ferromagnet was proposed previously. Here, the spin-transfer-torque (STT) driven mag-netization dynamics of the cross-shaped ferromagnet is revisited. As a low power alternative, voltage controlled magnetic anisotropy (VCMA) based writing scheme is studied. Trade-offs and limitations of the VCMA-induced switching over STT are also discussed. In the next part of this dissertation, magnetic properties and magnetization process of epitaxial chromium telluride thin films have been studied. Presence of strong perpendicular magnetic anisotropy in this material makes it an attractive choice for device applications. In this work, anisotropy energies of chromium telluride thin films have been estimated from magnetization measurements. The magnetization reversal process is then studied using analytical models as well as micromagnetic simulations. The last part of this work focuses on the write error rates (WER) of STTRAM. The stochastic write process of STTRAM at finite temperatures gives rise to write errors when a bit fails to switch within the duration of the write pulse. Ultra-low WER on the scale of 10⁻⁹ or less are desired for practical applications. Micromagnetic simulations are required to capture spatially-incoherent magnetization dynamics inside a ferromagnet, which may effect the WER. In this work, using the techniques of rare event enhancement, reliable calculation of WERs to 10⁻⁹ is demonstrated while keeping the computational effort to a minimum. Employing rare-event-enhanced micromagnetic simulations, WERs of both perpendicular and in-plane STTRAM bits are calculated and effects of spatially-incoherent excitations on the WER slopes are discussed.Electrical and Computer Engineerin

Texas ScholarWorks