Search CORE

201 research outputs found

Normally-Off Computing and Checkpoint/Rollback for Fast, Low-Power, and Reliable Devices

Author: Benoit Pascal
Gamatié Abdoulaye
Sassatelli Gilles
Senni Sophiane
Torres Lionel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

International audienceSince the advent of complementary metal oxide semiconductors (CMOS), the number of transistors per die has continued to increase, reaching today several billion transistors. As a result, it has been possible to design and fabricate smart devices able to run at high speed. However, the power consumption of systems-on-chip has significantly increased due to the high density integration and the high leakage power of current CMOS transistors. As a result, the limits of heat dissipation make further improvement in performance difficult. A high level of autonomy for battery-powered devices is a real challenge. To deal with these issues, spin-transfer-torque magnetic random-access memory (STT-MRAM) technology is seen as a promising solution. In addition to its attractive performance features, STT-MRAM can bring nonvolatility to a system to allow full data retention after a complete shutdown while maintaining a fast wake-up time. Considering two 32-bit embedded processors, this letter shows how STT-MRAM can improve energy efficiency and reliability of future embedded systems thanks to normally-off computing and checkpointing/rollback techniques. A detailed analysis is performed to evaluate the cost related to the backup/recovery of the system. Index Terms—Spintronic memory and logic, embedded processor, spin-transfer-torque, magnetic random-access memory

Magnetic racetrack memory: from physics to the cusp of applications within a decade

Author: Bläsing R.
Castrillon J.
Filippou P.
Garg C.
Hameed F.
Khan A.
Parkin S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2020
Field of study

Racetrack memory (RTM) is a novel spintronic memory-storage technology that has the potential to overcome fundamental constraints of existing memory and storage devices. It is unique in that its core differentiating feature is the movement of data, which is composed of magnetic domain walls (DWs), by short current pulses. This enables more data to be stored per unit area compared to any other current technologies. On the one hand, RTM has the potential for mass data storage with unlimited endurance using considerably less energy than today's technologies. On the other hand, RTM promises an ultrafast nonvolatile memory competitive with static random access memory (SRAM) but with a much smaller footprint. During the last decade, the discovery of novel physical mechanisms to operate RTM has led to a major enhancement in the efficiency with which nanoscopic, chiral DWs can be manipulated. New materials and artificially atomically engineered thin-film structures have been found to increase the speed and lower the threshold current with which the data bits can be manipulated. With these recent developments, RTM has attracted the attention of the computer architecture community that has evaluated the use of RTM at various levels in the memory stack. Recent studies advocate RTM as a promising compromise between, on the one hand, power-hungry, volatile memories and, on the other hand, slow, nonvolatile storage. By optimizing the memory subsystem, significant performance improvements can be achieved, enabling a new era of cache, graphical processing units, and high capacity memory devices. In this article, we provide an overview of the major developments of RTM technology from both the physics and computer architecture perspectives over the past decade. We identify the remaining challenges and give an outlook on its future

MPG.PuRe

Microarchitectures pour la sauvegarde incrémentale, robuste et efficace dans les systèmes à alimentation intermittente

Author: Pala Davide
Publication venue: HAL CCSD
Publication date: 10/11/2022
Field of study

Embedded devices powered with environmental energy harvesting, have to sustain computation while experiencing unexpected power failures.To preserve the progress across the power interruptions, Non-Volatile Memories (NVMs) are used to quickly save the state. This dissertation first presents an overview and comparison of different NVM technologies, based on different surveys from the literature. The second contribution we propose is a dedicated backup controller, called Freezer, that implements an on-demand incremental backup scheme. This can make the size of the backup 87.7% smaller then a full-memory backup strategy from the state of the art (SoA). Our third contribution addresses the problem of corruption of the state, due to interruptions during the backup process. Two algorithms are presented, that improve on the Freezer incremental backup process, making it robust to errors, by always guaranteeing the existence of a correct state, that can be restored in case of backup errors. These two algorithms can consume 23% less energy than the usual double-buffering technique used in the SoA. The fourth contribution, addresses the scalability of our proposed approach. Combining Freezer with Bloom filters, we introduce a backup scheme that can cover much larger address spaces, while achieving a backup size which is half the size of the regular Freezer approach.Les appareils embarqués alimentés par la récupération d'énergie environnementale doivent maintenir le calcul tout en subissant des pannes de courant inattendues. Pour préserver la progression à travers les interruptions de courant, des mémoires non volatiles (NVM) sont utilisées pour enregistrer rapidement l'état. Cette thèse présente d'abord une vue d'ensemble et une comparaison des différentes technologies NVM, basées sur différentes enquêtes de la littérature. La deuxième contribution que nous proposons est un contrôleur de sauvegarde dédié, appelé Freezer, qui implémente un schéma de sauvegarde incrémentale à la demande. Cela peut réduire la taille de la sauvegarde de 87,7% à celle d'une stratégie de sauvegarde à mémoire complète de l'état de l'art. Notre troisième contribution aborde le problème de la corruption de l'état, due aux interruptions pendant le processus de sauvegarde. Deux algorithmes sont présentés, qui améliorent le processus de sauvegarde incrémentale de Freezer, le rendant robuste aux erreurs, en garantissant toujours l'existence d'un état correct, qui peut être restauré en cas d'erreurs de sauvegarde. Ces deux algorithmes peuvent consommer

23\%

d'énergie en moins que la technique de ``double-buffering'' utilisée dans l'état de l'art. La quatrième contribution porte sur l'évolutivité de notre approche proposée. En combinant Freezer avec des filtres Bloom, nous introduisons un schéma de sauvegarde qui peut couvrir des espaces d'adressage beaucoup plus grands, tout en obtenant une taille de sauvegarde qui est la moitié de la taille de l'approche Freezer habituelle

INRIA a CCSD electronic archive server

Design and Code Optimization for Systems with Next-generation Racetrack Memories

Author: Khan Asif Ali
Publication venue
Publication date: 16/06/2022
Field of study

With the rise of computationally expensive application domains such as machine learning, genomics, and fluids simulation, the quest for performance and energy-efficient computing has gained unprecedented momentum. The significant increase in computing and memory devices in modern systems has resulted in an unsustainable surge in energy consumption, a substantial portion of which is attributed to the memory system. The scaling of conventional memory technologies and their suitability for the next-generation system is also questionable. This has led to the emergence and rise of nonvolatile memory ( NVM ) technologies. Today, in different development stages, several NVM technologies are competing for their rapid access to the market. Racetrack memory ( RTM ) is one such nonvolatile memory technology that promises SRAM -comparable latency, reduced energy consumption, and unprecedented density compared to other technologies. However, racetrack memory ( RTM ) is sequential in nature, i.e., data in an RTM cell needs to be shifted to an access port before it can be accessed. These shift operations incur performance and energy penalties. An ideal RTM , requiring at most one shift per access, can easily outperform SRAM . However, in the worst-cast shifting scenario, RTM can be an order of magnitude slower than SRAM . This thesis presents an overview of the RTM device physics, its evolution, strengths and challenges, and its application in the memory subsystem. We develop tools that allow the programmability and modeling of RTM -based systems. For shifts minimization, we propose a set of techniques including optimal, near-optimal, and evolutionary algorithms for efficient scalar and instruction placement in RTMs . For array accesses, we explore schedule and layout transformations that eliminate the longer overhead shifts in RTMs . We present an automatic compilation framework that analyzes static control flow programs and transforms the loop traversal order and memory layout to maximize accesses to consecutive RTM locations and minimize shifts. We develop a simulation framework called RTSim that models various RTM parameters and enables accurate architectural level simulation. Finally, to demonstrate the RTM potential in non-Von-Neumann in-memory computing paradigms, we exploit its device attributes to implement logic and arithmetic operations. As a concrete use-case, we implement an entire hyperdimensional computing framework in RTM to accelerate the language recognition problem. Our evaluation shows considerable performance and energy improvements compared to conventional Von-Neumann models and state-of-the-art accelerators

Technische Universität Dresden: Qucosa

Recommended from our members

Transiently Powered Computers

Author: Ransford Benjamin
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2013
Field of study

Demand for compact, easily deployable, energy-efficient computers has driven the development of general-purpose transiently powered computers (TPCs) that lack both batteries and wired power, operating exclusively on energy harvested from their surroundings. TPCs\u27 dependence solely on transient, harvested power offers several important design-time benefits. For example, omitting batteries saves board space and weight while obviating the need to make devices physically accessible for maintenance. However, transient power may provide an unpredictable supply of energy that makes operation difficult. A predictable energy supply is a key abstraction underlying most electronic designs. TPCs discard this abstraction in favor of opportunistic computation that takes advantage of available resources. A crucial question is how should a software-controlled computing device operate if it depends completely on external entities for power and other resources? The question poses challenges for computation, communication, storage, and other aspects of TPC design. The main idea of this work is that software techniques can make energy harvesting a practicable form of power supply for electronic devices. Its overarching goal is to facilitate the design and operation of usable TPCs. This thesis poses a set of challenges that are fundamental to TPCs, then pairs these challenges with approaches that use software techniques to address them. To address the challenge of computing steadily on harvested power, it describes Mementos, an energy-aware state-checkpointing system for TPCs. To address the dependence of opportunistic RF-harvesting TPCs on potentially untrustworthy RFID readers, it describes CCCP, a protocol and system for safely outsourcing data storage to RFID readers that may attempt to tamper with data. Additionally, it describes a simulator that facilitates experimentation with the TPC model, and a prototype computational RFID that implements the TPC model. To show that TPCs can improve existing electronic devices, this thesis describes applications of TPCs to implantable medical devices (IMDs), a challenging design space in which some battery-constrained devices completely lack protection against radio-based attacks. TPCs can provide security and privacy benefits to IMDs by, for instance, cryptographically authenticating other devices that want to communicate with the IMD before allowing the IMD to use any of its battery power. This thesis describes a simplified IMD that lacks its own radio, saving precious battery energy and therefore size. The simplified IMD instead depends on an RFID-scale TPC for all of its communication functions. TPCs are a natural area of exploration for future electronic design, given the parallel trends of energy harvesting and miniaturization. This work aims to establish and evaluate basic principles by which TPCs can operate

ScholarWorks@UMass Amherst

Evaluation of STT-MRAM main memory for HPC and real-time systems

Author: Asifuzzaman Kazi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

It is questionable whether DRAM will continue to scale and will meet the needs of next-generation systems. Therefore, significant effort is invested in research and development of novel memory technologies. One of the candidates for nextgeneration memory is Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM is an emerging non-volatile memory with a lot of potential that could be exploited for various requirements of different computing systems. Being a novel technology, STT-MRAM devices are already approaching DRAM in terms of capacity, frequency and device size. Special STT-MRAM features such as intrinsic radiation hardness, non-volatility, zero stand-by power and capability to function in extreme temperatures also make it particularly suitable for aerospace, avionics and automotive applications. Despite of being a conceivable alternative for main memory technology, to this day, academic research of STT-MRAM main memory remains marginal. This is mainly due to the unavailability of publicly available detailed timing parameters of this novel technology, which are required to perform a cycle accurate main memory simulation. Some researchers adopt simplistic memory models to simulate main memory, but such models can introduce significant errors in the analysis of the overall system performance. Therefore, detailed timing parameters are a must-have for any evaluation or architecture exploration study of STT-MRAM main memory. These detailed parameters are not publicly available because STT-MRAM manufacturers are reluctant to release any delicate information on the technology. This thesis demonstrates an approach to perform a cycle accurate simulation of STT-MRAM main memory, being the first to release detailed timing parameters of this technology from academia, essentially enabling researchers to conduct reliable system level simulation of STT-MRAM using widely accepted existing simulation infrastructure. Our results show that, in HPC domain STT-MRAM provide performance comparable to DRAM. Results from the power estimation indicates that STT-MRAM power consumption increases significantly for Activation/Precharge power while Burst power increases moderately and Background power does not deviate much from DRAM. The thesis includes detailed STT-MRAM main memory timing parameters to the main repositories of DramSim2 and Ramulator, two of the most widely used and accepted state-of-the-art main memory simulators. The STT-MRAM timing parameters that has been originated as a part of this thesis, are till date the only reliable and publicly available timing information on this memory technology published from academia. Finally, the thesis analyzes the feasibility of using STT-MRAM in real-time embedded systems by investigating STT-MRAM main memory impact on average system performance and WCET. STT-MRAM's suitability for the real-time embedded systems is validated on benchmarks provided by the European Space Agency (ESA), EEMBC Autobench and MediaBench suite by analyzing performance and WCET impact. In quantitative terms, our results show that STT-MRAM main memory in real-time embedded systems provides performance and WCET comparable to conventional DRAM, while opening up opportunities to exploit various advantages.Es cuestionable si DRAM continuará escalando y cumplirá con las necesidades de los sistemas de la próxima generación. Por lo tanto, se invierte un esfuerzo significativo en la investigación y el desarrollo de nuevas tecnologías de memoria. Uno de los candidatos para la memoria de próxima generación es la Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM es una memoria no volátil emergente con un gran potencial que podría ser explotada para diversos requisitos de diferentes sistemas informáticos. Al ser una tecnología novedosa, los dispositivos STT-MRAM ya se están acercando a la DRAM en términos de capacidad, frecuencia y tamaño del dispositivo. Las características especiales de STTMRAM, como la dureza intrínseca a la radiación, la no volatilidad, la potencia de reserva cero y la capacidad de funcionar en temperaturas extremas, también lo hacen especialmente adecuado para aplicaciones aeroespaciales, de aviónica y automotriz. A pesar de ser una alternativa concebible para la tecnología de memoria principal, hasta la fecha, la investigación académica de la memoria principal de STT-MRAM sigue siendo marginal. Esto se debe principalmente a la falta de disponibilidad de los parámetros de tiempo detallados públicamente disponibles de esta nueva tecnología, que se requieren para realizar un ciclo de simulación de memoria principal precisa. Algunos investigadores adoptan modelos de memoria simplistas para simular la memoria principal, pero tales modelos pueden introducir errores significativos en el análisis del rendimiento general del sistema. Por lo tanto, los parámetros de tiempo detallados son indispensables para cualquier evaluación o estudio de exploración de la arquitectura de la memoria principal de STT-MRAM. Estos parámetros detallados no están disponibles públicamente porque los fabricantes de STT-MRAM son reacios a divulgar información delicada sobre la tecnología. Esta tesis demuestra un enfoque para realizar un ciclo de simulación precisa de la memoria principal de STT-MRAM, siendo el primero en lanzar parámetros de tiempo detallados de esta tecnología desde la academia, lo que esencialmente permite a los investigadores realizar una simulación confiable a nivel de sistema de STT-MRAM utilizando una simulación existente ampliamente aceptada infraestructura. Nuestros resultados muestran que, en el dominio HPC, STT-MRAM proporciona un rendimiento comparable al de la DRAM. Los resultados de la estimación de potencia indican que el consumo de potencia de STT-MRAM aumenta significativamente para la activation/Precharge power, mientras que la Burst power aumenta moderadamente y la Background power no se desvía mucho de la DRAM. La tesis incluye parámetros detallados de temporización memoria principal de STT-MRAM a los repositorios principales de DramSim2 y Ramulator, dos de los simuladores de memoria principal más avanzados y más utilizados y aceptados. Los parámetros de tiempo de STT-MRAM que se han originado como parte de esta tesis, son hasta la fecha la única información de tiempo confiable y disponible al público sobre esta tecnología de memoria publicada desde la academia. Finalmente, la tesis analiza la viabilidad de usar STT-MRAM en real-time embedded systems mediante la investigación del impacto de la memoria principal de STT-MRAM en el rendimiento promedio del sistema y WCET. La idoneidad de STTMRAM para los real-time embedded systems se valida en los applicaciones proporcionados por la European Space Agency (ESA), EEMBC Autobench y MediaBench, al analizar el rendimiento y el impacto de WCET. En términos cuantitativos, nuestros resultados muestran que la memoria principal de STT-MRAM en real-time embedded systems proporciona un desempeño WCET comparable al de una memoria DRAM convencional, al tiempo que abre oportunidades para explotar varias ventajas

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Recommended from our members

RESISTIVE SWITCHING CHARACTERISTICS OF NANOSTRUCTURED AND SOLUTION-PROCESSED COMPLEX OXIDE ASSEMBLIES

Author: Zhou Zimu
Publication venue: ScholarWorks@UMass Amherst
Publication date: 08/05/2020
Field of study

Miniaturization of conventional nonvolatile (NVM) memory devices is rapidly approaching the physical limitations of the constituent materials. An emerging random access memory (RAM), nanoscale resistive RAM (RRAM), has the potential to replace conventional nonvolatile memory and could foster novel type of computing due to its fast switching speed, high scalability, and low power consumption. RRAM, or memristors, represent a class of two terminal devices comprising an insulating layer, such as a metal oxide, sandwiched between two terminal electrodes that exhibits two or more distinct resistance states that depend on the history of the applied bias. While the sudden resistance reduction into a conductive state in metal oxide insulators has been known for almost 50 years, the fundamental resistive switching mechanism is a complex phenomenon that is still long-debated, complex process. Further improvements to existing memristor performance require a complete understanding of memristive properties under various operation conditions. Additional technical issues also remain, such as the development of facile, low-cost fabrication methods as an alternative to expensive, ultra-high vacuum (UHV) deposition methods. This collection of work explores resistive switching within metal oxide-based memristive material assemblies by analyzing the fundamental physical insulating material properties. Chapter 3 aims to translate the utility and simplicity of the highly ordered anodic aluminum oxide (AAO) template structure to complex, yet more functional (memristive) materials. Functional oxides possessing ordered, scalable nanoporous arrays and nanocapacitor arrays over a large area is of interest to both the fields of next-generation electronics and energy storing/harvesting devices. Here their switching performance will be evaluated using conductive atomic force microscopy (C-AFM). Chapter 4 demonstrates a convective self-assembly fabrication method that effectively enables the synthesis of a low-cost solution processed memristor comprising binary oxide and perovskite ABO3 nanocrystals of varying diameter. Chapter 5 systematically compares the influence of inter-nanoparticle distance on the threshold switching SET voltage of hafnium oxide (HfO2) memristors. Utilizing shorter phosphonic acid ligands with higher binding affinity on the nanocrystal surface enabled a record-low SET voltage to be achieved. Chapter 6 extends the scope to the fine tuning of solution processed memristors with two types of perovskites nanocrystals. The primary advantage of nanocrystal memristors is the ability to draw from additional degrees of freedom by tuning the constituent nanocrystal material properties. Recent advancement of solution phase techniques enables a high degree of controllability over the nanocrystal size and structure. Thus, this work found in this dissertation aims to understand and decouple the effects of the geometric size and substitutional nanocrystal parameters on resistive switching

ScholarWorks@UMass Amherst

Enabling Reliable, Efficient, and Secure Computing for Energy Harvesting Powered IoT Devices

Author: Xie Mimi
Publication venue
Publication date: 11/09/2019
Field of study

Energy harvesting is one of the most promising techniques to power devices for future generation IoT. While energy harvesting does not have longevity, safety, and recharging concerns like traditional batteries, its instability brings a new challenge to the embedded systems: the energy harvested from environment is usually weak and intermittent. With traditional CMOS based technology, whenever the power is off, the computation has to start from the very beginning. Compared with existing CMOS based memory devices, emerging non-volatile memory devices such as PCM and STT-RAM, have the benefits of sustaining the data even when there is no power. By checkpointing the processor's volatile state to non-volatile memory, a program can resume its execution immediately after power comes back on again instead of restarting from the very beginning with checkpointing techniques. However, checkpointing is not sufficient for energy harvesting systems. First, the program execution resumed from the last checkpoint might not execute correctly and causes inconsistency problem to the system. This problem is due to the inconsistency between volatile system state and non-volatile system state during checkpointing. Second, the process of checkpointing consumes a considerable amount of energy and time due to the slow and energy-consuming write operation of non-volatile memory. Finally, connecting to the internet poses many security issues to energy harvesting IoT devices. Traditional data encryption methods are both energy and time consuming which do not fit the resource constrained IoT devices. Therefore, a light-weight encryption method is in urgent need for securing IoT devices. Targeting those three challenges, this dissertation proposes three techniques to enable reliable, efficient, and secure computing in energy harvesting IoT devices. First, a consistency-aware checkpointing technique is proposed to avoid inconsistency errors generated from the inconsistency between volatile state and non-volatile state. Second, checkpoint aware hybrid cache architecture is proposed to guarantee reliable checkpointing while maintaining a low checkpointing overhead from cache. Finally, to ensure the security of energy harvesting IoT devices, an energy-efficient in-memory encryption implementation for protecting the IoT device is proposed which can quickly encrypts the data in non-volatile memory and protect the embedded system physical and on-line attacks

D-Scholarship@Pitt

High-Density Solid-State Memory Devices and Technologies

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms

Directory of Open Access Books (DOAB)

Evaluation of STT-MRAM main memory for HPC and real-time systems

Author: Asifuzzaman Kazi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 29/07/2019
Field of study

UPCommons. Portal del coneixement obert de la UPC