Search CORE

44 research outputs found

Sytare: a Lightweight Kernel for NVRAM-Based Transiently-Powered Systems

Author: Berthou Gautier
Delizy Tristan
Marquet Kevin
Risset Tanguy
Salagnac Guillaume
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2019
Field of study

International audienceIn a near future, energy harvesting is expected to replace batteries in ultra-low-power embedded systems. Research prototypes of such systems have recently been proposed. As the power harvested in the environment is very low, such systems need to cope with frequent power outages. They are referred to as transiently-powered systems (TPS). In order to execute non-trivial applications, TPS need to retain information between power losses. To achieve this goal, emerging non-volatile memory (NVM) technologies are a key enabler: they provide a lightweight solution to retain, between power outages, the state of an application and of its peripheral devices. These include sensors, serial interface or radio devices for instance. Existing works have described various checkpointing mechanisms to adapt embedded applications to TPS but the use of peripherals was not yet handled. in these works. This paper proposes a solution for embedded applications using any peripheral device to run despite transient power. We follow a kernel-oriented approach resulting in minimal impact on the programming model of the application. We implement the new concepts in our lightweight kernel called Sytare, running on an MSP430FR5739 micro-controller and we analyze the cost of the proposed solution

INRIA a CCSD electronic archive server

Redesigning Transaction Processing Systems for Non-Volatile Memory

Author: Kim Wookhee
Publication venue: Graduate School of UNIST
Publication date: 01/02/2019
Field of study

Department of Computer Science and EngineeringTransaction Processing Systems are widely used because they make the user be able to manage their data more efficiently. However, they suffer performance bottleneck due to the redundant I/O for guaranteeing data consistency. In addition to the redundant I/O, slow storage device makes the performance more degraded. Leveraging non-volatile memory is one of the promising solutions the performance bottleneck in Transaction Processing Systems. However, since the I/O granularity of legacy storage devices and non-volatile memory is not equal, traditional Transaction Processing System cannot fully exploit the performance of persistent memory. The goal of this dissertation is to fully exploit non-volatile memory for improving the performance of Transaction Processing Systems. Write amplification between Transaction Processing System is pointed out as a performance bottleneck. As first approach, we redesigned Transaction Processing Systems to minimize the redundant I/O between the Transaction Processing Systems. We present LS-MVBT that integrates recovery information into the main database file to remove temporary files for recovery. The LS-MVBT also employs five optimizations to reduce the write traffics in single fsync() calls. We also exploit the persistent memory to reduce the performance bottleneck from slow storage devices. However, since the traditional recovery method is for slow storage devices, we develop byte-addressable differential logging, user-level heap manager, and transaction-aware persistence to fully exploit the persistent memory. To minimize the redundant I/O for guarantee data consistency, we present the failure-atomic slotted paging with persistent buffer cache. Redesigning indexing structure is the second approach to exploit the non-volatile memory fully. Since the B+-tree is originally designed for block granularity, It generates excessive I/O traffics in persistent memory. To mitigate this traffic, we develop cache line friendly B+-tree which aligns its node size to cache line size. It can minimize the write traffic. Moreover, with hardware transactional memory, it can update its single node atomically without any additional redundant I/O for guaranteeing data consistency. It can also adapt Failure-Atomic Shift and Failure-Atomic In-place Rebalancing to eliminate unnecessary I/O. Furthermore, We improved the persistent memory manager that exploit traditional memory heap structure with free-list instead of segregated lists for small memory allocations to minimize the memory allocation overhead. Our performance evaluation shows that our improved version that consider I/O granularity of non-volatile memory can efficiently reduce the redundant I/O traffic and improve the performance by large of a margin.ope

Systemunterstützung für moderne Speichertechnologien

Author: Sartakov Vasily
Publication venue
Publication date: 01/01/2019
Field of study

Trust and scalability are the two significant factors which impede the dissemination of clouds. The possibility of privileged access to customer data by a cloud provider limits the usage of clouds for processing security-sensitive data. Low latency cloud services rely on in-memory computations, and thus, are limited by several characteristics of Dynamic RAM (DRAM) such as capacity, density, energy consumption, for example. Two technological areas address these factors. Mainstream server platforms, such as Intel Software Guard eXtensions (SGX) und AMD Secure Encrypted Virtualisation (SEV) offer extensions for trusted execution in untrusted environments. Various technologies of Non-Volatile RAM (NV-RAM) have better capacity and density compared to DRAM and thus can be considered as DRAM alternatives in the future. However, these technologies and extensions require new programming approaches and system support since they add features to the system architecture: new system components (Intel SGX) and data persistence (NV-RAM). This thesis is devoted to the programming and architectural aspects of persistent and trusted systems. For trusted systems, an in-depth analysis of new architectural extensions was performed. A novel framework named EActors and a database engine named STANlite were developed to effectively use the capabilities of trusted~execution. For persistent systems, an in-depth analysis of prospective memory technologies, their features and the possible impact on system architecture was performed. A new persistence model, called the hypervisor-based model of persistence, was developed and evaluated by the NV-Hypervisor. This offers transparent persistence for legacy and proprietary software, and supports virtualisation of persistent memory.Vertrauenswürdigkeit und Skalierbarkeit sind die beiden maßgeblichen Faktoren, die die Verbreitung von Clouds behindern. Die Möglichkeit privilegierter Zugriffe auf Kundendaten durch einen Cloudanbieter schränkt die Nutzung von Clouds bei der Verarbeitung von sicherheitskritischen und vertraulichen Informationen ein. Clouddienste mit niedriger Latenz erfordern die Durchführungen von Berechnungen im Hauptspeicher und sind daher an Charakteristika von Dynamic RAM (DRAM) wie Kapazität, Dichte, Energieverbrauch und andere Aspekte gebunden. Zwei technologische Bereiche befassen sich mit diesen Faktoren: Etablierte Server Plattformen wie Intel Software Guard eXtensions (SGX) und AMD Secure Encrypted Virtualisation (SEV) stellen Erweiterungen für vertrauenswürdige Ausführung in nicht vertrauenswürdigen Umgebungen bereit. Verschiedene Technologien von nicht flüchtigem Speicher bieten bessere Kapazität und Speicherdichte verglichen mit DRAM, und können daher in Zukunft als Alternative zu DRAM herangezogen werden. Jedoch benötigen diese Technologien und Erweiterungen neuartige Ansätze und Systemunterstützung bei der Programmierung, da diese der Systemarchitektur neue Funktionalität hinzufügen: Systemkomponenten (Intel SGX) und Persistenz (nicht-flüchtiger Speicher). Diese Dissertation widmet sich der Programmierung und den Architekturaspekten von persistenten und vertrauenswürdigen Systemen. Für vertrauenswürdige Systeme wurde eine detaillierte Analyse der neuen Architekturerweiterungen durchgeführt. Außerdem wurden das neuartige EActors Framework und die STANlite Datenbank entwickelt, um die neuen Möglichkeiten von vertrauenswürdiger Ausführung effektiv zu nutzen. Darüber hinaus wurde für persistente Systeme eine detaillierte Analyse zukünftiger Speichertechnologien, deren Merkmale und mögliche Auswirkungen auf die Systemarchitektur durchgeführt. Ferner wurde das neue Hypervisor-basierte Persistenzmodell entwickelt und mittels NV-Hypervisor ausgewertet, welches transparente Persistenz für alte und proprietäre Software, sowie Virtualisierung von persistentem Speicher ermöglicht

Digitale Bibliothek Braunschweig

EA-PHT-HPR: Designing Scalable Data Structures for Persistent Memory

Author: Cepeda Diego
Publication venue: 'University of Waterloo'
Publication date: 17/08/2020
Field of study

Volatile memory has dominated the realm of main memory on servers and computers for a long time. In 2019, Intel released to the public the Optane data center persistent memory modules (DCPMM). These devices offer the capacity and persistence of block devices while providing the byte addressability and low latency of DRAM devices. The introduction of this technology now allows programmers to develop data structures that can remain in main memory across crashes and power failures. Implementing recoverable code is not an easy task, and adds a new degree of complexity to how we develop and prove the correctness of code. This thesis explores the different approaches that have been taken for the development of persistent data structures, specifically for hash tables. The work presents an iterative process for the development of a persistent hash table. The proposed designs are based on a previously implemented DRAM design. We intend for the design of the hash table to remain similar to its original DRAM design while achieving high performance and scalability in persistent memory. Through each step of the iterative process, the proposed design's weak points are identified, and the implementations are compared to current state-of-the-art persistent hash tables. The final proposed design consists of a hybrid hash table implementation that achieves up to 47% higher performance in write-heavy workloads, and up to 19% higher performance in read-only workloads in comparison to the dynamic and scalable hashing (DASH) implementation, which currently is one of the fastest hash tables for persistent memory. As well, to reduce the latency of a full table resize operation, the proposed design incorporates a new full table resize mechanism that takes advantage of parallelization

University of Waterloo's Institutional Repository

Embedding Logic and Non-volatile Devices in CMOS Digital Circuits for Improving Energy Efficiency

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: Static CMOS logic has remained the dominant design style of digital systems for more than four decades due to its robustness and near zero standby current. Static CMOS logic circuits consist of a network of combinational logic cells and clocked sequential elements, such as latches and flip-flops that are used for sequencing computations over time. The majority of the digital design techniques to reduce power, area, and leakage over the past four decades have focused almost entirely on optimizing the combinational logic. This work explores alternate architectures for the flip-flops for improving the overall circuit performance, power and area. It consists of three main sections. First, is the design of a multi-input configurable flip-flop structure with embedded logic. A conventional D-type flip-flop may be viewed as realizing an identity function, in which the output is simply the value of the input sampled at the clock edge. In contrast, the proposed multi-input flip-flop, named PNAND, can be configured to realize one of a family of Boolean functions called threshold functions. In essence, the PNAND is a circuit implementation of the well-known binary perceptron. Unlike other reconfigurable circuits, a PNAND can be configured by simply changing the assignment of signals to its inputs. Using a standard cell library of such gates, a technology mapping algorithm can be applied to transform a given netlist into one with an optimal mixture of conventional logic gates and threshold gates. This approach was used to fabricate a 32-bit Wallace Tree multiplier and a 32-bit booth multiplier in 65nm LP technology. Simulation and chip measurements show more than 30% improvement in dynamic power and more than 20% reduction in core area. The functional yield of the PNAND reduces with geometry and voltage scaling. The second part of this research investigates the use of two mechanisms to improve the robustness of the PNAND circuit architecture. One is the use of forward and reverse body biases to change the device threshold and the other is the use of RRAM devices for low voltage operation. The third part of this research focused on the design of flip-flops with non-volatile storage. Spin-transfer torque magnetic tunnel junctions (STT-MTJ) are integrated with both conventional D-flipflop and the PNAND circuits to implement non-volatile logic (NVL). These non-volatile storage enhanced flip-flops are able to save the state of system locally when a power interruption occurs. However, manufacturing variations in the STT-MTJs and in the CMOS transistors significantly reduce the yield, leading to an overly pessimistic design and consequently, higher energy consumption. A detailed analysis of the design trade-offs in the driver circuitry for performing backup and restore, and a novel method to design the energy optimal driver for a given yield is presented. Efficient designs of two nonvolatile flip-flop (NVFF) circuits are presented, in which the backup time is determined on a per-chip basis, resulting in minimizing the energy wastage and satisfying the yield constraint. To achieve a yield of 98%, the conventional approach would have to expend nearly 5X more energy than the minimum required, whereas the proposed tunable approach expends only 26% more energy than the minimum. A non-volatile threshold gate architecture NV-TLFF are designed with the same backup and restore circuitry in 65nm technology. The embedded logic in NV-TLFF compensates performance overhead of NVL. This leads to the possibility of zero-overhead non-volatile datapath circuits. An 8-bit multiply-and- accumulate (MAC) unit is designed to demonstrate the performance benefits of the proposed architecture. Based on the results of HSPICE simulations, the MAC circuit with the proposed NV-TLFF cells is shown to consume at least 20% less power and area as compared to the circuit designed with conventional DFFs, without sacrificing any performance.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

Anchor: Architecture for Secure Non-Volatile Memories

Author: Swami Shivam
Publication venue
Publication date: 01/01/1711
Field of study

The rapid growth of memory-intensive applications like cloud computing, deep learning, bioinformatics, etc., have propelled memory industry to develop scalable, high density, low power non-volatile memory (NVM) technologies; however, computing systems that integrate these advanced NVMs are vulnerable to several security attacks that threaten (i) data confidentiality, (ii) data availability, and (iii) data integrity. This dissertation presents ANCHOR, which integrates 4 low overhead, high performance security solutions SECRET, COVERT, ACME, and STASH to thwart these attacks on NVM systems. SECRET is a low cost security solution for data confidentiality in multi-/triple-level cell (i.e., MLC/TLC) NVMs. SECRET synergistically combines (i) smart encryption, which prevents re-encryption of unmodified or zero-words during a write-back with (ii) XOR-based energy masking, which further optimizes NVM writes by transforming a high-energy ciphertext into a low-energy ciphertext. SECRET outperforms state-of-the-art encryption solutions, with the lowest write energy and latency, as well as the highest lifetime. COVERT and ACME complement SECRET to improve system availability of counter mode encryption (CME). COVERT repurposes unused error correction resources to dynamically extend time to counter overflow of fast growing counters, thereby delaying frequent full memory re-encryption (system freeze). ACME performs counter write leveling (CWL) to further increase time to counter overflow, and thereby delays the time to full memory re-encryption. COVERT+ACME achieves system availability of 99.999% during normal operation and 99.9% under a denial of memory service (DoMS) attack. In contrast, conventional CME achieves system availability of only 85.71% during normal operation and is rendered non-operational under a DoMS attack. Finally, STASH is a comprehensive end-to-end security architecture for state-of-the-art smart hybrid memories (SHMs) that employ a smart DRAM cache with smart NVM-based main memory. STASH integrates (i) CME for data confidentiality, (ii) page-level Merkle Tree authentication for data integrity, (iii) recovery-compatible MT updates to withstand power/system failures, and (iv) page-migration friendly security meta-data management. For security guarantees equivalent to state-of-the-art, STASH reduces memory overhead by 12.7x, improves system performance by 65%, and increases NVM lifetime by 5x. This dissertation thus addresses the core security challenges of next-generation NVM-based memory systems. Directions for future research include (i) exploration of holistic architectures that ensure both security and reliability of smart memory systems, (ii) investigating applications of ANCHOR to reduce security overhead of Internet-of-Things, and (iii) extending ANCHOR to safeguard emerging non-volatile processors, especially in the light of advanced attacks like Spectre and Meltdown