Search CORE

476 research outputs found

Recommended from our members

QoS-aware mechanisms for improving cost-efficiency of datacenters

Author: Zhu Haishan
Publication venue
Publication date: 22/01/2021
Field of study

Warehouse Scale Computers (WSCs) promise high cost-efficiency by amortizing power, cooling, and management overheads. WSCs today host a large variety of jobs with two broad performance requirements categories: latency-critical (LC) and best-effort (BE). Ideally, to fully utilize all hardware resources, WSC operators can simply fill all the nodes with computing jobs. Unfortunately, because colocated jobs contend for shared resources, systems with high loads often experience performance degradation, which negatively impacts the Quality of Service (QoS) for LC jobs. In fact, service providers usually over-provision resources to avoid any interference with LC jobs, leading to significant resource inefficiencies. In this dissertation, I explore opportunities across different system-abstraction layers to improve the cost-efficiency of dataceters by increasing resource utilization of WSCs with little or no impact on the performance of LC jobs. The dissertation has three main components. First, I explore opportunities to improve the throughput of multicore systems by reducing the performance variation of LC jobs. The main insight is that by reshaping the latency distribution curve, performance headroom of LC jobs can be effectively converted to improved BE throughput. I develop, implement, and evaluate a runtime system that achieves this goal with existing hardware. I leverage the cache partitioning, per-core frequency scaling, and thread masking of server processors. Evaluation results show the proposed solution enables 30% higher system throughput compared to solutions proposed in prior works while maintaining at least as good QoS for LC jobs. Second, I study resource contention in near-future heterogeneous memory architectures (HMA). This study is motivated by recent developments in non-volatile memory (NVM) technologies, which enable higher storage density at the cost of same performance. To understand the performance and QoS impact of HMAs, I design and implement a performance emulator in the Linux kernel that runs unmodified workloads with high accuracy, low overhead, and complete transparency. I further propose and evaluate multiple data and resource management QoS mechanisms, such as locality-aware page admission, occupancy management, and write buffer jailing. Third, I focus on accelerated machine learning (ML) systems. By profiling the performance of production workloads and accelerators, I show that accelerated ML tasks are highly sensitive to main memory interference due to fine-grained interaction between CPU and accelerator tasks. As a result, memory resource contention can significantly decreases the performance and efficiency gains of accelerators. I propose a runtime system that leverages existing hardware capabilities and show 17% higher system efficiency compared to previous approaches. This study further exposes opportunities for future processor architecturesElectrical and Computer Engineerin

Texas ScholarWorks

Exploiting Nanoelectronic Properties of Memory Chips for Prevention of IC Counterfeiting

Author: Chakraborty Supriya
Das Tamoghno
Suri Manan
Publication venue
Publication date: 09/09/2022
Field of study

This study presents a methodology for anticounterfeiting of Non-Volatile Memory (NVM) chips. In particular, we experimentally demonstrate a generalized methodology for detecting (i) Integrated Circuit (IC) origin, (ii) recycled or used NVM chips, and (iii) identification of used locations (addresses) in the chip. Our proposed methodology inspects latency and variability signatures of Commercial-Off-The-Shelf (COTS) NVM chips. The proposed technique requires low-cycle (~100) pre-conditioning and utilizes Machine Learning (ML) algorithms. We observe different trends in evolution of latency (sector erase or page write) with cycling on different NVM technologies from different vendors. ML assisted approach is utilized for detecting IC manufacturers with 95.1 % accuracy obtained on prepared test dataset consisting of 3 different NVM technologies including 6 different manufacturers (9 types of chips).Comment: 5 pages, 5 figures, accepted in IEEE NANO 202

arXiv.org e-Print Archive

Software-Managed Read and Write Wear-Leveling for Non-Volatile Main Memory

Author: Amrouch H.
Bauer L.
Chen J.-J.
Chen K.-H.
Genssler P. R.
Hakert C.
Henkel J.
Schirmeier H.
Von Der Brüggen G.
Publication venue: Association for Computing Machinery
Publication date: 07/03/2022
Field of study

In-memory wear-leveling has become an important research field for emerging non-volatile main memories over the past years. Many approaches in the literature perform wear-leveling by making use of special hardware. Since most non-volatile memories only wear out from write accesses, the proposed approaches in the literature also usually try to spread write accesses widely over the entire memory space. Some non-volatile memories, however, also wear out from read accesses, because every read causes a consecutive write access. Software-based solutions only operate from the application or kernel level, where read and write accesses are realized with different instructions and semantics. Therefore different mechanisms are required to handle reads and writes on the software level. First, we design a method to approximate read and write accesses to the memory to allow aging aware coarse-grained wear-leveling in the absence of special hardware, providing the age information. Second, we provide specific solutions to resolve access hot-spots within the compiled program code (text segment) and on the application stack. In our evaluation, we estimate the cell age by counting the total amount of accesses per cell. The results show that employing all our methods improves the memory lifetime by up to a factor of 955×

KITopen

Novel Cryptographic Authentication Mechanisms for Supply Chains and OpenStack

Author: Rahaeimehr Reza
Publication venue: OpenCommons@UConn
Publication date: 02/12/2019
Field of study

In this dissertation, first, we studied the Radio-Frequency Identification (RFID) tag authentication problem in supply chains. RFID tags have been widely used as a low-cost wireless method for detecting counterfeit product injection in supply chains. We open a new direction toward solving this problem by using the Non-Volatile Memory (NVM) of recent RFID tags. We propose a method based on this direction that significantly improves the availability of the system and costs less. In our method, we introduce the notion of Software Unclonability, which is a kind of one-time MAC for authenticating random inputs. Also, we introduce three lightweight constructions that are software unclonable. Second, we focus on OpenStack that is a prestigious open-source cloud platform. OpenStack takes advantage of some tokening mechanisms to establish trust between its modules and users. It turns out that when an adversary captures user tokens by exploiting a bug in a module, he gets extreme power on behalf of users. Here, we propose a novel tokening mechanism that ties commands to tokens and enables OpenStack to support short life tokens while it keeps the performance up

DigitalCommons@UConn

OpenCommons at University of Connecticut

Accelerating Time Series Analysis via Processing using Non-Volatile Memories

Author: Fernandez Ivan
Ghiasi Nika Mansouri
Giannoula Christina
Gutierrez Eladio
Gómez-Luna Juan
Manglik Aditya
Mutlu Onur
Plata Oscar
Quislant Ricardo
Publication venue
Publication date: 08/11/2022
Field of study

Time Series Analysis (TSA) is a critical workload for consumer-facing devices. Accelerating TSA is vital for many domains as it enables the extraction of valuable information and predict future events. The state-of-the-art algorithm in TSA is the subsequence Dynamic Time Warping (sDTW) algorithm. However, sDTW's computation complexity increases quadratically with the time series' length, resulting in two performance implications. First, the amount of data parallelism available is significantly higher than the small number of processing units enabled by commodity systems (e.g., CPUs). Second, sDTW is bottlenecked by memory because it 1) has low arithmetic intensity and 2) incurs a large memory footprint. To tackle these two challenges, we leverage Processing-using-Memory (PuM) by performing in-situ computation where data resides, using the memory cells. PuM provides a promising solution to alleviate data movement bottlenecks and exposes immense parallelism. In this work, we present MATSA, the first MRAM-based Accelerator for Time Series Analysis. The key idea is to exploit magneto-resistive memory crossbars to enable energy-efficient and fast time series computation in memory. MATSA provides the following key benefits: 1) it leverages high levels of parallelism in the memory substrate by exploiting column-wise arithmetic operations, and 2) it significantly reduces the data movement costs performing computation using the memory cells. We evaluate three versions of MATSA to match the requirements of different environments (e.g., embedded, desktop, or HPC computing) based on MRAM technology trends. We perform a design space exploration and demonstrate that our HPC version of MATSA can improve performance by 7.35x/6.15x/6.31x and energy efficiency by 11.29x/4.21x/2.65x over server CPU, GPU and PNM architectures, respectively

arXiv.org e-Print Archive

Repository for Publications and Research Data

Memristive Non-Volatile Memory Based on Graphene Materials

Author: Huang Yanbo
Li Puzhuo
Mitrovic Ivona Z
Qi Yanfei
Shen Zongjie
Wen Jiacheng
Yang Li
Zhao Cezhou
Zhao Chun
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

Resistive random access memory (RRAM), which is considered as one of the most promising next-generation non-volatile memory (NVM) devices and a representative of memristor technologies, demonstrated great potential in acting as an artificial synapse in the industry of neuromorphic systems and artificial intelligence (AI), due its advantages such as fast operation speed, low power consumption, and high device density. Graphene and related materials (GRMs), especially graphene oxide (GO), acting as active materials for RRAM devices, are considered as a promising alternative to other materials including metal oxides and perovskite materials. Herein, an overview of GRM-based RRAM devices is provided, with discussion about the properties of GRMs, main operation mechanisms for resistive switching (RS) behavior, figure of merit (FoM) summary, and prospect extension of GRM-based RRAM devices. With excellent physical and chemical advantages like intrinsic Young’s modulus (1.0 TPa), good tensile strength (130 GPa), excellent carrier mobility (2.0 × 105 cm2∙V−1∙s−1), and high thermal (5000 Wm−1∙K−1) and superior electrical conductivity (1.0 × 106 S∙m−1), GRMs can act as electrodes and resistive switching media in RRAM devices. In addition, the GRM-based interface between electrode and dielectric can have an effect on atomic diffusion limitation in dielectric and surface effect suppression. Immense amounts of concrete research indicate that GRMs might play a significant role in promoting the large-scale commercialization possibility of RRAM devices

University of Liverpool Repository