Search CORE

7,395 research outputs found

Recommended from our members

PERSES: Data layout for low impact failures

Author: Adams IF
Long DDE
Miller EL
Wildani A
Publication venue: eScholarship, University of California
Publication date: 05/02/2015
Field of study

Growth in disk capacity continues to outpace advances in read speed and device reliability. This has led to storage systems spending increasing amounts of time in a degraded state while failed disks reconstruct. Users and applications that do not use the data on the failed or degraded drives are negligibly impacted by the failure, increasing the perceived performance of the system. We leverage this observation with PERSES, a statistical data allocation scheme to reduce the performance impact of reconstruction after disk failure. PERSES reduces degradation from the perspective of the user by clustering data on disks such that data with high probability of co-access is placed on the same device as often as possible. Trace-driven simulations show that, by laying out data with PERSES, we can reduce the perceived time lost due to failure over three years by up to 80% compared to arbitrary allocation

eScholarship - University of California

Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques

Author: Dubeyko Viacheslav
Publication venue
Publication date: 25/03/2019
Field of study

A file system optimization is the most common task in the file system field. Usually, it is seen as the key file system problem. Moreover, it is possible to state that optimization is dominant in commercial development. A problem of a new file system architecture development arises more frequently in academia. End-user can treat file system performance as the key problem of file system evolving as technology. Such understanding arises from common treatment of persistent memory as slow subsystem. As a result, problem of improving performance of data processing treats as a problem of file system performance optimization. However, evolution of physical technologies of persistent data storage requires significant changing of concepts and approaches of file systems' internal techniques. Generally speaking, only trying to improve the file system efficiency cannot resolve all issue of file systems as technological direction. Moreover, it can impede evolution of file system technology at whole. It is impossible to satisfy end-user's expectations by means of file systems optimization only. New persistent storage technologies can question about file systems necessity at whole without suggestion of revolutionary new file system's approaches. However, file system contains paradigm of information structuring that is very important for end-user as a human being. It needs to distinguish the two classes of tasks: (1) optimization task; (2) task of elaboration a new architecture vision or paradigm. But, frequently, project goal degenerates into optimization task which is meant really elaboration of a new paradigm. End-user expectations are complex and contradictory set of requirements. Only optimization tasks cannot resolve the all current needs of end-user in the file system field. End-user's expectations require resolving tasks of a new architecture vision or paradigm elaboration

arXiv.org e-Print Archive

Mirrored and Hybrid Disk Arrays: Organization, Scheduling, Reliability, and Performance

Author: Thomasian Alexander
Publication venue
Publication date: 26/01/2018
Field of study

Basic mirroring (BM) classified as RAID level 1 replicates data on two disks, thus doubling disk access bandwidth for read requests. RAID1/0 is an array of BM pairs with balanced loads due to striping. When a disk fails the read load on its pair is doubled, which results in halving the maximum attainable bandwidth. We review RAID1 organizations which attain a balanced load upon disk failure, but as shown by reliability analysis tend to be less reliable than RAID1/0. Hybrid disk arrays which store XORed instead of replicated data tend to have a higher reliability than mirrored disks, but incur a higher overhead in updating data. Read request response time can be improved by processing them at a higher priority than writes, since they have a direct effect on application response time. Shortest seek distance and affinity based routing both shorten seek time. Anticipatory arm placement places arms optimally to minimize the seek distance. The analysis of RAID1 in normal, degraded, and rebuild mode is provided to quantify RAID1/0 performance. We compare the reliability of mirrored disk organizations against each other and hybrid disks and erasure coded disk arrays

arXiv.org e-Print Archive

Authenticated Key-Value Stores with Hardware Enclaves

Author: Chen Ju
Li Kai
Tang Yuzhe
Xu Jianliang
Zhang Qi
Publication venue
Publication date: 06/11/2019
Field of study

Authenticated data storage on an untrusted platform is an important computing paradigm for cloud applications ranging from big-data outsourcing, to cryptocurrency and certificate transparency log. These modern applications increasingly feature update-intensive workloads, whereas existing authenticated data structures (ADSs) designed with in-place updates are inefficient to handle such workloads. In this paper, we address this issue and propose a novel authenticated log-structured merge tree (eLSM) based key-value store by leveraging Intel SGX enclaves. We present a system design that runs the code of eLSM store inside enclave. To circumvent the limited enclave memory (128 MB with the latest Intel CPUs), we propose to place the memory buffer of the eLSM store outside the enclave and protect the buffer using a new authenticated data structure by digesting individual LSM-tree levels. We design protocols to support query authentication in data integrity, completeness (under range queries), and freshness. The proof in our protocol is made small by including only the Merkle proofs at selective levels. We implement eLSM on top of Google LevelDB and Facebook RocksDB with minimal code change and performance interference. We evaluate the performance of eLSM under the YCSB workload benchmark and show a performance advantage of up to 4.5X speedup.Comment: eLSM, Enclave, key-value store, ADS, 18 page

arXiv.org e-Print Archive

Page Cache Attacks

Author: Fogh Anders
Gruss Daniel
Hennessey Jason
Ionescu Alex
Kraft Erik
Schwarz Michael
Tiwari Trishita
Trachtenberg Ari
Publication venue
Publication date: 04/01/2019
Field of study

We present a new hardware-agnostic side-channel attack that targets one of the most fundamental software caches in modern computer systems: the operating system page cache. The page cache is a pure software cache that contains all disk-backed pages, including program binaries, shared libraries, and other files, and our attacks thus work across cores and CPUs. Our side-channel permits unprivileged monitoring of some memory accesses of other processes, with a spatial resolution of 4KB and a temporal resolution of 2 microseconds on Linux (restricted to 6.7 measurements per second) and 466 nanoseconds on Windows (restricted to 223 measurements per second); this is roughly the same order of magnitude as the current state-of-the-art cache attacks. We systematically analyze our side channel by demonstrating different local attacks, including a sandbox bypassing high-speed covert channel, timed user-interface redressing attacks, and an attack recovering automatically generated temporary passwords. We further show that we can trade off the side channel's hardware agnostic property for remote exploitability. We demonstrate this via a low profile remote covert channel that uses this page-cache side-channel to exfiltrate information from a malicious sender process through innocuous server requests. Finally, we propose mitigations for some of our attacks, which have been acknowledged by operating system vendors and slated for future security patches

arXiv.org e-Print Archive

Simulating Data Access Profiles of Computational Jobs in Data Grids

Author: Barisits Martin
Begy Volodimir
Hermans Joeri
Lassnig Mario
Schikuta Erich
Publication venue
Publication date: 12/03/2019
Field of study

The data access patterns of applications running in computing grids are changing due to the recent proliferation of high speed local and wide area networks. The data-intensive jobs are no longer strictly required to run at the computing sites, where the respective input data are located. Instead, jobs may access the data employing arbitrary combinations of data-placement, stage-in and remote data access. These data access profiles exhibit partially non-overlapping throughput bottlenecks. This fact can be exploited in order to minimize the time jobs spend waiting for input data. In this work we present a novel grid computing simulator, which puts a heavy emphasis on the various data access profiles. The fundamental assumptions underlying our simulator are justified by empirical experiments performed in the Worldwide LHC Computing Grid (WLCG) at CERN. We demonstrate how to calibrate the simulator parameters in accordance with the true system using posterior inference with likelihood-free Markov Chain Monte Carlo. Thereafter, we validate the simulator's output with respect to an authentic production workload from WLCG, demonstrating its remarkable accuracy

arXiv.org e-Print Archive

ReCA: an Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization

Author: Asadi Hossein
Ebrahimi Shahriar
Salkhordeh Reza
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2018
Field of study

In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal cache configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of workloads. The cache reconfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively

arXiv.org e-Print Archive

Bandana: Using Non-volatile Memory for Storing Deep Learning Models

Author: Cidon Asaf
Eisenman Assaf
Gardner Darryl
Hazelwood Kim
Katti Sachin
Naumov Maxim
Pupyrev Sergey
Smelyanskiy Misha
Publication venue
Publication date: 14/11/2018
Field of study

Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership

arXiv.org e-Print Archive

A Survey on Tiering and Caching in High-Performance Storage Systems

Author: Hoseinzadeh Morteza
Publication venue
Publication date: 25/04/2019
Field of study

Although every individual invented storage technology made a big step towards perfection, none of them is spotless. Different data store essentials such as performance, availability, and recovery requirements have not met together in a single economically affordable medium, yet. One of the most influential factors is price. So, there has always been a trade-off between having a desired set of storage choices and the costs. To address this issue, a network of various types of storing media is used to deliver the high performance of expensive devices such as solid state drives and non-volatile memories, along with the high capacity of inexpensive ones like hard disk drives. In software, caching and tiering are long-established concepts for handling file operations and moving data automatically within such a storage network and manage data backup in low-cost media. Intelligently moving data around different devices based on the needs is the key insight for this matter. In this survey, we discuss some recent pieces of research that have been done to improve high-performance storage systems with caching and tiering techniques.Comment: Ph.D. Research Exam Repor

arXiv.org e-Print Archive

knor: A NUMA-Optimized In-Memory, Distributed and Semi-External-Memory k-means Library

Author: Burns Randal
Mhembere Disa
Priebe Carey E.
Vogelstein Joshua T.
Zheng Da
Publication venue
Publication date: 24/06/2017
Field of study

k-means is one of the most influential and utilized machine learning algorithms. Its computation limits the performance and scalability of many statistical analysis and machine learning tasks. We rethink and optimize k-means in terms of modern NUMA architectures to develop a novel parallelization scheme that delays and minimizes synchronization barriers. The \textit{k-means NUMA Optimized Routine} (\textsf{knor}) library has (i) in-memory (\textsf{knori}), (ii) distributed memory (\textsf{knord}), and (iii) semi-external memory (\textsf{knors}) modules that radically improve the performance of k-means for varying memory and hardware budgets. \textsf{knori} boosts performance for single machine datasets by an order of magnitude or more. \textsf{knors} improves the scalability of k-means on a memory budget using SSDs. \textsf{knors} scales to billions of points on a single machine, using a fraction of the resources that distributed in-memory systems require. \textsf{knord} retains \textsf{knori}'s performance characteristics, while scaling in-memory through distributed computation in the cloud. \textsf{knor} modifies Elkan's triangle inequality pruning algorithm such that we utilize it on billion-point datasets without the significant memory overhead of the original algorithm. We demonstrate \textsf{knor} outperforms distributed commercial products like H

_2

O, Turi (formerly Dato, GraphLab) and Spark's MLlib by more than an order of magnitude for datasets of

10^7

10^9

points

arXiv.org e-Print Archive