7,395 research outputs found

    Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques

    Full text link
    A file system optimization is the most common task in the file system field. Usually, it is seen as the key file system problem. Moreover, it is possible to state that optimization is dominant in commercial development. A problem of a new file system architecture development arises more frequently in academia. End-user can treat file system performance as the key problem of file system evolving as technology. Such understanding arises from common treatment of persistent memory as slow subsystem. As a result, problem of improving performance of data processing treats as a problem of file system performance optimization. However, evolution of physical technologies of persistent data storage requires significant changing of concepts and approaches of file systems' internal techniques. Generally speaking, only trying to improve the file system efficiency cannot resolve all issue of file systems as technological direction. Moreover, it can impede evolution of file system technology at whole. It is impossible to satisfy end-user's expectations by means of file systems optimization only. New persistent storage technologies can question about file systems necessity at whole without suggestion of revolutionary new file system's approaches. However, file system contains paradigm of information structuring that is very important for end-user as a human being. It needs to distinguish the two classes of tasks: (1) optimization task; (2) task of elaboration a new architecture vision or paradigm. But, frequently, project goal degenerates into optimization task which is meant really elaboration of a new paradigm. End-user expectations are complex and contradictory set of requirements. Only optimization tasks cannot resolve the all current needs of end-user in the file system field. End-user's expectations require resolving tasks of a new architecture vision or paradigm elaboration

    Mirrored and Hybrid Disk Arrays: Organization, Scheduling, Reliability, and Performance

    Full text link
    Basic mirroring (BM) classified as RAID level 1 replicates data on two disks, thus doubling disk access bandwidth for read requests. RAID1/0 is an array of BM pairs with balanced loads due to striping. When a disk fails the read load on its pair is doubled, which results in halving the maximum attainable bandwidth. We review RAID1 organizations which attain a balanced load upon disk failure, but as shown by reliability analysis tend to be less reliable than RAID1/0. Hybrid disk arrays which store XORed instead of replicated data tend to have a higher reliability than mirrored disks, but incur a higher overhead in updating data. Read request response time can be improved by processing them at a higher priority than writes, since they have a direct effect on application response time. Shortest seek distance and affinity based routing both shorten seek time. Anticipatory arm placement places arms optimally to minimize the seek distance. The analysis of RAID1 in normal, degraded, and rebuild mode is provided to quantify RAID1/0 performance. We compare the reliability of mirrored disk organizations against each other and hybrid disks and erasure coded disk arrays

    Authenticated Key-Value Stores with Hardware Enclaves

    Full text link
    Authenticated data storage on an untrusted platform is an important computing paradigm for cloud applications ranging from big-data outsourcing, to cryptocurrency and certificate transparency log. These modern applications increasingly feature update-intensive workloads, whereas existing authenticated data structures (ADSs) designed with in-place updates are inefficient to handle such workloads. In this paper, we address this issue and propose a novel authenticated log-structured merge tree (eLSM) based key-value store by leveraging Intel SGX enclaves. We present a system design that runs the code of eLSM store inside enclave. To circumvent the limited enclave memory (128 MB with the latest Intel CPUs), we propose to place the memory buffer of the eLSM store outside the enclave and protect the buffer using a new authenticated data structure by digesting individual LSM-tree levels. We design protocols to support query authentication in data integrity, completeness (under range queries), and freshness. The proof in our protocol is made small by including only the Merkle proofs at selective levels. We implement eLSM on top of Google LevelDB and Facebook RocksDB with minimal code change and performance interference. We evaluate the performance of eLSM under the YCSB workload benchmark and show a performance advantage of up to 4.5X speedup.Comment: eLSM, Enclave, key-value store, ADS, 18 page

    Page Cache Attacks

    Full text link
    We present a new hardware-agnostic side-channel attack that targets one of the most fundamental software caches in modern computer systems: the operating system page cache. The page cache is a pure software cache that contains all disk-backed pages, including program binaries, shared libraries, and other files, and our attacks thus work across cores and CPUs. Our side-channel permits unprivileged monitoring of some memory accesses of other processes, with a spatial resolution of 4KB and a temporal resolution of 2 microseconds on Linux (restricted to 6.7 measurements per second) and 466 nanoseconds on Windows (restricted to 223 measurements per second); this is roughly the same order of magnitude as the current state-of-the-art cache attacks. We systematically analyze our side channel by demonstrating different local attacks, including a sandbox bypassing high-speed covert channel, timed user-interface redressing attacks, and an attack recovering automatically generated temporary passwords. We further show that we can trade off the side channel's hardware agnostic property for remote exploitability. We demonstrate this via a low profile remote covert channel that uses this page-cache side-channel to exfiltrate information from a malicious sender process through innocuous server requests. Finally, we propose mitigations for some of our attacks, which have been acknowledged by operating system vendors and slated for future security patches

    Simulating Data Access Profiles of Computational Jobs in Data Grids

    Full text link
    The data access patterns of applications running in computing grids are changing due to the recent proliferation of high speed local and wide area networks. The data-intensive jobs are no longer strictly required to run at the computing sites, where the respective input data are located. Instead, jobs may access the data employing arbitrary combinations of data-placement, stage-in and remote data access. These data access profiles exhibit partially non-overlapping throughput bottlenecks. This fact can be exploited in order to minimize the time jobs spend waiting for input data. In this work we present a novel grid computing simulator, which puts a heavy emphasis on the various data access profiles. The fundamental assumptions underlying our simulator are justified by empirical experiments performed in the Worldwide LHC Computing Grid (WLCG) at CERN. We demonstrate how to calibrate the simulator parameters in accordance with the true system using posterior inference with likelihood-free Markov Chain Monte Carlo. Thereafter, we validate the simulator's output with respect to an authentic production workload from WLCG, demonstrating its remarkable accuracy

    ReCA: an Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization

    Full text link
    In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive workload characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O workloads and classify them into five major classes. Based on this characterization, an optimal cache configuration is presented for each class of workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of workloads. The cache reconfiguration is done online and workload classes can be extended to emerging I/O workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively

    Bandana: Using Non-volatile Memory for Storing Deep Learning Models

    Full text link
    Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership

    A Survey on Tiering and Caching in High-Performance Storage Systems

    Full text link
    Although every individual invented storage technology made a big step towards perfection, none of them is spotless. Different data store essentials such as performance, availability, and recovery requirements have not met together in a single economically affordable medium, yet. One of the most influential factors is price. So, there has always been a trade-off between having a desired set of storage choices and the costs. To address this issue, a network of various types of storing media is used to deliver the high performance of expensive devices such as solid state drives and non-volatile memories, along with the high capacity of inexpensive ones like hard disk drives. In software, caching and tiering are long-established concepts for handling file operations and moving data automatically within such a storage network and manage data backup in low-cost media. Intelligently moving data around different devices based on the needs is the key insight for this matter. In this survey, we discuss some recent pieces of research that have been done to improve high-performance storage systems with caching and tiering techniques.Comment: Ph.D. Research Exam Repor

    knor: A NUMA-Optimized In-Memory, Distributed and Semi-External-Memory k-means Library

    Full text link
    k-means is one of the most influential and utilized machine learning algorithms. Its computation limits the performance and scalability of many statistical analysis and machine learning tasks. We rethink and optimize k-means in terms of modern NUMA architectures to develop a novel parallelization scheme that delays and minimizes synchronization barriers. The \textit{k-means NUMA Optimized Routine} (\textsf{knor}) library has (i) in-memory (\textsf{knori}), (ii) distributed memory (\textsf{knord}), and (iii) semi-external memory (\textsf{knors}) modules that radically improve the performance of k-means for varying memory and hardware budgets. \textsf{knori} boosts performance for single machine datasets by an order of magnitude or more. \textsf{knors} improves the scalability of k-means on a memory budget using SSDs. \textsf{knors} scales to billions of points on a single machine, using a fraction of the resources that distributed in-memory systems require. \textsf{knord} retains \textsf{knori}'s performance characteristics, while scaling in-memory through distributed computation in the cloud. \textsf{knor} modifies Elkan's triangle inequality pruning algorithm such that we utilize it on billion-point datasets without the significant memory overhead of the original algorithm. We demonstrate \textsf{knor} outperforms distributed commercial products like H2_2O, Turi (formerly Dato, GraphLab) and Spark's MLlib by more than an order of magnitude for datasets of 10710^7 to 10910^9 points
    • …
    corecore