247 research outputs found
I/O Schedulers for Proportionality and Stability on Flash-Based SSDs in Multi-Tenant Environments
The use of flash based Solid State Drives (SSDs) has expanded rapidly into the cloud computing environment. In cloud computing, ensuring the service level objective (SLO) of each server is the major criterion in designing a system. In particular, eliminating performance interference among virtual machines (VMs) on shared storage is a key challenge. However, studies on SSD performance to guarantee SLO in such environments are limited. In this paper, we present analysis of I/O behavior for a shared SSD as storage in terms of proportionality and stability. We show that performance SLOs of SSD based storage systems being shared by VMs or tasks are not satisfactory. We present and analyze the reasons behind the unexpected behavior through examining the components of SSDs such as channels, DRAM buffer, and Native Command Queuing (NCQ). We introduce two novel SSD-aware host level I/O schedulers on Linux, called A & x002B;CFQ and H & x002B;BFQ, based on our analysis and findings. Through experiments on Linux, we analyze I/O proportionality and stability in multi-tenant environments. In addition, through experiments using real workloads, we analyze the performance interference between workloads on a shared SSD. We then show that the proposed I/O schedulers almost eliminate the interference effect seen in CFQ and BFQ, while still providing I/O proportionality and stability for various I/O weighted scenarios
Understand Data Preprocessing for Effective End-to-End Training of Deep Neural Networks
In this paper, we primarily focus on understanding the data preprocessing
pipeline for DNN Training in the public cloud. First, we run experiments to
test the performance implications of the two major data preprocessing methods
using either raw data or record files. The preliminary results show that data
preprocessing is a clear bottleneck, even with the most efficient software and
hardware configuration enabled by NVIDIA DALI, a high-optimized data
preprocessing library. Second, we identify the potential causes, exercise a
variety of optimization methods, and present their pros and cons. We hope this
work will shed light on the new co-design of ``data storage, loading pipeline''
and ``training framework'' and flexible resource configurations between them so
that the resources can be fully exploited and performance can be maximized
Divided disk cache and SSD FTL for improving performance in storage
Although there are many efficient techniques to minimize the speed gap between processor and the memory, it remains a bottleneck for various commercial implementations. Since secondary memory technologies are much slower than main memory, it is challenging to match memory speed to the processor. Usually, hard disk drives include semiconductor caches to improve their performance. A hit in the disk cache eliminates the mechanical seek time and rotational latency. To further improve performance a divided disk cache, subdivided between metadata and data, has been proposed previously. We propose a new algorithm to apply the SSD that is flash memory-based solid state drive by applying FTL. First, this paper evaluates the performance of such a disk cache via simulations using DiskSim. Then, we perform an experiment to evaluate the performance of the proposed algorithm.clos
- âŠ