65 research outputs found

    New techniques to model energy-aware I/O architectures based on SSD and hard disk drives

    Get PDF
    For years, performance improvements at the computer I/O subsystem and at other subsystems have advanced at their own pace, being less the improvements at the I/O subsystem, and making the overall system speed dependant of the I/O subsystem speed. One of the main factors for this imbalance is the inherent nature of disk drives, which has allowed big advances in disk densities, but not so many in disk performance. Thus, to improve I/O subsystem performance, disk drives have become a goal of study for many researchers, having to use, in some cases, different kind of models. Other research studies aim to improve I/O subsystem performance by tuning more abstract I/O levels. Since disk drives lay behind those levels, real disk drives or just models need to be used. One of the most common techniques to evaluate the performance of a computer I/O subsystem is found on detailed simulation models including specific features of storage devices like disk geometry, zone splitting, caching, read-ahead buffers and request reordering. However, as soon as a new technological innovation is added, those models need to be reworked to include new characteristics, making difficult to have general models up to date. Our alternative is modeling a storage device as a black-box probabilistic model, where the storage device itself, its interface and the interconnection mechanisms are modeled as a single stochastic process, defining the service time as a random variable with an unknown distribution. This approach allows generating disk service times needing less computational power by means of a variate generator included in a simulator. This approach allows to reach a greater scalability in I/O subsystems performance evaluations by means of simulation. Lately, energy saving for computing systems has become an important need. In mobile computers, the battery life is limited to a certain amount of time, and not wasting energy at certain parts would extend the usage of the computer. Here, again the computer I/O subsystem has pointed out as field of study, because disk drives, which are a main part of it, are one of the most power consuming elements due to their mechanical nature. In server or enterprise computers, where the number of disks increase considerably, power saving may reduce cooling requirements for heat dissipation and thus, great monetary costs. This dissertation also considers the question of saving energy in the disk drive, by making advantage of diverse devices in hybrid storage systems, composed of Solid State Disks (SSDs) and Disk drives. SSDs and Disk drives offer different power characteristics, being SSDs much less power consuming than disk drives. In this thesis, several techniques that use SSDs as supporting devices for Disk drives, are proposed. Various options for managing SSDs and Disk devices in such hybrid systems are examinated, and it is shown that the proposed methods save energy and monetary costs in diverse scenarios. A simulator composed of Disks and SSD devices was implemented. This thesis studies the design and evaluation of the proposed approaches with the help of realistic workloads. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Durante años, las mejoras de rendimiento en el subsistema de E/S del ordenador y en otros subsistemas han avanzado a su propio ritmo, siendo menores las mejoras en el subsistema de E/S, y provocando que la velocidad global del sistema dependa de la velocidad del subsistema de E/S. Uno de los factores principales de este desequilibrio es la naturaleza inherente de las unidades de disco, la cual que ha permitido grandes avances en las densidades de disco, pero no así en su rendimiento. Por lo tanto, para mejorar el rendimiento del subsistema de E/S, las unidades de disco se han convertido en objetivo de estudio para muchos investigadores, que se ven obligados a utilizar, en algunos casos, diferentes tipos de modelos o simuladores. Otros estudios de investigación tienen como objetivo mejorar el rendimiento del subsistema de E/S, estudiando otros niveles más abstractos. Como los dispositivos de disco siguen estando detrás de esos niveles, tanto discos reales como modelos pueden usarse para esos estudios. Una de las técnicas más comunes para evaluar el rendimiento del subsistema de E/S de un ordenador se ha encontrado en los modelos de simulación detallada, los cuales modelan características específicas de los dispositivos de almacenamiento como la geometría del disco, la división en zonas, el almacenamiento en caché, el comportamiento de los buffers de lectura anticipada y la reordenación de solicitudes. Sin embargo, cuando se agregan innovaciones tecnológicas, los modelos tienen que ser revisados a fin de incluir nuevas características que incorporen dichas innovaciones, y esto hace difícil el tener modelos generales actualizados. Nuestra alternativa es el modelado de un dispositivo de almacenamiento como un modelo probabilístico de caja negra, donde el dispositivo de almacenamiento en sí, su interfaz y sus mecanismos de interconexión se tratan como un proceso estocástico, definiendo el tiempo de servicio como una variable aleatoria con una distribución desconocida. Este enfoque permite la generación de los tiempos de servicio del disco, de forma que se necesite menos potencia de cálculo a través del uso de un generador de variable aleatoria incluido en un simulador. De este modo, se permite alcanzar una mayor escalabilidad en la evaluación del rendimiento del subsistema de E/S a través de la simulación. En los últimos años, el ahorro de energía en los sistemas de computación se ha convertido en una necesidad importante. En ordenadores portátiles, la duración de la batería se limita a una cierta cantidad de tiempo, y no desperdiciar energía en ciertas partes haría más largo el uso del ordenador. Aquí, de nuevo el subsistema de E/S se señala como campo de estudio, ya que las unidades de disco, que son una parte principal del mismo, son uno de los elementos de más consumo de energía debido a su naturaleza mecánica. En los equipos de servidor o de empresa, donde el número de discos aumenta considerablemente, el ahorro de energía puede reducir las necesidades de refrigeración para la disipación de calor y por lo tanto, grandes costes monetarios. Esta tesis también considera la cuestión del ahorro energético en la unidad de disco, haciendo uso de diversos dispositivos en sistemas de almacenamiento híbridos, que emplean discos de estado sólido (SSD) y unidades de disco. Las SSD y unidades de disco ofrecen diferentes características de potencia, consumiendo las SSDs menos energía que las unidades de disco. En esta tesis se proponen varias técnicas que utilizan los SSD como apoyo a los dispositivos de disco. Se examinan las diversas opciones para la gestión de las SSD y los dispositivos de disco en tales sistemas híbridos, y se muestra que los métodos propuestos ahorran energía y costes monetarios en diversos escenarios. Se ha implementado un simulador compuesto por discos y dispositivos SSD. Esta tesis estudia el diseño y evaluación de los enfoques propuestos con la ayuda de las cargas de trabajo reales

    Energy-Efficient Streaming Using Non-volatile Memory

    Get PDF
    The disk and the DRAM in a typical mobile system consume a significant fraction (up to 30%) of the total system energy. To save on storage energy, the DRAM should be small and the disk should be spun down for long periods of time. We show that this can be achieved for predominantly streaming workloads by connecting the disk to the DRAM via a large non-volatile memory (NVM). We refer to this as the NVM-based architecture (NVMBA); the conventional architecture with only a DRAM and a disk is referred to as DRAMBA. The NVM in the NVMBA acts as a traffic reshaper from the disk to the DRAM. The total system costs are balanced, since the cost increase due to adding the NVM is compensated by the decrease in DRAM cost. We analyze the energy saving of NVMBA, with NAND flash memory serving as NVM, relative to DRAMBA with respect to (1) the streaming demand, (2) the disk form factor, (3) the best-effort provision, and (4) the stream location on the disk. We present a worst-case analysis of the reliability of the disk drive and the flash memory, and show that a small flash capacity is sufficient to operate the system over a year at negligible cost. Disk lifetime is superior to flash, so that is of no concern

    Multi-level Hybrid Cache: Impact and Feasibility

    Full text link

    Synergistically Coupling Of Solid State Drives And Hard Disks For Qos-Aware Virtual Memory

    Get PDF
    With significant advantages in capacity, power consumption, and price, solid state disk (SSD) has good potential to be employed as an extension of dynamic random-access memory, such that applications with large working sets could run efficiently on a modestly configured system. While initial results reported in recent works show promising prospects for this use of SSD by incorporating it into the management of virtual memory, frequent writes from write-intensive programs could quickly wear out SSD, making the idea less practical. This thesis makes four contributions towards solving this issue. First, we propose a scheme, HybridSwap, that integrates a hard disk with an SSD for virtual memory man-agement, synergistically achieving the advantages of both. In addition, HybridSwap can constrain performance loss caused by swapping according to user-specified QoS requirements. Second, We develop an efficient algorithm to record memory access history and to identify page access sequences and evaluate their locality. Using a history of page access patterns HybridSwap dynamically creates an out-of-memory virtual memory page layout on the swap space spanning the SSD and hard disk such that random reads are served by SSD and sequential reads are asynchronously served by the hard disk with high efficiency. Third, we build a QoS-assurance mechanism into HybridSwap to demonstrate the flexibility of the system in bounding the performance penalty due to swapping. It allows users to specify a bound on the program stall time due to page faults as a percentage of the program\u27s total run time. Forth, we have implemented HybridSwap in a recent Linux kernel, version 2.6.35.7. Our evaluation with representative benchmarks, such as Memcached for key-value store, and scientific programs from the ALGLIB cross-platform numerical analysis and data processing library, shows that the number of writes to SSD can be reduced by 40% with the system\u27s performance comparable to that with pure SSD swapping, and can satisfy a swapping-related QoS requirement as long as the I/O resource is sufficient

    Memory Management for Emerging Memory Technologies

    Get PDF
    The Memory Wall, or the gap between CPU speed and main memory latency, is ever increasing. The latency of Dynamic Random-Access Memory (DRAM) is now of the order of hundreds of CPU cycles. Additionally, the DRAM main memory is experiencing power, performance and capacity constraints that limit process technology scaling. On the other hand, the workloads running on such systems are themselves changing due to virtualization and cloud computing demanding more performance of the data centers. Not only do these workloads have larger working set sizes, but they are also changing the way memory gets used, resulting in higher sharing and increased bandwidth demands. New Non-Volatile Memory technologies (NVM) are emerging as an answer to the current main memory issues. This thesis looks at memory management issues as the emerging memory technologies get integrated into the memory hierarchy. We consider the problems at various levels in the memory hierarchy, including sharing of CPU LLC, traffic management to future non-volatile memories behind the LLC, and extending main memory through the employment of NVM. The first solution we propose is “Adaptive Replacement and Insertion" (ARI), an adaptive approach to last-level CPU cache management, optimizing the cache miss rate and writeback rate simultaneously. Our specific focus is to reduce writebacks as much as possible while maintaining or improving miss rate relative to conventional LRU replacement policy, with minimal hardware overhead. ARI reduces writebacks on benchmarks from SPEC2006 suite on average by 32.9% while also decreasing misses on average by 4.7%. In a PCM based memory system, this decreases energy consumption by 23% compared to LRU and provides a 49% lifetime improvement beyond what is possible with randomized wear-leveling. Our second proposal is “Variable-Timeslice Thread Scheduling" (VATS), an OS kernel-level approach to CPU cache sharing. With modern, large, last-level caches (LLC), the time to fill the LLC is greater than the OS scheduling window. As a result, when a thread aggressively thrashes the LLC by replacing much of the data in it, another thread may not be able to recover its working set before being rescheduled. We isolate the threads in time by increasing their allotted time quanta, and allowing larger periods of time between interfering threads. Our approach, compared to conventional scheduling, mitigates up to 100% of the performance loss caused by CPU LLC interference. The system throughput is boosted by up to 15%. As an unconventional approach to utilizing emerging memory technologies, we present a Ternary Content-Addressable Memory (TCAM) design with Flash transistors. TCAM is successfully used in network routing but can also be utilized in the OS Virtual Memory applications. Based on our layout and circuit simulation experiments, we conclude that our FTCAM block achieves an area improvement of 7.9× and a power improvement of 1.64× compared to a CMOS approach. In order to lower the cost of Main Memory in systems with huge memory demand, it is becoming practical to extend the DRAM in the system with the less-expensive NVMe Flash, for a much lower system cost. However, given the relatively high Flash devices access latency, naively using them as main memory leads to serious performance degradation. We propose OSVPP, a software-only, OS swap-based page prefetching scheme for managing such hybrid DRAM + NVM systems. We show that it is possible to gain about 50% of the lost performance due to swapping into the NVM and thus enable the utilization of such hybrid systems for memory-hungry applications, lowering the memory cost while keeping the performance comparable to the DRAM-only system

    Memory Management for Emerging Memory Technologies

    Get PDF
    The Memory Wall, or the gap between CPU speed and main memory latency, is ever increasing. The latency of Dynamic Random-Access Memory (DRAM) is now of the order of hundreds of CPU cycles. Additionally, the DRAM main memory is experiencing power, performance and capacity constraints that limit process technology scaling. On the other hand, the workloads running on such systems are themselves changing due to virtualization and cloud computing demanding more performance of the data centers. Not only do these workloads have larger working set sizes, but they are also changing the way memory gets used, resulting in higher sharing and increased bandwidth demands. New Non-Volatile Memory technologies (NVM) are emerging as an answer to the current main memory issues. This thesis looks at memory management issues as the emerging memory technologies get integrated into the memory hierarchy. We consider the problems at various levels in the memory hierarchy, including sharing of CPU LLC, traffic management to future non-volatile memories behind the LLC, and extending main memory through the employment of NVM. The first solution we propose is “Adaptive Replacement and Insertion" (ARI), an adaptive approach to last-level CPU cache management, optimizing the cache miss rate and writeback rate simultaneously. Our specific focus is to reduce writebacks as much as possible while maintaining or improving miss rate relative to conventional LRU replacement policy, with minimal hardware overhead. ARI reduces writebacks on benchmarks from SPEC2006 suite on average by 32.9% while also decreasing misses on average by 4.7%. In a PCM based memory system, this decreases energy consumption by 23% compared to LRU and provides a 49% lifetime improvement beyond what is possible with randomized wear-leveling. Our second proposal is “Variable-Timeslice Thread Scheduling" (VATS), an OS kernel-level approach to CPU cache sharing. With modern, large, last-level caches (LLC), the time to fill the LLC is greater than the OS scheduling window. As a result, when a thread aggressively thrashes the LLC by replacing much of the data in it, another thread may not be able to recover its working set before being rescheduled. We isolate the threads in time by increasing their allotted time quanta, and allowing larger periods of time between interfering threads. Our approach, compared to conventional scheduling, mitigates up to 100% of the performance loss caused by CPU LLC interference. The system throughput is boosted by up to 15%. As an unconventional approach to utilizing emerging memory technologies, we present a Ternary Content-Addressable Memory (TCAM) design with Flash transistors. TCAM is successfully used in network routing but can also be utilized in the OS Virtual Memory applications. Based on our layout and circuit simulation experiments, we conclude that our FTCAM block achieves an area improvement of 7.9× and a power improvement of 1.64× compared to a CMOS approach. In order to lower the cost of Main Memory in systems with huge memory demand, it is becoming practical to extend the DRAM in the system with the less-expensive NVMe Flash, for a much lower system cost. However, given the relatively high Flash devices access latency, naively using them as main memory leads to serious performance degradation. We propose OSVPP, a software-only, OS swap-based page prefetching scheme for managing such hybrid DRAM + NVM systems. We show that it is possible to gain about 50% of the lost performance due to swapping into the NVM and thus enable the utilization of such hybrid systems for memory-hungry applications, lowering the memory cost while keeping the performance comparable to the DRAM-only system

    Optimizing Virtual Machine I/O Performance in Cloud Environments

    Get PDF
    Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs

    DISK DESIGN-SPACE EXPLORATION IN TERMS OF SYSTEM-LEVEL PERFORMANCE, POWER, AND ENERGY CONSUMPTION

    Get PDF
    To make the common case fast, most studies focus on the computation phase of applications in which most instructions are executed. However, many programs spend significant time in the I/O intensive phase due to the I/O latency. To obtain a system with more balanced phases, we require greater insight into the effects of the I/O configurations to the entire system in both performance and power dissipation domains. Due to lack of public tools with the complete picture of the entire memory hierarchy, we developed SYSim. SYSim is a complete-system simulator aiming at complete memory hierarchy studies in both performance and power consumption domains. In this dissertation, we used SYSim to investigate the system-level impacts of several disk enhancements and technology improvements to the detailed interaction in memory hierarchy during the I/O-intensive phase. The experimental results are reported in terms of both total system performance and power/energy consumption. With SYSim, we conducted the complete-system experiments and revealed intriguing behaviors including, but not limited to, the following: During the I/O intensive phase which consists of both disk reads and writes, the average system CPI tracks only average disk read response time, and not overall average disk response time, which is the widely-accepted metric in disk drive research. In disk read-dominating applications, Disk Prefetching is more important than increasing the disk RPM. On the other hand, in applications with both disk reads and writes, the disk RPM matters. The execution time can be improved to an order of magnitude by applying some disk enhancements. Using disk caching and prefetching can improve the performance by the factor of 2, and write-buffering can improve the performance by the factor of 10. Moreover, using disk caching/prefetching and the write-buffering techniques in conjunction can improve the total system performance by at least an order of magnitude. Increasing the disk RPM and the number of disks in RAID disk system also have an impressive improvement over the total system performance. However, employing such techniques requires careful consideration for trade-offs in power/energy consumption
    corecore