Search CORE

4 research outputs found

Scheduling policies for disks and disk arrays

Author: Liu Chang
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2005
Field of study

Recent rapid advances of magnetic recording technology have enabled substantial increases in disk capacity. There has been less than 10% improvement annually in the random access time to small data blocks on the disk. Such accesses are very common in OLTP applications, which tend to have stringent response time requirements. Scheduling of disk requests is intended to improve their response time, reduce disk service time, and increase disk access bandwidth with respect to the default FCFS scheduling policy. Shortest Access Time First policy has been shown to outperform other classical disk scheduling policies in numerous studies. Before verifying this conclusion, this dissertation develops an empirical analysis of the SATF policy, and produces a valuable by-product, expressed as x[m] = mp, during the study. Classical scheduling policies and some well-known variations of the SATE policy are re-evaluated, and three extensions are proposed. The performance evaluation uses self-developed simulators containing detailed disk information. The simulators, driven with both synthetic and trace workloads, report the measurements of requests, such as the mean and the 95th percentile of the response times, as well as the measurements of the system, such as the maximum throughput. A comprehensive arrangement of routing and scheduling schemes is presented or mirrored disk systems, or RAIDi. The performance evaluation is based on a twodimensional configuration classification: independent queues (i.e. a router sends the requests to one of the disks as soon as these requests arrive) versus a shared queue (i.e. the requests are held in a common queue at the router and are scheduled to be served); normal data layout versus transposed data layout (i.e. the data stored on the inner cylinders of one disk is duplicated on the outer cylinders of the mirrored disk). The availability of a non-volatile storage or NVS, which allows the processing of write requests to be deferred, is also investigated. Finally, various strategies of mirrored disk declustering are compared against the basic disk mirroring. Their competence of load balancing and their reliability are examined in both normal mode and degraded mode

Digital Commons @ New Jersey Institute of Technology (NJIT)

Data allocation in disk arrays with multiple raid levels

Author: Xu Jun
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2008
Field of study

There has been an explosion in the amount of generated data, which has to be stored reliably because it is not easily reproducible. Some datasets require frequent read and write access. like online transaction processing applications. Others just need to be stored safely and read once in a while, as in data mining. This different access requirements can be solved by using the RAID (redundant array of inexpensive disks) paradigm. i.e., RAIDi for the first situation and RAID5 for the second situation. Furthermore rather than providing two disk arrays with RAID 1 and RAID5 capabilities, a controller can be postulated to emulate both. It is referred as a heterogeneous disk array (HDA). Dedicating a subset of disks to RAID 1 results in poor disk utilization, since RAIDi vs RAID5 capacity and bandwidth requirements are not known a priori. Balancing disk loads when disk space is shared among allocation requests, referred to as virtual arrays - VAs poses a difficult problem. RAIDi disk arrays have a higher access rate per gigabyte than RAID5 disk arrays. Allocating more VAs while keeping disk utilizations balanced and within acceptable bounds is the goal of this study. Given its size and access rate a VA\u27s width or the number of its Virtual Disks -VDs is determined. VDs allocations on physical disks using vector-packing heuristics, with disk capacity and bandwidth as the two dimensions are shown to be the best. An allocation is acceptable if it does riot exceed the disk capacity and overload disks even in the presence of disk failures. When disk bandwidth rather than capacity is the bottleneck, the clustered RAID paradigm is applied, which offers a tradeoff between disk space and bandwidth. Another scenario is also considered where the RAID level is determined by a classification algorithm utilizing the access characteristics of the VA, i.e., fractions of small versus large access and the fraction of write versus read accesses. The effect of RAID 1 organization on its reliability and performance is studied too. The effect of disk failures on the X-code two disk failure tolerant array is analyzed and it is shown that the load across disks is highly unbalanced unless in an NxN array groups of N stripes are randomly rotated

Digital Commons @ New Jersey Institute of Technology (NJIT)

Queueing network models of zoned RAID system performance

Author: Lebrecht Abigail
Lebrecht Abigail
Publication venue: Computing, Imperial College London
Publication date: 01/08/2010
Field of study

RAID systems are widely deployed, both as standalone storage solutions and as the building blocks of modern virtualised storage platforms. An accurate model of RAID system performance is therefore critical towards fulfilling quality of service constraints for fast, reliable storage. This thesis presents techniques and tools that model response times in zoned RAID systems. The inputs to this analysis are a specified I/O request arrival rate, an I/O request access profile, a given RAID configuration and physical disk parameters. The primary output of this analysis is an approximation to the cumulative distribution function of I/O request response time. From this, it is straightforward to calculate response time quantiles, as well as the mean, variance and higher moments of I/O request response time. The model supports RAID levels 0, 01, 10 and 5 and a variety of workload types. Our RAID model is developed in a bottom-up hierarchical fashion. We begin by modelling each zoned disk drive in the array as a single M/G/1 queue. The service time is modelled as the sum of the random variables of seek time, rotational latency and data transfer time. In doing so, we take into account the properties of zoned disks. We then abstract a RAID system as a fork-join queueing network. This comprises several queues, each of which represents one disk drive in the array. We tailor our basic fork-join approximation to account for the I/O request patterns associated with particular request types and request sizes under different RAID levels. We extend the RAID and disk models to support bulk arrivals, requests of different sizes and scheduling algorithms that reorder queueing requests to minimise disk head positioning time. Finally, we develop a corresponding simulation to improve and validate the model. To test the accuracy of all our models, we validate them against disk drive and RAID device measurements throughout

Spiral - Imperial College Digital Repository