8 research outputs found
Response Time Approximations in Fork-Join Queues
Fork-join queueing networks model a network of parallel servers in which an arriving job splits into a number of subtasks that are serviced in parallel. Fork-join queues can be used to model disk arrays. A response time approximation of the fork-join queue is presented that attempts to comply with the additional constraints of modelling a disk array. This approximation is compared with existing analytical approximations of the fork-join queueing network
Approximations and Bounds for (n, k) Fork-Join Queues: A Linear Transformation Approach
Compared to basic fork-join queues, a job in (n, k) fork-join queues only
needs its k out of all n sub-tasks to be finished. Since (n, k) fork-join
queues are prevalent in popular distributed systems, erasure coding based cloud
storages, and modern network protocols like multipath routing, estimating the
sojourn time of such queues is thus critical for the performance measurement
and resource plan of computer clusters. However, the estimating keeps to be a
well-known open challenge for years, and only rough bounds for a limited range
of load factors have been given. In this paper, we developed a closed-form
linear transformation technique for jointly-identical random variables: An
order statistic can be represented by a linear combination of maxima. This
brand-new technique is then used to transform the sojourn time of non-purging
(n, k) fork-join queues into a linear combination of the sojourn times of basic
(k, k), (k+1, k+1), ..., (n, n) fork-join queues. Consequently, existing
approximations for basic fork-join queues can be bridged to the approximations
for non-purging (n, k) fork-join queues. The uncovered approximations are then
used to improve the upper bounds for purging (n, k) fork-join queues.
Simulation experiments show that this linear transformation approach is
practiced well for moderate n and relatively large k.Comment: 10 page
Validation of Large Zoned RAID Systems
Building on our prior work we present an improved model for for large partial stripe following full stripe writes in RAID 5. This was necessary because we observed that our previous model tended to underestimate measured results. To date, we have only validated these models against RAID systems with at most four disks. Here we validate our improved model, and also our existing models for other read and write configurations, against measurements taken from an eight disk RAID array
RAID Organizations for Improved Reliability and Performance: A Not Entirely Unbiased Tutorial (1st revision)
RAID proposal advocated replacing large disks with arrays of PC disks, but as
the capacity of small disks increased 100-fold in 1990s the production of large
disks was discontinued. Storage dependability is increased via replication or
erasure coding. Cloud storage providers store multiple copies of data obviating
for need for further redundancy. Varitaions of RAID based on local recovery
codes, partial MDS reduce recovery cost. NAND flash Solid State Disks - SSDs
have low latency and high bandwidth, are more reliable, consume less power and
have a lower TCO than Hard Disk Drives, which are more viable for hyperscalers.Comment: Submitted to ACM Computing Surveys. arXiv admin note: substantial
text overlap with arXiv:2306.0876
On Fork-Join Queues and Maximum Ratio Cliques
This dissertation consists of two parts. The ļ¬rst part delves into the problem of response time estimation in fork-join queueing networks. These systems have been seen in literature for more than thirty years. The estimation of the mean response time in these systems has been found to be notoriously hard for most forms of these queueing systems. In this work, simple expressions for the mean response time are proposed as conjectures. Extensive experiments demonstrate the remarkable accuracy of these conjectures. Algorithms for the estimation of response time using these conjectures are proposed. For many of the networks studied in this dissertation, no approximations are known in literature for estimation of their response time. Therefore, the contribution of this dissertation in this direction marks signiļ¬cant progress in the analysis of fork-join queues.
The second part of this dissertation introduces a fractional version of the classical maximum weight clique problem, the maximum ratio clique problem, which is to ļ¬nd a maximal clique that has the largest ratio of beneļ¬t and cost weights associated with the cliques vertices. This problem is formulated to model networks in which the vertices have a beneļ¬t as well as a cost associated with them. The maximum ratio clique problem ļ¬nds applications in a wide range of areas including social networks, stock market graphs and wind farm location. NP-completeness of the decision version of the problem is established, and three solution methods are proposed. The results of numerical experiments with standard graph instances, as well as with real-life instances arising in ļ¬nance and energy systems, are reported
Queueing network models of zoned RAID system performance
RAID systems are widely deployed, both as standalone storage solutions and as
the building blocks of modern virtualised storage platforms. An accurate model of
RAID system performance is therefore critical towards fulfilling quality of service
constraints for fast, reliable storage.
This thesis presents techniques and tools that model response times in zoned
RAID systems. The inputs to this analysis are a specified I/O request arrival
rate, an I/O request access profile, a given RAID configuration and physical disk
parameters. The primary output of this analysis is an approximation to the cumulative
distribution function of I/O request response time. From this, it is straightforward
to calculate response time quantiles, as well as the mean, variance and
higher moments of I/O request response time. The model supports RAID levels
0, 01, 10 and 5 and a variety of workload types.
Our RAID model is developed in a bottom-up hierarchical fashion. We begin by
modelling each zoned disk drive in the array as a single M/G/1 queue. The service
time is modelled as the sum of the random variables of seek time, rotational
latency and data transfer time. In doing so, we take into account the properties of
zoned disks. We then abstract a RAID system as a fork-join queueing network.
This comprises several queues, each of which represents one disk drive in the array.
We tailor our basic fork-join approximation to account for the I/O request
patterns associated with particular request types and request sizes under different
RAID levels. We extend the RAID and disk models to support bulk arrivals, requests
of different sizes and scheduling algorithms that reorder queueing requests
to minimise disk head positioning time. Finally, we develop a corresponding simulation
to improve and validate the model. To test the accuracy of all our models,
we validate them against disk drive and RAID device measurements throughout