13 research outputs found

    Redundant disk arrays: Reliable, parallel secondary storage

    Get PDF
    During the past decade, advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secondary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of current disk subsystems, by leveraging parallelism, many times their performance. Unfortunately, arrays of small disks may have much higher failure rates than the single large disks they replace. Redundant arrays of inexpensive disks (RAID) use simple redundancy schemes to provide high data reliability. The data encoding, performance, and reliability of redundant disk arrays are investigated. Organizing redundant data into a disk array is treated as a coding problem. Among alternatives examined, codes as simple as parity are shown to effectively correct single, self-identifying disk failures

    RAIDX: RAID EXTENDED FOR HETEROGENEOUS ARRAYS

    Get PDF
    The computer hard drive market has diversified with the establishment of solid state disks (SSDs) as an alternative to magnetic hard disks (HDDs). Each hard drive technology has its advantages: the SSDs are faster than HDDs but the HDDs are cheaper. Our goal is to construct a parallel storage system with HDDs and SSDs such that the parallel system is as fast as the SSDs. Achieving this goal is challenging since the slow HDDs store more data and become bottlenecks, while the SSDs remain idle. RAIDX is a parallel storage system designed for disks of different speeds, capacities and technologies. The RAIDX hardware consists of an array of disks; the RAIDX software consists of data structures and algorithms that allow the disks to be viewed as a single storage unit that has capacity equal to the sum of the capacities of its disks, failure rate lower than the failure rate of its individual disks, and speeds close to that of its faster disks. RAIDX achieves its performance goals with the aid of its novel parallel data organization technique that allows storage data to be moved on the fly without impacting the upper level file system. We show that storage data accesses satisfy the locality of reference principle, whereby only a small fraction of storage data are accessed frequently. RAIDX has a monitoring program that identifies frequently accessed blocks and a migration program that moves frequently accessed blocks to faster disks. The faster disks are caches that store the solo copy of frequently accessed data. Experimental evaluation has shown that a HDD+SSD RAIDX array is as fast as an all-SSD array when the workload shows locality of reference

    Design and Analysis of Capacity Extendmle Disk Array System: The Diagonal Move Algorithm

    Get PDF
    With increased I/O performance and at least one disk failure tolerance, data redundant disk array as secondary storage system efficiently translate from a conventional computer storage system to be with better I/O rate, higher data transfer rate, and stronger reliability than traditional large single-disk systems. The increased I/O performance in measurement of I/O operating rate and data transfer rate are mostly gained from simultaneous data retrieval from several disks which are organized in parallel as shown in Figure 2-1. This parallel disks organization lets us have better I/O operating rate than the singular disk architecture, since it has several disk- I/O operations running concurrently. The I/O operating rate is defined as the number of I/O operations per second. And the simultaneous data access from disks lets us have better data transfer rate than any singular disk architecture, since we are retrieving data from more than one disk at the same time. The data transfer rate is defined as amount of data transferred through the bus or network per second such as bits per second (BPS). The disk data access time is the composition of seek time, rotation time, and data transfer time. Because data transfer speed over the bus or network electronically is much faster than the slow mechanical disk drives, accessing several disks in parallel can contribute to the bus or network having better utilization

    Parallel I/O system for a clustered computing environment

    Get PDF
    Master'sMASTER OF SCIENC

    High performance disk array architectures.

    Get PDF
    Yeung Kai-hau, Alan.Thesis (Ph.D.)--Chinese University of Hong Kong, 1995.Includes bibliographical references.ACKNOWLEDGMENTS --- p.ivABSTRACT --- p.vChapter CHAPTER 1 --- Introduction --- p.1Chapter 1.1 --- The Information Age --- p.2Chapter 1.2 --- The Importance of Input/Output --- p.3Chapter 1.3 --- Redundant Arrays of Inexpensive Disks --- p.5Chapter 1.4 --- Outline of the Thesis --- p.7References --- p.8Chapter CHAPTER 2 --- Selective Broadcast Data Distribution Systems --- p.10Chapter 2.1 --- Introduction --- p.11Chapter 2.2 --- The Distributed Architecture --- p.12Chapter 2.3 --- Mean Block Acquisition Delay for Uniform Request Distribution --- p.16Chapter 2.4 --- Mean Block Acquisition Delay for General Request Distributions --- p.21Chapter 2.5 --- Optimal Choice of Block Sizes --- p.24Chapter 2.6 --- Chapter Summary --- p.25References --- p.26Chapter CHAPTER 3 --- Dynamic Multiple Parity Disk Arrays --- p.28Chapter 3.1 --- Introduction --- p.29Chapter 3.2 --- DMP Disk Array --- p.31Chapter 3.3 --- Average Delay --- p.37Chapter 3.4 --- Maximum Throughput --- p.47Chapter 3.5 --- Simulation with Precise Disk Model --- p.53Chapter 3.6 --- Chapter Summary --- p.58References --- p.59Appendix --- p.61Chapter CHAPTER 4 --- Dynamic Parity Logging Disk Arrays --- p.69Chapter 4.1 --- Introduction --- p.70Chapter 4.2 --- DPL Disk Array Architecture --- p.73Chapter 4.3 --- DPL Disk Array Operation --- p.79Chapter 4.4 --- Performance of DPL Disk Array --- p.83Chapter 4.5 --- Chapter Summary --- p.91References --- p.92Appendix --- p.94Chapter CHAPTER 5 --- Performance Analysis of Mirrored Disk Array --- p.101Chapter 5.1 --- Introduction --- p.102Chapter 5.2 --- Queueing Model --- p.103Chapter 5.3 --- Delay Analysis --- p.104Chapter 5.4 --- Numerical Examples and Simulation Results --- p.108References --- p.109Chapter CHAPTER 6 --- State Reduction in the Exact Analysis of Fork/Join Queues --- p.110Chapter 6.1 --- Introduction --- p.111Chapter 6.2 --- State Reduction For Closed Fork/Join Queueing Systems --- p.113Chapter 6.3 --- Extension To Open Fork/Join Queueing Systems --- p.118Chapter 6.4 --- Chapter Summary --- p.122References --- p.123Chapter CHAPTER 7 --- Conclusion and Future Research --- p.124Chapter 7.1 --- Summary --- p.125Chapter 7.2 --- Future Researches --- p.12

    Um simulador para a arquitetura RAID5

    Get PDF
    Orientador: Celio Cardoso GuimarãesDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Ciencia da CompuitaçãoResumo: A performance dos dispositivos de I/O em um sistema de computação não tem acompanhado o desenvolvimento da unidade central de processamento (CPU). Como resultado, o poder computacional das máquinas que fazem uso de uma grande quantidade de I/O tem sido desperdiçado. Como exemplo, a performance de um servidor de arquivos é severamente limitada pela performance do disco magnético. Esta tese se concentra neste dispositivo de I/O. Um simulador de um subsistema de discos magnéticos, baseado na arquitetura RAID 5 proposta por Patterson, é apresentadoAbstract: The performance of I/O devices in a computing system has not been follo­wing the developments of the Central Processing Unit (CPU). As a result, the computational power of machines which make use of a large amount of I/O has been largely worthless. As an example, the performance of a file server is severely limited by the performance of the magnetic disco This thesis is focused on this I/O device. A simulator of a magnetic disc subsystem, based on the RAID 5 architecture proposed by Patterson, is presented hereMestradoMestre em Ciência da Computaçã

    Scalability of RAID systems

    Get PDF
    RAID systems (Redundant Arrays of Inexpensive Disks) have dominated backend storage systems for more than two decades and have grown continuously in size and complexity. Currently they face unprecedented challenges from data intensive applications such as image processing, transaction processing and data warehousing. As the size of RAID systems increases, designers are faced with both performance and reliability challenges. These challenges include limited back-end network bandwidth, physical interconnect failures, correlated disk failures and long disk reconstruction time. This thesis studies the scalability of RAID systems in terms of both performance and reliability through simulation, using a discrete event driven simulator for RAID systems (SIMRAID) developed as part of this project. SIMRAID incorporates two benchmark workload generators, based on the SPC-1 and Iometer benchmark specifications. Each component of SIMRAID is highly parameterised, enabling it to explore a large design space. To improve the simulation speed, SIMRAID develops a set of abstraction techniques to extract the behaviour of the interconnection protocol without losing accuracy. Finally, to meet the technology trend toward heterogeneous storage architectures, SIMRAID develops a framework that allows easy modelling of different types of device and interconnection technique. Simulation experiments were first carried out on performance aspects of scalability. They were designed to answer two questions: (1) given a number of disks, which factors affect back-end network bandwidth requirements; (2) given an interconnection network, how many disks can be connected to the system. The results show that the bandwidth requirement per disk is primarily determined by workload features and stripe unit size (a smaller stripe unit size has better scalability than a larger one), with cache size and RAID algorithm having very little effect on this value. The maximum number of disks is limited, as would be expected, by the back-end network bandwidth. Studies of reliability have led to three proposals to improve the reliability and scalability of RAID systems. Firstly, a novel data layout called PCDSDF is proposed. PCDSDF combines the advantages of orthogonal data layouts and parity declustering data layouts, so that it can not only survivemultiple disk failures caused by physical interconnect failures or correlated disk failures, but also has a good degraded and rebuild performance. The generating process of PCDSDF is deterministic and time-efficient. The number of stripes per rotation (namely the number of stripes to achieve rebuild workload balance) is small. Analysis shows that the PCDSDF data layout can significantly improve the system reliability. Simulations performed on SIMRAID confirm the good performance of PCDSDF, which is comparable to other parity declustering data layouts, such as RELPR. Secondly, a system architecture and rebuilding mechanism have been designed, aimed at fast disk reconstruction. This architecture is based on parity declustering data layouts and a disk-oriented reconstruction algorithm. It uses stripe groups instead of stripes as the basic distribution unit so that it can make use of the sequential nature of the rebuilding workload. The design space of system factors such as parity declustering ratio, chunk size, private buffer size of surviving disks and free buffer size are explored to provide guidelines for storage system design. Thirdly, an efficient distributed hot spare allocation and assignment algorithm for general parity declustering data layouts has been developed. This algorithm avoids conflict problems in the process of assigning distributed spare space for the units on the failed disk. Simulation results show that it effectively solves the write bottleneck problem and, at the same time, there is only a small increase in the average response time to user requests

    An Evaluation of Redundant Arrays of Disks using an Amdahl 5890

    No full text
    Recently we presented several disk array architectures designed to increase the data rate and I/O rate of supercomputing applications, transaction processing, and file systems [Patterson 88]. In this paper we present a hardware performance measurement of two of these architectures, mirroring and rotated parity. We see how throughput for these two architectures is affected by response time requirements, request sizes, and read to write ratios. We find that for applications with large accesses, such as many supercomputing applications, a rotated parity disk array far outperforms traditional mirroring architecture. For applications dominated by small accesses, such as transaction processing, mirroring architectures have higher performance per disk than rotated parity architectures
    corecore