4 research outputs found

    Rebuild performance enhancement using onboard caching and delayed vacation termination in clustered raid 5

    Get PDF
    The Clustered Raid 5 (CRAID5) architecture with a parity group size(G) smaller than the number of disks(N) increases the load by the declustering ratio denoted by α = (G -1)/(N -1), which can be lesser than that in Raid 5 while switching to, and subsequently operating in rebuild mode. The Nearly Random Permutation (NRP) layout provides the flexibility to vary the declustering ratio (α) for a given N, and the Vacationing Server Model (VSM) of processing the rebuild requests provides acceptable rebuild and user response times. The rebuild performance and the user response time can be improved by introducing an onboard buffer in the disks, which caches a single track upon arrival of a rebuild request while in rebuild mode. Such an enhancement is proposed, and the architecture is described along with an analysis using the DASim simulation toolkit developed at NJIT. Also proposed is the delayed termination of vacations with two user requests as this improves the rebuild performance with a negligible negative impact on user response time. Finally, the effect of limiting the rebuild buffer on the rebuild performance is presented in the context of three different disk utilizations and declustering ratios

    Studies of disk arrays tolerating two disk failures and a proposal for a heterogeneous disk array

    Get PDF
    There has been an explosion in the amount of generated data in the past decade. Online access to these data is made possible by large disk arrays, especially in the RAID (Redundant Array of Independent Disks) paradigm. According to the RAID level a disk array can tolerate one or more disk failures, so that the storage subsystem can continue operating with disk failure(s). RAID 5 is a single disk failure tolerant array which dedicates the capacity of one disk to parity information. The content on the failed disk can be reconstructed on demand and written onto a spare disk. However, RAID5 does not provide enough protection for data since the data loss may occur when there is a media failure (unreadable sectors) or a second disk failure during the rebuild process. Due to the high cost of downtime in many applications, two disk failure tolerant arrays, such as RAID6 and EVENODD, have become popular. These schemes use 2/N of the capacity of the array for redundant information in order to tolerate two disk failures. RM2 is another scheme that can tolerate two disk failures, with slightly higher redundancy ratio. However, the performance of these two disk failure tolerant RAID schemes is impaired, since there are two check disks to be updated for each write request. Therefore, their performance, especially when there are disk failure(s), is of interest. In the first part of the dissertation, the operations for the RAID5, RAID6, EVENODD and RM2 schemes are described. A cost model is developed for these RAID schemes by analyzing the operations in various operating modes. This cost model offers a measure of the volume of data being transmitted, and provides adevice-independent comparison of the efficiency of these RAID schemes. Based on this cost model, the maximum throughput of a RAID scheme can be obtained given detailed disk characteristic and RAID configuration. Utilizing M/G/1 queuing model and other favorable modeling assumptions, a queuing analysis to obtain the mean read response time is described. Simulation is used to validate analytic results, as well as to evaluate the RAID systems in analytically intractable cases. The second part of this dissertation describes a new disk array architecture, namely Heterogeneous Disk Array (HDA). The HDA is motivated by a few observations of the trends in storage technology. The HDA architecture allows a disk array to have two forms of heterogeneity: (1) device heterogeneity, i.e., disks of different types can be incorporated in a single HDA; and (2) RAID level heterogeneity, i.e., various RAID schemes can coexist in the same array. The goal of this architecture is (1) utilizing the extra resource (i.e. bandwidth and capacity) introduced by new disk drives in an automated and efficient way; and (2) using appropriate RAID levels to meet the varying availability requirements for different applications. In HDA, each new object is associated with an appropriate RAID level and the allocation is carried out in a way to keep disk bandwidth and capacity utilizations balanced. Design considerations for the data structures of HDA metadata are described, followed by the actual design of the data structures and flowcharts for the most frequent operations. Then a data allocation algorithm is described in detail. Finally, the HDA architecture is prototyped based on the DASim simulation toolkit developed at NJIT and simulation results of an HDA with two RAID levels (RAID 1 and RAIDS) are presented
    corecore