14 research outputs found
Self-Repairing Disk Arrays
As the prices of magnetic storage continue to decrease, the cost of replacing
failed disks becomes increasingly dominated by the cost of the service call
itself. We propose to eliminate these calls by building disk arrays that
contain enough spare disks to operate without any human intervention during
their whole lifetime. To evaluate the feasibility of this approach, we have
simulated the behavior of two-dimensional disk arrays with n parity disks and
n(n-1)/2 data disks under realistic failure and repair assumptions. Our
conclusion is that having n(n+1)/2 spare disks is more than enough to achieve a
99.999 percent probability of not losing data over four years. We observe that
the same objectives cannot be reached with RAID level 6 organizations and would
require RAID stripes that could tolerate triple disk failures.Comment: Part of ADAPT Workshop proceedings, 2015 (arXiv:1412.2347
Decentralized Erasure Codes for Distributed Networked Storage
We consider the problem of constructing an erasure code for storage over a
network when the data sources are distributed. Specifically, we assume that
there are n storage nodes with limited memory and k<n sources generating the
data. We want a data collector, who can appear anywhere in the network, to
query any k storage nodes and be able to retrieve the data. We introduce
Decentralized Erasure Codes, which are linear codes with a specific randomized
structure inspired by network coding on random bipartite graphs. We show that
decentralized erasure codes are optimally sparse, and lead to reduced
communication, storage and computation cost over random linear coding.Comment: to appear in IEEE Transactions on Information Theory, Special Issue:
Networking and Information Theor
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusive-OR computations. This redundant storage is optimal, in the sense that two failed disks cannot be retrieved with less than two redundant disks. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID-5 controllers. Hence, EVENODD can be implemented on standard RAID-5 controllers without any hardware changes. The most commonly used scheme that employes optimal redundant storage (i.e., two extra disks) is based on Reed-Solomon (RS) error-correcting codes. This scheme requires computation over finite fields and results in a more complex implementation. For example, we show that the complexity of implementing EVENODD in a disk array with 15 disks is about 50% of the one required when using the RS scheme. The new scheme is not limited to RAID architectures: it can be used in any system requiring large symbols and relatively short codes, for instance, in multitrack magnetic recording. To this end, we also present a decoding algorithm for one column (track) in error
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
Construction of Extended Steiner Systems for Information Retrieval
A multiset batch code is a variation of information retrieval where a t-multiset of items can be retrieved by reading at most one bit from each server. We study a problem at the other end of the spectrum, namely that of retrieving a t-multiset of items by accessing exactly one server. Our solution to the problem is a combinatorial notion called an extended Steiner system, which was first studied by Johnson and Mendelsohn [11]. An extended Steiner system ES(t; k; v) is a collection of k-multisets (thus, allowing repetition of elements in a block) of a v-set such that every t-multiset belongs to exactly one block. An extended triple system, with t = 2 and k = 3, has been investigated and constructed previously [3, 11]. We study extended systems over v elements with k = t + 1, denoted as ES(t, t + 1, v). We show constructions of ES(t, t + 1, v) for all t ≥ 3 and v ≥ t + 1.A multiset batch code is a variation of information retrieval where a t-multiset of items can be retrieved by reading at most one bit from each server. We study a problem at the other end of the spectrum, namely that of retrieving a t-multiset of items by accessing exactly one server. Our solution to the problem is a combinatorial notion called an extended Steiner system, which was first studied by Johnson and Mendelsohn [11]. An extended Steiner system ES(t, k , v ) is a collection of k-multisets (thus, allowing repetition of elements in a block) of a v -set such that every t-multiset belongs to exactly one block. An extended triple system, with t = 2 and k = 3, has been investigated and constructed previously [3, 11]. We study extended systems over v elements with k = t + 1, denoted as ES(t, t + 1, v ). We show constructions of ES(t, t + 1, v ) for all t 3 and v t + 1
Stochastic Analysis on RAID Reliability for Solid-State Drives
Solid-state drives (SSDs) have been widely deployed in desktops and data
centers. However, SSDs suffer from bit errors, and the bit error rate is time
dependent since it increases as an SSD wears down. Traditional storage systems
mainly use parity-based RAID to provide reliability guarantees by striping
redundancy across multiple devices, but the effectiveness of RAID in SSDs
remains debatable as parity updates aggravate the wearing and bit error rates
of SSDs. In particular, an open problem is that how different parity
distributions over multiple devices, such as the even distribution suggested by
conventional wisdom, or uneven distributions proposed in recent RAID schemes
for SSDs, may influence the reliability of an SSD RAID array. To address this
fundamental problem, we propose the first analytical model to quantify the
reliability dynamics of an SSD RAID array. Specifically, we develop a
"non-homogeneous" continuous time Markov chain model, and derive the transient
reliability solution. We validate our model via trace-driven simulations and
conduct numerical analysis to provide insights into the reliability dynamics of
SSD RAID arrays under different parity distributions and subject to different
bit error rates and array configurations. Designers can use our model to decide
the appropriate parity distribution based on their reliability requirements.Comment: 12 page
Durability and Availability of Erasure-Coded Storage Systems with Concurrent Maintenance
This initial version of this document was written back in 2014 for the sole
purpose of providing fundamentals of reliability theory as well as to identify
the theoretical types of machinery for the prediction of
durability/availability of erasure-coded storage systems. Since the definition
of a "system" is too broad, we specifically focus on warm/cold storage systems
where the data is stored in a distributed fashion across different storage
units with or without continuous operation. The contents of this document are
dedicated to a review of fundamentals, a few major improved stochastic models,
and several contributions of my work relevant to the field. One of the
contributions of this document is the introduction of the most general form of
Markov models for the estimation of mean time to failure. This work was
partially later published in IEEE Transactions on Reliability. Very good
approximations for the closed-form solutions for this general model are also
investigated. Various storage configurations under different policies are
compared using such advanced models. Later in a subsequent chapter, we have
also considered multi-dimensional Markov models to address detached
drive-medium combinations such as those found in optical disk and tape storage
systems. It is not hard to anticipate such a system structure would most likely
be part of future DNA storage libraries. This work is partially published in
Elsevier Reliability and System Safety. Topics that include simulation
modelings for more accurate estimations are included towards the end of the
document by noting the deficiencies of the simplified canonical as well as more
complex Markov models, due mainly to the stationary and static nature of
Markovinity. Throughout the document, we shall focus on concurrently maintained
systems although the discussions will only slightly change for the systems
repaired one device at a time.Comment: 58 pages, 20 figures, 9 tables. arXiv admin note: substantial text
overlap with arXiv:1911.0032