Search CORE

117 research outputs found

MDS Array Codes with Optimal Rebuilding

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2011
Field of study

MDS array codes are widely used in storage systems to protect data against erasures. We address the rebuilding ratio problem, namely, in the case of erasures, what is the the fraction of the remaining information that needs to be accessed in order to rebuild exactly the lost information? It is clear that when the number of erasures equals the maximum number of erasures that an MDS code can correct then the rebuilding ratio is 1 (access all the remaining information). However, the interesting (and more practical) case is when the number of erasures is smaller than the erasure correcting capability of the code. For example, consider an MDS code that can correct two erasures: What is the smallest amount of information that one needs to access in order to correct a single erasure? Previous work showed that the rebuilding ratio is bounded between 1/2 and 3/4 , however, the exact value was left as an open problem. In this paper, we solve this open problem and prove that for the case of a single erasure with a 2-erasure correcting code, the rebuilding ratio is 1/2 . In general, we construct a new family of r-erasure correcting MDS array codes that has optimal rebuilding ratio of 1/r in the case of a single erasure. Our array codes have efficient encoding and decoding algorithms (for the case r = 2 they use a finite field of size 3) and an optimal update property

CiteSeerX

Crossref

Caltech Authors

Zigzag Codes: MDS Array Codes with Optimal Rebuilding

Author: Itzhak Tamo
Jehoshua Bruck
Student Member
Student Member
Zhiying Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

MDS array codes are widely used in storage systems to protect data against erasures. We address the \emph{rebuilding ratio} problem, namely, in the case of erasures, what is the fraction of the remaining information that needs to be accessed in order to rebuild \emph{exactly} the lost information? It is clear that when the number of erasures equals the maximum number of erasures that an MDS code can correct then the rebuilding ratio is 1 (access all the remaining information). However, the interesting and more practical case is when the number of erasures is smaller than the erasure correcting capability of the code. For example, consider an MDS code that can correct two erasures: What is the smallest amount of information that one needs to access in order to correct a single erasure? Previous work showed that the rebuilding ratio is bounded between 1/2 and 3/4, however, the exact value was left as an open problem. In this paper, we solve this open problem and prove that for the case of a single erasure with a 2-erasure correcting code, the rebuilding ratio is 1/2. In general, we construct a new family of

r

-erasure correcting MDS array codes that has optimal rebuilding ratio of

\frac{e}{r}

in the case of

e

erasures,

1 \le e \le r

. Our array codes have efficient encoding and decoding algorithms (for the case

r=2

they use a finite field of size 3) and an optimal update property.Comment: 23 pages, 5 figures, submitted to IEEE transactions on information theor

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

On Codes for Optimal Rebuilding Access

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'California Institute of Technology Library'
Publication date: 08/07/2011
Field of study

MDS (maximum distance separable) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r erasures by accessing (reading) all the remaining information in both the systematic nodes and the parity (redundancy) nodes. However, in practice, a single erasure is the most likely failure event; hence, a natural question is how much information do we need to access in order to rebuild a single storage node? We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of a single erasure. In our previous work we showed that the optimal rebuilding ratio of 1/r is achievable (using our newly constructed array codes) for the rebuilding of any systematic node, however, all the information needs to be accessed for the rebuilding of the parity nodes. Namely, constructing array codes with a rebuilding ratio of 1/r was left as an open problem. In this paper, we solve this open problem and present array codes that achieve the lower bound of 1/r for rebuilding any single systematic or parity node

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

Access vs. Bandwidth in Codes for Storage

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2012
Field of study

Maximum distance separable (MDS) codes are widely used in storage systems to protect against disk (node) failures. A node is said to have capacity

l

over some field

\mathbb{F}

, if it can store that amount of symbols of the field. An

(n,k,l)

MDS code uses

n

nodes of capacity

l

to store

k

information nodes. The MDS property guarantees the resiliency to any

n-k

node failures. An \emph{optimal bandwidth} (resp. \emph{optimal access}) MDS code communicates (resp. accesses) the minimum amount of data during the repair process of a single failed node. It was shown that this amount equals a fraction of

1/(n-k)

of data stored in each node. In previous optimal bandwidth constructions,

l

scaled polynomially with

k

in codes with asymptotic rate

<1

. Moreover, in constructions with a constant number of parities, i.e. rate approaches 1,

l

is scaled exponentially w.r.t.

k

. In this paper, we focus on the later case of constant number of parities

n-k=r

, and ask the following question: Given the capacity of a node

l

what is the largest number of information disks

k

in an optimal bandwidth (resp. access)

(k+r,k,l)

MDS code. We give an upper bound for the general case, and two tight bounds in the special cases of two important families of codes. Moreover, the bounds show that in some cases optimal-bandwidth code has larger

k

than optimal-access code, and therefore these two measures are not equivalent.Comment: This paper was presented in part at the IEEE International Symposium on Information Theory (ISIT 2012). submitted to IEEE transactions on information theor

arXiv.org e-Print Archive

CiteSeerX

Crossref

Long MDS Codes for Optimal Repair Bandwidth

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'California Institute of Technology Library'
Publication date: 16/06/2012
Field of study

MDS codes are erasure-correcting codes that can correct the maximum number of erasures given the number of redundancy or parity symbols. If an MDS code has r parities and no more than r erasures occur, then by transmitting all the remaining data in the code one can recover the original information. However, it was shown that in order to recover a single symbol erasure, only a fraction of 1/r of the information needs to be transmitted. This fraction is called the repair bandwidth (fraction). Explicit code constructions were given in previous works. If we view each symbol in the code as a vector or a column, then the code forms a 2D array and such codes are especially widely used in storage systems. In this paper, we ask the following question: given the length of the column l, can we construct high-rate MDS array codes with optimal repair bandwidth of 1/r, whose code length is as long as possible? In this paper, we give code constructions such that the code length is (r + 1)log_r l

CiteSeerX

Crossref

Caltech Authors

Locality and Availability in Distributed Storage

Author: Dimakis Alexandros G.
Papailiopoulos Dimitris S.
Rawat Ankit Singh
Vishwanath Sriram
Publication venue
Publication date: 01/01/2014
Field of study

This paper studies the problem of code symbol availability: a code symbol is said to have

(r, t)

-availability if it can be reconstructed from

t

disjoint groups of other symbols, each of size at most

r

. For example,

3

-replication supports

(1, 2)

-availability as each symbol can be read from its

t= 2

other (disjoint) replicas, i.e.,

r=1

. However, the rate of replication must vanish like

\frac{1}{t+1}

as the availability increases. This paper shows that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant. It further shows that this is possible with the minimum distance arbitrarily close to the Singleton bound. This paper also presents a bound demonstrating a trade-off between minimum distance, availability and locality. Our codes match the aforementioned bound and their construction relies on combinatorial objects called resolvable designs. From a practical standpoint, our codes seem useful for distributed storage applications involving hot data, i.e., the information which is frequently accessed by multiple processes in parallel.Comment: Submitted to ISIT 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Constructions of Optimal and Almost Optimal Locally Repairable Codes

Author: Ernvall Toni
Hollanti Camilla
Westerbäck Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/06/2014
Field of study

Constructions of optimal locally repairable codes (LRCs) in the case of

(r+1) \nmid n

and over small finite fields were stated as open problems for LRCs in [I. Tamo \emph{et al.}, "Optimal locally repairable codes and connections to matroid theory", \emph{2013 IEEE ISIT}]. In this paper, these problems are studied by constructing almost optimal linear LRCs, which are proven to be optimal for certain parameters, including cases for which

(r+1) \nmid n

. More precisely, linear codes for given length, dimension, and all-symbol locality are constructed with almost optimal minimum distance. `Almost optimal' refers to the fact that their minimum distance differs by at most one from the optimal value given by a known bound for LRCs. In addition to these linear LRCs, optimal LRCs which do not require a large field are constructed for certain classes of parameters.Comment: 5 pages, conferenc

arXiv.org e-Print Archive

Crossref

On Minimizing Data-read and Download for Storage-Node Recovery

Author: Shah Nihar B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We consider the problem of efficient recovery of the data stored in any individual node of a distributed storage system, from the rest of the nodes. Applications include handling failures and degraded reads. We measure efficiency in terms of the amount of data-read and the download required. To minimize the download, we focus on the minimum bandwidth setting of the 'regenerating codes' model for distributed storage. Under this model, the system has a total of n nodes, and the data stored in any node must be (efficiently) recoverable from any d of the other (n-1) nodes. Lower bounds on the two metrics under this model were derived previously; it has also been shown that these bounds are achievable for the amount of data-read and download when d=n-1, and for the amount of download alone when d<n-1. In this paper, we complete this picture by proving the converse result, that when d<n-1, these lower bounds are strictly loose with respect to the amount of read required. The proof is information-theoretic, and hence applies to non-linear codes as well. We also show that under two (practical) relaxations of the problem setting, these lower bounds can be met for both read and download simultaneously.Comment: IEEE Communications Letter

arXiv.org e-Print Archive

CiteSeerX

Repairable Block Failure Resilient Codes

Author: Calis Gokhan
Koyluoglu O. Ozan
Publication venue
Publication date: 27/06/2014
Field of study

In large scale distributed storage systems (DSS) deployed in cloud computing, correlated failures resulting in simultaneous failure (or, unavailability) of blocks of nodes are common. In such scenarios, the stored data or a content of a failed node can only be reconstructed from the available live nodes belonging to available blocks. To analyze the resilience of the system against such block failures, this work introduces the framework of Block Failure Resilient (BFR) codes, wherein the data (e.g., file in DSS) can be decoded by reading out from a same number of codeword symbols (nodes) from each available blocks of the underlying codeword. Further, repairable BFR codes are introduced, wherein any codeword symbol in a failed block can be repaired by contacting to remaining blocks in the system. Motivated from regenerating codes, file size bounds for repairable BFR codes are derived, trade-off between per node storage and repair bandwidth is analyzed, and BFR-MSR and BFR-MBR points are derived. Explicit codes achieving these two operating points for a wide set of parameters are constructed by utilizing combinatorial designs, wherein the codewords of the underlying outer codes are distributed to BFR codeword symbols according to projective planes

arXiv.org e-Print Archive

Crossref