Search CORE

619 research outputs found

Optimal Rebuilding of Multiple Erasures in MDS Codes

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue
Publication date: 03/03/2016
Field of study

MDS array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with

r

redundancy nodes can correct any

r

node erasures by accessing all the remaining information in the surviving nodes. However, in practice,

e

erasures is a more likely failure event, for

1\le e<r

. Hence, a natural question is how much information do we need to access in order to rebuild

e

storage nodes? We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of

e

erasures. In our previous work we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of

1/r

for the rebuilding of any systematic node when

e=1

, however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any

r

erasures can be corrected by accessing some of the remaining information, and any

e=1

erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this work, we study 3 questions on rebuilding of codes: (i) We show a fundamental trade-off between the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of

e/r

for MDS codes, for any

1\le e\le r

. (ii) We construct systematic codes that achieve optimal rebuilding ratio of

1/r

, for any systematic or parity node erasure. (iii) We present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances.Comment: There is an overlap of this work with our two previous submissions: Zigzag Codes: MDS Array Codes with Optimal Rebuilding; On Codes for Optimal Rebuilding Access. arXiv admin note: text overlap with arXiv:1112.037

arXiv.org e-Print Archive

Caltech Authors

Optimal Rebuilding of Multiple Erasures in MDS Codes

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2017
Field of study

Maximum distance separable (MDS) array codes are widely used in storage systems due to their computationally efficient encoding and decoding procedures. An MDS code with r redundancy nodes can correct any r node erasures by accessing (reading) all the remaining information in the surviving nodes. However, in practice, e erasures are a more likely failure event, for some 1≤e<r . Hence, a natural question is how much information do we need to access in order to rebuild e storage nodes. We define the rebuilding ratio as the fraction of remaining information accessed during the rebuilding of e erasures. In our previous work, we constructed MDS codes, called zigzag codes, that achieve the optimal rebuilding ratio of 1/r for the rebuilding of any systematic node when e=1 ; however, all the information needs to be accessed for the rebuilding of the parity node erasure. The (normalized) repair bandwidth is defined as the fraction of information transmitted from the remaining nodes during the rebuilding process. For codes that are not necessarily MDS, Dimakis et al. proposed the regenerating codes framework where any r erasures can be corrected by accessing some of the remaining information, and any e=1 erasure can be rebuilt from some subsets of surviving nodes with optimal repair bandwidth. In this paper, we present three results on rebuilding of codes: 1) we show a fundamental outer bound on the storage size of the node and the repair bandwidth similar to the regenerating codes framework, and show that zigzag codes achieve the optimal rebuilding ratio of e/r for systematic nodes of MDS codes, for any 1≤e≤r ; 2) we construct systematic codes that achieve optimal rebuilding ratio of 1/r , for any systematic or parity node erasure; and 3) we present error correction algorithms for zigzag codes, and in particular demonstrate how these codes can be corrected beyond their minimum Hamming distances

Access vs. Bandwidth in Codes for Storage

Author: Bruck Jehoshua
Tamo Itzhak
Wang Zhiying
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2012
Field of study

Maximum distance separable (MDS) codes are widely used in storage systems to protect against disk (node) failures. A node is said to have capacity

l

over some field

\mathbb{F}

, if it can store that amount of symbols of the field. An

(n,k,l)

MDS code uses

n

nodes of capacity

l

to store

k

information nodes. The MDS property guarantees the resiliency to any

n-k

node failures. An \emph{optimal bandwidth} (resp. \emph{optimal access}) MDS code communicates (resp. accesses) the minimum amount of data during the repair process of a single failed node. It was shown that this amount equals a fraction of

1/(n-k)

of data stored in each node. In previous optimal bandwidth constructions,

l

scaled polynomially with

k

in codes with asymptotic rate

<1

. Moreover, in constructions with a constant number of parities, i.e. rate approaches 1,

l

is scaled exponentially w.r.t.

k

. In this paper, we focus on the later case of constant number of parities

n-k=r

, and ask the following question: Given the capacity of a node

l

what is the largest number of information disks

k

in an optimal bandwidth (resp. access)

(k+r,k,l)

MDS code. We give an upper bound for the general case, and two tight bounds in the special cases of two important families of codes. Moreover, the bounds show that in some cases optimal-bandwidth code has larger

k

than optimal-access code, and therefore these two measures are not equivalent.Comment: This paper was presented in part at the IEEE International Symposium on Information Theory (ISIT 2012). submitted to IEEE transactions on information theor

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Repair Framework for Scalar MDS Codes

Author: Caire Giuseppe
Dimakis Alexandros G.
Papailiopoulos Dimitris S.
Shanmugam Karthikeyan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2013
Field of study

Several works have developed vector-linear maximum-distance separable (MDS) storage codes that min- imize the total communication cost required to repair a single coded symbol after an erasure, referred to as repair bandwidth (BW). Vector codes allow communicating fewer sub-symbols per node, instead of the entire content. This allows non trivial savings in repair BW. In sharp contrast, classic codes, like Reed- Solomon (RS), used in current storage systems, are deemed to suffer from naive repair, i.e. downloading the entire stored message to repair one failed node. This mainly happens because they are scalar-linear. In this work, we present a simple framework that treats scalar codes as vector-linear. In some cases, this allows significant savings in repair BW. We show that vectorized scalar codes exhibit properties that simplify the design of repair schemes. Our framework can be seen as a finite field analogue of real interference alignment. Using our simplified framework, we design a scheme that we call clique-repair which provably identifies the best linear repair strategy for any scalar 2-parity MDS code, under some conditions on the sub-field chosen for vectorization. We specify optimal repair schemes for specific (5,3)- and (6,4)-Reed- Solomon (RS) codes. Further, we present a repair strategy for the RS code currently deployed in the Facebook Analytics Hadoop cluster that leads to 20% of repair BW savings over naive repair which is the repair scheme currently used for this code.Comment: 10 Pages; accepted to IEEE JSAC -Distributed Storage 201

arXiv.org e-Print Archive

Crossref