52 research outputs found
Convertible Codes: New Class of Codes for Efficient Conversion of Coded Data in Distributed Storage
Erasure codes are typically used in large-scale distributed storage systems to provide durability of data in the face of failures. In this setting, a set of k blocks to be stored is encoded using an [n, k] code to generate n blocks that are then stored on different storage nodes. A recent work by Kadekodi et al. [Kadekodi et al., 2019] shows that the failure rate of storage devices vary significantly over time, and that changing the rate of the code (via a change in the parameters n and k) in response to such variations provides significant reduction in storage space requirement. However, the resource overhead of realizing such a change in the code rate on already encoded data in traditional codes is prohibitively high.
Motivated by this application, in this work we first present a new framework to formalize the notion of code conversion - the process of converting data encoded with an [n^I, k^I] code into data encoded with an [n^F, k^F] code while maintaining desired decodability properties, such as the maximum-distance-separable (MDS) property. We then introduce convertible codes, a new class of code pairs that allow for code conversions in a resource-efficient manner. For an important parameter regime (which we call the merge regime) along with the widely used linearity and MDS decodability constraint, we prove tight bounds on the number of nodes accessed during code conversion. In particular, our achievability result is an explicit construction of MDS convertible codes that are optimal for all parameter values in the merge regime albeit with a high field size. We then present explicit low-field-size constructions of optimal MDS convertible codes for a broad range of parameters in the merge regime. Our results thus show that it is indeed possible to achieve code conversions with significantly lesser resources as compared to the default approach of re-encoding
Two Piggybacking Codes with Flexible Sub-Packetization to Achieve Lower Repair Bandwidth
As a special class of array codes, piggybacking codes are MDS codes
(i.e., any out of nodes can retrieve all data symbols) that can achieve
low repair bandwidth for single-node failure with low sub-packetization . In
this paper, we propose two new piggybacking codes that have lower repair
bandwidth than the existing piggybacking codes given the same parameters. Our
first piggybacking codes can support flexible sub-packetization with , where . We show that our first piggybacking codes have
lower repair bandwidth for any single-node failure than the existing
piggybacking codes when , and .
Moreover, we propose second piggybacking codes such that the sub-packetization
is a multiple of the number of parity nodes (i.e., ), by jointly
designing the piggyback function for data node repair and transformation
function for parity node repair. We show that the proposed second piggybacking
codes have lowest repair bandwidth for any single-node failure among all the
existing piggybacking codes for the evaluated parameters
and
A Family of Erasure Correcting Codes with Low Repair Bandwidth and Low Repair Complexity
We present the construction of a new family of erasure correcting codes for
distributed storage that yield low repair bandwidth and low repair complexity.
The construction is based on two classes of parity symbols. The primary goal of
the first class of symbols is to provide good erasure correcting capability,
while the second class facilitates node repair, reducing the repair bandwidth
and the repair complexity. We compare the proposed codes with other codes
proposed in the literature.Comment: Accepted, will appear in the proceedings of Globecom 2015 (Selected
Areas in Communications: Data Storage
Code Constructions for Distributed Storage With Low Repair Bandwidth and Low Repair Complexity
We present the construction of a family of erasure correcting codes for
distributed storage that achieve low repair bandwidth and complexity at the
expense of a lower fault tolerance. The construction is based on two classes of
codes, where the primary goal of the first class of codes is to provide fault
tolerance, while the second class aims at reducing the repair bandwidth and
repair complexity. The repair procedure is a two- step procedure where parts of
the failed node are repaired in the first step using the first code. The
downloaded symbols during the first step are cached in the memory and used to
repair the remaining erased data symbols at minimal additional read cost during
the second step. The first class of codes is based on MDS codes modified using
piggybacks, while the second class is designed to reduce the number of
additional symbols that need to be downloaded to repair the remaining erased
symbols. We numerically show that the proposed codes achieve better repair
bandwidth compared to MDS codes, codes constructed using piggybacks, and local
reconstruction/Pyramid codes, while a better repair complexity is achieved when
compared to MDS, Zigzag, Pyramid codes, and codes constructed using piggybacks.Comment: To appear in IEEE Transactions on Communication
- …