Search CORE

213 research outputs found

Convertible Codes: New Class of Codes for Efficient Conversion of Coded Data in Distributed Storage

Author: Maturana Francisco
Rashmi K. V.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 01/01/2020
Field of study

Erasure codes are typically used in large-scale distributed storage systems to provide durability of data in the face of failures. In this setting, a set of k blocks to be stored is encoded using an [n, k] code to generate n blocks that are then stored on different storage nodes. A recent work by Kadekodi et al. [Kadekodi et al., 2019] shows that the failure rate of storage devices vary significantly over time, and that changing the rate of the code (via a change in the parameters n and k) in response to such variations provides significant reduction in storage space requirement. However, the resource overhead of realizing such a change in the code rate on already encoded data in traditional codes is prohibitively high. Motivated by this application, in this work we first present a new framework to formalize the notion of code conversion - the process of converting data encoded with an [n^I, k^I] code into data encoded with an [n^F, k^F] code while maintaining desired decodability properties, such as the maximum-distance-separable (MDS) property. We then introduce convertible codes, a new class of code pairs that allow for code conversions in a resource-efficient manner. For an important parameter regime (which we call the merge regime) along with the widely used linearity and MDS decodability constraint, we prove tight bounds on the number of nodes accessed during code conversion. In particular, our achievability result is an explicit construction of MDS convertible codes that are optimal for all parameter values in the merge regime albeit with a high field size. We then present explicit low-field-size constructions of optimal MDS convertible codes for a broad range of parameters in the merge regime. Our results thus show that it is indeed possible to achieve code conversions with significantly lesser resources as compared to the default approach of re-encoding

Dagstuhl Research Online Publication Server

Two Piggybacking Codes with Flexible Sub-Packetization to Achieve Lower Repair Bandwidth

Author: Bai Bo
Hou Hanxu
Huang Zhongyi
Jiang Zhengyi
Shi Hao
Publication venue
Publication date: 20/09/2022
Field of study

As a special class of array codes,

(n,k,m)

piggybacking codes are MDS codes (i.e., any

k

out of

n

nodes can retrieve all data symbols) that can achieve low repair bandwidth for single-node failure with low sub-packetization

m

. In this paper, we propose two new piggybacking codes that have lower repair bandwidth than the existing piggybacking codes given the same parameters. Our first piggybacking codes can support flexible sub-packetization

m

with

2\leq m\leq n-k

, where

n - k > 3

. We show that our first piggybacking codes have lower repair bandwidth for any single-node failure than the existing piggybacking codes when

n - k = 8,9

m = 6

and

30\leq k \leq 100

. Moreover, we propose second piggybacking codes such that the sub-packetization is a multiple of the number of parity nodes (i.e.,

(n-k)|m

), by jointly designing the piggyback function for data node repair and transformation function for parity node repair. We show that the proposed second piggybacking codes have lowest repair bandwidth for any single-node failure among all the existing piggybacking codes for the evaluated parameters

k/n = 0.75, 0.8, 0.9

and

n-k\geq 4

arXiv.org e-Print Archive

A Repair Framework for Scalar MDS Codes

Author: Caire Giuseppe
Dimakis Alexandros G.
Papailiopoulos Dimitris S.
Shanmugam Karthikeyan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2013
Field of study

Several works have developed vector-linear maximum-distance separable (MDS) storage codes that min- imize the total communication cost required to repair a single coded symbol after an erasure, referred to as repair bandwidth (BW). Vector codes allow communicating fewer sub-symbols per node, instead of the entire content. This allows non trivial savings in repair BW. In sharp contrast, classic codes, like Reed- Solomon (RS), used in current storage systems, are deemed to suffer from naive repair, i.e. downloading the entire stored message to repair one failed node. This mainly happens because they are scalar-linear. In this work, we present a simple framework that treats scalar codes as vector-linear. In some cases, this allows significant savings in repair BW. We show that vectorized scalar codes exhibit properties that simplify the design of repair schemes. Our framework can be seen as a finite field analogue of real interference alignment. Using our simplified framework, we design a scheme that we call clique-repair which provably identifies the best linear repair strategy for any scalar 2-parity MDS code, under some conditions on the sub-field chosen for vectorization. We specify optimal repair schemes for specific (5,3)- and (6,4)-Reed- Solomon (RS) codes. Further, we present a repair strategy for the RS code currently deployed in the Facebook Analytics Hadoop cluster that leads to 20% of repair BW savings over naive repair which is the repair scheme currently used for this code.Comment: 10 Pages; accepted to IEEE JSAC -Distributed Storage 201

arXiv.org e-Print Archive

Crossref

A Family of Erasure Correcting Codes with Low Repair Bandwidth and Low Repair Complexity

Author: Amat Alexandre Graell i
Andriyanova Iryna
Brännström Fredrik
Kumar Siddhartha
Publication venue
Publication date: 01/01/2015
Field of study

We present the construction of a new family of erasure correcting codes for distributed storage that yield low repair bandwidth and low repair complexity. The construction is based on two classes of parity symbols. The primary goal of the first class of symbols is to provide good erasure correcting capability, while the second class facilitates node repair, reducing the repair bandwidth and the repair complexity. We compare the proposed codes with other codes proposed in the literature.Comment: Accepted, will appear in the proceedings of Globecom 2015 (Selected Areas in Communications: Data Storage

arXiv.org e-Print Archive

Crossref

HAL Descartes

Chalmers Research

Chalmers Publication Library

Hal-Diderot