3,567 research outputs found
Storage codes -- coding rate and repair locality
The {\em repair locality} of a distributed storage code is the maximum number
of nodes that ever needs to be contacted during the repair of a failed node.
Having small repair locality is desirable, since it is proportional to the
number of disk accesses during repair. However, recent publications show that
small repair locality comes with a penalty in terms of code distance or storage
overhead if exact repair is required.
Here, we first review some of the main results on storage codes under various
repair regimes and discuss the recent work on possible
(information-theoretical) trade-offs between repair locality and other code
parameters like storage overhead and code distance, under the exact repair
regime.
Then we present some new information theoretical lower bounds on the storage
overhead as a function of the repair locality, valid for all common coding and
repair models. In particular, we show that if each of the nodes in a
distributed storage system has storage capacity \ga and if, at any time, a
failed node can be {\em functionally} repaired by contacting {\em some} set of
nodes (which may depend on the actual state of the system) and downloading
an amount \gb of data from each, then in the extreme cases where \ga=\gb or
\ga = r\gb, the maximal coding rate is at most or 1/2, respectively
(that is, the excess storage overhead is at least or 1, respectively).Comment: Accepted for publication in ICNC'13, San Diego, US
High-Rate Regenerating Codes Through Layering
In this paper, we provide explicit constructions for a class of exact-repair
regenerating codes that possess a layered structure. These regenerating codes
correspond to interior points on the storage-repair-bandwidth tradeoff, and
compare very well in comparison to scheme that employs space-sharing between
MSR and MBR codes. For the parameter set with , we
construct a class of codes with an auxiliary parameter , referred to as
canonical codes. With in the range , these codes operate in
the region between the MSR point and the MBR point, and perform significantly
better than the space-sharing line. They only require a field size greater than
. For the case of , canonical codes can also be shown to
achieve an interior point on the line-segment joining the MSR point and the
next point of slope-discontinuity on the storage-repair-bandwidth tradeoff.
Thus we establish the existence of exact-repair codes on a point other than the
MSR and the MBR point on the storage-repair-bandwidth tradeoff. We also
construct layered regenerating codes for general parameter set ,
which we refer to as non-canonical codes. These codes also perform
significantly better than the space-sharing line, though they require a
significantly higher field size. All the codes constructed in this paper are
high-rate, can repair multiple node-failures and do not require any computation
at the helper nodes. We also construct optimal codes with locality in which the
local codes are layered regenerating codes.Comment: 20 pages, 9 figure
Optimal Locally Repairable Codes and Connections to Matroid Theory
Petabyte-scale distributed storage systems are currently transitioning to
erasure codes to achieve higher storage efficiency. Classical codes like
Reed-Solomon are highly sub-optimal for distributed environments due to their
high overhead in single-failure events. Locally Repairable Codes (LRCs) form a
new family of codes that are repair efficient. In particular, LRCs minimize the
number of nodes participating in single node repairs during which they generate
small network traffic. Two large-scale distributed storage systems have already
implemented different types of LRCs: Windows Azure Storage and the Hadoop
Distributed File System RAID used by Facebook. The fundamental bounds for LRCs,
namely the best possible distance for a given code locality, were recently
discovered, but few explicit constructions exist. In this work, we present an
explicit and optimal LRCs that are simple to construct. Our construction is
based on grouping Reed-Solomon (RS) coded symbols to obtain RS coded symbols
over a larger finite field. We then partition these RS symbols in small groups,
and re-encode them using a simple local code that offers low repair locality.
For the analysis of the optimality of the code, we derive a new result on the
matroid represented by the code generator matrix.Comment: Submitted for publication, a shorter version was presented at ISIT
201
A Class of MSR Codes for Clustered Distributed Storage
Clustered distributed storage models real data centers where intra- and
cross-cluster repair bandwidths are different. In this paper, exact-repair
minimum-storage-regenerating (MSR) codes achieving capacity of clustered
distributed storage are designed. Focus is given on two cases: and
, where is the ratio of the available cross- and
intra-cluster repair bandwidths, is the total number of distributed nodes
and is the number of contact nodes in data retrieval. The former represents
the scenario where cross-cluster communication is not allowed, while the latter
corresponds to the case of minimum cross-cluster bandwidth that is possible
under the minimum storage overhead constraint. For the case, two
types of locally repairable codes are proven to achieve the MSR point. As for
, an explicit MSR coding scheme is suggested for the
two-cluster situation under the specific condition of .Comment: 9 pages, a part of this paper is submitted to IEEE ISIT201
A family of optimal locally recoverable codes
A code over a finite alphabet is called locally recoverable (LRC) if every
symbol in the encoding is a function of a small number (at most ) other
symbols. We present a family of LRC codes that attain the maximum possible
value of the distance for a given locality parameter and code cardinality. The
codewords are obtained as evaluations of specially constructed polynomials over
a finite field, and reduce to a Reed-Solomon code if the locality parameter
is set to be equal to the code dimension. The size of the code alphabet for
most parameters is only slightly greater than the code length. The recovery
procedure is performed by polynomial interpolation over points. We also
construct codes with several disjoint recovering sets for every symbol. This
construction enables the system to conduct several independent and simultaneous
recovery processes of a specific symbol by accessing different parts of the
codeword. This property enables high availability of frequently accessed data
("hot data").Comment: Minor changes. This is the final published version of the pape
- …