219 research outputs found
Update-Efficiency and Local Repairability Limits for Capacity Approaching Codes
Motivated by distributed storage applications, we investigate the degree to
which capacity achieving encodings can be efficiently updated when a single
information bit changes, and the degree to which such encodings can be
efficiently (i.e., locally) repaired when single encoded bit is lost.
Specifically, we first develop conditions under which optimum
error-correction and update-efficiency are possible, and establish that the
number of encoded bits that must change in response to a change in a single
information bit must scale logarithmically in the block-length of the code if
we are to achieve any nontrivial rate with vanishing probability of error over
the binary erasure or binary symmetric channels. Moreover, we show there exist
capacity-achieving codes with this scaling.
With respect to local repairability, we develop tight upper and lower bounds
on the number of remaining encoded bits that are needed to recover a single
lost bit of the encoding. In particular, we show that if the code-rate is
less than the capacity, then for optimal codes, the maximum number
of codeword symbols required to recover one lost symbol must scale as
.
Several variations on---and extensions of---these results are also developed.Comment: Accepted to appear in JSA
Capacity of Locally Recoverable Codes
Motivated by applications in distributed storage, the notion of a locally
recoverable code (LRC) was introduced a few years back. In an LRC, any
coordinate of a codeword is recoverable by accessing only a small number of
other coordinates. While different properties of LRCs have been well-studied,
their performance on channels with random erasures or errors has been mostly
unexplored. In this note, we analyze the performance of LRCs over such
stochastic channels. In particular, for input-symmetric discrete memoryless
channels, we give a tight characterization of the gap to Shannon capacity when
LRCs are used over the channel.Comment: Invited paper to the Information Theory Workshop (ITW) 201
Locally Encodable and Decodable Codes for Distributed Storage Systems
We consider the locality of encoding and decoding operations in distributed
storage systems (DSS), and propose a new class of codes, called locally
encodable and decodable codes (LEDC), that provides a higher degree of
operational locality compared to currently known codes. For a given locality
structure, we derive an upper bound on the global distance and demonstrate the
existence of an optimal LEDC for sufficiently large field size. In addition, we
also construct two families of optimal LEDC for fields with size linear in code
length.Comment: 7 page
Local recovery in data compression for general sources
Source coding is concerned with optimally compressing data, so that it can be reconstructed up to a specified distortion from its compressed representation. Usually, in fixed-length compression, a sequence of n symbols (from some alphabet) is encoded to a sequence of k symbols (bits). The decoder produces an estimate of the original sequence of n symbols from the encoded bits. The rate-distortion function characterizes the optimal possible rate of compression allowing a given distortion in reconstruction as n grows. This function depends on the source probability distribution. In a locally recoverable decoding, to reconstruct a single symbol, only a few compressed bits are accessed. In this paper we find the limits of local recovery for rates near the rate-distortion function. For a wide set of source distributions, we show that, it is possible to compress within ε of the rate-distortion function such the local recoverability grows as Ω(log(1/ε)); that is, in order to recover one source symbol, at least Ω(log(1/ε)) bits of the compressed symbols are queried. We also show order optimal impossibility results. Similar results are provided for lossless source coding as well.National Science Foundation (U.S.). (grant CCF 1318093)United States. Air Force. Office of Scientific Research ( FA9550-11-1-0183)National Science Foundation (U.S.). (grant CCF-1319828
Optimal Locally Repairable Codes and Connections to Matroid Theory
Petabyte-scale distributed storage systems are currently transitioning to
erasure codes to achieve higher storage efficiency. Classical codes like
Reed-Solomon are highly sub-optimal for distributed environments due to their
high overhead in single-failure events. Locally Repairable Codes (LRCs) form a
new family of codes that are repair efficient. In particular, LRCs minimize the
number of nodes participating in single node repairs during which they generate
small network traffic. Two large-scale distributed storage systems have already
implemented different types of LRCs: Windows Azure Storage and the Hadoop
Distributed File System RAID used by Facebook. The fundamental bounds for LRCs,
namely the best possible distance for a given code locality, were recently
discovered, but few explicit constructions exist. In this work, we present an
explicit and optimal LRCs that are simple to construct. Our construction is
based on grouping Reed-Solomon (RS) coded symbols to obtain RS coded symbols
over a larger finite field. We then partition these RS symbols in small groups,
and re-encode them using a simple local code that offers low repair locality.
For the analysis of the optimality of the code, we derive a new result on the
matroid represented by the code generator matrix.Comment: Submitted for publication, a shorter version was presented at ISIT
201
- …