20,797 research outputs found
CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems
Data availability is critical in distributed storage systems, especially when
node failures are prevalent in real life. A key requirement is to minimize the
amount of data transferred among nodes when recovering the lost or unavailable
data of failed nodes. This paper explores recovery solutions based on
regenerating codes, which are shown to provide fault-tolerant storage and
minimum recovery bandwidth. Existing optimal regenerating codes are designed
for single node failures. We build a system called CORE, which augments
existing optimal regenerating codes to support a general number of failures
including single and concurrent failures. We theoretically show that CORE
achieves the minimum possible recovery bandwidth for most cases. We implement
CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to
20 storage nodes. We demonstrate that our CORE prototype conforms to our
theoretical findings and achieves recovery bandwidth saving when compared to
the conventional recovery approach based on erasure codes.Comment: 25 page
Construction of Near-Optimum Burst Erasure Correcting Low-Density Parity-Check Codes
In this paper, a simple, general-purpose and effective tool for the design of
low-density parity-check (LDPC) codes for iterative correction of bursts of
erasures is presented. The design method consists in starting from the
parity-check matrix of an LDPC code and developing an optimized parity-check
matrix, with the same performance on the memory-less erasure channel, and
suitable also for the iterative correction of single bursts of erasures. The
parity-check matrix optimization is performed by an algorithm called pivot
searching and swapping (PSS) algorithm, which executes permutations of
carefully chosen columns of the parity-check matrix, after a local analysis of
particular variable nodes called stopping set pivots. This algorithm can be in
principle applied to any LDPC code. If the input parity-check matrix is
designed for achieving good performance on the memory-less erasure channel,
then the code obtained after the application of the PSS algorithm provides good
joint correction of independent erasures and single erasure bursts. Numerical
results are provided in order to show the effectiveness of the PSS algorithm
when applied to different categories of LDPC codes.Comment: 15 pages, 4 figures. IEEE Trans. on Communications, accepted
(submitted in Feb. 2007
Enhanced Recursive Reed-Muller Erasure Decoding
Recent work have shown that Reed-Muller (RM) codes achieve the erasure
channel capacity. However, this performance is obtained with maximum-likelihood
decoding which can be costly for practical applications. In this paper, we
propose an encoding/decoding scheme for Reed-Muller codes on the packet erasure
channel based on Plotkin construction. We present several improvements over the
generic decoding. They allow, for a light cost, to compete with
maximum-likelihood decoding performance, especially on high-rate codes, while
significantly outperforming it in terms of speed
Low-Complexity Codes for Random and Clustered High-Order Failures in Storage Arrays
RC (Random/Clustered) codes are a new efficient array-code family for recovering from 4-erasures. RC codes correct most 4-erasures, and essentially all 4-erasures that are clustered. Clustered erasures are introduced as a new erasure model for storage arrays. This model draws its motivation from correlated device failures, that are caused by physical proximity of devices, or by age proximity of endurance-limited solid-state drives. The reliability of storage arrays that employ RC codes is analyzed and compared to known codes. The new RC code is significantly more efficient, in all practical implementation factors, than the best known 4-erasure correcting MDS code. These factors include: small-write update-complexity, full-device update-complexity, decoding complexity and number of supported devices in the array
Asymmetric LOCO Codes: Constrained Codes for Flash Memories
In data storage and data transmission, certain patterns are more likely to be
subject to error when written (transmitted) onto the media. In magnetic
recording systems with binary data and bipolar non-return-to-zero signaling,
patterns that have insufficient separation between consecutive transitions
exacerbate inter-symbol interference. Constrained codes are used to eliminate
such error-prone patterns. A recent example is a new family of
capacity-achieving constrained codes, named lexicographically-ordered
constrained codes (LOCO codes). LOCO codes are symmetric, that is, the set of
forbidden patterns is closed under taking pattern complements. LOCO codes are
suboptimal in terms of rate when used in Flash devices where block erasure is
employed since the complement of an error-prone pattern is not detrimental in
these devices. This paper introduces asymmetric LOCO codes (A-LOCO codes),
which are lexicographically-ordered constrained codes that forbid only those
patterns that are detrimental for Flash performance. A-LOCO codes are also
capacity-achieving, and at finite-lengths, they offer higher rates than the
available state-of-the-art constrained codes designed for the same goal. The
mapping-demapping between the index and the codeword in A-LOCO codes allows
low-complexity encoding and decoding algorithms that are simpler than their
LOCO counterparts.Comment: 9 pages (double column), 0 figures, accepted at the Annual Allerton
Conference on Communication, Control, and Computin
- …