3,566 research outputs found
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
Self-Repairing Disk Arrays
As the prices of magnetic storage continue to decrease, the cost of replacing
failed disks becomes increasingly dominated by the cost of the service call
itself. We propose to eliminate these calls by building disk arrays that
contain enough spare disks to operate without any human intervention during
their whole lifetime. To evaluate the feasibility of this approach, we have
simulated the behavior of two-dimensional disk arrays with n parity disks and
n(n-1)/2 data disks under realistic failure and repair assumptions. Our
conclusion is that having n(n+1)/2 spare disks is more than enough to achieve a
99.999 percent probability of not losing data over four years. We observe that
the same objectives cannot be reached with RAID level 6 organizations and would
require RAID stripes that could tolerate triple disk failures.Comment: Part of ADAPT Workshop proceedings, 2015 (arXiv:1412.2347
Self-repairing Homomorphic Codes for Distributed Storage Systems
Erasure codes provide a storage efficient alternative to replication based
redundancy in (networked) storage systems. They however entail high
communication overhead for maintenance, when some of the encoded fragments are
lost and need to be replenished. Such overheads arise from the fundamental need
to recreate (or keep separately) first a copy of the whole object before any
individual encoded fragment can be generated and replenished. There has been
recently intense interest to explore alternatives, most prominent ones being
regenerating codes (RGC) and hierarchical codes (HC). We propose as an
alternative a new family of codes to improve the maintenance process, which we
call self-repairing codes (SRC), with the following salient features: (a)
encoded fragments can be repaired directly from other subsets of encoded
fragments without having to reconstruct first the original data, ensuring that
(b) a fragment is repaired from a fixed number of encoded fragments, the number
depending only on how many encoded blocks are missing and independent of which
specific blocks are missing. These properties allow for not only low
communication overhead to recreate a missing fragment, but also independent
reconstruction of different missing fragments in parallel, possibly in
different parts of the network. We analyze the static resilience of SRCs with
respect to traditional erasure codes, and observe that SRCs incur marginally
larger storage overhead in order to achieve the aforementioned properties. The
salient SRC properties naturally translate to low communication overheads for
reconstruction of lost fragments, and allow reconstruction with lower latency
by facilitating repairs in parallel. These desirable properties make
self-repairing codes a good and practical candidate for networked distributed
storage systems
Redundancy and Aging of Efficient Multidimensional MDS-Parity Protected Distributed Storage Systems
The effect of redundancy on the aging of an efficient Maximum Distance
Separable (MDS) parity--protected distributed storage system that consists of
multidimensional arrays of storage units is explored. In light of the
experimental evidences and survey data, this paper develops generalized
expressions for the reliability of array storage systems based on more
realistic time to failure distributions such as Weibull. For instance, a
distributed disk array system is considered in which the array components are
disseminated across the network and are subject to independent failure rates.
Based on such, generalized closed form hazard rate expressions are derived.
These expressions are extended to estimate the asymptotical reliability
behavior of large scale storage networks equipped with MDS parity-based
protection. Unlike previous studies, a generic hazard rate function is assumed,
a generic MDS code for parity generation is used, and an evaluation of the
implications of adjustable redundancy level for an efficient distributed
storage system is presented. Results of this study are applicable to any
erasure correction code as long as it is accompanied with a suitable structure
and an appropriate encoding/decoding algorithm such that the MDS property is
maintained.Comment: 11 pages, 6 figures, Accepted for publication in IEEE Transactions on
Device and Materials Reliability (TDMR), Nov. 201
Storage codes -- coding rate and repair locality
The {\em repair locality} of a distributed storage code is the maximum number
of nodes that ever needs to be contacted during the repair of a failed node.
Having small repair locality is desirable, since it is proportional to the
number of disk accesses during repair. However, recent publications show that
small repair locality comes with a penalty in terms of code distance or storage
overhead if exact repair is required.
Here, we first review some of the main results on storage codes under various
repair regimes and discuss the recent work on possible
(information-theoretical) trade-offs between repair locality and other code
parameters like storage overhead and code distance, under the exact repair
regime.
Then we present some new information theoretical lower bounds on the storage
overhead as a function of the repair locality, valid for all common coding and
repair models. In particular, we show that if each of the nodes in a
distributed storage system has storage capacity \ga and if, at any time, a
failed node can be {\em functionally} repaired by contacting {\em some} set of
nodes (which may depend on the actual state of the system) and downloading
an amount \gb of data from each, then in the extreme cases where \ga=\gb or
\ga = r\gb, the maximal coding rate is at most or 1/2, respectively
(that is, the excess storage overhead is at least or 1, respectively).Comment: Accepted for publication in ICNC'13, San Diego, US
Emerging Approaches to DNA Data Storage: Challenges and Prospects
With the total amount of worldwide data skyrocketing, the global data storage demand is predicted to grow to 1.75 × 1014GB by 2025. Traditional storage methods have difficulties keeping pace given that current storage media have a maximum density of 103GB/mm3. As such, data production will far exceed the capacity of currently available storage methods. The costs of maintaining and transferring data, as well as the limited lifespans and significant data losses associated with current technologies also demand advanced solutions for information storage. Nature offers a powerful alternative through the storage of information that defines living organisms in unique orders of four bases (A, T, C, G) located in molecules called deoxyribonucleic acid (DNA). DNA molecules as information carriers have many advantages over traditional storage media. Their high storage density, potentially low maintenance cost, ease of synthesis, and chemical modification make them an ideal alternative for information storage. To this end, rapid progress has been made over the past decade by exploiting user-defined DNA materials to encode information. In this review, we discuss the most recent advances of DNA-based data storage with a major focus on the challenges that remain in this promising field, including the current intrinsic low speed in data writing and reading and the high cost per byte stored. Alternatively, data storage relying on DNA nanostructures (as opposed to DNA sequence) as well as on other combinations of nanomaterials and biomolecules are proposed with promising technological and economic advantages. In summarizing the advances that have been made and underlining the challenges that remain, we provide a roadmap for the ongoing research in this rapidly growing field, which will enable the development of technological solutions to the global demand for superior storage methodologies
In-Network Redundancy Generation for Opportunistic Speedup of Backup
Erasure coding is a storage-efficient alternative to replication for
achieving reliable data backup in distributed storage systems. During the
storage process, traditional erasure codes require a unique source node to
create and upload all the redundant data to the different storage nodes.
However, such a source node may have limited communication and computation
capabilities, which constrain the storage process throughput. Moreover, the
source node and the different storage nodes might not be able to send and
receive data simultaneously -- e.g., nodes might be busy in a datacenter
setting, or simply be offline in a peer-to-peer setting -- which can further
threaten the efficacy of the overall storage process. In this paper we propose
an "in-network" redundancy generation process which distributes the data
insertion load among the source and storage nodes by allowing the storage nodes
to generate new redundant data by exchanging partial information among
themselves, improving the throughput of the storage process. The process is
carried out asynchronously, utilizing spare bandwidth and computing resources
from the storage nodes. The proposed approach leverages on the local
repairability property of newly proposed erasure codes tailor made for the
needs of distributed storage systems. We analytically show that the performance
of this technique relies on an efficient usage of the spare node resources, and
we derive a set of scheduling algorithms to maximize the same. We
experimentally show, using availability traces from real peer-to-peer
applications as well as Google data center availability and workload traces,
that our algorithms can, depending on the environment characteristics, increase
the throughput of the storage process significantly (up to 90% in data centers,
and 60% in peer-to-peer settings) with respect to the classical naive data
insertion approach
- …