877 research outputs found
TOWARDS DIGITAL TWINS FOR OPTIMIZING METRICS IN DISTRIBUTED STORAGE SYSTEMS - A REVIEW
With the exponential data growth, there is a crucial need for highly available, scalable, reliable, and cost-effective Distributed Storage Systems (DSSs). To ensure such efficient and fault tolerant systems, replication and erasure coding techniques are typically used in traditional DSSs. However, these systems are prone to failure and require different failure prevention and recovery algorithms. Failure recovery of DSS and data reconstruction techniques take into consideration different performance metrics optimization in the recovery process. In this paper, DSS performance metrics are introduced. Several recent papers related to adopting erasure coding in DSSs are surveyed together with highlighting related performance metrics introduced in the context of these papers. Next, we present recent literature where Digital Twins (DTs) are involved in monitoring DSSs and assisting the data center managers in intelligent decision-making. Finally, important open issues are identified to inspire future studies for fully efficient DSSs
Self-repairing Homomorphic Codes for Distributed Storage Systems
Erasure codes provide a storage efficient alternative to replication based
redundancy in (networked) storage systems. They however entail high
communication overhead for maintenance, when some of the encoded fragments are
lost and need to be replenished. Such overheads arise from the fundamental need
to recreate (or keep separately) first a copy of the whole object before any
individual encoded fragment can be generated and replenished. There has been
recently intense interest to explore alternatives, most prominent ones being
regenerating codes (RGC) and hierarchical codes (HC). We propose as an
alternative a new family of codes to improve the maintenance process, which we
call self-repairing codes (SRC), with the following salient features: (a)
encoded fragments can be repaired directly from other subsets of encoded
fragments without having to reconstruct first the original data, ensuring that
(b) a fragment is repaired from a fixed number of encoded fragments, the number
depending only on how many encoded blocks are missing and independent of which
specific blocks are missing. These properties allow for not only low
communication overhead to recreate a missing fragment, but also independent
reconstruction of different missing fragments in parallel, possibly in
different parts of the network. We analyze the static resilience of SRCs with
respect to traditional erasure codes, and observe that SRCs incur marginally
larger storage overhead in order to achieve the aforementioned properties. The
salient SRC properties naturally translate to low communication overheads for
reconstruction of lost fragments, and allow reconstruction with lower latency
by facilitating repairs in parallel. These desirable properties make
self-repairing codes a good and practical candidate for networked distributed
storage systems
- …