17,441 research outputs found
Implementation and performance evaluation of distributed cloud storage solutions using random linear network coding
This paper advocates the use of random linear network coding for storage in distributed clouds in order to reduce storage and traffic costs in dynamic settings, i.e. when adding and removing numerous storage devices/clouds on-the-fly and when the number of reachable clouds is limited. We introduce various network coding approaches that trade-off reliability, storage and traffic costs, and system complexity relying on probabilistic recoding for cloud regeneration. We compare these approaches with other approaches based on data replication and Reed-Solomon codes. A simulator has been developed to carry out a thorough performance evaluation of the various approaches when relying on different system settings, e.g., finite fields, and network/storage conditions, e.g., storage space used per cloud, limited network use, and limited recoding capabilities. In contrast to standard coding approaches, our techniques do not require us to retrieve the full original information in order to store meaningful information. Our numerical results show a high resilience over a large number of regeneration cycles compared to other approaches.Danish Council for Independent Research (Green Mobile Cloud Project DFF-090201372B)Hungarian National Development Agency (Research and Technology Innovation Fund Grant KMR_12-1-2012-0441)European Union (European Social Fund Project FuturICT.hu Grant TAMOP- 4.2.2.C-11/1/KONV-2012-0013
Coded Computation Against Processing Delays for Virtualized Cloud-Based Channel Decoding
The uplink of a cloud radio access network architecture is studied in which
decoding at the cloud takes place via network function virtualization on
commercial off-the-shelf servers. In order to mitigate the impact of straggling
decoders in this platform, a novel coding strategy is proposed, whereby the
cloud re-encodes the received frames via a linear code before distributing them
to the decoding processors. Transmission of a single frame is considered first,
and upper bounds on the resulting frame unavailability probability as a
function of the decoding latency are derived by assuming a binary symmetric
channel for uplink communications. Then, the analysis is extended to account
for random frame arrival times. In this case, the trade-off between average
decoding latency and the frame error rate is studied for two different queuing
policies, whereby the servers carry out per-frame decoding or continuous
decoding, respectively. Numerical examples demonstrate that the bounds are
useful tools for code design and that coding is instrumental in obtaining a
desirable compromise between decoding latency and reliability.Comment: 11 pages and 12 figures, Submitte
TOFEC: Achieving Optimal Throughput-Delay Trade-off of Cloud Storage Using Erasure Codes
Our paper presents solutions using erasure coding, parallel connections to
storage cloud and limited chunking (i.e., dividing the object into a few
smaller segments) together to significantly improve the delay performance of
uploading and downloading data in and out of cloud storage.
TOFEC is a strategy that helps front-end proxy adapt to level of workload by
treating scalable cloud storage (e.g. Amazon S3) as a shared resource requiring
admission control. Under light workloads, TOFEC creates more smaller chunks and
uses more parallel connections per file, minimizing service delay. Under heavy
workloads, TOFEC automatically reduces the level of chunking (fewer chunks with
increased size) and uses fewer parallel connections to reduce overhead,
resulting in higher throughput and preventing queueing delay. Our trace-driven
simulation results show that TOFEC's adaptation mechanism converges to an
appropriate code that provides the optimal delay-throughput trade-off without
reducing system capacity. Compared to a non-adaptive strategy optimized for
throughput, TOFEC delivers 2.5x lower latency under light workloads; compared
to a non-adaptive strategy optimized for latency, TOFEC can scale to support
over 3x as many requests
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems
Data availability is critical in distributed storage systems, especially when
node failures are prevalent in real life. A key requirement is to minimize the
amount of data transferred among nodes when recovering the lost or unavailable
data of failed nodes. This paper explores recovery solutions based on
regenerating codes, which are shown to provide fault-tolerant storage and
minimum recovery bandwidth. Existing optimal regenerating codes are designed
for single node failures. We build a system called CORE, which augments
existing optimal regenerating codes to support a general number of failures
including single and concurrent failures. We theoretically show that CORE
achieves the minimum possible recovery bandwidth for most cases. We implement
CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to
20 storage nodes. We demonstrate that our CORE prototype conforms to our
theoretical findings and achieves recovery bandwidth saving when compared to
the conventional recovery approach based on erasure codes.Comment: 25 page
- …