17,441 research outputs found

    Implementation and performance evaluation of distributed cloud storage solutions using random linear network coding

    Get PDF
    This paper advocates the use of random linear network coding for storage in distributed clouds in order to reduce storage and traffic costs in dynamic settings, i.e. when adding and removing numerous storage devices/clouds on-the-fly and when the number of reachable clouds is limited. We introduce various network coding approaches that trade-off reliability, storage and traffic costs, and system complexity relying on probabilistic recoding for cloud regeneration. We compare these approaches with other approaches based on data replication and Reed-Solomon codes. A simulator has been developed to carry out a thorough performance evaluation of the various approaches when relying on different system settings, e.g., finite fields, and network/storage conditions, e.g., storage space used per cloud, limited network use, and limited recoding capabilities. In contrast to standard coding approaches, our techniques do not require us to retrieve the full original information in order to store meaningful information. Our numerical results show a high resilience over a large number of regeneration cycles compared to other approaches.Danish Council for Independent Research (Green Mobile Cloud Project DFF-090201372B)Hungarian National Development Agency (Research and Technology Innovation Fund Grant KMR_12-1-2012-0441)European Union (European Social Fund Project FuturICT.hu Grant TAMOP- 4.2.2.C-11/1/KONV-2012-0013

    Coded Computation Against Processing Delays for Virtualized Cloud-Based Channel Decoding

    Get PDF
    The uplink of a cloud radio access network architecture is studied in which decoding at the cloud takes place via network function virtualization on commercial off-the-shelf servers. In order to mitigate the impact of straggling decoders in this platform, a novel coding strategy is proposed, whereby the cloud re-encodes the received frames via a linear code before distributing them to the decoding processors. Transmission of a single frame is considered first, and upper bounds on the resulting frame unavailability probability as a function of the decoding latency are derived by assuming a binary symmetric channel for uplink communications. Then, the analysis is extended to account for random frame arrival times. In this case, the trade-off between average decoding latency and the frame error rate is studied for two different queuing policies, whereby the servers carry out per-frame decoding or continuous decoding, respectively. Numerical examples demonstrate that the bounds are useful tools for code design and that coding is instrumental in obtaining a desirable compromise between decoding latency and reliability.Comment: 11 pages and 12 figures, Submitte

    TOFEC: Achieving Optimal Throughput-Delay Trade-off of Cloud Storage Using Erasure Codes

    Full text link
    Our paper presents solutions using erasure coding, parallel connections to storage cloud and limited chunking (i.e., dividing the object into a few smaller segments) together to significantly improve the delay performance of uploading and downloading data in and out of cloud storage. TOFEC is a strategy that helps front-end proxy adapt to level of workload by treating scalable cloud storage (e.g. Amazon S3) as a shared resource requiring admission control. Under light workloads, TOFEC creates more smaller chunks and uses more parallel connections per file, minimizing service delay. Under heavy workloads, TOFEC automatically reduces the level of chunking (fewer chunks with increased size) and uses fewer parallel connections to reduce overhead, resulting in higher throughput and preventing queueing delay. Our trace-driven simulation results show that TOFEC's adaptation mechanism converges to an appropriate code that provides the optimal delay-throughput trade-off without reducing system capacity. Compared to a non-adaptive strategy optimized for throughput, TOFEC delivers 2.5x lower latency under light workloads; compared to a non-adaptive strategy optimized for latency, TOFEC can scale to support over 3x as many requests

    Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments

    Full text link
    Data centres that use consumer-grade disks drives and distributed peer-to-peer systems are unreliable environments to archive data without enough redundancy. Most redundancy schemes are not completely effective for providing high availability, durability and integrity in the long-term. We propose alpha entanglement codes, a mechanism that creates a virtual layer of highly interconnected storage devices to propagate redundant information across a large scale storage system. Our motivation is to design flexible and practical erasure codes with high fault-tolerance to improve data durability and availability even in catastrophic scenarios. By flexible and practical, we mean code settings that can be adapted to future requirements and practical implementations with reasonable trade-offs between security, resource usage and performance. The codes have three parameters. Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. Two other parameters increase fault-tolerance even further without the need of additional storage. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably. We evaluate how several redundancy schemes perform in unreliable environments and show that alpha entanglement codes are flexible and practical codes. Remarkably, they excel at code locality, hence, they reduce repair costs and become less dependent on storage locations with poor availability. Our solution outperforms Reed-Solomon codes in many disaster recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN

    CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems

    Full text link
    Data availability is critical in distributed storage systems, especially when node failures are prevalent in real life. A key requirement is to minimize the amount of data transferred among nodes when recovering the lost or unavailable data of failed nodes. This paper explores recovery solutions based on regenerating codes, which are shown to provide fault-tolerant storage and minimum recovery bandwidth. Existing optimal regenerating codes are designed for single node failures. We build a system called CORE, which augments existing optimal regenerating codes to support a general number of failures including single and concurrent failures. We theoretically show that CORE achieves the minimum possible recovery bandwidth for most cases. We implement CORE and evaluate our prototype atop a Hadoop HDFS cluster testbed with up to 20 storage nodes. We demonstrate that our CORE prototype conforms to our theoretical findings and achieves recovery bandwidth saving when compared to the conventional recovery approach based on erasure codes.Comment: 25 page
    corecore