11 research outputs found

    Fundamental Limits on Communication for Oblivious Updates in Storage Networks

    Full text link
    In distributed storage systems, storage nodes intermittently go offline for numerous reasons. On coming back online, nodes need to update their contents to reflect any modifications to the data in the interim. In this paper, we consider a setting where no information regarding modified data needs to be logged in the system. In such a setting, a 'stale' node needs to update its contents by downloading data from already updated nodes, while neither the stale node nor the updated nodes have any knowledge as to which data symbols are modified and what their value is. We investigate the fundamental limits on the amount of communication necessary for such an "oblivious" update process. We first present a generic lower bound on the amount of communication that is necessary under any storage code with a linear encoding (while allowing non-linear update protocols). This lower bound is derived under a set of extremely weak conditions, giving all updated nodes access to the entire modified data and the stale node access to the entire stale data as side information. We then present codes and update algorithms that are optimal in that they meet this lower bound. Next, we present a lower bound for an important subclass of codes, that of linear Maximum-Distance-Separable (MDS) codes. We then present an MDS code construction and an associated update algorithm that meets this lower bound. These results thus establish the capacity of oblivious updates in terms of the communication requirements under these settings.Comment: IEEE Global Communications Conference (GLOBECOM) 201

    Evaluating Erasure Codes in Dicoogle PACS

    Get PDF
    DICOM (Digital Imaging and Communication in Medicine) is a standard for image and data transmission in medical purpose hardware and is commonly used for viewing, storing, printing and transmitting images. As a part of the way that DICOM transmits files, the PACS (Picture Archiving and Communication System) platform, Dicoogle, has become one of the most in-demand image processing and viewing platforms. However, the Dicoogle PACS architecture does not guarantee image information recovery in the case of information loss. Therefore, this paper proposes a file recovery solution in the Dicoogle architecture. The proposal consists of maximizing the encoding and decoding performance of medical images through computational parallelism. To validate the proposal, the Java programming language based on the Reed-Solomon algorithm is implemented in different performance tests. The experimental results show that the proposal is optimal in terms of image processing time for the Dicoogle PACS storage system.Ministry of Science, Innovation and Universities (MICINN) of Spain PGC2018 098883-B-C44European CommissionPrograma para el Desarrollo Profesional Docente para el Tipo Superior (PRODEP) of MexicoCorporacion Ecuatoriana para el Desarrollo de la Investigacion y la Academia (CEDIA) of Ecuador CEPRA XII-2018-13Universidad de Las Americas (UDLA), Quito, Ecuador IEA.WHP.21.0

    X-Cipher: Achieving Data Resiliency in Homomorphic Ciphertexts

    Get PDF
    Homomorphic encryption (HE) allows for computations on encrypted data without requiring decryption. HE is commonly applied to outsource computation on private data. Due to the additional risks caused by data outsourcing, the ability to recover from losses is essential, but doing so on data encrypted under an HE scheme introduces additional challenges for recovery and usability. This work introduces X-Cipher, which aims to make HE data resilient by ensuring it is private and fault-tolerant simultaneously at all stages during data-outsourcing. X-Cipher allows for data recovery without decryption, and maintains its ability to recover and keep data private when a cluster server has been compromised. X-Cipher allows for reduced ciphertext storage overhead by introducing novel encoding and leveraging previously introduced ciphertext packing. X-Cipher\u27s capabilities were evaluated on synthetic dataset, and compared to prior work to demonstrate X-Cipher enables additional security capabilities while enabling privacy-preserving outsourced computations

    Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads,”

    Get PDF
    Abstract To reduce storage overhead, cloud file systems are transitioning from replication to erasure codes. This process has revealed new dimensions on which to evaluate the performance of different coding schemes: the amount of data used in recovery and when performing degraded reads. We present an algorithm that finds the optimal number of codeword symbols needed for recovery for any XOR-based erasure code and produces recovery schedules that use a minimum amount of data. We differentiate popular erasure codes based on this criterion and demonstrate that the differences improve I/O performance in practice for the large block sizes used in cloud file systems. Several cloud systems [Ford10,Calder11] have adopted Reed-Solomon (RS) codes, because of their generality and their ability to tolerate larger numbers of failures. We define a new class of rotated ReedSolomon codes that perform degraded reads more efficiently than all known codes, but otherwise inherit the reliability and performance properties of Reed-Solomon codes. The online home for this paper may be found at: http://web.eecs.utk.edu/~plank/plank/papers/ FAST-2012 Abstract To reduce storage overhead, cloud file systems are transitioning from replication to erasure codes. This process has revealed new dimensions on which to evaluate the performance of different coding schemes: the amount of data used in recovery and when performing degraded reads. We present an algorithm that finds the optimal number of codeword symbols needed for recovery for any XOR-based erasure code and produces recovery schedules that use a minimum amount of data. We differentiate popular erasure codes based on this criterion and demonstrate that the differences improve I/O performance in practice for the large block sizes used in cloud file systems. Several cloud system

    Signal Processing for Caching Networks and Non-volatile Memories

    Get PDF
    The recent information explosion has created a pressing need for faster and more reliable data storage and transmission schemes. This thesis focuses on two systems: caching networks and non-volatile storage systems. It proposes network protocols to improve the efficiency of information delivery and signal processing schemes to reduce errors at the physical layer as well. This thesis first investigates caching and delivery strategies for content delivery networks. Caching has been investigated as a useful technique to reduce the network burden by prefetching some contents during oË™-peak hours. Coded caching [1] proposed by Maddah-Ali and Niesen is the foundation of our algorithms and it has been shown to be a useful technique which can reduce peak traffic rates by encoding transmissions so that different users can extract different information from the same packet. Content delivery networks store information distributed across multiple servers, so as to balance the load and avoid unrecoverable losses in case of node or disk failures. On one hand, distributed storage limits the capability of combining content from different servers into a single message, causing performance losses in coded caching schemes. But, on the other hand, the inherent redundancy existing in distributed storage systems can be used to improve the performance of those schemes through parallelism. This thesis proposes a scheme combining distributed storage of the content in multiple servers and an efficient coded caching algorithm for delivery to the users. This scheme is shown to reduce the peak transmission rate below that of state-of-the-art algorithms

    Efficient data reliability management of cloud storage systems for big data applications

    Get PDF
    Cloud service providers are consistently striving to provide efficient and reliable service, to their client's Big Data storage need. Replication is a simple and flexible method to ensure reliability and availability of data. However, it is not an efficient solution for Big Data since it always scales in terabytes and petabytes. Hence erasure coding is gaining traction despite its shortcomings. Deploying erasure coding in cloud storage confronts several challenges like encoding/decoding complexity, load balancing, exponential resource consumption due to data repair and read latency. This thesis has addressed many challenges among them. Even though data durability and availability should not be compromised for any reason, client's requirements on read performance (access latency) may vary with the nature of data and its access pattern behaviour. Access latency is one of the important metrics and latency acceptance range can be recorded in the client's SLA. Several proactive recovery methods, for erasure codes are proposed in this research, to reduce resource consumption due to recovery. Also, a novel cache based solution is proposed to mitigate the access latency issue of erasure coding
    corecore