2 research outputs found

    On the Communication Cost of MDS Erasure Codes in Distributed Storage Systems

    Get PDF
    Distributed storage systems store some redundant data to keep the degree of availability of the stored data constant and also to increase the system's resistance against failures. This type of systems usually use pure replication or methods based on RAID systems as redundancy schemes. In this paper, we study the communication cost of a distributed data storage system using Maximum Distance Separable (MDS) erasure codes. Our focus is reduction of the cost of one-to-many communication used in data reconstruction/repair initialization and update operations. We propose the use of two different communication approaches on the area of distributed storage systems for the above operations; Steiner tree approach and multi-shortest path approach. We also analyse these two communication approaches empirically and theoretically. Our theoretical results indicate that Steiner tree approach has lower message usage, whereas, multi-shortest path approach has lower time usage for data reconstruction/repair initialization operations. On the other hand, Steiner tree approach has better message and time metrics for the data update process. Furthermore, our experimental results support these theoretical results. Thus, users can choose between the two approaches depending on their needs and priorities

    On the Erasure Recoverability of MDS Codes under Concurrent Updates *

    No full text
    Abstract β€” We consider a fault-tolerant distributed storage system that protects data on k disks using a systematic linear (n, k) MDS code. In such a system, updates to data blocks require corresponding updates to check blocks. Concurrent fault-prone access by multiple writers can drive the system into an inconsistent state with reduced tolerance for disk failures. We show tight bounds on the erasure recoverability of an (n, k) MDS code in this scenario. The bounds depend not just on the minimum distance of the code, but also on the maximum number of concurrent faulty writers and the manner in which they attempt to update the check blocks (one at a time/all at once.) I
    corecore