28 research outputs found

    Network Traffic Driven Storage Repair

    Full text link
    Recently we constructed an explicit family of locally repairable and locally regenerating codes. Their existence was proven by Kamath et al. but no explicit construction was given. Our design is based on HashTag codes that can have different sub-packetization levels. In this work we emphasize the importance of having two ways to repair a node: repair only with local parity nodes or repair with both local and global parity nodes. We say that the repair strategy is network traffic driven since it is in connection with the concrete system and code parameters: the repair bandwidth of the code, the number of I/O operations, the access time for the contacted parts and the size of the stored file. We show the benefits of having repair duality in one practical example implemented in Hadoop. We also give algorithms for efficient repair of the global parity nodes.Comment: arXiv admin note: text overlap with arXiv:1701.0666

    An Explicit Construction of Systematic MDS Codes with Small Sub-packetization for All-Node Repair

    Full text link
    An explicit construction of systematic MDS codes, called HashTag+ codes, with arbitrary sub-packetization level for all-node repair is proposed. It is shown that even for small sub-packetization levels, HashTag+ codes achieve the optimal MSR point for repair of any parity node, while the repair bandwidth for a single systematic node depends on the sub-packetization level. Compared to other codes in the literature, HashTag+ codes provide from 20% to 40% savings in the average amount of data accessed and transferred during repair

    Functional broadcast repair of multiple partial failures in wireless distributed storage systems

    Get PDF
    We consider a distributed storage system with n nodes, where a user can recover the stored file from any k nodes, and study the problem of repairing r partially failed nodes. We consider broadcast repair , that is, d surviving nodes transmit broadcast messages on an error-free wireless channel to the r nodes being repaired, which are then used, together with the surviving data in the local memories of the failed nodes, to recover the lost content. First, we derive the trade-off between the storage capacity and the repair bandwidth for partial repair of multiple failed nodes, based on the cut-set bound for information flow graphs. It is shown that utilizing the broadcast nature of the wireless medium and the surviving contents at the partially failed nodes reduces the repair bandwidth per node significantly. Then, we list a set of invariant conditions that are sufficient for a functional repair code to be feasible. We further propose a scheme for functional repair of multiple failed nodes that satisfies the invariant conditions with high probability, and its extension to the repair of partial failures. The performance of the proposed scheme meets the cut-set bound on all the points on the trade-off curve for all admissible parameters when k is divisible by r , while employing linear subpacketization, which is an important practical consideration in the design of distributed storage codes. Unlike random linear codes, which are conventionally used for functional repair of failed nodes, the proposed repair scheme has lower overhead, lower input-output cost, and lower computational complexity during repair
    corecore