110 research outputs found

    New Centralized MSR Codes With Small Sub-packetization

    Full text link
    Centralized repair refers to repairing hβ‰₯2h\geq 2 node failures using dd helper nodes in a centralized way, where the repair bandwidth is counted by the total amount of data downloaded from the helper nodes. A centralized MSR code is an MDS array code with (h,d)(h,d)-optimal repair for some hh and dd. In this paper, we present several classes of centralized MSR codes with small sub-packetization. At first, we construct an alternative MSR code with (1,di)(1,d_i)-optimal repair for multiple repair degrees did_i simultaneously. Based on the code structure, we are able to construct a centralized MSR code with (hi,di)(h_i,d_i)-optimal repair property for all possible (hi,di)(h_i,d_i) with hi∣(diβˆ’k)h_i\mid (d_i-k) simultaneously. The sub-packetization is no more than lcm(1,2,…,nβˆ’k)(nβˆ’k)n{\rm lcm}(1,2,\ldots,n-k)(n-k)^n, which is much smaller than a previous work given by Ye and Barg ((lcm(1,2,…,nβˆ’k))n({\rm lcm}(1,2,\ldots,n-k))^n). Moreover, for general parameters 2≀h≀nβˆ’k2\leq h\leq n-k and k≀d≀nβˆ’hk\leq d\leq n-h, we further give a centralized MSR code enabling (h,d)(h,d)-optimal repair with sub-packetization smaller than all previous works

    Optimal Locally Repairable and Secure Codes for Distributed Storage Systems

    Full text link
    This paper aims to go beyond resilience into the study of security and local-repairability for distributed storage systems (DSS). Security and local-repairability are both important as features of an efficient storage system, and this paper aims to understand the trade-offs between resilience, security, and local-repairability in these systems. In particular, this paper first investigates security in the presence of colluding eavesdroppers, where eavesdroppers are assumed to work together in decoding stored information. Second, the paper focuses on coding schemes that enable optimal local repairs. It further brings these two concepts together, to develop locally repairable coding schemes for DSS that are secure against eavesdroppers. The main results of this paper include: a. An improved bound on the secrecy capacity for minimum storage regenerating codes, b. secure coding schemes that achieve the bound for some special cases, c. a new bound on minimum distance for locally repairable codes, d. code construction for locally repairable codes that attain the minimum distance bound, and e. repair-bandwidth-efficient locally repairable codes with and without security constraints.Comment: Submitted to IEEE Transactions on Information Theor

    Network Traffic Driven Storage Repair

    Full text link
    Recently we constructed an explicit family of locally repairable and locally regenerating codes. Their existence was proven by Kamath et al. but no explicit construction was given. Our design is based on HashTag codes that can have different sub-packetization levels. In this work we emphasize the importance of having two ways to repair a node: repair only with local parity nodes or repair with both local and global parity nodes. We say that the repair strategy is network traffic driven since it is in connection with the concrete system and code parameters: the repair bandwidth of the code, the number of I/O operations, the access time for the contacted parts and the size of the stored file. We show the benefits of having repair duality in one practical example implemented in Hadoop. We also give algorithms for efficient repair of the global parity nodes.Comment: arXiv admin note: text overlap with arXiv:1701.0666

    Erasure Coding Optimization for Data Storage: Acceleration Techniques and Delayed Parities Generation

    Get PDF
    Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually, and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure: follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs. A delayed parity generation technique for maximum distance separable (MDS) storage codes is proposed as well, for two possible applications: the first is to improve the write-speed during data intake where only a subset of the parities are initially produced and stored into the system, and the rest can be produced from the stored data during a later time of lower system load; the second is to provide better adaptivity, where a lower number of parities can be chosen initially in a storage system, and more parities can be produced when the existing ones are not sufficient to guarantee the needed reliability or performance. In both applications, it is important to reduce the data access as much as possible during the delayed parity generation procedure. For this purpose, we first identify the fundamental limit for delayed parity generation through a connection to the well-known multicast network coding problem, then provide an explicit and low-complexity code transformation that is applicable on any MDS codes to obtain optimal codes. The problem we consider is closely related to the regenerating code problem, however the proposed codes are much simpler and have a much smaller subpacketization factor than regenerating codes, and thus our result in fact shows that blindly adopting regenerating codes in these two settings is unnecessary and wasteful. Moreover, two aspects of this approach is addressed. The first is to optimize the underlying coding matrix, and the second is to understand its behavior in a system setting. For the former, we generalize the existing approach by allowing more flexibility in the code design, and then optimize the underlying coding matrix in the familiar bitmatrix-based coding framework. For the latter, we construct a prototype system, and conduct tests on a local storage network and on two virtual machine-based setups. In both cases, the results confirm the benefit of delayed parity generation when the system bottleneck is in the communication bandwidth instead of the computation
    • …
    corecore