498 research outputs found

    Repairing Multiple Failures with Coordinated and Adaptive Regenerating Codes

    Get PDF
    Accepted as a Regular Paper (6 pages) at NetCod 2011 : The 2011 International Symposium on Network Coding, Honk-Kong, July 2011. Also available on arXiv: http://arxiv.org/abs/1102.0204Erasure correcting codes are widely used to ensure data persistence in distributed storage systems. This paper addresses the repair of such codes in the presence of simultaneous failures. It is crucial to maintain the required redundancy over time to prevent permanent data losses. We go beyond existing work (i.e., regenerating codes by Dimakis et al.) and propose coordinated regenerating codes allowing devices to coordinate during simultaneous repairs thus reducing the costs further. We provide closed form expressions of the communication costs of our new codes depending on the number of live devices and the number of devices being repaired. We prove that deliberately delaying repairs does not bring additional gains in itself. This means that regenerating codes are optimal as long as each failure can be repaired before a second one occurs. Yet, when multiple failures are detected simultaneously, we prove that our coordinated regenerating codes are optimal and outperform uncoordinated repairs (with respect to communication and storage costs). Finally, we define adaptive regenerating codes that self-adapt to the system state and prove they are optimal.Les codes correcteurs d'effacements sont largement utilisĂ©s pour assurer la persistance des donnĂ©es dans les systĂšmes de stockage distribuĂ©s. Ce rapport s'intĂ©resse Ă  la rĂ©paration de tels codes dans le cas de dĂ©faillances simultanĂ©es. Cette maintenance est cruciale afin de prĂ©venir les pertes de donnĂ©es permanentes. Nous Ă©tendons les travaux existants (codes rĂ©gĂ©nĂ©rants par Dimakis et al.) et proposons des codes rĂ©gĂ©nĂ©rants coordonnĂ©s qui permettent aux Ă©lĂ©ments du systĂšmes de se coordonner durant les rĂ©parations de dĂ©faillances simultanĂ©es afin de rĂ©duire les coĂ»ts de rĂ©paration. Nous fournissons une forme close des coĂ»ts de communications de nos codes en fonction du nombre d'Ă©quipements vivants et du nombre d'Ă©quipements en cours de rĂ©paration. Nous prouvons, par ailleurs, que retarder les rĂ©parations de façon dĂ©libĂ©rĂ©e n'apporte pas de gains additionnels. Cela signifie que les codes rĂ©gĂ©nĂ©rants sont optimaux tant qu'une premiĂšre dĂ©faillance peut ĂȘtre rĂ©parĂ©e avant une seconde. Cependant, quand de multiples dĂ©faillances sont dĂ©tectĂ©s simultanĂ©ment, nous prouvons que nos codes rĂ©gĂ©nĂ©rants coordonnĂ©s sont optimaux et dĂ©passe les rĂ©parations non coordonnĂ©es (vis Ă  vis des coĂ»ts de stockage et de rĂ©paration). Enfin, nous dĂ©finissons des codes rĂ©gĂ©nĂ©rants adaptatifs qui s'auto-adapte Ă  l'Ă©tat du systĂšme et prouvons qu'ils sont optimaux

    Storage codes -- coding rate and repair locality

    Full text link
    The {\em repair locality} of a distributed storage code is the maximum number of nodes that ever needs to be contacted during the repair of a failed node. Having small repair locality is desirable, since it is proportional to the number of disk accesses during repair. However, recent publications show that small repair locality comes with a penalty in terms of code distance or storage overhead if exact repair is required. Here, we first review some of the main results on storage codes under various repair regimes and discuss the recent work on possible (information-theoretical) trade-offs between repair locality and other code parameters like storage overhead and code distance, under the exact repair regime. Then we present some new information theoretical lower bounds on the storage overhead as a function of the repair locality, valid for all common coding and repair models. In particular, we show that if each of the nn nodes in a distributed storage system has storage capacity \ga and if, at any time, a failed node can be {\em functionally} repaired by contacting {\em some} set of rr nodes (which may depend on the actual state of the system) and downloading an amount \gb of data from each, then in the extreme cases where \ga=\gb or \ga = r\gb, the maximal coding rate is at most r/(r+1)r/(r+1) or 1/2, respectively (that is, the excess storage overhead is at least 1/r1/r or 1, respectively).Comment: Accepted for publication in ICNC'13, San Diego, US
    • 

    corecore