1,246 research outputs found

    On Secure Distributed Data Storage Under Repair Dynamics

    Full text link
    We address the problem of securing distributed storage systems against passive eavesdroppers that can observe a limited number of storage nodes. An important aspect of these systems is node failures over time, which demand a repair mechanism aimed at maintaining a targeted high level of system reliability. If an eavesdropper observes a node that is added to the system to replace a failed node, it will have access to all the data downloaded during repair, which can potentially compromise the entire information in the system. We are interested in determining the secrecy capacity of distributed storage systems under repair dynamics, i.e., the maximum amount of data that can be securely stored and made available to a legitimate user without revealing any information to any eavesdropper. We derive a general upper bound on the secrecy capacity and show that this bound is tight for the bandwidth-limited regime which is of importance in scenarios such as peer-to-peer distributed storage systems. We also provide a simple explicit code construction that achieves the capacity for this regime.Comment: 5 pages, 4 figures, to appear in Proceedings of IEEE ISIT 201

    Storage codes -- coding rate and repair locality

    Full text link
    The {\em repair locality} of a distributed storage code is the maximum number of nodes that ever needs to be contacted during the repair of a failed node. Having small repair locality is desirable, since it is proportional to the number of disk accesses during repair. However, recent publications show that small repair locality comes with a penalty in terms of code distance or storage overhead if exact repair is required. Here, we first review some of the main results on storage codes under various repair regimes and discuss the recent work on possible (information-theoretical) trade-offs between repair locality and other code parameters like storage overhead and code distance, under the exact repair regime. Then we present some new information theoretical lower bounds on the storage overhead as a function of the repair locality, valid for all common coding and repair models. In particular, we show that if each of the nn nodes in a distributed storage system has storage capacity \ga and if, at any time, a failed node can be {\em functionally} repaired by contacting {\em some} set of rr nodes (which may depend on the actual state of the system) and downloading an amount \gb of data from each, then in the extreme cases where \ga=\gb or \ga = r\gb, the maximal coding rate is at most r/(r+1)r/(r+1) or 1/2, respectively (that is, the excess storage overhead is at least 1/r1/r or 1, respectively).Comment: Accepted for publication in ICNC'13, San Diego, US

    Repairable Block Failure Resilient Codes

    Full text link
    In large scale distributed storage systems (DSS) deployed in cloud computing, correlated failures resulting in simultaneous failure (or, unavailability) of blocks of nodes are common. In such scenarios, the stored data or a content of a failed node can only be reconstructed from the available live nodes belonging to available blocks. To analyze the resilience of the system against such block failures, this work introduces the framework of Block Failure Resilient (BFR) codes, wherein the data (e.g., file in DSS) can be decoded by reading out from a same number of codeword symbols (nodes) from each available blocks of the underlying codeword. Further, repairable BFR codes are introduced, wherein any codeword symbol in a failed block can be repaired by contacting to remaining blocks in the system. Motivated from regenerating codes, file size bounds for repairable BFR codes are derived, trade-off between per node storage and repair bandwidth is analyzed, and BFR-MSR and BFR-MBR points are derived. Explicit codes achieving these two operating points for a wide set of parameters are constructed by utilizing combinatorial designs, wherein the codewords of the underlying outer codes are distributed to BFR codeword symbols according to projective planes

    Access vs. Bandwidth in Codes for Storage

    Get PDF
    Maximum distance separable (MDS) codes are widely used in storage systems to protect against disk (node) failures. A node is said to have capacity ll over some field F\mathbb{F}, if it can store that amount of symbols of the field. An (n,k,l)(n,k,l) MDS code uses nn nodes of capacity ll to store kk information nodes. The MDS property guarantees the resiliency to any nβˆ’kn-k node failures. An \emph{optimal bandwidth} (resp. \emph{optimal access}) MDS code communicates (resp. accesses) the minimum amount of data during the repair process of a single failed node. It was shown that this amount equals a fraction of 1/(nβˆ’k)1/(n-k) of data stored in each node. In previous optimal bandwidth constructions, ll scaled polynomially with kk in codes with asymptotic rate <1<1. Moreover, in constructions with a constant number of parities, i.e. rate approaches 1, ll is scaled exponentially w.r.t. kk. In this paper, we focus on the later case of constant number of parities nβˆ’k=rn-k=r, and ask the following question: Given the capacity of a node ll what is the largest number of information disks kk in an optimal bandwidth (resp. access) (k+r,k,l)(k+r,k,l) MDS code. We give an upper bound for the general case, and two tight bounds in the special cases of two important families of codes. Moreover, the bounds show that in some cases optimal-bandwidth code has larger kk than optimal-access code, and therefore these two measures are not equivalent.Comment: This paper was presented in part at the IEEE International Symposium on Information Theory (ISIT 2012). submitted to IEEE transactions on information theor
    • …