3,528 research outputs found

    Multiset Combinatorial Batch Codes

    Full text link
    Batch codes, first introduced by Ishai, Kushilevitz, Ostrovsky, and Sahai, mimic a distributed storage of a set of nn data items on mm servers, in such a way that any batch of kk data items can be retrieved by reading at most some tt symbols from each server. Combinatorial batch codes, are replication-based batch codes in which each server stores a subset of the data items. In this paper, we propose a generalization of combinatorial batch codes, called multiset combinatorial batch codes (MCBC), in which nn data items are stored in mm servers, such that any multiset request of kk items, where any item is requested at most rr times, can be retrieved by reading at most tt items from each server. The setup of this new family of codes is motivated by recent work on codes which enable high availability and parallel reads in distributed storage systems. The main problem under this paradigm is to minimize the number of items stored in the servers, given the values of n,m,k,r,tn,m,k,r,t, which is denoted by N(n,k,m,t;r)N(n,k,m,t;r). We first give a necessary and sufficient condition for the existence of MCBCs. Then, we present several bounds on N(n,k,m,t;r)N(n,k,m,t;r) and constructions of MCBCs. In particular, we determine the value of N(n,k,m,1;r)N(n,k,m,1;r) for any nβ‰₯⌊kβˆ’1rβŒ‹(mkβˆ’1)βˆ’(mβˆ’k+1)A(m,4,kβˆ’2)n\geq \left\lfloor\frac{k-1}{r}\right\rfloor{m\choose k-1}-(m-k+1)A(m,4,k-2), where A(m,4,kβˆ’2)A(m,4,k-2) is the maximum size of a binary constant weight code of length mm, distance four and weight kβˆ’2k-2. We also determine the exact value of N(n,k,m,1;r)N(n,k,m,1;r) when r∈{k,kβˆ’1}r\in\{k,k-1\} or k=mk=m

    Load-Balanced Fractional Repetition Codes

    Full text link
    We introduce load-balanced fractional repetition (LBFR) codes, which are a strengthening of fractional repetition (FR) codes. LBFR codes have the additional property that multiple node failures can be sequentially repaired by downloading no more than one block from any other node. This allows for better use of the network, and can additionally reduce the number of disk reads necessary to repair multiple nodes. We characterize LBFR codes in terms of their adjacency graphs, and use this characterization to present explicit constructions LBFR codes with storage capacity comparable existing FR codes. Surprisingly, in some parameter regimes, our constructions of LBFR codes match the parameters of the best constructions of FR codes

    Repairable Block Failure Resilient Codes

    Full text link
    In large scale distributed storage systems (DSS) deployed in cloud computing, correlated failures resulting in simultaneous failure (or, unavailability) of blocks of nodes are common. In such scenarios, the stored data or a content of a failed node can only be reconstructed from the available live nodes belonging to available blocks. To analyze the resilience of the system against such block failures, this work introduces the framework of Block Failure Resilient (BFR) codes, wherein the data (e.g., file in DSS) can be decoded by reading out from a same number of codeword symbols (nodes) from each available blocks of the underlying codeword. Further, repairable BFR codes are introduced, wherein any codeword symbol in a failed block can be repaired by contacting to remaining blocks in the system. Motivated from regenerating codes, file size bounds for repairable BFR codes are derived, trade-off between per node storage and repair bandwidth is analyzed, and BFR-MSR and BFR-MBR points are derived. Explicit codes achieving these two operating points for a wide set of parameters are constructed by utilizing combinatorial designs, wherein the codewords of the underlying outer codes are distributed to BFR codeword symbols according to projective planes

    Locality and Availability in Distributed Storage

    Full text link
    This paper studies the problem of code symbol availability: a code symbol is said to have (r,t)(r, t)-availability if it can be reconstructed from tt disjoint groups of other symbols, each of size at most rr. For example, 33-replication supports (1,2)(1, 2)-availability as each symbol can be read from its t=2t= 2 other (disjoint) replicas, i.e., r=1r=1. However, the rate of replication must vanish like 1t+1\frac{1}{t+1} as the availability increases. This paper shows that it is possible to construct codes that can support a scaling number of parallel reads while keeping the rate to be an arbitrarily high constant. It further shows that this is possible with the minimum distance arbitrarily close to the Singleton bound. This paper also presents a bound demonstrating a trade-off between minimum distance, availability and locality. Our codes match the aforementioned bound and their construction relies on combinatorial objects called resolvable designs. From a practical standpoint, our codes seem useful for distributed storage applications involving hot data, i.e., the information which is frequently accessed by multiple processes in parallel.Comment: Submitted to ISIT 201
    • …
    corecore