430 research outputs found

    PIR Array Codes with Optimal Virtual Server Rate

    Get PDF
    There has been much recent interest in Private information Retrieval (PIR) in models where a database is stored across several servers using coding techniques from distributed storage, rather than being simply replicated. In particular, a recent breakthrough result of Fazelli, Vardy and Yaakobi introduces the notion of a PIR code and a PIR array code, and uses this notion to produce efficient PIR protocols. In this paper we are interested in designing PIR array codes. We consider the case when we have mm servers, with each server storing a fraction (1/s)(1/s) of the bits of the database; here ss is a fixed rational number with s>1s > 1. A PIR array code with the kk-PIR property enables a kk-server PIR protocol (with k≤mk\leq m) to be emulated on mm servers, with the overall storage requirements of the protocol being reduced. The communication complexity of a PIR protocol reduces as kk grows, so the virtual server rate, defined to be k/mk/m, is an important parameter. We study the maximum virtual server rate of a PIR array code with the kk-PIR property. We present upper bounds on the achievable virtual server rate, some constructions, and ideas how to obtain PIR array codes with the highest possible virtual server rate. In particular, we present constructions that asymptotically meet our upper bounds, and the exact largest virtual server rate is obtained when 1<s≤21 < s \leq 2. A kk-PIR code (and similarly a kk-PIR array code) is also a locally repairable code with symbol availability k−1k-1. Such a code ensures kk parallel reads for each information symbol. So the virtual server rate is very closely related to the symbol availability of the code when used as a locally repairable code. The results of this paper are discussed also in this context, where subspace codes also have an important role

    Some new constructions of optimal linear codes and alphabet-optimal (r,δ)(r,\delta)-locally repairable codes

    Full text link
    In distributed storage systems, locally repairable codes (LRCs) are designed to reduce disk I/O and repair costs by enabling recovery of each code symbol from a small number of other symbols. To handle multiple node failures, (r,δ)(r,\delta)-LRCs are introduced to enable local recovery in the event of up to δ−1\delta-1 failed nodes. Constructing optimal (r,δ)(r,\delta)-LRCs has been a significant research topic over the past decade. In \cite{Luo2022}, Luo \emph{et al.} proposed a construction of linear codes by using unions of some projective subspaces within a projective space. Several new classes of Griesmer codes and distance-optimal codes were constructed, and some of them were proved to be alphabet-optimal 22-LRCs. In this paper, we first modify the method of constructing linear codes in \cite{Luo2022} by considering a more general situation of intersecting projective subspaces. This modification enables us to construct good codes with more flexible parameters. Additionally, we present the conditions for the constructed linear codes to qualify as Griesmer codes or achieve distance optimality. Next, we explore the locality of linear codes constructed by eliminating elements from a complete projective space. The novelty of our work lies in establishing the locality as (2,p−2)(2,p-2), (2,p−1)(2,p-1), or (2,p)(2,p)-locality, in contrast to the previous literature that only considered 22-locality. Moreover, by combining analysis of code parameters and the C-M like bound for (r,δ)(r,\delta)-LRCs, we construct some alphabet-optimal (2,δ)(2,\delta)-LRCs which may be either Griesmer codes or not Griesmer codes. Finally, we investigate the availability and alphabet-optimality of (r,δ)(r,\delta)-LRCs constructed from our modified framework.Comment: 25 page

    Coding for the Clouds: Coding Techniques for Enabling Security, Locality, and Availability in Distributed Storage Systems

    Get PDF
    Cloud systems have become the backbone of many applications such as multimedia streaming, e-commerce, and cluster computing. At the foundation of any cloud architecture lies a large-scale, distributed, data storage system. To accommodate the massive amount of data being stored on the cloud, these distributed storage systems (DSS) have been scaled to contain hundreds to thousands of nodes that are connected through a networking infrastructure. Such data-centers are usually built out of commodity components, which make failures the norm rather than the exception. In order to combat node failures, data is typically stored in a redundant fashion. Due to the exponential data growth rate, many DSS are beginning to resort to error control coding over conventional replication methods, as coding offers high storage space efficiency. This paradigm shift from replication to coding, along with the need to guarantee reliability, efficiency, and security in DSS, has created a new set of challenges and opportunities, opening up a new area of research. This thesis addresses several of these challenges and opportunities by broadly making the following contributions. (i) We design practically amenable, low-complexity coding schemes that guarantee security of cloud systems, ensure quick recovery from failures, and provide high availability for retrieving partial information; and (ii) We analyze fundamental performance limits and optimal trade-offs between the key performance metrics of these coding schemes. More specifically, we first consider the problem of achieving information-theoretic security in DSS against an eavesdropper that can observe a limited number of nodes. We present a framework that enables design of secure repair-efficient codes through a joint construction of inner and outer codes. Then, we consider a practically appealing notion of weakly secure coding, and construct coset codes that can weakly secure a wide class of regenerating codes that reduce the amount of data downloaded during node repair. Second, we consider the problem of meeting repair locality constraints, which specify the number of nodes participating in the repair process. We propose a notion of unequal locality, which enables different locality values for different nodes, ensuring quick recovery for nodes storing important data. We establish tight upper bounds on the minimum distance of linear codes with unequal locality, and present optimal code constructions. Next, we extend the notion of locality from the Hamming metric to the rank and subspace metrics, with the goal of designing codes for efficient data recovery from special types of correlated failures in DSS.We construct a family of locally recoverable rank-metric codes with optimal data recovery properties. Finally, we consider the problem of providing high availability, which is ensured by enabling node repair from multiple disjoint subsets of nodes of small size. We study codes with availability from a queuing-theoretical perspective by analyzing the average time necessary to download a block of data under the Poisson request arrival model when each node takes a random amount of time to fetch its contents. We compare the delay performance of the availability codes with several alternatives such as conventional erasure codes and replication schemes

    Codes with efficient erasure correction

    Get PDF
    Distributed storage systems are becoming increasingly ubiquitous in the emerging era of Internet of Things. Major internet technology companies employ large-scale distributed storage systems to accommodate the massive amounts of data generated and requested by global users. The need of reliable and efficient storage of immense amounts of data calls for new applications and development of classical error-correcting codes. This dissertation is devoted to a study of codes with efficient erasure correction for distributed storage systems. The efficiency of erasure correction is often assessed by two performance metrics, bandwidth and locality. In this dissertation we address several problems for each of these two metrics. We construct families of codes with optimal communication complexity for erasure correction ("repair bandwidth") for a heterogeneous storage model, and derive several results for the problem of optimal repair of Reed-Solomon codes. We also construct families of cyclic and convolutional codes with locality, extending the range of parameters for which such families were previously known

    A family of optimal locally recoverable codes

    Full text link
    A code over a finite alphabet is called locally recoverable (LRC) if every symbol in the encoding is a function of a small number (at most rr) other symbols. We present a family of LRC codes that attain the maximum possible value of the distance for a given locality parameter and code cardinality. The codewords are obtained as evaluations of specially constructed polynomials over a finite field, and reduce to a Reed-Solomon code if the locality parameter rr is set to be equal to the code dimension. The size of the code alphabet for most parameters is only slightly greater than the code length. The recovery procedure is performed by polynomial interpolation over rr points. We also construct codes with several disjoint recovering sets for every symbol. This construction enables the system to conduct several independent and simultaneous recovery processes of a specific symbol by accessing different parts of the codeword. This property enables high availability of frequently accessed data ("hot data").Comment: Minor changes. This is the final published version of the pape

    Optimal Locally Repairable Linear Codes

    Full text link
    Linear erasure codes with local repairability are desirable for distributed data storage systems. An [n, k, d] code having all-symbol (r, \delta})-locality, denoted as (r, {\delta})a, is considered optimal if it also meets the minimum Hamming distance bound. The existing results on the existence and the construction of optimal (r, {\delta})a codes are limited to only the special case of {\delta} = 2, and to only two small regions within this special case, namely, m = 0 or m >= (v+{\delta}-1) > ({\delta}-1), where m = n mod (r+{\delta}-1) and v = k mod r. This paper investigates the existence conditions and presents deterministic constructive algorithms for optimal (r, {\delta})a codes with general r and {\delta}. First, a structure theorem is derived for general optimal (r, {\delta})a codes which helps illuminate some of their structure properties. Next, the entire problem space with arbitrary n, k, r and {\delta} is divided into eight different cases (regions) with regard to the specific relations of these parameters. For two cases, it is rigorously proved that no optimal (r, {\delta})a could exist. For four other cases the optimal (r, {\delta})a codes are shown to exist, deterministic constructions are proposed and the lower bound on the required field size for these algorithms to work is provided. Our new constructive algorithms not only cover more cases, but for the same cases where previous algorithms exist, the new constructions require a considerably smaller field, which translates to potentially lower computational complexity. Our findings substantially enriches the knowledge on (r, {\delta})a codes, leaving only two cases in which the existence of optimal codes are yet to be determined.Comment: Under Revie
    • …
    corecore