131 research outputs found

    Private Computation of Systematically Encoded Data with Colluding Servers

    Full text link
    Private Computation (PC), recently introduced by Sun and Jafar, is a generalization of Private Information Retrieval (PIR) in which a user wishes to privately compute an arbitrary function of data stored across several servers. We construct a PC scheme which accounts for server collusion, coded data, and non-linear functions. For data replicated over several possibly colluding servers, our scheme computes arbitrary functions of the data with rate equal to the asymptotic capacity of PIR for this setup. For systematically encoded data stored over colluding servers, we privately compute arbitrary functions of the columns of the data matrix and calculate the rate explicitly for polynomial functions. The scheme is a generalization of previously studied star-product PIR schemes.Comment: Submitted to IEEE International Symposium on Information Theory 2018. Version 2 fixes some typos and adds some clarifying remark

    Private Polynomial Computation from Lagrange Encoding

    Get PDF
    Private computation is a generalization of private information retrieval, in which a user is able to compute a function on a distributed dataset without revealing the identity of that function to the servers that store the dataset. In this paper it is shown that Lagrange encoding, a recently suggested powerful technique for encoding Reed-Solomon codes, enables private computation in many cases of interest. In particular, we present a scheme that enables private computation of polynomials of any degree on Lagrange encoded data, while being robust to Byzantine and straggling servers, and to servers that collude in attempt to deduce the identities of the functions to be evaluated. Moreover, incorporating ideas from the well-known Shamir secret sharing scheme allows the data itself to be concealed from the servers as well. Our results extend private computation to non-linear polynomials and to data-privacy, and reveal a tight connection between private computation and coded computation

    Private Polynomial Computation from Lagrange Encoding

    Get PDF
    Private computation is a generalization of private information retrieval, in which a user is able to compute a function on a distributed dataset without revealing the identity of that function to the servers. In this paper it is shown that Lagrange encoding, a powerful technique for encoding Reed-Solomon codes, enables private computation in many cases of interest. In particular, we present a scheme that enables private computation of polynomials of any degree on Lagrange encoded data, while being robust to Byzantine and straggling servers, and to servers colluding to attempt to deduce the identities of the functions to be evaluated. Moreover, incorporating ideas from the well-known Shamir secret sharing scheme allows the data itself to be concealed from the servers as well. Our results extend private computation to high degree polynomials and to data-privacy, and reveal a tight connection between private computation and coded computation.Comment: To appear in Transactions on Information Forensics and Securit

    On the Asymptotic Capacity of XX-Secure TT-Private Information Retrieval with Graph Based Replicated Storage

    Full text link
    The problem of private information retrieval with graph-based replicated storage was recently introduced by Raviv, Tamo and Yaakobi. Its capacity remains open in almost all cases. In this work the asymptotic (large number of messages) capacity of this problem is studied along with its generalizations to include arbitrary TT-privacy and XX-security constraints, where the privacy of the user must be protected against any set of up to TT colluding servers and the security of the stored data must be protected against any set of up to XX colluding servers. A general achievable scheme for arbitrary storage patterns is presented that achieves the rate (ρminXT)/N(\rho_{\min}-X-T)/N, where NN is the total number of servers, and each message is replicated at least ρmin\rho_{\min} times. Notably, the scheme makes use of a special structure inspired by dual Generalized Reed Solomon (GRS) codes. A general converse is also presented. The two bounds are shown to match for many settings, including symmetric storage patterns. Finally, the asymptotic capacity is fully characterized for the case without security constraints (X=0)(X=0) for arbitrary storage patterns provided that each message is replicated no more than T+2T+2 times. As an example of this result, consider PIR with arbitrary graph based storage (T=1,X=0T=1, X=0) where every message is replicated at exactly 33 servers. For this 33-replicated storage setting, the asymptotic capacity is equal to 2/ν2(G)2/\nu_2(G) where ν2(G)\nu_2(G) is the maximum size of a 22-matching in a storage graph G[V,E]G[V,E]. In this undirected graph, the vertices VV correspond to the set of servers, and there is an edge uvEuv\in E between vertices u,vu,v only if a subset of messages is replicated at both servers uu and vv

    Coding against stragglers in distributed computation scenarios

    Get PDF
    Data and analytics capabilities have made a leap forward in recent years. The volume of available data has grown exponentially. The huge amount of data needs to be transferred and stored with extremely high reliability. The concept of coded computing , or a distributed computing paradigm that utilizes coding theory to smartly inject and leverage data/computation redundancy into distributed computing systems, mitigates the fundamental performance bottlenecks for running large-scale data analytics. In this dissertation, a distributed computing framework, first for input files distributedly stored on the uplink of a cloud radio access network architecture, is studied. It focuses on that decoding at the cloud takes place via network function virtualization on commercial off-the-shelf servers. In order to mitigate the impact of straggling decoders in this platform, a novel coding strategy is proposed, whereby the cloud re-encodes the received frames via a linear code before distributing them to the decoding processors. Transmission of a single frame is considered first, and upper bounds on the resulting frame unavailability probability as a function of the decoding latency are derived by assuming a binary symmetric channel for uplink communications. Then, the analysis is extended to account for random frame arrival times. In this case, the trade-off between an average decoding latency and the frame error rate is studied for two different queuing policies, whereby the servers carry out per-frame decoding or continuous decoding, respectively. Numerical examples demonstrate that the bounds are useful tools for code design and that coding is instrumental in obtaining a desirable compromise between decoding latency and reliability. In the second part of this dissertation large matrix multiplications are considered which are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, and the total amount of data to be downloaded from the workers. In addition to exact recovery requirements, security and privacy constraints on the data matrices are imposed, and the recovery threshold as a function of the communication load is studied. First, it is assumed that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, a novel class of secure codes is introduced, referred to as secure generalized PolyDot codes, that generalize state-of-the-art non-secure codes for matrix multiplication. Secure generalized PolyDot codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. Then, a connection between secure matrix multiplication and private information retrieval is studied. It is assumed that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, a variant of generalized PolyDot codes is presented that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers

    Private Polynomial Computation from Lagrange Encoding

    Get PDF
    Private computation is a generalization of private information retrieval, in which a user is able to compute a function on a distributed dataset without revealing the identity of that function to the servers that store the dataset. In this paper it is shown that Lagrange encoding, a recently suggested powerful technique for encoding Reed-Solomon codes, enables private computation in many cases of interest. In particular, we present a scheme that enables private computation of polynomials of any degree on Lagrange encoded data, while being robust to Byzantine and straggling servers, and to servers that collude in attempt to deduce the identities of the functions to be evaluated. Moreover, incorporating ideas from the well-known Shamir secret sharing scheme allows the data itself to be concealed from the servers as well. Our results extend private computation to non-linear polynomials and to data-privacy, and reveal a tight connection between private computation and coded computation

    The Asymptotic Capacity of XX-Secure TT-Private Linear Computation with Graph Based Replicated Storage

    Full text link
    The problem of XX-secure TT-private linear computation with graph based replicated storage (GXSTPLC) is to enable the user to retrieve a linear combination of messages privately from a set of NN distributed servers where every message is only allowed to store among a subset of servers subject to an XX-security constraint, i.e., any groups of up to XX colluding servers must reveal nothing about the messages. Besides, any groups of up to TT servers cannot learn anything about the coefficients of the linear combination retrieved by the user. In this work, we completely characterize the asymptotic capacity of GXSTPLC, i.e., the supremum of average number of desired symbols retrieved per downloaded symbol, in the limit as the number of messages KK approaches infinity. Specifically, it is shown that a prior linear programming based upper bound on the asymptotic capacity of GXSTPLC due to Jia and Jafar is tight by constructing achievability schemes. Notably, our achievability scheme also settles the exact capacity (i.e., for finite KK) of XX-secure linear combination with graph based replicated storage (GXSLC). Our achievability proof builds upon an achievability scheme for a closely related problem named asymmetric X\mathbf{X}-secure T\mathbf{T}-private linear computation with graph based replicated storage (Asymm-GXSTPLC) that guarantees non-uniform security and privacy levels across messages and coefficients. In particular, by carefully designing Asymm-GXSTPLC settings for GXSTPLC problems, the corresponding Asymm-GXSTPLC schemes can be reduced to asymptotic capacity achieving schemes for GXSTPLC. In regard to the achievability scheme for Asymm-GXSTPLC, interesting aspects of our construction include a novel query and answer design which makes use of a Vandermonde decomposition of Cauchy matrices, and a trade-off among message replication, security and privacy thresholds.Comment: 39 pages, 2 figure
    corecore