126 research outputs found

    Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

    Get PDF
    We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to 13.43Γ—13.43\times, and also achieves a 2.36Γ—2.36\times-12.65Γ—12.65\times speedup over the state-of-the-art straggler mitigation strategies

    Near MDS poset codes and distributions

    Full text link
    We study qq-ary codes with distance defined by a partial order of the coordinates of the codewords. Maximum Distance Separable (MDS) codes in the poset metric have been studied in a number of earlier works. We consider codes that are close to MDS codes by the value of their minimum distance. For such codes, we determine their weight distribution, and in the particular case of the "ordered metric" characterize distributions of points in the unit cube defined by the codes. We also give some constructions of codes in the ordered Hamming space.Comment: 13 pages, 1 figur

    Full Diversity Unitary Precoded Integer-Forcing

    Full text link
    We consider a point-to-point flat-fading MIMO channel with channel state information known both at transmitter and receiver. At the transmitter side, a lattice coding scheme is employed at each antenna to map information symbols to independent lattice codewords drawn from the same codebook. Each lattice codeword is then multiplied by a unitary precoding matrix P{\bf P} and sent through the channel. At the receiver side, an integer-forcing (IF) linear receiver is employed. We denote this scheme as unitary precoded integer-forcing (UPIF). We show that UPIF can achieve full-diversity under a constraint based on the shortest vector of a lattice generated by the precoding matrix P{\bf P}. This constraint and a simpler version of that provide design criteria for two types of full-diversity UPIF. Type I uses a unitary precoder that adapts at each channel realization. Type II uses a unitary precoder, which remains fixed for all channel realizations. We then verify our results by computer simulations in 2Γ—22\times2, and 4Γ—44\times 4 MIMO using different QAM constellations. We finally show that the proposed Type II UPIF outperform the MIMO precoding X-codes at high data rates.Comment: 12 pages, 8 figures, to appear in IEEE-TW

    Storage Codes with Flexible Number of Nodes

    Full text link
    This paper presents flexible storage codes, a class of error-correcting codes that can recover information from a flexible number of storage nodes. As a result, one can make a better use of the available storage nodes in the presence of unpredictable node failures and reduce the data access latency. Let us assume a storage system encodes kβ„“k\ell information symbols over a finite field F\mathbb{F} into nn nodes, each of size β„“\ell symbols. The code is parameterized by a set of tuples {(Rj,kj,β„“j):1≀j≀a}\{(R_j,k_j,\ell_j): 1 \le j \le a\}, satisfying k1β„“1=k2β„“2=...=kaβ„“ak_1\ell_1=k_2\ell_2=...=k_a\ell_a and k1>k2>...>ka=k,β„“a=β„“k_1>k_2>...>k_a = k, \ell_a=\ell, such that the information symbols can be reconstructed from any RjR_j nodes, each node accessing β„“j\ell_j symbols. In other words, the code allows a flexible number of nodes for decoding to accommodate the variance in the data access time of the nodes. Code constructions are presented for different storage scenarios, including LRC (locally recoverable) codes, PMDS (partial MDS) codes, and MSR (minimum storage regenerating) codes. We analyze the latency of accessing information and perform simulations on Amazon clusters to show the efficiency of presented codes
    • …
    corecore