126 research outputs found
Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy
We consider a scenario involving computations over a massive dataset stored
distributedly across multiple workers, which is at the core of distributed
learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework
to simultaneously provide (1) resiliency against stragglers that may prolong
computations; (2) security against Byzantine (or malicious) workers that
deliberately modify the computation for their benefit; and (3)
(information-theoretic) privacy of the dataset amidst possible collusion of
workers. LCC, which leverages the well-known Lagrange polynomial to create
computation redundancy in a novel coded form across workers, can be applied to
any computation scenario in which the function of interest is an arbitrary
multivariate polynomial of the input dataset, hence covering many computations
of interest in machine learning. LCC significantly generalizes prior works to
go beyond linear computations. It also enables secure and private computing in
distributed settings, improving the computation and communication efficiency of
the state-of-the-art. Furthermore, we prove the optimality of LCC by showing
that it achieves the optimal tradeoff between resiliency, security, and
privacy, i.e., in terms of tolerating the maximum number of stragglers and
adversaries, and providing data privacy against the maximum number of colluding
workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the
conventional uncoded implementation of distributed least-squares linear
regression by up to , and also achieves a
- speedup over the state-of-the-art straggler
mitigation strategies
Near MDS poset codes and distributions
We study -ary codes with distance defined by a partial order of the
coordinates of the codewords. Maximum Distance Separable (MDS) codes in the
poset metric have been studied in a number of earlier works. We consider codes
that are close to MDS codes by the value of their minimum distance. For such
codes, we determine their weight distribution, and in the particular case of
the "ordered metric" characterize distributions of points in the unit cube
defined by the codes. We also give some constructions of codes in the ordered
Hamming space.Comment: 13 pages, 1 figur
Full Diversity Unitary Precoded Integer-Forcing
We consider a point-to-point flat-fading MIMO channel with channel state
information known both at transmitter and receiver. At the transmitter side, a
lattice coding scheme is employed at each antenna to map information symbols to
independent lattice codewords drawn from the same codebook. Each lattice
codeword is then multiplied by a unitary precoding matrix and sent
through the channel. At the receiver side, an integer-forcing (IF) linear
receiver is employed. We denote this scheme as unitary precoded integer-forcing
(UPIF). We show that UPIF can achieve full-diversity under a constraint based
on the shortest vector of a lattice generated by the precoding matrix . This constraint and a simpler version of that provide design criteria for
two types of full-diversity UPIF. Type I uses a unitary precoder that adapts at
each channel realization. Type II uses a unitary precoder, which remains fixed
for all channel realizations. We then verify our results by computer
simulations in , and MIMO using different QAM
constellations. We finally show that the proposed Type II UPIF outperform the
MIMO precoding X-codes at high data rates.Comment: 12 pages, 8 figures, to appear in IEEE-TW
Storage Codes with Flexible Number of Nodes
This paper presents flexible storage codes, a class of error-correcting codes
that can recover information from a flexible number of storage nodes. As a
result, one can make a better use of the available storage nodes in the
presence of unpredictable node failures and reduce the data access latency. Let
us assume a storage system encodes information symbols over a finite
field into nodes, each of size symbols. The code is
parameterized by a set of tuples ,
satisfying and , such that the information symbols can be reconstructed from any
nodes, each node accessing symbols. In other words, the code
allows a flexible number of nodes for decoding to accommodate the variance in
the data access time of the nodes. Code constructions are presented for
different storage scenarios, including LRC (locally recoverable) codes, PMDS
(partial MDS) codes, and MSR (minimum storage regenerating) codes. We analyze
the latency of accessing information and perform simulations on Amazon clusters
to show the efficiency of presented codes
- β¦