11 research outputs found
Coding against stragglers in distributed computation scenarios
Data and analytics capabilities have made a leap forward in recent years. The volume of available data has grown exponentially. The huge amount of data needs to be transferred and stored with extremely high reliability. The concept of coded computing , or a distributed computing paradigm that utilizes coding theory to smartly inject and leverage data/computation redundancy into distributed computing systems, mitigates the fundamental performance bottlenecks for running large-scale data analytics.
In this dissertation, a distributed computing framework, first for input files distributedly stored on the uplink of a cloud radio access network architecture, is studied. It focuses on that decoding at the cloud takes place via network function virtualization on commercial off-the-shelf servers. In order to mitigate the impact of straggling decoders in this platform, a novel coding strategy is proposed, whereby the cloud re-encodes the received frames via a linear code before distributing them to the decoding processors. Transmission of a single frame is considered first, and upper bounds on the resulting frame unavailability probability as a function of the decoding latency are derived by assuming a binary symmetric channel for uplink communications. Then, the analysis is extended to account for random frame arrival times. In this case, the trade-off between an average decoding latency and the frame error rate is studied for two different queuing policies, whereby the servers carry out per-frame decoding or continuous decoding, respectively. Numerical examples demonstrate that the bounds are useful tools for code design and that coding is instrumental in obtaining a desirable compromise between decoding latency and reliability.
In the second part of this dissertation large matrix multiplications are considered which are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, and the total amount of data to be downloaded from the workers. In addition to exact recovery requirements, security and privacy constraints on the data matrices are imposed, and the recovery threshold as a function of the communication load is studied. First, it is assumed that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, a novel class of secure codes is introduced, referred to as secure generalized PolyDot codes, that generalize state-of-the-art non-secure codes for matrix multiplication. Secure generalized PolyDot codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. Then, a connection between secure matrix multiplication and private information retrieval is studied. It is assumed that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, a variant of generalized PolyDot codes is presented that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers
Coded Computation Against Processing Delays for Virtualized Cloud-Based Channel Decoding
The uplink of a cloud radio access network architecture is studied in which
decoding at the cloud takes place via network function virtualization on
commercial off-the-shelf servers. In order to mitigate the impact of straggling
decoders in this platform, a novel coding strategy is proposed, whereby the
cloud re-encodes the received frames via a linear code before distributing them
to the decoding processors. Transmission of a single frame is considered first,
and upper bounds on the resulting frame unavailability probability as a
function of the decoding latency are derived by assuming a binary symmetric
channel for uplink communications. Then, the analysis is extended to account
for random frame arrival times. In this case, the trade-off between average
decoding latency and the frame error rate is studied for two different queuing
policies, whereby the servers carry out per-frame decoding or continuous
decoding, respectively. Numerical examples demonstrate that the bounds are
useful tools for code design and that coding is instrumental in obtaining a
desirable compromise between decoding latency and reliability.Comment: 11 pages and 12 figures, Submitte
Distributed and Private Coded Matrix Computation with Flexible Communication Load
Tensor operations, such as matrix multiplication, are central to large-scale
machine learning applications. For user-driven tasks these operations can be
carried out on a distributed computing platform with a master server at the
user side and multiple workers in the cloud operating in parallel. For
distributed platforms, it has been recently shown that coding over the input
data matrices can reduce the computational delay, yielding a trade-off between
recovery threshold and communication load. In this paper we impose an
additional security constraint on the data matrices and assume that workers can
collude to eavesdrop on the content of these data matrices. Specifically, we
introduce a novel class of secure codes, referred to as secure generalized
PolyDot codes, that generalizes previously published non-secure versions of
these codes for matrix multiplication. These codes extend the state-of-the-art
by allowing a flexible trade-off between recovery threshold and communication
load for a fixed maximum number of colluding workers.Comment: 8 pages, 6 figures, submitted to 2019 IEEE International Symposium on
Information Theory (ISIT
Public awareness, education and participation in solid waste management in Tehran
Background and objectives: Public participation in is vital in optimal management of municipal solid waste. Thepublic awareness, education and empowerment are the pre-requisites for the use of this potential. In this study, public awareness, education and participation in solid waste management were studied among a community samplein Tehran-2012. Materials and methods: The overall situation of solid waste management in Tehran was firstly assessed. Study participants were, thereafter, sampled from households from the 22 urban districts in the city of Tehran. A questionnaire was prepared and applied to 500 householders to estimate the public awareness, education and participation in solid waste management.Results: The results of this study showed that only about one-third of people had appropriate awareness in the field of solid waste management. The overall status of public education in solid waste management was also insufficient, so that 86% of people were trained at level of poor or very poor. Public participation in solid waste management was variable in different fields. Public participation in simple activities such as avoiding waste spillage and splurge in public places and scheduled transfer of collected waste to public containers was relatively good; and in waste reduction and separation of recyclable components was moderate. Furthermore, separation of hazardous waste and household composting were not done due to lack of required facilities and training.Conclusion: The present study revealed that public education and required facilities should be supplied and expanded in order to increase public participation in solid waste management. Repetition and continuity of education programs, face to face training, and greater use of television and Internet media are emphasized.Keywords: Public education, Public participation, Municipal solid waste, Tehran Cit
Private and Secure Distributed Matrix Multiplication with Flexible Communication Load
Large matrix multiplications are central to large-scale machine learning
applications. These operations are often carried out on a distributed computing
platform with a master server and multiple workers in the cloud operating in
parallel. For such distributed platforms, it has been recently shown that
coding over the input data matrices can reduce the computational delay,
yielding a trade-off between recovery threshold, i.e., the number of workers
required to recover the matrix product, and communication load, i.e., the total
amount of data to be downloaded from the workers. In this paper, in addition to
exact recovery requirements, we impose security and privacy constraints on the
data matrices, and study the recovery threshold as a function of the
communication load. We first assume that both matrices contain private
information and that workers can collude to eavesdrop on the content of these
data matrices. For this problem, we introduce a novel class of secure codes,
referred to as secure generalized PolyDot (SGPD) codes, that generalize
state-of-the-art non-secure codes for matrix multiplication. SGPD codes allow a
flexible trade-off between recovery threshold and communication load for a
fixed maximum number of colluding workers while providing perfect secrecy for
the two data matrices. We then study a connection between secure matrix
multiplication and private information retrieval. We specifically assume that
one of the data matrices is taken from a public set known to all the workers.
In this setup, the identity of the matrix of interest should be kept private
from the workers. For this model, we present a variant of generalized PolyDot
codes that can guarantee both secrecy of one matrix and privacy for the
identity of the other matrix for the case of no colluding servers.Comment: 12 pages, 9 figures, this submission subsumes arXiv:1901.07705. This
work has been submitted to the IEEE for possible publicatio
Linear label code of a root lattice using Gröbner bases
The label code of a lattice plays a key role in the characterization of the lattice. Every lattice Λ can be described in terms of a label code L and an orthogonal sublattice Λ ′ such that Λ / Λ ′≅ L . We identify the binomial ideal associated to an integer lattice and then establish a relation between the ideal quotient of the lattice and its label code. Furthermore, we present the Gröbner basis of the well-known root lattice Dn . As an application of the relation IΛ=IΛ′+IL , where IΛ,IΛ′ and IL denote binomial ideals associated to Λ,Λ′ and L, respectively, a linear label code of Dn is obtained using its Gröbner basis
Binomial Ideal Associated to a Lattice and Its Label Code
Extended abstract In coding theory the study of the binomial ideal derived from an arbitrary code is currently of great interest; see for example Every lattice Λ can be described in terms of a label code L and an orthogonal sublattice Λ such that Λ/Λ ∼ = L [2]. We assign binomial ideals I Λ and I L to an integer lattice Λ and its label code L, respectively. In this work, we identify the binomial ideal associated to an integer lattice and then establish the relation I Λ = I Λ + I L between the ideal of the lattice and its label code