6 research outputs found

    Coding against stragglers in distributed computation scenarios

    Get PDF
    Data and analytics capabilities have made a leap forward in recent years. The volume of available data has grown exponentially. The huge amount of data needs to be transferred and stored with extremely high reliability. The concept of coded computing , or a distributed computing paradigm that utilizes coding theory to smartly inject and leverage data/computation redundancy into distributed computing systems, mitigates the fundamental performance bottlenecks for running large-scale data analytics. In this dissertation, a distributed computing framework, first for input files distributedly stored on the uplink of a cloud radio access network architecture, is studied. It focuses on that decoding at the cloud takes place via network function virtualization on commercial off-the-shelf servers. In order to mitigate the impact of straggling decoders in this platform, a novel coding strategy is proposed, whereby the cloud re-encodes the received frames via a linear code before distributing them to the decoding processors. Transmission of a single frame is considered first, and upper bounds on the resulting frame unavailability probability as a function of the decoding latency are derived by assuming a binary symmetric channel for uplink communications. Then, the analysis is extended to account for random frame arrival times. In this case, the trade-off between an average decoding latency and the frame error rate is studied for two different queuing policies, whereby the servers carry out per-frame decoding or continuous decoding, respectively. Numerical examples demonstrate that the bounds are useful tools for code design and that coding is instrumental in obtaining a desirable compromise between decoding latency and reliability. In the second part of this dissertation large matrix multiplications are considered which are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, and the total amount of data to be downloaded from the workers. In addition to exact recovery requirements, security and privacy constraints on the data matrices are imposed, and the recovery threshold as a function of the communication load is studied. First, it is assumed that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, a novel class of secure codes is introduced, referred to as secure generalized PolyDot codes, that generalize state-of-the-art non-secure codes for matrix multiplication. Secure generalized PolyDot codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. Then, a connection between secure matrix multiplication and private information retrieval is studied. It is assumed that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, a variant of generalized PolyDot codes is presented that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers

    Providing Private and Fast Data Access for Cloud Systems

    Get PDF
    Cloud storage and computing systems have become the backbone of many applications such as streaming (Netflix, YouTube), storage (Dropbox, Google Drive), and computing (Amazon Elastic Computing, Microsoft Azure). To address the ever growing demand for storage and computing requirements of these applications, cloud services are typically im-plemented over a large-scale distributed data storage system. Cloud systems are expected to provide the following two pivotal services for the users: 1) private content access and 2) fast content access. The goal of this thesis is to understand and address some of the challenges that need to be overcome to provide these two services. The first part of this thesis focuses on private data access in distributed systems. In particular, we contribute to the areas of Private Information Retrieval (PIR) and Private Computation (PC). In the PIR problem, there is a user who wishes to privately retrieve a subset of files belonging to a database stored on a single or multiple remote server(s). In the PC problem, the user wants to privately compute functions of a subset of files in the database. The PIR and PC problems seek the most efficient solutions with the minimum download cost that enable the user to retrieve or compute what it wants privately. We establish fundamental bounds on the minimum download cost required for guaran-teeing the privacy requirement in some practical and realistic settings of the PIR and PC problems and develop novel and efficient privacy-preserving algorithms for these settings. In particular, we study the single-server and multi-server settings of PIR in which the user initially has a random linear combination of a subset of files in the database as side in-formation, referred to as PIR with coded side information. We also study the multi-server setting of the PC in which the user wants to privately compute multiple linear combinations of a subset of files in the database, referred to as Private Linear Transformation. The second part of this thesis focuses on fast content access in distributed systems. In particular, we study the use of erasure coding to handle data access requests in distributed storage and computing systems. Service rate region is an important performance metric for coded distributed systems, which expresses the set of all data access request rates that can be simultaneously served by the system. In this context, two classes of problems arise: 1) characterizing the service rate region of a given storage scheme and finding the optimal request allocation, and 2) designing the underlying erasure code to handle a given desired service rate region. As contributions along the first class of problems, we characterize the service rate region of systems with some common coding schemes such as Simplex codes and Reed-Muller codes by introducing two novel techniques: 1) fractional matching and vertex cover on graph representation of codes, and 2) geometric representations of codes. Moreover, along the second class of code design, we establish some lower bounds on the minimum storage required to handle a desired service rate region for a coded distributed system and in some regimes, we design efficient storage schemes that provide the desired service rate region while minimizing the storage requirements