1,428 research outputs found

    Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

    Get PDF
    We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to 13.43×13.43\times, and also achieves a 2.36×2.36\times-12.65×12.65\times speedup over the state-of-the-art straggler mitigation strategies

    How to Securely Compute the Modulo-Two Sum of Binary Sources

    Full text link
    In secure multiparty computation, mutually distrusting users in a network want to collaborate to compute functions of data which is distributed among the users. The users should not learn any additional information about the data of others than what they may infer from their own data and the functions they are computing. Previous works have mostly considered the worst case context (i.e., without assuming any distribution for the data); Lee and Abbe (2014) is a notable exception. Here, we study the average case (i.e., we work with a distribution on the data) where correctness and privacy is only desired asymptotically. For concreteness and simplicity, we consider a secure version of the function computation problem of K\"orner and Marton (1979) where two users observe a doubly symmetric binary source with parameter p and the third user wants to compute the XOR. We show that the amount of communication and randomness resources required depends on the level of correctness desired. When zero-error and perfect privacy are required, the results of Data et al. (2014) show that it can be achieved if and only if a total rate of 1 bit is communicated between every pair of users and private randomness at the rate of 1 is used up. In contrast, we show here that, if we only want the probability of error to vanish asymptotically in block length, it can be achieved by a lower rate (binary entropy of p) for all the links and for private randomness; this also guarantees perfect privacy. We also show that no smaller rates are possible even if privacy is only required asymptotically.Comment: 6 pages, 1 figure, extended version of submission to IEEE Information Theory Workshop, 201

    Finding the Median (Obliviously) with Bounded Space

    Full text link
    We prove that any oblivious algorithm using space SS to find the median of a list of nn integers from {1,...,2n}\{1,...,2n\} requires time Ω(nloglogSn)\Omega(n \log\log_S n). This bound also applies to the problem of determining whether the median is odd or even. It is nearly optimal since Chan, following Munro and Raman, has shown that there is a (randomized) selection algorithm using only ss registers, each of which can store an input value or O(logn)O(\log n)-bit counter, that makes only O(loglogsn)O(\log\log_s n) passes over the input. The bound also implies a size lower bound for read-once branching programs computing the low order bit of the median and implies the analog of PNPcoNPP \ne NP \cap coNP for length o(nloglogn)o(n \log\log n) oblivious branching programs

    Learning to Prune: Speeding up Repeated Computations

    Get PDF
    It is common to encounter situations where one must solve a sequence of similar computational problems. Running a standard algorithm with worst-case runtime guarantees on each instance will fail to take advantage of valuable structure shared across the problem instances. For example, when a commuter drives from work to home, there are typically only a handful of routes that will ever be the shortest path. A naive algorithm that does not exploit this common structure may spend most of its time checking roads that will never be in the shortest path. More generally, we can often ignore large swaths of the search space that will likely never contain an optimal solution. We present an algorithm that learns to maximally prune the search space on repeated computations, thereby reducing runtime while provably outputting the correct solution each period with high probability. Our algorithm employs a simple explore-exploit technique resembling those used in online algorithms, though our setting is quite different. We prove that, with respect to our model of pruning search spaces, our approach is optimal up to constant factors. Finally, we illustrate the applicability of our model and algorithm to three classic problems: shortest-path routing, string search, and linear programming. We present experiments confirming that our simple algorithm is effective at significantly reducing the runtime of solving repeated computations
    corecore