79,987 research outputs found

    MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture

    Get PDF
    Massively Parallel Computation (MPC) is a model of computation widely believed to best capture realistic parallel computing architectures such as large-scale MapReduce and Hadoop clusters. Motivated by the fact that many data analytics tasks performed on these platforms involve sensitive user data, we initiate the theoretical exploration of how to leverage MPC architectures to enable efficient, privacy-preserving computation over massive data. Clearly if a computation task does not lend itself to an efficient implementation on MPC even without security, then we cannot hope to compute it efficiently on MPC with security. We show, on the other hand, that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we show the following results: - any MPC algorithm can be compiled to a communication-oblivious counterpart while asymptotically preserving its round and space complexity, where communication-obliviousness ensures that any network intermediary observing the communication patterns learn no information about the secret inputs; - assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against an adversary who controls not only intermediate network routers but additionally up to 1/3 - ? fraction of machines (for an arbitrarily small constant ?) - moreover, this compilation preserves the round complexity tightly, and preserves the space complexity upto a multiplicative security parameter related blowup. As an initial exploration of this important direction, our work suggests new definitions and proposes novel protocols that blend algorithmic and cryptographic techniques

    Black-Box Parallelization for Machine Learning

    Get PDF
    The landscape of machine learning applications is changing rapidly: large centralized datasets are replaced by high volume, high velocity data streams generated by a vast number of geographically distributed, loosely connected devices, such as mobile phones, smart sensors, autonomous vehicles or industrial machines. Current learning approaches centralize the data and process it in parallel in a cluster or computing center. This has three major disadvantages: (i) it does not scale well with the number of data-generating devices since their growth exceeds that of computing centers, (ii) the communication costs for centralizing the data are prohibitive in many applications, and (iii) it requires sharing potentially privacy-sensitive data. Pushing computation towards the data-generating devices alleviates these problems and allows to employ their otherwise unused computing power. However, current parallel learning approaches are designed for tightly integrated systems with low latency and high bandwidth, not for loosely connected distributed devices. Therefore, I propose a new paradigm for parallelization that treats the learning algorithm as a black box, training local models on distributed devices and aggregating them into a single strong one. Since this requires only exchanging models instead of actual data, the approach is highly scalable, communication-efficient, and privacy-preserving. Following this paradigm, this thesis develops black-box parallelizations for two broad classes of learning algorithms. One approach can be applied to incremental learning algorithms, i.e., those that improve a model in iterations. Based on the utility of aggregations it schedules communication dynamically, adapting it to the hardness of the learning problem. In practice, this leads to a reduction in communication by orders of magnitude. It is analyzed for (i) online learning, in particular in the context of in-stream learning, which allows to guarantee optimal regret and for (ii) batch learning based on empirical risk minimization where optimal convergence can be guaranteed. The other approach is applicable to non-incremental algorithms as well. It uses a novel aggregation method based on the Radon point that allows to achieve provably high model quality with only a single aggregation. This is achieved in polylogarithmic runtime on quasi-polynomially many processors. This relates parallel machine learning to Nick's class of parallel decision problems and is a step towards answering a fundamental open problem about the abilities and limitations of efficient parallel learning algorithms. An empirical study on real distributed systems confirms the potential of the approaches in realistic application scenarios

    Distributed Training Large-Scale Deep Architectures

    Full text link
    Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this paper, we focus on employing the system approach to speed up large-scale training. Via lessons learned from our routine benchmarking effort, we first identify bottlenecks and overheads that hinter data parallelism. We then devise guidelines that help practitioners to configure an effective system and fine-tune parameters to achieve desired speedup. Specifically, we develop a procedure for setting minibatch size and choosing computation algorithms. We also derive lemmas for determining the quantity of key components such as the number of GPUs and parameter servers. Experiments and examples show that these guidelines help effectively speed up large-scale deep learning training

    Cloudbus Toolkit for Market-Oriented Cloud Computing

    Full text link
    This keynote paper: (1) presents the 21st century vision of computing and identifies various IT paradigms promising to deliver computing as a utility; (2) defines the architecture for creating market-oriented Clouds and computing atmosphere by leveraging technologies such as virtual machines; (3) provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; (4) presents the work carried out as part of our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a Service software system containing SDK (Software Development Kit) for construction of Cloud applications and deployment on private or public Clouds, in addition to supporting market-oriented resource management; (ii) internetworking of Clouds for dynamic creation of federated computing environments for scaling of elastic applications; (iii) creation of 3rd party Cloud brokering services for building content delivery networks and e-Science applications and their deployment on capabilities of IaaS providers such as Amazon along with Grid mashups; (iv) CloudSim supporting modelling and simulation of Clouds for performance studies; (v) Energy Efficient Resource Allocation Mechanisms and Techniques for creation and management of Green Clouds; and (vi) pathways for future research.Comment: 21 pages, 6 figures, 2 tables, Conference pape
    • …
    corecore