194 research outputs found

    MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture

    Get PDF
    Massively Parallel Computation (MPC) is a model of computation widely believed to best capture realistic parallel computing architectures such as large-scale MapReduce and Hadoop clusters. Motivated by the fact that many data analytics tasks performed on these platforms involve sensitive user data, we initiate the theoretical exploration of how to leverage MPC architectures to enable efficient, privacy-preserving computation over massive data. Clearly if a computation task does not lend itself to an efficient implementation on MPC even without security, then we cannot hope to compute it efficiently on MPC with security. We show, on the other hand, that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we show the following results: - any MPC algorithm can be compiled to a communication-oblivious counterpart while asymptotically preserving its round and space complexity, where communication-obliviousness ensures that any network intermediary observing the communication patterns learn no information about the secret inputs; - assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against an adversary who controls not only intermediate network routers but additionally up to 1/3 - ? fraction of machines (for an arbitrarily small constant ?) - moreover, this compilation preserves the round complexity tightly, and preserves the space complexity upto a multiplicative security parameter related blowup. As an initial exploration of this important direction, our work suggests new definitions and proposes novel protocols that blend algorithmic and cryptographic techniques

    Secure Merge with O(n log log n) Secure Operations

    Get PDF
    Data-oblivious algorithms are a key component of many secure computation protocols. In this work, we show that advances in secure multiparty shuffling algorithms can be used to increase the efficiency of several key cryptographic tools. The key observation is that many secure computation protocols rely heavily on secure shuffles. The best data-oblivious shuffling algorithms require O(nlogn)O(n \log n), operations, but in the two-party or multiparty setting, secure shuffling can be achieved with only O(n)O(n) communication. Leveraging the efficiency of secure multiparty shuffling, we give novel algorithms that improve the efficiency of securely sorting sparse lists, secure stable compaction, and securely merging two sorted lists. Securely sorting private lists is a key component of many larger secure computation protocols. The best data-oblivious sorting algorithms for sorting a list of nn elements require O(nlogn)O(n \log n) comparisons. Using black-box access to a linear-communication secure shuffle, we give a secure algorithm for sorting a list of length nn with tnt \ll n nonzero elements with communication O(tlog2n+n)O(t \log^2 n + n), which beats the best oblivious algorithms when the number of nonzero elements, tt, satisfies t<n/log2nt < n/\log^2 n. Secure compaction is the problem of removing dummy elements from a list, and is essentially equivalent to sorting on 1-bit keys. The best oblivious compaction algorithms run in O(n)O(n)-time, but they are unstable, i.e., the order of the remaining elements is not preserved. Using black-box access to a linear-communication secure shuffle, we give a stable compaction algorithm with only O(n)O(n) communication. Our main result is a novel secure merge protocol. The best previous algorithms for securely merging two sorted lists into a sorted whole required O(nlogn)O(n \log n) secure operations. Using black-box access to an O(n)O(n)-communication secure shuffle, we give the first secure merge algorithm that requires only O(nloglogn)O(n \log \log n) communication. Our algorithm takes as input nn secret-shared values, and outputs a secret-sharing of the sorted list. All our algorithms are generic, i.e., they can be implemented using generic secure computations techniques and make black-box access to a secure shuffle. Our techniques extend naturally to the multiparty situation (with a constant number of parties) as well as to handle malicious adversaries without changing the asymptotic efficiency. These algorithm have applications to securely computing database joins and order statistics on private data as well as multiparty Oblivious RAM protocols

    Assorted algorithms and protocols for secure computation

    Get PDF

    Assorted algorithms and protocols for secure computation

    Get PDF

    Scalable and Robust Distributed Algorithms for Privacy-Preserving Applications

    Get PDF
    We live in an era when political and commercial entities are increasingly engaging in sophisticated cyber attacks to damage, disrupt, or censor information content and to conduct mass surveillance. By compiling various patterns from user data over time, untrusted parties could create an intimate picture of sensitive personal information such as political and religious beliefs, health status, and so forth. In this dissertation, we study scalable and robust distributed algorithms that guarantee user privacy when communicating with other parties to either solely exchange information or participate in multi-party computations. We consider scalability and robustness requirements in three privacy-preserving areas: secure multi-party computation (MPC), anonymous broadcast, and blocking-resistant Tor bridge distribution. We propose decentralized algorithms for MPC that, unlike most previous work, scale well with the number of parties and tolerate malicious faults from a large fraction of the parties. Our algorithms do not require any trusted party and are fully load-balanced. Anonymity is an essential tool for achieving privacy; it enables individuals to communicate with each other without being identified as the sender or the receiver of the information being exchanged. We show that our MPC algorithms can be effectively used to design a scalable anonymous broadcast protocol. We do this by developing a multi-party shuffling protocol that can efficiently anonymize a sequence of messages in the presence of many faulty nodes. Our final approach for preserving user privacy in cyberspace is to improve Tor; the most popular anonymity network in the Internet. A current challenge with Tor is that colluding corrupt users inside a censorship territory can completely block user\u27s access to Tor by obtaining information about a large fraction of Tor bridges; a type of relay nodes used as the Tor\u27s primary mechanism for blocking-resistance. We describe a randomized bridge distribution algorithm, where all honest users are guaranteed to connect to Tor in the presence of an adversary corrupting an unknown number of users. Our simulations suggest that, with minimal resource costs, our algorithm can guarantee Tor access for all honest users after a small (logarithmic) number of rounds

    Secure Computation Protocols for Privacy-Preserving Machine Learning

    Get PDF
    Machine Learning (ML) profitiert erheblich von der Verfügbarkeit großer Mengen an Trainingsdaten, sowohl im Bezug auf die Anzahl an Datenpunkten, als auch auf die Anzahl an Features pro Datenpunkt. Es ist allerdings oft weder möglich, noch gewollt, mehr Daten unter zentraler Kontrolle zu aggregieren. Multi-Party-Computation (MPC)-Protokolle stellen eine Lösung dieses Dilemmas in Aussicht, indem sie es mehreren Parteien erlauben, ML-Modelle auf der Gesamtheit ihrer Daten zu trainieren, ohne die Eingabedaten preiszugeben. Generische MPC-Ansätze bringen allerdings erheblichen Mehraufwand in der Kommunikations- und Laufzeitkomplexität mit sich, wodurch sie sich nur beschränkt für den Einsatz in der Praxis eignen. Das Ziel dieser Arbeit ist es, Privatsphäreerhaltendes Machine Learning mittels MPC praxistauglich zu machen. Zuerst fokussieren wir uns auf zwei Anwendungen, lineare Regression und Klassifikation von Dokumenten. Hier zeigen wir, dass sich der Kommunikations- und Rechenaufwand erheblich reduzieren lässt, indem die aufwändigsten Teile der Berechnung durch Sub-Protokolle ersetzt werden, welche auf die Zusammensetzung der Parteien, die Verteilung der Daten, und die Zahlendarstellung zugeschnitten sind. Insbesondere das Ausnutzen dünnbesetzter Datenrepräsentationen kann die Effizienz der Protokolle deutlich verbessern. Diese Beobachtung verallgemeinern wir anschließend durch die Entwicklung einer Datenstruktur für solch dünnbesetzte Daten, sowie dazugehöriger Zugriffsprotokolle. Aufbauend auf dieser Datenstruktur implementieren wir verschiedene Operationen der Linearen Algebra, welche in einer Vielzahl von Anwendungen genutzt werden. Insgesamt zeigt die vorliegende Arbeit, dass MPC ein vielversprechendes Werkzeug auf dem Weg zu Privatsphäre-erhaltendem Machine Learning ist, und die von uns entwickelten Protokolle stellen einen wesentlichen Schritt in diese Richtung dar.Machine learning (ML) greatly benefits from the availability of large amounts of training data, both in terms of the number of samples, and the number of features per sample. However, aggregating more data under centralized control is not always possible, nor desirable, due to security and privacy concerns, regulation, or competition. Secure multi-party computation (MPC) protocols promise a solution to this dilemma, allowing multiple parties to train ML models on their joint datasets while provably preserving the confidentiality of the inputs. However, generic approaches to MPC result in large computation and communication overheads, which limits the applicability in practice. The goal of this thesis is to make privacy-preserving machine learning with secure computation practical. First, we focus on two high-level applications, linear regression and document classification. We show that communication and computation overhead can be greatly reduced by identifying the costliest parts of the computation, and replacing them with sub-protocols that are tailored to the number and arrangement of parties, the data distribution, and the number representation used. One of our main findings is that exploiting sparsity in the data representation enables considerable efficiency improvements. We go on to generalize this observation, and implement a low-level data structure for sparse data, with corresponding secure access protocols. On top of this data structure, we develop several linear algebra algorithms that can be used in a wide range of applications. Finally, we turn to improving a cryptographic primitive named vector-OLE, for which we propose a novel protocol that helps speed up a wide range of secure computation tasks, within private machine learning and beyond. Overall, our work shows that MPC indeed offers a promising avenue towards practical privacy-preserving machine learning, and the protocols we developed constitute a substantial step in that direction

    Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

    Full text link
    Vertical federated learning (VFL) has recently emerged as an appealing distributed paradigm empowering multi-party collaboration for training high-quality models over vertically partitioned datasets. Gradient boosting has been popularly adopted in VFL, which builds an ensemble of weak learners (typically decision trees) to achieve promising prediction performance. Recently there have been growing interests in using decision table as an intriguing alternative weak learner in gradient boosting, due to its simpler structure, good interpretability, and promising performance. In the literature, there have been works on privacy-preserving VFL for gradient boosted decision trees, but no prior work has been devoted to the emerging case of decision tables. Training and inference on decision tables are different from that the case of generic decision trees, not to mention gradient boosting with decision tables in VFL. In light of this, we design, implement, and evaluate Privet, the first system framework enabling privacy-preserving VFL service for gradient boosted decision tables. Privet delicately builds on lightweight cryptography and allows an arbitrary number of participants holding vertically partitioned datasets to securely train gradient boosted decision tables. Extensive experiments over several real-world datasets and synthetic datasets demonstrate that Privet achieves promising performance, with utility comparable to plaintext centralized learning.Comment: Accepted in IEEE Transactions on Services Computing (TSC
    corecore