260 research outputs found

    Secure Computation Protocols for Privacy-Preserving Machine Learning

    Get PDF
    Machine Learning (ML) profitiert erheblich von der Verfügbarkeit großer Mengen an Trainingsdaten, sowohl im Bezug auf die Anzahl an Datenpunkten, als auch auf die Anzahl an Features pro Datenpunkt. Es ist allerdings oft weder möglich, noch gewollt, mehr Daten unter zentraler Kontrolle zu aggregieren. Multi-Party-Computation (MPC)-Protokolle stellen eine Lösung dieses Dilemmas in Aussicht, indem sie es mehreren Parteien erlauben, ML-Modelle auf der Gesamtheit ihrer Daten zu trainieren, ohne die Eingabedaten preiszugeben. Generische MPC-Ansätze bringen allerdings erheblichen Mehraufwand in der Kommunikations- und Laufzeitkomplexität mit sich, wodurch sie sich nur beschränkt für den Einsatz in der Praxis eignen. Das Ziel dieser Arbeit ist es, Privatsphäreerhaltendes Machine Learning mittels MPC praxistauglich zu machen. Zuerst fokussieren wir uns auf zwei Anwendungen, lineare Regression und Klassifikation von Dokumenten. Hier zeigen wir, dass sich der Kommunikations- und Rechenaufwand erheblich reduzieren lässt, indem die aufwändigsten Teile der Berechnung durch Sub-Protokolle ersetzt werden, welche auf die Zusammensetzung der Parteien, die Verteilung der Daten, und die Zahlendarstellung zugeschnitten sind. Insbesondere das Ausnutzen dünnbesetzter Datenrepräsentationen kann die Effizienz der Protokolle deutlich verbessern. Diese Beobachtung verallgemeinern wir anschließend durch die Entwicklung einer Datenstruktur für solch dünnbesetzte Daten, sowie dazugehöriger Zugriffsprotokolle. Aufbauend auf dieser Datenstruktur implementieren wir verschiedene Operationen der Linearen Algebra, welche in einer Vielzahl von Anwendungen genutzt werden. Insgesamt zeigt die vorliegende Arbeit, dass MPC ein vielversprechendes Werkzeug auf dem Weg zu Privatsphäre-erhaltendem Machine Learning ist, und die von uns entwickelten Protokolle stellen einen wesentlichen Schritt in diese Richtung dar.Machine learning (ML) greatly benefits from the availability of large amounts of training data, both in terms of the number of samples, and the number of features per sample. However, aggregating more data under centralized control is not always possible, nor desirable, due to security and privacy concerns, regulation, or competition. Secure multi-party computation (MPC) protocols promise a solution to this dilemma, allowing multiple parties to train ML models on their joint datasets while provably preserving the confidentiality of the inputs. However, generic approaches to MPC result in large computation and communication overheads, which limits the applicability in practice. The goal of this thesis is to make privacy-preserving machine learning with secure computation practical. First, we focus on two high-level applications, linear regression and document classification. We show that communication and computation overhead can be greatly reduced by identifying the costliest parts of the computation, and replacing them with sub-protocols that are tailored to the number and arrangement of parties, the data distribution, and the number representation used. One of our main findings is that exploiting sparsity in the data representation enables considerable efficiency improvements. We go on to generalize this observation, and implement a low-level data structure for sparse data, with corresponding secure access protocols. On top of this data structure, we develop several linear algebra algorithms that can be used in a wide range of applications. Finally, we turn to improving a cryptographic primitive named vector-OLE, for which we propose a novel protocol that helps speed up a wide range of secure computation tasks, within private machine learning and beyond. Overall, our work shows that MPC indeed offers a promising avenue towards practical privacy-preserving machine learning, and the protocols we developed constitute a substantial step in that direction

    Implementation of a Secure Multiparty Computation Protocol

    Get PDF
    Secure multiparty computation (SMC) allows a set of parties to jointly compute a function on private inputs such that, they learn only the output of the function, and the correctness of the output is guaranteed even when a subset of the parties is controlled by an adversary. SMC allows data to be kept in an uncompromisable form and still be useful, and it also gives new meaning to data ownership, allowing data to be shared in a useful way while retaining its privacy. Thus, applications of SMC hold promise for addressing some of the security issues information-driven societies struggle with. In this thesis, we implement two SMC protocols. Our primary objective is to gain a solid understanding of the basic concepts related to SMC. We present a brief survey of the field, with focus on SMC based on secret sharing. In addition to the protocol im- plementations, we implement circuit randomization, a common technique for efficiency improvement. The implemented protocols are run on a simulator to securely evaluate some simple arithmetic functions, and the round complexities of the implemented protocols are compared. Finally, we attempt to extend the implementation to support more general computations

    Secure Multi-Party Computation In Practice

    Get PDF
    Secure multi-party computation (MPC) is a cryptographic primitive for computing on private data. MPC provides strong privacy guarantees, but practical adoption requires high-quality application design, software development, and resource management. This dissertation aims to identify and reduce barriers to practical deployment of MPC applications. First, the dissertation evaluates the design, capabilities, and usability of eleven state-of-the-art MPC software frameworks. These frameworks are essential for prototyping MPC applications, but their qualities vary widely; the survey provides insight into their current abilities and limitations. A comprehensive online repository augments the survey, including complete build environments, sample programs, and additional documentation for each framework. Second, the dissertation applies these lessons in two practical applications of MPC. The first addresses algorithms for assessing stability in financial networks, traditionally designed in a full-information model with a central regulator or data aggregator. This case study describes principles to transform two such algorithms into data-oblivious versions and benchmark their execution under MPC using three frameworks. The second aims to enable unlinkability of payments made with blockchain-based cryptocurrencies. This study uses MPC in conjunction with other privacy techniques to achieve unlinkability in payment channels. Together, these studies illuminate the limitations of existing software, develop guidelines for transforming non-private algorithms into versions suitable for execution under MPC, and illustrate the current practical feasibility of MPC as a solution to a wide variety of applications

    Turvalisel ühisarvutusel põhinev privaatsust säilitav statistiline analüüs

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Kaasaegses ühiskonnas luuakse inimese kohta digitaalne kirje kohe pärast tema sündi. Sellest hetkest alates jälgitakse tema käitumist ning kogutakse andmeid erinevate eluvaldkondade kohta. Kui kasutate poes kliendikaarti, käite arsti juures, täidate maksudeklaratsiooni või liigute lihtsalt ringi mobiiltelefoni taskus kandes, koguvad ning salvestavad firmad ja riigiasutused teie tundlikke andmeid. Vahel anname selliseks jälitustegevuseks vabatahtlikult loa, et saada mingit kasu. Näiteks võime saada soodustust, kui kasutame kliendikaarti. Teinekord on meil vaja teha keeruline otsus, kas loobuda võimalusest teha mobiiltelefonikõnesid või lubada enda jälgimine mobiilimastide kaudu edastatava info abil. Riigiasutused haldavad infot meie tervise, hariduse ja sissetulekute kohta, et meid paremini ravida, harida ja meilt makse koguda. Me loodame, et meie andmeid kasutatakse mõistlikult, aga samas eeldame, et meie privaatsus on tagatud. Käesolev töö uurib, kuidas teostada statistilist analüüsi nii, et tagada üksikisiku privaatsus. Selle eesmärgi saavutamiseks kasutame turvalist ühisarvutust. See krüptograafiline meetod lubab analüüsida andmeid nii, et üksikuid väärtuseid ei ole kunagi võimalik näha. Hoolimata sellest, et turvalise ühisarvutuse kasutamine on aeganõudev protsess, näitame, et see on piisavalt kiire ja seda on võimalik kasutada isegi väga suurte andmemahtude puhul. Me oleme teinud võimalikuks populaarseimate statistilise analüüsi meetodite kasutamise turvalise ühisarvutuse kontekstis. Me tutvustame privaatsust säilitavat statistilise analüüsi tööriista Rmind, mis sisaldab kõiki töö käigus loodud funktsioone. Rmind sarnaneb tööriistadele, millega statistikud on harjunud. See lubab neil viia läbi uuringuid ilma, et nad peaksid üksikasjalikult tundma allolevaid krüptograafilisi protokolle. Kasutame dissertatsioonis kirjeldatud meetodeid, et valmistada ette statistiline uuring, mis ühendab kaht Eesti riiklikku andmekogu. Uuringu eesmärk on teada saada, kas Eesti tudengid, kes töötavad ülikooliõpingute ajal, lõpetavad nominaalajaga väiksema tõenäosusega kui nende õpingutele keskenduvad kaaslased.In a modern society, from the moment a person is born, a digital record is created. From there on, the person’s behaviour is constantly tracked and data are collected about the different aspects of his or her life. Whether one is swiping a customer loyalty card in a store, going to the doctor, doing taxes or simply moving around with a mobile phone in one’s pocket, sensitive data are being gathered and stored by governments and companies. Sometimes, we give our permission for this kind of surveillance for some benefit. For instance, we could get a discount using a customer loyalty card. Other times we have a difficult choice – either we cannot make phone calls or our movements are tracked based on cellular data. The government tracks information about our health, education and income to cure us, educate us and collect taxes. We hope that the data are used in a meaningful way, however, we also have an expectation of privacy. This work focuses on how to perform statistical analyses in a way that preserves the privacy of the individual. To achieve this goal, we use secure multi-­‐party computation. This cryptographic technique allows data to be analysed without seeing the individual values. Even though using secure multi-­‐party computation is a time-­‐consuming process, we show that it is feasible even for large-­‐scale databases. We have developed ways for using the most popular statistical analysis methods with secure multi-­‐party computation. We introduce a privacy-­‐preserving statistical analysis tool called Rmind that contains all of our resulting implementations. Rmind is similar to tools that statistical analysts are used to. This allows them to carry out studies on the data without having to know the details of the underlying cryptographic protocols. The methods described in the thesis are used in practice to prepare for running a statistical study on large-­‐scale real-­‐life data to find out whether Estonian students who are working during university studies are less likely to graduate in nominal time

    Information-Theoretic Secure Outsourced Computation in Distributed Systems

    Get PDF
    Secure multi-party computation (secure MPC) has been established as the de facto paradigm for protecting privacy in distributed computation. One of the earliest secure MPC primitives is the Shamir\u27s secret sharing (SSS) scheme. SSS has many advantages over other popular secure MPC primitives like garbled circuits (GC) -- it provides information-theoretic security guarantee, requires no complex long-integer operations, and often leads to more efficient protocols. Nonetheless, SSS receives less attention in the signal processing community because SSS requires a larger number of honest participants, making it prone to collusion attacks. In this dissertation, I propose an agent-based computing framework using SSS to protect privacy in distributed signal processing. There are three main contributions to this dissertation. First, the proposed computing framework is shown to be significantly more efficient than GC. Second, a novel game-theoretical framework is proposed to analyze different types of collusion attacks. Third, using the proposed game-theoretical framework, specific mechanism designs are developed to deter collusion attacks in a fully distributed manner. Specifically, for a collusion attack with known detectors, I analyze it as games between secret owners and show that the attack can be effectively deterred by an explicit retaliation mechanism. For a general attack without detectors, I expand the scope of the game to include the computing agents and provide deterrence through deceptive collusion requests. The correctness and privacy of the protocols are proved under a covert adversarial model. Our experimental results demonstrate the efficiency of SSS-based protocols and the validity of our mechanism design
    corecore