139 research outputs found

    Secure Computation Protocols for Privacy-Preserving Machine Learning

    Get PDF
    Machine Learning (ML) profitiert erheblich von der Verfรผgbarkeit groรŸer Mengen an Trainingsdaten, sowohl im Bezug auf die Anzahl an Datenpunkten, als auch auf die Anzahl an Features pro Datenpunkt. Es ist allerdings oft weder mรถglich, noch gewollt, mehr Daten unter zentraler Kontrolle zu aggregieren. Multi-Party-Computation (MPC)-Protokolle stellen eine Lรถsung dieses Dilemmas in Aussicht, indem sie es mehreren Parteien erlauben, ML-Modelle auf der Gesamtheit ihrer Daten zu trainieren, ohne die Eingabedaten preiszugeben. Generische MPC-Ansรคtze bringen allerdings erheblichen Mehraufwand in der Kommunikations- und Laufzeitkomplexitรคt mit sich, wodurch sie sich nur beschrรคnkt fรผr den Einsatz in der Praxis eignen. Das Ziel dieser Arbeit ist es, Privatsphรคreerhaltendes Machine Learning mittels MPC praxistauglich zu machen. Zuerst fokussieren wir uns auf zwei Anwendungen, lineare Regression und Klassifikation von Dokumenten. Hier zeigen wir, dass sich der Kommunikations- und Rechenaufwand erheblich reduzieren lรคsst, indem die aufwรคndigsten Teile der Berechnung durch Sub-Protokolle ersetzt werden, welche auf die Zusammensetzung der Parteien, die Verteilung der Daten, und die Zahlendarstellung zugeschnitten sind. Insbesondere das Ausnutzen dรผnnbesetzter Datenreprรคsentationen kann die Effizienz der Protokolle deutlich verbessern. Diese Beobachtung verallgemeinern wir anschlieรŸend durch die Entwicklung einer Datenstruktur fรผr solch dรผnnbesetzte Daten, sowie dazugehรถriger Zugriffsprotokolle. Aufbauend auf dieser Datenstruktur implementieren wir verschiedene Operationen der Linearen Algebra, welche in einer Vielzahl von Anwendungen genutzt werden. Insgesamt zeigt die vorliegende Arbeit, dass MPC ein vielversprechendes Werkzeug auf dem Weg zu Privatsphรคre-erhaltendem Machine Learning ist, und die von uns entwickelten Protokolle stellen einen wesentlichen Schritt in diese Richtung dar.Machine learning (ML) greatly benefits from the availability of large amounts of training data, both in terms of the number of samples, and the number of features per sample. However, aggregating more data under centralized control is not always possible, nor desirable, due to security and privacy concerns, regulation, or competition. Secure multi-party computation (MPC) protocols promise a solution to this dilemma, allowing multiple parties to train ML models on their joint datasets while provably preserving the confidentiality of the inputs. However, generic approaches to MPC result in large computation and communication overheads, which limits the applicability in practice. The goal of this thesis is to make privacy-preserving machine learning with secure computation practical. First, we focus on two high-level applications, linear regression and document classification. We show that communication and computation overhead can be greatly reduced by identifying the costliest parts of the computation, and replacing them with sub-protocols that are tailored to the number and arrangement of parties, the data distribution, and the number representation used. One of our main findings is that exploiting sparsity in the data representation enables considerable efficiency improvements. We go on to generalize this observation, and implement a low-level data structure for sparse data, with corresponding secure access protocols. On top of this data structure, we develop several linear algebra algorithms that can be used in a wide range of applications. Finally, we turn to improving a cryptographic primitive named vector-OLE, for which we propose a novel protocol that helps speed up a wide range of secure computation tasks, within private machine learning and beyond. Overall, our work shows that MPC indeed offers a promising avenue towards practical privacy-preserving machine learning, and the protocols we developed constitute a substantial step in that direction

    Improved security and privacy preservation for biometric hashing

    Get PDF
    We address improving verification performance, as well as security and privacy aspects of biohashing methods in this thesis. We propose various methods to increase the verification performance of the random projection based biohashing systems. First, we introduce a new biohashing method based on optimal linear transform which seeks to find a better projection matrix. Second, we propose another biohashing method based on a discriminative projection selection technique that selects the rows of the random projection matrix by using the Fisher criterion. Third, we introduce a new quantization method that attempts to optimize biohashes using the ideas from diversification of error-correcting output codes classifiers. Simulation results show that introduced methods improve the verification performance of biohashing. We consider various security and privacy attack scenarios for biohashing methods. We propose new attack methods based on minimum l1 and l2 norm reconstructions. The results of these attacks show that biohashing is vulnerable to such attacks and better template protection methods are necessary. Therefore, we propose an identity verification system which has new enrollment and authentication protocols based on threshold homomorphic encryption. The system can be used with any biometric modality and feature extraction method whose output templates can be binarized, therefore it is not limited to biohashing. Our analysis shows that the introduced system is robust against most security and privacy attacks conceived in the literature. In addition, a straightforward implementation of its authentication protocol is su ciently fast enough to be used in real applications

    ์žก์Œํ‚ค๋ฅผ ๊ฐ€์ง€๋Š” ์‹ ์›๊ธฐ๋ฐ˜ ๋™ํ˜•์•”ํ˜ธ์— ๊ด€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€,2020. 2. ์ฒœ์ •ํฌ.ํด๋ผ์šฐ๋“œ ์ƒ์˜ ๋ฐ์ดํ„ฐ ๋ถ„์„ ์œ„์ž„ ์‹œ๋‚˜๋ฆฌ์˜ค๋Š” ๋™ํ˜•์•”ํ˜ธ์˜ ๊ฐ€์žฅ ํšจ๊ณผ์ ์ธ ์‘์šฉ ์‹œ๋‚˜๋ฆฌ์˜ค ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ ์ œ๊ณต์ž์™€ ๋ถ„์„๊ฒฐ๊ณผ ์š”๊ตฌ์ž๊ฐ€ ์กด์žฌํ•˜๋Š” ์‹ค์ œ ํ˜„์‹ค์˜ ๋ชจ๋ธ์—์„œ๋Š” ๊ธฐ๋ณธ์ ์ธ ์•”๋ณตํ˜ธํ™”์™€ ๋™ํ˜• ์—ฐ์‚ฐ ์™ธ์—๋„ ์—ฌ์ „ํžˆ ํ•ด๊ฒฐํ•ด์•ผ ํ•  ๊ณผ์ œ๋“ค์ด ๋‚จ์•„์žˆ๋Š” ์‹ค์ •์ด๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์—์„œ ํ•„์š”ํ•œ ์—ฌ๋Ÿฌ ์š”๊ตฌ์‚ฌํ•ญ๋“ค์„ ํฌ์ฐฉํ•˜๊ณ , ์ด์— ๋Œ€ํ•œ ํ•ด๊ฒฐ๋ฐฉ์•ˆ์„ ๋…ผํ•˜์˜€๋‹ค. ๋จผ์ €, ๊ธฐ์กด์˜ ์•Œ๋ ค์ง„ ๋™ํ˜• ๋ฐ์ดํ„ฐ ๋ถ„์„ ์†”๋ฃจ์…˜๋“ค์€ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ์ธต์œ„๋‚˜ ์ˆ˜์ค€์„ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ์ ์— ์ฐฉ์•ˆํ•˜์—ฌ, ์‹ ์›๊ธฐ๋ฐ˜ ์•”ํ˜ธ์™€ ๋™ํ˜•์•”ํ˜ธ๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์— ์ ‘๊ทผ ๊ถŒํ•œ์„ ์„ค์ •ํ•˜์—ฌ ํ•ด๋‹น ๋ฐ์ดํ„ฐ ์‚ฌ์ด์˜ ์—ฐ์‚ฐ์„ ํ—ˆ์šฉํ•˜๋Š” ๋ชจ๋ธ์„ ์ƒ๊ฐํ•˜์˜€๋‹ค. ๋˜ํ•œ ์ด ๋ชจ๋ธ์˜ ํšจ์œจ์ ์ธ ๋™์ž‘์„ ์œ„ํ•ด์„œ ๋™ํ˜•์•”ํ˜ธ ์นœํ™”์ ์ธ ์‹ ์›๊ธฐ๋ฐ˜ ์•”ํ˜ธ์— ๋Œ€ํ•˜์—ฌ ์—ฐ๊ตฌํ•˜์˜€๊ณ , ๊ธฐ์กด์— ์•Œ๋ ค์ง„ NTRU ๊ธฐ๋ฐ˜์˜ ์•”ํ˜ธ๋ฅผ ํ™•์žฅํ•˜์—ฌ module-NTRU ๋ฌธ์ œ๋ฅผ ์ •์˜ํ•˜๊ณ  ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์‹ ์›๊ธฐ๋ฐ˜ ์•”ํ˜ธ๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋‘˜์งธ๋กœ, ๋™ํ˜•์•”ํ˜ธ์˜ ๋ณตํ˜ธํ™” ๊ณผ์ •์—๋Š” ์—ฌ์ „ํžˆ ๋น„๋ฐ€ํ‚ค๊ฐ€ ๊ด€์—ฌํ•˜๊ณ  ์žˆ๊ณ , ๋”ฐ๋ผ์„œ ๋น„๋ฐ€ํ‚ค ๊ด€๋ฆฌ ๋ฌธ์ œ๊ฐ€ ๋‚จ์•„์žˆ๋‹ค๋Š” ์ ์„ ํฌ์ฐฉํ•˜์˜€๋‹ค. ์ด๋Ÿฌํ•œ ์ ์—์„œ ์ƒ์ฒด์ •๋ณด๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ณตํ˜ธํ™” ๊ณผ์ •์„ ๊ฐœ๋ฐœํ•˜์—ฌ ํ•ด๋‹น ๊ณผ์ •์„ ๋™ํ˜•์•”ํ˜ธ ๋ณตํ˜ธํ™”์— ์ ์šฉํ•˜์˜€๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์•”๋ณตํ˜ธํ™”์™€ ๋™ํ˜• ์—ฐ์‚ฐ์˜ ์ „ ๊ณผ์ •์„ ์–ด๋Š ๊ณณ์—๋„ ํ‚ค๊ฐ€ ์ €์žฅ๋˜์ง€ ์•Š์€ ์ƒํƒœ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ์•”ํ˜ธ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋™ํ˜•์•”ํ˜ธ์˜ ๊ตฌ์ฒด์ ์ธ ์•ˆ์ „์„ฑ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ ๊ณ ๋ คํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋™ํ˜•์•”ํ˜ธ๊ฐ€ ๊ธฐ๋ฐ˜ํ•˜๊ณ  ์žˆ๋Š” ์ด๋ฅธ๋ฐ” Learning With Errors (LWE) ๋ฌธ์ œ์˜ ์‹ค์ œ์ ์ธ ๋‚œํ•ด์„ฑ์„ ๋ฉด๋ฐ€ํžˆ ๋ถ„์„ํ•˜์˜€๊ณ , ๊ทธ ๊ฒฐ๊ณผ ๊ธฐ์กด์˜ ๊ณต๊ฒฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋ณด๋‹ค ํ‰๊ท ์ ์œผ๋กœ 1000๋ฐฐ ์ด์ƒ ๋น ๋ฅธ ๊ณต๊ฒฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ํ˜„์žฌ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ๋™ํ˜•์•”ํ˜ธ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์•ˆ์ „ํ•˜์ง€ ์•Š์Œ์„ ๋ณด์˜€๊ณ , ์ƒˆ๋กœ์šด ๊ณต๊ฒฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ • ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ ๋…ผํ•˜์˜€๋‹ค.Secure data analysis delegation on cloud is one of the most powerful application that homomorphic encryption (HE) can bring. As the technical level of HE arrive at practical regime, this model is also being considered to be a more serious and realistic paradigm. In this regard, this increasing attention requires more versatile and secure model to deal with much complicated real world problems. First, as real world modeling involves a number of data owners and clients, an authorized control to data access is still required even for HE scenario. Second, we note that although homomorphic operation requires no secret key, the decryption requires the secret key. That is, the secret key management concern still remains even for HE. Last, in a rather fundamental view, we thoroughly analyze the concrete hardness of the base problem of HE, so-called Learning With Errors (LWE). In fact, for the sake of efficiency, HE exploits a weaker variant of LWE whose security is believed not fully understood. For the data encryption phase efficiency, we improve the previously suggested NTRU-lattice ID-based encryption by generalizing the NTRU concept into module-NTRU lattice. Moreover, we design a novel method that decrypts the resulting ciphertext with a noisy key. This enables the decryptor to use its own noisy source, in particular biometric, and hence fundamentally solves the key management problem. Finally, by considering further improvement on existing LWE solving algorithms, we propose new algorithms that shows much faster performance. Consequently, we argue that the HE parameter choice should be updated regarding our attacks in order to maintain the currently claimed security level.1 Introduction 1 1.1 Access Control based on Identity 2 1.2 Biometric Key Management 3 1.3 Concrete Security of HE 3 1.4 List of Papers 4 2 Background 6 2.1 Notation 6 2.2 Lattices 7 2.2.1 Lattice Reduction Algorithm 7 2.2.2 BKZ cost model 8 2.2.3 Geometric Series Assumption (GSA) 8 2.2.4 The Nearest Plane Algorithm 9 2.3 Gaussian Measures 9 2.3.1 Kullback-Leibler Divergence 11 2.4 Lattice-based Hard Problems 12 2.4.1 The Learning With Errors Problem 12 2.4.2 NTRU Problem 13 2.5 One-way and Pseudo-random Functions 14 3 ID-based Data Access Control 16 3.1 Module-NTRU Lattices 16 3.1.1 Construction of MNTRU lattice and trapdoor 17 3.1.2 Minimize the Gram-Schmidt norm 22 3.2 IBE-Scheme from Module-NTRU 24 3.2.1 Scheme Construction 24 3.2.2 Security Analysis by Attack Algorithms 29 3.2.3 Parameter Selections 31 3.3 Application to Signature 33 4 Noisy Key Cryptosystem 36 4.1 Reusable Fuzzy Extractors 37 4.2 Local Functions 40 4.2.1 Hardness over Non-uniform Sources 40 4.2.2 Flipping local functions 43 4.2.3 Noise stability of predicate functions: Xor-Maj 44 4.3 From Pseudorandom Local Functions 47 4.3.1 Basic Construction: One-bit Fuzzy Extractor 48 4.3.2 Expansion to multi-bit Fuzzy Extractor 50 4.3.3 Indistinguishable Reusability 52 4.3.4 One-way Reusability 56 4.4 From Local One-way Functions 59 5 Concrete Security of Homomorphic Encryption 63 5.1 Albrecht's Improved Dual Attack 64 5.1.1 Simple Dual Lattice Attack 64 5.1.2 Improved Dual Attack 66 5.2 Meet-in-the-Middle Attack on LWE 69 5.2.1 Noisy Collision Search 70 5.2.2 Noisy Meet-in-the-middle Attack on LWE 74 5.3 The Hybrid-Dual Attack 76 5.3.1 Dimension-error Trade-o of LWE 77 5.3.2 Our Hybrid Attack 79 5.4 The Hybrid-Primal Attack 82 5.4.1 The Primal Attack on LWE 83 5.4.2 The Hybrid Attack for SVP 86 5.4.3 The Hybrid-Primal attack for LWE 93 5.4.4 Complexity Analysis 96 5.5 Bit-security estimation 102 5.5.1 Estimations 104 5.5.2 Application to PKE 105 6 Conclusion 108 Abstract (in Korean) 120Docto

    Privacy-aware Security Applications in the Era of Internet of Things

    Get PDF
    In this dissertation, we introduce several novel privacy-aware security applications. We split these contributions into three main categories: First, to strengthen the current authentication mechanisms, we designed two novel privacy-aware alternative complementary authentication mechanisms, Continuous Authentication (CA) and Multi-factor Authentication (MFA). Our first system is Wearable-assisted Continuous Authentication (WACA), where we used the sensor data collected from a wrist-worn device to authenticate users continuously. Then, we improved WACA by integrating a noise-tolerant template matching technique called NTT-Sec to make it privacy-aware as the collected data can be sensitive. We also designed a novel, lightweight, Privacy-aware Continuous Authentication (PACA) protocol. PACA is easily applicable to other biometric authentication mechanisms when feature vectors are represented as fixed-length real-valued vectors. In addition to CA, we also introduced a privacy-aware multi-factor authentication method, called PINTA. In PINTA, we used fuzzy hashing and homomorphic encryption mechanisms to protect the users\u27 sensitive profiles while providing privacy-preserving authentication. For the second privacy-aware contribution, we designed a multi-stage privacy attack to smart home users using the wireless network traffic generated during the communication of the devices. The attack works even on the encrypted data as it is only using the metadata of the network traffic. Moreover, we also designed a novel solution based on the generation of spoofed traffic. Finally, we introduced two privacy-aware secure data exchange mechanisms, which allow sharing the data between multiple parties (e.g., companies, hospitals) while preserving the privacy of the individual in the dataset. These mechanisms were realized with the combination of Secure Multiparty Computation (SMC) and Differential Privacy (DP) techniques. In addition, we designed a policy language, called Curie Policy Language (CPL), to handle the conflicting relationships among parties. The novel methods, attacks, and countermeasures in this dissertation were verified with theoretical analysis and extensive experiments with real devices and users. We believe that the research in this dissertation has far-reaching implications on privacy-aware alternative complementary authentication methods, smart home user privacy research, as well as the privacy-aware and secure data exchange methods

    Counteracting Bloom Filter Encoding Techniques for Private Record Linkage

    Get PDF
    Record Linkage is a process of combining records representing same entity spread across multiple and different data sources, primarily for data analytics. Traditionally, this could be performed with comparing personal identifiers present in data (e.g., given name, surname, social security number etc.). However, sharing information across databases maintained by disparate organizations leads to exchange of personal information pertaining to an individual. In practice, various statutory regulations and policies prohibit the disclosure of such identifiers. Private record linkage (PRL) techniques have been implemented to execute record linkage without disclosing any information about other dissimilar records. Various techniques have been proposed to implement PRL, including cryptographically secure multi-party computational protocols. However, these protocols have been debated over the scalability factors as they are computationally extensive by nature. Bloom filter encoding (BFE) for private record linkage has become a topic of recent interest in the medical informatics community due to their versatility and ability to match records approximately in a manner that is (ostensibly) privacy-preserving. It also has the advantage of computing matches directly in plaintext space making them much faster than their secure mutli-party computation counterparts. The trouble with BFEs lies in their security guarantees: by their very nature BFEs leak information to assist in the matching process. Despite this known shortcoming, BFEs continue to be studied in the context of new heuristically designed countermeasures to address known attacks. A new class of set-intersection attack is proposed in this thesis which re-examines the security of BFEs by conducting experiments, demonstrating an inverse relationship between security and accuracy. With real-world deployment of BFEs in the health information sector approaching, the results from this work will generate renewed discussion around the security of BFEs as well as motivate research into new, more efficient multi-party protocols for private approximate matching

    A Cryptanalysis of Two Cancelable Biometric Schemes based on Index-of-Max Hashing

    Full text link
    Cancelable biometric schemes generate secure biometric templates by combining user specific tokens and biometric data. The main objective is to create irreversible, unlinkable, and revocable templates, with high accuracy in matching. In this paper, we cryptanalyze two recent cancelable biometric schemes based on a particular locality sensitive hashing function, index-of-max (IoM): Gaussian Random Projection-IoM (GRP-IoM) and Uniformly Random Permutation-IoM (URP-IoM). As originally proposed, these schemes were claimed to be resistant against reversibility, authentication, and linkability attacks under the stolen token scenario. We propose several attacks against GRP-IoM and URP-IoM, and argue that both schemes are severely vulnerable against authentication and linkability attacks. We also propose better, but not yet practical, reversibility attacks against GRP-IoM. The correctness and practical impact of our attacks are verified over the same dataset provided by the authors of these two schemes.Comment: Some revisions and addition of acknowledgement

    Homomorphic Encryption and Cryptanalysis of Lattice Cryptography

    Get PDF
    The vast amount of personal data being collected and analyzed through internet connected devices is vulnerable to theft and misuse. Modern cryptography presents several powerful techniques that can help to solve the puzzle of how to harness data for use while at the same time protecting it---one such technique is homomorphic encryption that allows computations to be done on data while it is still encrypted. The question of security for homomorphic encryption relates to the broader field of lattice cryptography. Lattice cryptography is one of the main areas of cryptography that promises to be secure even against quantum computing. In this dissertation, we will touch on several aspects of homomorphic encryption and its security based on lattice cryptography. Our main contributions are: 1. proving some heuristics that are used in major results in the literature for controlling the error size in bootstrapping for fully homomorphic encryption, 2. presenting a new fully homomorphic encryption scheme that supports k-bit arbitrary operations and achieves an asymptotic ciphertext expansion of one, 3. thoroughly studying certain attacks against the Ring Learning with Errors problem, 4. precisely characterizing the performance of an algorithm for solving the Approximate Common Divisor problem

    Private Polynomial Commitments and Applications to MPC

    Get PDF
    Polynomial commitment schemes allow a prover to commit to a polynomial and later reveal the evaluation of the polynomial on an arbitrary point along with proof of validity. This object is central in the design of many cryptographic schemes such as zero-knowledge proofs and verifiable secret sharing. In the standard definition, the polynomial is known to the prover whereas the evaluation points are not private. In this paper, we put forward the notion of private polynomial commitments that capture additional privacy guarantees, where the evaluation points are hidden from the verifier while the polynomial is hidden from both. We provide concretely efficient constructions that allow simultaneously batch the verification of many evaluations with a small additive overhead. As an application, we design a new concretely efficient multi-party private set-intersection with malicious security and improved asymptotic communication and space complexities. We demonstrate the concrete efficiency of our construction via an implementation. Our scheme can prove 2102^{10} evaluations of a private polynomial of degree 2102^{10} in 157s. The proof size is only 169KB and the verification time is 11.8s. Moreover, we also implemented the multi-party private set intersection protocol and scale it to 1000 parties (which has not been shown before). The total running time for 2142^{14} elements per party is 2,410 seconds. While existing protocols offer better computational complexity, our scheme offers significantly smaller communication and better scalability (in the number of parties) owing to better memory usage
    • โ€ฆ
    corecore