22 research outputs found

    Exploring the Effectiveness of Privacy Preserving Classification in Convolutional Neural Networks

    Get PDF
    A front-runner in modern technological advancement, machine learning relies heavily on the use of personal data. It follows that, when assessing the scope of confidentiality for machine learning models, understanding the potential role of encryption is critical. Convolutional Neural Networks (CNN) are a subset of artificial feed-forward neural networks tailored specifically for image recognition and classification. As the popularity of CNN increases, so too does the need for privacy preserving classification. Homomorphic Encryption (HE) refers to a cryptographic system that allows for computation on encrypted data to obtain an encrypted result such that, when decrypted, the result is the same value that would have been obtained if the operations were performed on the original unencrypted data. The objective of this research was to explore the application of HE alongside CNN with the creation of privacy-preserving CNN layers that have the ability to operate on encrypted images. This was accomplished through (1) researching the underlying structure of preexisting privacy-preserving CNN classifiers, (2) creating privacy-preserving convolution, pooling, and fully-connected layers by mapping the computations found within each layer to a space of homomorphic computations, (3) developing a polynomial-approximated activation function and creating a privacy-preserving activation layer based on this approximation, (4) testing and profiling the designed application to asses efficiency, performance, accuracy, and overall practicality

    Glyph: Fast and Accurately Training Deep Neural Networks on Encrypted Data

    Full text link
    Big data is one of the cornerstones to enabling and training deep neural networks (DNNs). Because of the lack of expertise, to gain benefits from their data, average users have to rely on and upload their private data to big data companies they may not trust. Due to the compliance, legal, or privacy constraints, most users are willing to contribute only their encrypted data, and lack interests or resources to join the training of DNNs in cloud. To train a DNN on encrypted data in a completely non-interactive way, a recent work proposes a fully homomorphic encryption (FHE)-based technique implementing all activations in the neural network by \textit{Brakerski-Gentry-Vaikuntanathan (BGV)}-based lookup tables. However, such inefficient lookup-table-based activations significantly prolong the training latency of privacy-preserving DNNs. In this paper, we propose, Glyph, a FHE-based scheme to fast and accurately train DNNs on encrypted data by switching between TFHE (Fast Fully Homomorphic Encryption over the Torus) and BGV cryptosystems. Glyph uses logic-operation-friendly TFHE to implement nonlinear activations, while adopts vectorial-arithmetic-friendly BGV to perform multiply-accumulation (MAC) operations. Glyph further applies transfer learning on the training of DNNs to improve the test accuracy and reduce the number of MAC operations between ciphertext and ciphertext in convolutional layers. Our experimental results show Glyph obtains the state-of-the-art test accuracy, but reduces the training latency by 99%99\% over the prior FHE-based technique on various encrypted datasets.Comment: 10 pages, 8 figure

    Privacy-Preserving Classification on Deep Neural Network

    Get PDF
    Neural Networks (NN) are today increasingly used in Machine Learning where they have become deeper and deeper to accurately model or classify high-level abstractions of data. Their development however also gives rise to important data privacy risks. This observation motives Microsoft researchers to propose a framework, called Cryptonets. The core idea is to combine simplifications of the NN with Fully Homomorphic Encryptions (FHE) techniques to get both confidentiality of the manipulated data and efficiency of the processing. While efficiency and accuracy are demonstrated when the number of non-linear layers is small (eg 22), Cryptonets unfortunately becomes ineffective for deeper NNs which let the problem of privacy preserving matching open in these contexts. This work successfully addresses this problem by combining the original ideas of Cryptonets\u27 solution with the batch normalization principle introduced at ICML 2015 by Ioffe and Szegedy. We experimentally validate the soundness of our approach with a neural network with 66 non-linear layers. When applied to the MNIST database, it competes the accuracy of the best non-secure versions, thus significantly improving Cryptonets

    Cloud-based homomorphic encryption for privacy-preserving machine learning in clinical decision support

    Get PDF
    While privacy and security concerns dominate public cloud services, Homomorphic Encryption (HE) is seen as an emerging solution that ensures secure processing of sensitive data via untrusted networks in the public cloud or by third-party cloud vendors. It relies on the fact that some encryption algorithms display the property of homomorphism, which allows them to manipulate data meaningfully while still in encrypted form; although there are major stumbling blocks to overcome before the technology is considered mature for production cloud environments. Such a framework would find particular relevance in Clinical Decision Support (CDS) applications deployed in the public cloud. CDS applications have an important computational and analytical role over confidential healthcare information with the aim of supporting decision-making in clinical practice. Machine Learning (ML) is employed in CDS applications that typically learn and can personalise actions based on individual behaviour. A relatively simple-to-implement, common and consistent framework is sought that can overcome most limitations of Fully Homomorphic Encryption (FHE) in order to offer an expanded and flexible set of HE capabilities. In the absence of a significant breakthrough in FHE efficiency and practical use, it would appear that a solution relying on client interactions is the best known entity for meeting the requirements of private CDS-based computation, so long as security is not significantly compromised. A hybrid solution is introduced, that intersperses limited two-party interactions amongst the main homomorphic computations, allowing exchange of both numerical and logical cryptographic contexts in addition to resolving other major FHE limitations. Interactions involve the use of client-based ciphertext decryptions blinded by data obfuscation techniques, to maintain privacy. This thesis explores the middle ground whereby HE schemes can provide improved and efficient arbitrary computational functionality over a significantly reduced two-party network interaction model involving data obfuscation techniques. This compromise allows for the powerful capabilities of HE to be leveraged, providing a more uniform, flexible and general approach to privacy-preserving system integration, which is suitable for cloud deployment. The proposed platform is uniquely designed to make HE more practical for mainstream clinical application use, equipped with a rich set of capabilities and potentially very complex depth of HE operations. Such a solution would be suitable for the long-term privacy preserving-processing requirements of a cloud-based CDS system, which would typically require complex combinatorial logic, workflow and ML capabilities

    The High-Level Practical Overview of Open-Source Privacy-Preserving Machine Learning Solutions

    Get PDF
    This paper aims to provide a high-level overview of practical approaches to machine-learning respecting the privacy and confidentiality of customer information, which is called Privacy-Preserving Machine Learning. First, the security approaches in offline-learning privacy methods are assessed. Those focused on modern cryptographic methods, such as Homomorphic Encryption and Secure Multi-Party Computation, as well as on dedicated combined hardware and software platforms like Trusted Execution Environment - Intel® Software Guard Extensions (Intel® SGX). Combining the security approaches with different machine learning architectures leads to our Proof of Concept in which the accuracy and speed of the security solutions will be examined. The next step was exploring and comparing the Open-Source Python-based solutions for PPML. Four solutions were selected from almost 40 separate, state-of-the-art systems: SyMPC, TF-Encrypted, TenSEAL, and Gramine. Three different Neural Network architectures were designed to show different libraries’ capabilities. The POC solves the image classification problem based on the MNIST dataset. As the computational results show, the accuracy of all considered secure approaches is similar. The maximum difference between non-secure and secure flow does not exceed 1.2%. In terms of secure computations, the most effective Privacy-Preserving Machine Learning library is based on Trusted Execution Environment, followed by Secure Multi-Party Computation and Homomorphic Encryption. However, most of those are at least 1000 times slower than the non-secure evaluation. Unfortunately, it is not acceptable for a real-world scenario. Future work could combine different security approaches, explore other new and existing state-of-the-art libraries or implement support for hardware-accelerated secure computation

    Private-Key Fully Homomorphic Encryption for Private Classification of Medical Data

    Full text link
    A wealth of medical data is inaccessible to researchers and clinicians due to privacy restrictions such as HIPAA. Clinicians would benefit from access to predictive models for diagnosis, such as classification of tumors as malignant or benign, without compromising patients’ privacy. In addition, the medical institutions and companies who own these medical information systems wish to keep their models private when used by outside parties. Fully homomorphic encryption (FHE) enables practical polynomial computation over encrypted data. This dissertation begins with coverage of speed and security improvements to existing private-key fully homomorphic encryption methods. Next this dissertation presents a protocol for third-party private search using private-key FHE. Finally, fully homomorphic protocols for polynomial machine learning algorithms are presented using privacy-preserving Naive Bayes and Decision Tree classifiers. These protocols allow clients to privately classify their data points without direct access to the learned model. Experiments using these classifiers are run using publicly available medical data sets. These protocols are applied to the task of privacy-preserving classification of real-world medical data. Results show that private-key fully homomorphic encryption is able to provide fast and accurate results for privacy-preserving medical classification

    Privacy-aware Security Applications in the Era of Internet of Things

    Get PDF
    In this dissertation, we introduce several novel privacy-aware security applications. We split these contributions into three main categories: First, to strengthen the current authentication mechanisms, we designed two novel privacy-aware alternative complementary authentication mechanisms, Continuous Authentication (CA) and Multi-factor Authentication (MFA). Our first system is Wearable-assisted Continuous Authentication (WACA), where we used the sensor data collected from a wrist-worn device to authenticate users continuously. Then, we improved WACA by integrating a noise-tolerant template matching technique called NTT-Sec to make it privacy-aware as the collected data can be sensitive. We also designed a novel, lightweight, Privacy-aware Continuous Authentication (PACA) protocol. PACA is easily applicable to other biometric authentication mechanisms when feature vectors are represented as fixed-length real-valued vectors. In addition to CA, we also introduced a privacy-aware multi-factor authentication method, called PINTA. In PINTA, we used fuzzy hashing and homomorphic encryption mechanisms to protect the users\u27 sensitive profiles while providing privacy-preserving authentication. For the second privacy-aware contribution, we designed a multi-stage privacy attack to smart home users using the wireless network traffic generated during the communication of the devices. The attack works even on the encrypted data as it is only using the metadata of the network traffic. Moreover, we also designed a novel solution based on the generation of spoofed traffic. Finally, we introduced two privacy-aware secure data exchange mechanisms, which allow sharing the data between multiple parties (e.g., companies, hospitals) while preserving the privacy of the individual in the dataset. These mechanisms were realized with the combination of Secure Multiparty Computation (SMC) and Differential Privacy (DP) techniques. In addition, we designed a policy language, called Curie Policy Language (CPL), to handle the conflicting relationships among parties. The novel methods, attacks, and countermeasures in this dissertation were verified with theoretical analysis and extensive experiments with real devices and users. We believe that the research in this dissertation has far-reaching implications on privacy-aware alternative complementary authentication methods, smart home user privacy research, as well as the privacy-aware and secure data exchange methods
    corecore