106 research outputs found

    Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners

    Full text link
    The k-nearest neighbors (k-NN) algorithm is a popular and effective classification algorithm. Due to its large storage and computational requirements, it is suitable for cloud outsourcing. However, k-NN is often run on sensitive data such as medical records, user images, or personal information. It is important to protect the privacy of data in an outsourced k-NN system. Prior works have all assumed the data owners (who submit data to the outsourced k-NN system) are a single trusted party. However, we observe that in many practical scenarios, there may be multiple mutually distrusting data owners. In this work, we present the first framing and exploration of privacy preservation in an outsourced k-NN system with multiple data owners. We consider the various threat models introduced by this modification. We discover that under a particularly practical threat model that covers numerous scenarios, there exists a set of adaptive attacks that breach the data privacy of any exact k-NN system. The vulnerability is a result of the mathematical properties of k-NN and its output. Thus, we propose a privacy-preserving alternative system supporting kernel density estimation using a Gaussian kernel, a classification algorithm from the same family as k-NN. In many applications, this similar algorithm serves as a good substitute for k-NN. We additionally investigate solutions for other threat models, often through extensions on prior single data owner systems

    On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects

    Full text link
    The Internet of Things (IoT) will be a main data generation infrastructure for achieving better system intelligence. This paper considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator. Existing distributed machine learning and data encryption approaches incur significant computation and communication overhead, rendering them ill-suited for resource-constrained IoT objects. We study an approach that applies independent Gaussian random projection at each IoT object to obfuscate data and trains a deep neural network at the coordinator based on the projected data from the IoT objects. This approach introduces light computation overhead to the IoT objects and moves most workload to the coordinator that can have sufficient computing resources. Although the independent projections performed by the IoT objects address the potential collusion between the curious coordinator and some compromised IoT objects, they significantly increase the complexity of the projected data. In this paper, we leverage the superior learning capability of deep learning in capturing sophisticated patterns to maintain good learning performance. Extensive comparative evaluation shows that this approach outperforms other lightweight approaches that apply additive noisification for differential privacy and/or support vector machines for learning in the applications with light data pattern complexities.Comment: 12 pages,IOTDI 201

    Homomorphic Encryption for Machine Learning in Medicine and Bioinformatics

    Get PDF
    Machine learning techniques are an excellent tool for the medical community to analyzing large amounts of medical and genomic data. On the other hand, ethical concerns and privacy regulations prevent the free sharing of this data. Encryption methods such as fully homomorphic encryption (FHE) provide a method evaluate over encrypted data. Using FHE, machine learning models such as deep learning, decision trees, and naive Bayes have been implemented for private prediction using medical data. FHE has also been shown to enable secure genomic algorithms, such as paternity testing, and secure application of genome-wide association studies. This survey provides an overview of fully homomorphic encryption and its applications in medicine and bioinformatics. The high-level concepts behind FHE and its history are introduced. Details on current open-source implementations are provided, as is the state of FHE for privacy-preserving techniques in machine learning and bioinformatics and future growth opportunities for FHE

    Towards privacy-aware mobile-based continuous authentication systems

    Get PDF
    User authentication is used to verify the identify of individuals attempting to gain access to a certain system. It traditionally refers to the initial authentication using knowledge factors (e.g. passwords), or ownership factors (e.g. smart cards). However, initial authentication cannot protect the computer (or smartphone), if left unattended, after the initial login. Thus, continuous authentication was proposed to complement initial authentication by transparently and continuously testing the users\u27 behavior against the stored profile (machine learning model). Since continuous authentication utilizes users\u27 behavioral data to build machine learning models, certain privacy and security concerns have to be addressed before these systems can be widely deployed. In almost all of the continuous authentication research, non-privacy-preserving classification methods were used (such as SVM or KNN). The motivation of this work is twofold: (1) studying the implications of such assumption on continuous authentication security, and users\u27 privacy, and (2) proposing privacy-aware solutions to address the threats introduced by these assumptions. First, we study and propose reconstruction attacks and model inversion attacks in relation to continuous authentication systems, and we implement solutions that can be effective against our proposed attacks. We conduct this research assuming that a certain cloud service (which rely on continuous authentication) was compromised, and that the adversary is trying to utilize this compromised system to access a user\u27s account on another cloud service. We identify two types of adversaries based on how their knowledge is obtained: (1) full-profile adversary that has access to the victim\u27s profile, and (2) decision value adversary who is an active adversary that only has access to the cloud service mobile app (which is used to obtain a feature vector). Eventually, both adversaries use the user\u27s compromised feature vectors to generate raw data based on our proposed reconstruction methods: a numerical method that is tied to a single attacked system (set of features), and a randomized algorithm that is not restricted to a single set of features. We conducted experiments using a public data set where we evaluated the attacks performed by our two types of adversaries and two types of reconstruction algorithms, and we have shown that our attacks are feasible. Finally, we analyzed the results, and provided recommendations to resist our attacks. Our remedies directly limit the effectiveness of model inversion attacks; thus, dealing with decision value adversaries. Second, we study privacy-enhancing technologies for machine learning that can potentially prevent full-profile adversaries from utilizing the stored profiles to obtain the original feature vectors. We also study the problem of restricting undesired inference on users\u27 private data within the context of continuous authentication. We propose a gesture-based continuous authentication framework that utilizes supervised dimensionality reduction (S-DR) techniques to protect against undesired inference attacks, and meets the non-invertibility (security) requirement of cancelable biometrics. These S-DR methods are Discriminant Component Analysis (DCA), and Multiclass Discriminant Ratio (MDR). Using experiments on a public data set, our results show that DCA and MDR provide better privacy/utility performance than random projection, which was extensively utilized in cancelable biometrics. Third, since using DCA (or MDR) requires computing the projection matrix from data distributed across multiple data owners, we proposed privacy-preserving PCA/DCA protocols that enable a data user (cloud server) to compute the projection matrices without compromising the privacy of the individual data owners. To achieve this, we propose new protocols for computing the scatter matrices using additive homomorphic encryption, and performing the Eigen decomposition using Garbled circuits. We implemented our protocols using Java and Obliv-C, and conducted experiments on public datasets. We show that our protocols are efficient, and preserve the privacy while maintaining the accuracy

    Efficient privacy-preserving facial expression classification

    Get PDF
    This paper proposes an efficient algorithm to perform privacy-preserving (PP) facial expression classification (FEC) in the client-server model. The server holds a database and offers the classification service to the clients. The client uses the service to classify the facial expression (FaE) of subject. It should be noted that the client and server are mutually untrusted parties and they want to perform the classification without revealing their inputs to each other. In contrast to the existing works, which rely on computationally expensive cryptographic operations, this paper proposes a lightweight algorithm based on the randomization technique. The proposed algorithm is validated using the widely used JAFFE and MUG FaE databases. Experimental results demonstrate that the proposed algorithm does not degrade the performance compared to existing works. However, it preserves the privacy of inputs while improving the computational complexity by 120 times and communication complexity by 31 percent against the existing homomorphic cryptography based approach

    Towards Attack-Resilient Geometric Data Perturbation

    Get PDF

    CryptGPU: Fast Privacy-Preserving Machine Learning on the GPU

    Get PDF
    We introduce CryptGPU, a system for privacy-preserving machine learning that implements all operations on the GPU (graphics processing unit). Just as GPUs played a pivotal role in the success of modern deep learning, they are also essential for realizing scalable privacy-preserving deep learning. In this work, we start by introducing a new interface to losslessly embed cryptographic operations over secret-shared values (in a discrete domain) into floating-point operations that can be processed by highly-optimized CUDA kernels for linear algebra. We then identify a sequence of GPU-friendly cryptographic protocols to enable privacy-preserving evaluation of both linear and non-linear operations on the GPU. Our microbenchmarks indicate that our private GPU-based convolution protocol is over 150x faster than the analogous CPU-based protocol; for non-linear operations like the ReLU activation function, our GPU-based protocol is around 10x faster than its CPU analog. With CryptGPU, we support private inference and private training on convolutional neural networks with over 60 million parameters as well as handle large datasets like ImageNet. Compared to the previous state-of-the-art, when considering large models and datasets, our protocols achieve a 2x to 8x improvement in private inference and a 6x to 36x improvement for private training. Our work not only showcases the viability of performing secure multiparty computation (MPC) entirely on the GPU to enable fast privacy-preserving machine learning, but also highlights the importance of designing new MPC primitives that can take full advantage of the GPU\u27s computing capabilities