4,639 research outputs found

    Conditionals in Homomorphic Encryption and Machine Learning Applications

    Get PDF
    Homomorphic encryption aims at allowing computations on encrypted data without decryption other than that of the final result. This could provide an elegant solution to the issue of privacy preservation in data-based applications, such as those using machine learning, but several open issues hamper this plan. In this work we assess the possibility for homomorphic encryption to fully implement its program without relying on other techniques, such as multiparty computation (SMPC), which may be impossible in many use cases (for instance due to the high level of communication required). We proceed in two steps: i) on the basis of the structured program theorem (Bohm-Jacopini theorem) we identify the relevant minimal set of operations homomorphic encryption must be able to perform to implement any algorithm; and ii) we analyse the possibility to solve -- and propose an implementation for -- the most fundamentally relevant issue as it emerges from our analysis, that is, the implementation of conditionals (requiring comparison and selection/jump operations). We show how this issue clashes with the fundamental requirements of homomorphic encryption and could represent a drawback for its use as a complete solution for privacy preservation in data-based applications, in particular machine learning ones. Our approach for comparisons is novel and entirely embedded in homomorphic encryption, while previous studies relied on other techniques, such as SMPC, demanding high level of communication among parties, and decryption of intermediate results from data-owners. Our protocol is also provably safe (sharing the same safety as the homomorphic encryption schemes), differently from other techniques such as Order-Preserving/Revealing-Encryption (OPE/ORE).Comment: 14 pages, 1 figure, corrected typos, added introductory pedagogical section on polynomial approximatio

    Encrypted statistical machine learning: new privacy preserving methods

    Full text link
    We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.Comment: 39 page

    VirtualIdentity : privacy preserving user profiling

    Get PDF
    User profiling from user generated content (UGC) is a common practice that supports the business models of many social media companies. Existing systems require that the UGC is fully exposed to the module that constructs the user profiles. In this paper we show that it is possible to build user profiles without ever accessing the user's original data, and without exposing the trained machine learning models for user profiling - which are the intellectual property of the company - to the users of the social media site. We present VirtualIdentity, an application that uses secure multi-party cryptographic protocols to detect the age, gender and personality traits of users by classifying their user-generated text and personal pictures with trained support vector machine models in a privacy preserving manner

    Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare

    Get PDF
    Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare
    corecore