105 research outputs found

    Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare

    Get PDF
    Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare

    VirtualIdentity : privacy preserving user profiling

    Get PDF
    User profiling from user generated content (UGC) is a common practice that supports the business models of many social media companies. Existing systems require that the UGC is fully exposed to the module that constructs the user profiles. In this paper we show that it is possible to build user profiles without ever accessing the user's original data, and without exposing the trained machine learning models for user profiling - which are the intellectual property of the company - to the users of the social media site. We present VirtualIdentity, an application that uses secure multi-party cryptographic protocols to detect the age, gender and personality traits of users by classifying their user-generated text and personal pictures with trained support vector machine models in a privacy preserving manner

    Protecting privacy of users in brain-computer interface applications

    Get PDF
    Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data. The challenge we address is how to engage in meaningful ML with EEG data while protecting the privacy of users. Hence, we propose cryptographic protocols based on secure multiparty computation (SMC) to perform linear regression over EEG signals from many users in a fully privacy-preserving(PP) fashion, i.e., such that each individual's EEG signals are not revealed to anyone else. To illustrate the potential of our secure framework, we show how it allows estimating the drowsiness of drivers from their EEG signals as would be possible in the unencrypted case, and at a very reasonable computational cost. Our solution is the first application of commodity-based SMC to EEG data, as well as the largest documented experiment of secret sharing-based SMC in general, namely, with 15 players involved in all the computations

    Efficient and Private Scoring of Decision Trees, Support Vector Machines and Logistic Regression Models based on Pre-Computation

    Get PDF
    Many data-driven personalized services require that private data of users is scored against a trained machine learning model. In this paper we propose a novel protocol for privacy-preserving classification of decision trees, a popular machine learning model in these scenarios. Our solutions are composed out of building blocks, namely a secure comparison protocol, a protocol for obliviously selecting inputs, and a protocol for evaluating polynomials. By combining some of the building blocks for our decision tree classification protocol, we also improve previously proposed solutions for classification of support vector machines and logistic regression models. Our protocols are information theoretically secure and, unlike previously proposed solutions, do not require modular exponentiations. We show that our protocols for privacy-preserving classification lead to more efficient results from the point of view of computational and communication complexities. We present accuracy and runtime results for 7 classification benchmark datasets from the UCI repository

    Applying Secure Multi-party Computation in Practice

    Get PDF
    In this work, we present solutions for technical difficulties in deploying secure multi-party computation in real-world applications. We will first give a brief overview of the current state of the art, bring out several shortcomings and address them. The main contribution of this work is an end-to-end process description of deploying secure multi-party computation for the first large-scale registry-based statistical study on linked databases. Involving large stakeholders like government institutions introduces also some non-technical requirements like signing contracts and negotiating with the Data Protection Agency
    corecore