90 research outputs found

    Privacy-preserving scoring of tree ensembles : a novel framework for AI in healthcare

    Get PDF
    Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare

    Enabling Privacy-preserving Auctions in Big Data

    Full text link
    We study how to enable auctions in the big data context to solve many upcoming data-based decision problems in the near future. We consider the characteristics of the big data including, but not limited to, velocity, volume, variety, and veracity, and we believe any auction mechanism design in the future should take the following factors into consideration: 1) generality (variety); 2) efficiency and scalability (velocity and volume); 3) truthfulness and verifiability (veracity). In this paper, we propose a privacy-preserving construction for auction mechanism design in the big data, which prevents adversaries from learning unnecessary information except those implied in the valid output of the auction. More specifically, we considered one of the most general form of the auction (to deal with the variety), and greatly improved the the efficiency and scalability by approximating the NP-hard problems and avoiding the design based on garbled circuits (to deal with velocity and volume), and finally prevented stakeholders from lying to each other for their own benefit (to deal with the veracity). We achieve these by introducing a novel privacy-preserving winner determination algorithm and a novel payment mechanism. Additionally, we further employ a blind signature scheme as a building block to let bidders verify the authenticity of their payment reported by the auctioneer. The comparison with peer work shows that we improve the asymptotic performance of peer works' overhead from the exponential growth to a linear growth and from linear growth to a logarithmic growth, which greatly improves the scalability

    Supporting Regularized Logistic Regression Privately and Efficiently

    Full text link
    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Increasing concerns over data privacy make it more and more difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used machine learning model in various disciplines while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluation on several studies validated the privacy guarantees, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc

    Convex optimization-based Privacy-Preserving Distributed Least Squares via Subspace Perturbation

    Get PDF

    A Novel Privacy-Preserved Recommender System Framework based on Federated Learning

    Full text link
    Recommender System (RS) is currently an effective way to solve information overload. To meet users' next click behavior, RS needs to collect users' personal information and behavior to achieve a comprehensive and profound user preference perception. However, these centrally collected data are privacy-sensitive, and any leakage may cause severe problems to both users and service providers. This paper proposed a novel privacy-preserved recommender system framework (PPRSF), through the application of federated learning paradigm, to enable the recommendation algorithm to be trained and carry out inference without centrally collecting users' private data. The PPRSF not only able to reduces the privacy leakage risk, satisfies legal and regulatory requirements but also allows various recommendation algorithms to be applied
    corecore