3 research outputs found

    Privacy-preserving naive Bayes classification on distributed data via semi-trusted mixers

    No full text
    Distributed data mining applications, such as those dealing with health care, finance, counter-terrorism and homeland defense, use sensitive data from distributed databases held by different parties. This comes into direct conflict with an individual's need and right to privacy. It is thus of great importance to develop adequate security techniques for protecting privacy of individual values used for data mining. In this paper, we consider privacy-preserving naive Bayes classifier for horizontally partitioned distributed data and propose a two-party protocol and a multi-party protocol to achieve it. Our multi-party protocol is built on the semi-trusted mixer model, in which each data site sends messages to two semi-trusted mixers, respectively, which run our two-party protocol and then broadcast the classification result. This model facilitates both trust management and implementation. Security analysis has showed that our two-party protocol is a private protocol and our multi-party protocol is a private protocol as long as the two mixers do not conclude

    A survey of state-of-the-art methods for securing medical databases

    Get PDF
    This review article presents a survey of recent work devoted to advanced state-of-the-art methods for securing of medical databases. We concentrate on three main directions, which have received attention recently: attribute-based encryption for enabling secure access to confidential medical databases distributed among several data centers; homomorphic encryption for providing answers to confidential queries in a secure manner; and privacy-preserving data mining used to analyze data stored in medical databases for verifying hypotheses and discovering trends. Only the most recent and significant work has been included

    Privacy Preserving Data Mining For Horizontally Distributed Medical Data Analysis

    Get PDF
    To build reliable prediction models and identify useful patterns, assembling data sets from databases maintained by different sources such as hospitals becomes increasingly common; however, it might divulge sensitive information about individuals and thus leads to increased concerns about privacy, which in turn prevents different parties from sharing information. Privacy Preserving Distributed Data Mining (PPDDM) provides a means to address this issue without accessing actual data values to avoid the disclosure of information beyond the final result. In recent years, a number of state-of-the-art PPDDM approaches have been developed, most of which are based on Secure Multiparty Computation (SMC). SMC requires expensive communication cost and sophisticated secure computation. Besides, the mining progress is inevitable to slow down due to the increasing volume of the aggregated data. In this work, a new framework named Privacy-Aware Non-linear SVM (PAN-SVM) is proposed to build a PPDDM model from multiple data sources. PAN-SVM employs the Secure Sum Protocol to protect privacy at the bottom layer, and reduces the complex communication and computation via Nystrom matrix approximation and Eigen decomposition methods at the medium layer. The top layer of PAN-SVM speeds up the whole algorithm for large scale datasets. Based on the proposed framework of PAN-SVM, a Privacy Preserving Multi-class Classifier is built, and the experimental results on several benchmark datasets and microarray datasets show its abilities to improve classification accuracy compared with a regular SVM. In addition, two Privacy Preserving Feature Selection methods are also proposed based on PAN-SVM, and tested by using benchmark data and real world data. PAN-SVM does not depend on a trusted third party; all participants collaborate equally. Many experimental results show that PAN-SVM can not only effectively solve the problem of collaborative privacy-preserving data mining by building non-linear classification rules, but also significantly improve the performance of built classifiers
    corecore