20,780 research outputs found

    Privacy-Preserving Collaborative Association Rule Mining

    Get PDF
    In recent times, the development of privacy technologies has promoted the speed of research on privacy-preserving collaborative data mining. People borrowed the ideas of secure multi-party computation and developed secure multi-party protocols to deal with privacy-preserving collaborative data mining problems. Random perturbation was also identified to be an efficient estimation technique to solve the problems. Both secure multi-party protocol and random perturbation technique have their advantages and shortcomings. In this paper, we develop a new approach that combines existing techniques in such a way that the new approach gains the advantages from both of them

    Geo-tagging and privacy-preservation in mobile cloud computing

    Get PDF
    With the emerge of the cloud computing service and the explosive growth of the mobile devices and applications, mobile computing technologies and cloud computing technologies have been drawing significant attentions. Mobile cloud computing, with the synergy between the cloud and mobile technologies, has brought us new opportunities to develop novel and practical systems such as mobile multimedia systems and cloud systems that provide collaborative data-mining services for data from disparate owners (e.g., mobile users). However, it also creates new challenges, e.g., the algorithms deployed in the computationally weak mobile device require higher efficiency, and introduces new problems such as the privacy concern when the private data is shared in the cloud for collaborative data-mining. The main objectives of this dissertation are: 1. to develop practical systems based on the unique features of mobile devices (i.e., all-in-one computing platform and sensors) and the powerful computing capability of the cloud; 2. to propose solutions protecting the data privacy when the data from disparate owners are shared in the cloud for collaborative data-mining. We first propose a mobile geo-tagging system. It is a novel, accurate and efficient image and video based remote target localization and tracking system using the Android smartphone. To cope with the smartphones' computational limitation, we design light-weight image/video processing algorithms to achieve a good balance between estimation accuracy and computational complexity. Our system is first of its kind and we provide first hand real-world experimental results, which demonstrate that our system is feasible and practicable. To address the privacy concern when data from disparate owners are shared in the cloud for collaborative data-mining, we then propose a generic compressive sensing (CS) based secure multiparty computation (MPC) framework for privacy-preserving collaborative data-mining in which data mining is performed in the CS domain. We perform the CS transformation and reconstruction processes with MPC protocols. We modify the original orthogonal matching pursuit algorithm and develop new MPC protocols so that the CS reconstruction process can be implemented using MPC. Our analysis and experimental results show that our generic framework is capable of enabling privacy preserving collaborative data-mining. The proposed framework can be applied to many privacy preserving collaborative data-mining and signal processing applications in the cloud. We identify an application scenario that requires simultaneously performing secure watermark detection and privacy preserving multimedia data storage. We further propose a privacy preserving storage and secure watermark detection framework by adopting our generic framework to address such a requirement. In our secure watermark detection framework, the multimedia data and secret watermark pattern are presented to the cloud for secure watermark detection in a compressive sensing domain to protect the privacy. We also give mathematical and statistical analysis to derive the expected watermark detection performance in the compressive sensing domain, based on the target image, watermark pattern and the size of the compressive sensing matrix (but without the actual CS matrix), which means that the watermark detection performance in the CS domain can be estimated during the watermark embedding process. The correctness of the derived performance has been validated by our experiments. Our theoretical analysis and experimental results show that secure watermark detection in the compressive sensing domain is feasible. By taking advantage of our mobile geo-tagging system and compressive sensing based privacy preserving data-mining framework, we develop a mobile privacy preserving collaborative filtering system. In our system, mobile users can share their personal data with each other in the cloud and get daily activity recommendations based on the data-mining results generated by the cloud, without leaking the privacy and secrecy of the data to other parties. Experimental results demonstrate that the proposed system is effective in enabling efficient mobile privacy preserving collaborative filtering services.Includes bibliographical references (pages 126-133)

    Privacy Preserving Distributed Data Mining

    Get PDF
    Privacy preserving distributed data mining aims to design secure protocols which allow multiple parties to conduct collaborative data mining while protecting the data privacy. My research focuses on the design and implementation of privacy preserving two-party protocols based on homomorphic encryption. I present new results in this area, including new secure protocols for basic operations and two fundamental privacy preserving data mining protocols. I propose a number of secure protocols for basic operations in the additive secret-sharing scheme based on homomorphic encryption. I derive a basic relationship between a secret number and its shares, with which we develop efficient secure comparison and secure division with public divisor protocols. I also design a secure inverse square root protocol based on Newton\u27s iterative method and hence propose a solution for the secure square root problem. In addition, we propose a secure exponential protocol based on Taylor series expansions. All these protocols are implemented using secure multiplication and can be used to develop privacy preserving distributed data mining protocols. In particular, I develop efficient privacy preserving protocols for two fundamental data mining tasks: multiple linear regression and EM clustering. Both protocols work for arbitrarily partitioned datasets. The two-party privacy preserving linear regression protocol is provably secure in the semi-honest model, and the EM clustering protocol discloses only the number of iterations. I provide a proof-of-concept implementation of these protocols in C++, based on the Paillier cryptosystem

    Privacy-Preserving Decision Tree Classification over Horizontally Partitioned Data

    Get PDF
    Protection of privacy is one of important problems in data mining. The unwillingness to share their data frequently results in failure of collaborative data mining. This paper studies how to build a decision tree classifier under the following scenario: a database is horizontally partitioned into multiple pieces, with each piece owned by a particular party. All the parties want to build a decision tree classifier based on such a database, but due to the privacy constraints, neither of them wants to disclose their private pieces. We build a privacy-preserving system, including a set of secure protocols, that allows the parties to construct such a classifier. We guarantee that the private data are securely protected

    A Toolbox for privacy preserving distributed data mining

    Get PDF
    Distributed structure of individual data makes it necessary for data holders to perform collaborative analysis over the collective database for better data mining results. However each site has to ensure the privacy of its individual data, which means no information is revealed about individual values. Privacy preserving distributed data mining is utilized for that purpose. In this study, we try to draw more attention to the topic of privacy preserving data mining by showing a model which is realistic for data mining, and allows for very efficient protocols. We give two protocols which are useful tools in data mining: a protocol for Yaoѫs millionaires problem, and a protocol for numerical distance. Our solution to Yaoѫs millionaires problem is of independent interest since it gives a solution which improves on known protocols with respect to both computation complexity and communication overhead. This protocol can be used for different purposes in privacy preserving data mining algorithms such as comparison and equality test of data records. Our numerical distance protocol is also applicable to variety of algorithms. In this study we applied our numerical distance protocol in a privacy preserving distributed clustering protocol for horizontally partitioned data. We show application of our protocol over different attribute types such as interval-scaled,binary, nominal, ordinal, ratio-scaled, and alphanumeric. We present proof of security of our protocol, and explain communication, and computation complexity analysis indetail

    Privacy Preserving Data Mining For Horizontally Distributed Medical Data Analysis

    Get PDF
    To build reliable prediction models and identify useful patterns, assembling data sets from databases maintained by different sources such as hospitals becomes increasingly common; however, it might divulge sensitive information about individuals and thus leads to increased concerns about privacy, which in turn prevents different parties from sharing information. Privacy Preserving Distributed Data Mining (PPDDM) provides a means to address this issue without accessing actual data values to avoid the disclosure of information beyond the final result. In recent years, a number of state-of-the-art PPDDM approaches have been developed, most of which are based on Secure Multiparty Computation (SMC). SMC requires expensive communication cost and sophisticated secure computation. Besides, the mining progress is inevitable to slow down due to the increasing volume of the aggregated data. In this work, a new framework named Privacy-Aware Non-linear SVM (PAN-SVM) is proposed to build a PPDDM model from multiple data sources. PAN-SVM employs the Secure Sum Protocol to protect privacy at the bottom layer, and reduces the complex communication and computation via Nystrom matrix approximation and Eigen decomposition methods at the medium layer. The top layer of PAN-SVM speeds up the whole algorithm for large scale datasets. Based on the proposed framework of PAN-SVM, a Privacy Preserving Multi-class Classifier is built, and the experimental results on several benchmark datasets and microarray datasets show its abilities to improve classification accuracy compared with a regular SVM. In addition, two Privacy Preserving Feature Selection methods are also proposed based on PAN-SVM, and tested by using benchmark data and real world data. PAN-SVM does not depend on a trusted third party; all participants collaborate equally. Many experimental results show that PAN-SVM can not only effectively solve the problem of collaborative privacy-preserving data mining by building non-linear classification rules, but also significantly improve the performance of built classifiers

    A Privacy-Preserving Framework for Collaborative Association Rule Mining in Cloud

    Get PDF
    Collaborative Data Mining facilitates multiple organizations to integrate their datasets and extract useful knowledge from their joint datasets for mutual benefits. The knowledge extracted in this manner is found to be superior to the knowledge extracted locally from a single organization’s dataset. With the rapid development of outsourcing, there is a growing interest for organizations to outsource their data mining tasks to a cloud environment to effectively address their economic and performance demands. However, due to privacy concerns and stringent compliance regulations, organizations do not want to share their private datasets neither with the cloud nor with other participating organizations. In this paper, we address the problem of outsourcing association rule mining task to a federated cloud environment in a privacy-preserving manner. Specifically, we propose a privacy-preserving framework that allows a set of users, each with a private dataset, to outsource their encrypted databases and the cloud returns the association rules extracted from the aggregated encrypted databases to the participating users. Our proposed solution ensures the confidentiality of the outsourced data and also minimizes the users’ participation during the association rule mining process. Additionally, we show that the proposed solution is secure under the standard semi-honest model and demonstrate its practicality
    • …
    corecore