186 research outputs found

    Mining Frequent Graph Patterns with Differential Privacy

    Full text link
    Discovering frequent graph patterns in a graph database offers valuable information in a variety of applications. However, if the graph dataset contains sensitive data of individuals such as mobile phone-call graphs and web-click graphs, releasing discovered frequent patterns may present a threat to the privacy of individuals. {\em Differential privacy} has recently emerged as the {\em de facto} standard for private data analysis due to its provable privacy guarantee. In this paper we propose the first differentially private algorithm for mining frequent graph patterns. We first show that previous techniques on differentially private discovery of frequent {\em itemsets} cannot apply in mining frequent graph patterns due to the inherent complexity of handling structural information in graphs. We then address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling based algorithm. Unlike previous work on frequent itemset mining, our techniques do not rely on the output of a non-private mining algorithm. Instead, we observe that both frequent graph pattern mining and the guarantee of differential privacy can be unified into an MCMC sampling framework. In addition, we establish the privacy and utility guarantee of our algorithm and propose an efficient neighboring pattern counting technique as well. Experimental results show that the proposed algorithm is able to output frequent patterns with good precision

    Mining Privacy-Preserving Association Rules based on Parallel Processing in Cloud Computing

    Full text link
    With the onset of the Information Era and the rapid growth of information technology, ample space for processing and extracting data has opened up. However, privacy concerns may stifle expansion throughout this area. The challenge of reliable mining techniques when transactions disperse across sources is addressed in this study. This work looks at the prospect of creating a new set of three algorithms that can obtain maximum privacy, data utility, and time savings while doing so. This paper proposes a unique double encryption and Transaction Splitter approach to alter the database to optimize the data utility and confidentiality tradeoff in the preparation phase. This paper presents a customized apriori approach for the mining process, which does not examine the entire database to estimate the support for each attribute. Existing distributed data solutions have a high encryption complexity and an insufficient specification of many participants' properties. Proposed solutions provide increased privacy protection against a variety of attack models. Furthermore, in terms of communication cycles and processing complexity, it is much simpler and quicker. Proposed work tests on top of a realworld transaction database demonstrate that the aim of the proposed method is realistic

    Frequent Symptom Sets Identification from Uncertain Medical Data in Differentially Private Way

    Get PDF

    Protecting Privacy When Releasing Search Results from Medical Document Data

    Get PDF
    Health information technologies have greatly facilitated sharing of personal health data for secondary use, which is critical to medical and health research. However, there is a growing concern about privacy due to data sharing and publishing. Medical and health data typically contain unstructured text documents, such as clinical narratives, pathology reports, and discharge summaries. This study concerns privacy-preserving extraction, summary, and release of information from medical documents. Existing studies on privacy-preserving data mining and publishing focus mostly on structured data. We propose a novel approach to enable privacy-preserving extract, summarize, query and report patients’ demographic, health and medical information from medical documents. The extracted data is represented in a semi-structured, set-valued data format, which can be stored in a health information system for query and analysis. The privacy preserving mechanism is based on the cutting-edge idea of differential privacy, which offers rigorous privacy guarantee
    corecore