186 research outputs found
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Mining Privacy-Preserving Association Rules based on Parallel Processing in Cloud Computing
With the onset of the Information Era and the rapid growth of information
technology, ample space for processing and extracting data has opened up.
However, privacy concerns may stifle expansion throughout this area. The
challenge of reliable mining techniques when transactions disperse across
sources is addressed in this study. This work looks at the prospect of creating
a new set of three algorithms that can obtain maximum privacy, data utility,
and time savings while doing so. This paper proposes a unique double encryption
and Transaction Splitter approach to alter the database to optimize the data
utility and confidentiality tradeoff in the preparation phase. This paper
presents a customized apriori approach for the mining process, which does not
examine the entire database to estimate the support for each attribute.
Existing distributed data solutions have a high encryption complexity and an
insufficient specification of many participants' properties. Proposed solutions
provide increased privacy protection against a variety of attack models.
Furthermore, in terms of communication cycles and processing complexity, it is
much simpler and quicker. Proposed work tests on top of a realworld transaction
database demonstrate that the aim of the proposed method is realistic
Protecting Privacy When Releasing Search Results from Medical Document Data
Health information technologies have greatly facilitated sharing of personal health data for secondary use, which is critical to medical and health research. However, there is a growing concern about privacy due to data sharing and publishing. Medical and health data typically contain unstructured text documents, such as clinical narratives, pathology reports, and discharge summaries. This study concerns privacy-preserving extraction, summary, and release of information from medical documents. Existing studies on privacy-preserving data mining and publishing focus mostly on structured data. We propose a novel approach to enable privacy-preserving extract, summarize, query and report patients’ demographic, health and medical information from medical documents. The extracted data is represented in a semi-structured, set-valued data format, which can be stored in a health information system for query and analysis. The privacy preserving mechanism is based on the cutting-edge idea of differential privacy, which offers rigorous privacy guarantee
- …