Search CORE

100,966 research outputs found

Mining Frequent Graph Patterns with Differential Privacy

Author: Geweke J.
Gilks W.
Karwa V.
Rubinstein R.
Williams O.
Yan X.
Publication venue
Publication date: 01/03/2013
Field of study

Discovering frequent graph patterns in a graph database offers valuable information in a variety of applications. However, if the graph dataset contains sensitive data of individuals such as mobile phone-call graphs and web-click graphs, releasing discovered frequent patterns may present a threat to the privacy of individuals. {\em Differential privacy} has recently emerged as the {\em de facto} standard for private data analysis due to its provable privacy guarantee. In this paper we propose the first differentially private algorithm for mining frequent graph patterns. We first show that previous techniques on differentially private discovery of frequent {\em itemsets} cannot apply in mining frequent graph patterns due to the inherent complexity of handling structural information in graphs. We then address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling based algorithm. Unlike previous work on frequent itemset mining, our techniques do not rely on the output of a non-private mining algorithm. Instead, we observe that both frequent graph pattern mining and the guarantee of differential privacy can be unified into an MCMC sampling framework. In addition, we establish the privacy and utility guarantee of our algorithm and propose an efficient neighboring pattern counting technique as well. Experimental results show that the proposed algorithm is able to output frequent patterns with good precision

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

The effect of missing values using genetic programming on evolvable diagnosis

Author: Kalganova T
Werner JC
Publication venue
Publication date: 01/01/2002
Field of study

Medical databases usually contain missing values due the policy of reducing stress and harm to the patient. In practice missing values has been a problem mainly due to the necessity to evaluate mathematical equations obtained by genetic programming. The solution to this problem is to use fill in methods to estimate the missing values. This paper analyses three fill in methods: (1) attribute means, (2) conditional means, and (3) random number generation. The methods are evaluated using sensitivity, specificity, and entropy to explain the exchange in knowledge of the results. The results are illustrated based on the breast cancer database. Conditional means produced the best fill in experimental results

Brunel University Research Archive

Graph-based real-time fault diagnostics

Author: Karsai G.
Padalkar S.
Sztipanovits J.
Publication venue
Publication date
Field of study

A real-time fault detection and diagnosis capability is absolutely crucial in the design of large-scale space systems. Some of the existing AI-based fault diagnostic techniques like expert systems and qualitative modelling are frequently ill-suited for this purpose. Expert systems are often inadequately structured, difficult to validate and suffer from knowledge acquisition bottlenecks. Qualitative modelling techniques sometimes generate a large number of failure source alternatives, thus hampering speedy diagnosis. In this paper we present a graph-based technique which is well suited for real-time fault diagnosis, structured knowledge representation and acquisition and testing and validation. A Hierarchical Fault Model of the system to be diagnosed is developed. At each level of hierarchy, there exist fault propagation digraphs denoting causal relations between failure modes of subsystems. The edges of such a digraph are weighted with fault propagation time intervals. Efficient and restartable graph algorithms are used for on-line speedy identification of failure source components

NASA Technical Reports Server