100,966 research outputs found
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Recommended from our members
The effect of missing values using genetic programming on evolvable diagnosis
Medical databases usually contain missing values due the policy of
reducing stress and harm to the patient. In practice missing values has been a
problem mainly due to the necessity to evaluate mathematical equations obtained
by genetic programming. The solution to this problem is to use fill in methods to
estimate the missing values. This paper analyses three fill in methods: (1) attribute
means, (2) conditional means, and (3) random number generation. The methods
are evaluated using sensitivity, specificity, and entropy to explain the exchange in
knowledge of the results. The results are illustrated based on the breast cancer
database. Conditional means produced the best fill in experimental results
Graph-based real-time fault diagnostics
A real-time fault detection and diagnosis capability is absolutely crucial in the design of large-scale space systems. Some of the existing AI-based fault diagnostic techniques like expert systems and qualitative modelling are frequently ill-suited for this purpose. Expert systems are often inadequately structured, difficult to validate and suffer from knowledge acquisition bottlenecks. Qualitative modelling techniques sometimes generate a large number of failure source alternatives, thus hampering speedy diagnosis. In this paper we present a graph-based technique which is well suited for real-time fault diagnosis, structured knowledge representation and acquisition and testing and validation. A Hierarchical Fault Model of the system to be diagnosed is developed. At each level of hierarchy, there exist fault propagation digraphs denoting causal relations between failure modes of subsystems. The edges of such a digraph are weighted with fault propagation time intervals. Efficient and restartable graph algorithms are used for on-line speedy identification of failure source components
- …